-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not skip metadata when quering SELECT * FROM ...
#275
Conversation
I think this logic belongs in the server. All drivers are susceptible to it. |
Although, we have to check:
|
True, @dkropachev was supposed to create a core issue for that. This PR is meant as a temporary solution to help a customer with correctness problems when they run this type of queries as solving it in the core would probably take some time. AFAIK there is no way right now to invalidate prepared statements when schema changes. In cql v5 that is solved and we could have something analogical, am I right @Lorak-mmk ? |
This commit makes skipping metadata optimization disabled for `SELECT * FROM ...` queries due to potential correctness issues when altering the table. This may cause a decrease in performance in comparison to always skipping metadata. To avoid that it is recommended not to use `SELECT * FROM ...` and provide specific columns names.
bf55aaa
to
c8f8c58
Compare
If we set |
Having read the protocol spec, I'm undecided.
So I lean towards interpretation that the protocol does allow returning metadata even when
Rust driver's case is that if cached metadata is present, it is used unconditionally. However, it would be a trivial change to compare cached metadata with the recent result metadata and invalidate the prepared statement in case of a mismatch; and also, to use result metadata (if present) instead of cached metadata. |
I agree with your analysis. However, the Rust case proves that a server-side solution is insufficient and we must audit and possibly fix all the drivers anyway. So the question isn't whether to apply a server-side or client-side solution, it's whether to apply a (more accurate) server-side solution and audit/fix all the drivers, or just patch all the drivers. I think it's worthwhile (and easy) to do it in the server, so why not, even though it doesn't save us work on the drivers. |
@wprzytula - sounds like the classic case of "Be liberal in what you accept, and conservative in what you send.", no? You can ask for no metadata, but be prepared to process it if your request is ignored. Ignoring it seems like a bug on the server side though. |
Here, I'm proposing for the server to be less conservative: even though the driver requested to skip metadata, I'm sending it anyway. I'm relying on the driver to be liberal and accept it anyway. We already know the Rust driver will accept it but throw it away. |
If we do it on the server I think it would be good to use the solution from cql v5. In cql v5 when preparing a statement there are 2 IDs returned:
During EXECUTE driver sends both IDs. If the server detects that result metadata id is incorrect (because schema changed) it sends new result metadata id (+ of course new result metadata) in the RESULT (so no additional round trip required). Driver can then handle the response correctly, and update its local data to reflect new result metadata. It solves the issue with SELECT * (and similar problems with UDT), without sacrificing performance (because there is no need to send metadata on each request, just once after schema changed). |
A side note: Rust driver's
An argument for the second solution is that if a schema change affects the table queried by the particular PreparedStatement, then chances are the user's struct reflecting the table no longer fits, so a type check error would be thrown later anyway. Can we somehow differentiate between schema changes that affect a particular statement and those that don't? Preferably on server side, so that new metadata is only sent when the statement is affected. |
Solution from CQL v5 that I mentioned in the comment above does that. |
@dkropachev @avikivity @Lorak-mmk I think we agree that the changes to the server side are needed, but what do you recommend to do with this PR, is it okay to have it as a temporary fix or should we wait until it is fixed on server side? |
@sylwiaszunejko , I think we need to merge it in, let me quickly test other cases, i will come back with approval in couple of hours. |
@sylwiaszunejko , I see that exactly same problem we have with UDTs, let's brain storm if there is a way to cover all the cases. |
Could you provide the reproducer for UDTs? |
UPDATED |
@sylwiaszunejko , |
Maybe we should reconsider going with the rust driver approach and disabling skipping metadata by default if there are so many cases when we are at risk of incorrect results |
And create a core issue to address it on server side, and when it is merged we could return default to |
core issue |
We shouldn't block this series, server side changes are much slower to propagate. |
Are UDTs really affected? Yes, if a UDT has types added, the server will serialize a larger tuple, but the driver can/should just ignore the new fields (are they always added at the end? must be). It can't ignore new fields in SELECT * because fields aren't ordered wrt addition time. |
@sylwiaszunejko , @avikivity and another case with |
As there are many other cases when the behavior is broken I am closing this PR, I will open another one with changing default behavior to disable skipping metadata until this is fixed on the server side |
This PR makes skipping metadata optimization disabled for
SELECT * FROM ...
queries due to potential correctness issues when altering the table. This may cause a decrease in performance in comparison to always skipping metadata. To avoid that it is recommended not to useSELECT * FROM ...
and provide specific columns names.Fixes: #261