-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Java keyExists() has an incorrect behaviour in most cases #12921
Comments
Hello @alim-akbashev and thanks for reaching our community, I double check code and I didn't find any misbehavior. I also wrote small test to verify your concerns. The
Originally this function used BloomFilter, but today it may be replaced with RibbonFilter or for small databases it may use cache. Radek |
hi @rhubner , thanks for your reply Yea, seems my first assumption on the reason of the behaviour was wrong. But i still have concerns regarding the current behaviour.
If it would be so, that would not be an issue for me. But unfortunately it does not behave so.
I believe, the test will fail if we add re-opening of a db between put and keyExists/keyMayExist (to clear all kind of possible caches). At least exactly such behaviour i do see using a java code - keyExists/keyMayExist returns |
Hi @alim-akbashev ,
Feel free to update the test. Maybe I missed something. I also try to compact DB or flush WAL, but it didn't have any effect on the test. Radek |
yea, i'm so sorry, @rhubner , my bad i didn't double-check the steps-to-reproduce. it turned out, that to reproduce 1. it's required to fill a db with much more data 2. it's required to restart the app, just closing and opening the db again does not shows the issue. anyways, i've created a demo projects which 100% reproduces the issue (at least in my environment) https://github.com/alim-akbashev/keyExistsBugDemo it consist of two mini-apps:
hope it helps in investigation. |
Hello @alim-akbashev , thanks for the example. I can replicate your behavior on my machine. I think it's the key_maye_exist what return wrong result. But this happen only for TtlDB. For normal RocksDB I'm getting correct results. Maybe different implementations? Let me investigate more. Radek |
Hello @alim-akbashev , I did some debbuging over weeked and today and move little bit further. I will be talking primary about DBWithTTLImpl::KeyMayExist The first call
What's strange, that Radek |
hello @rhubner , thanks for your efforts. here's a quick update from my side - I've updated the demo app, now it contains only settings which really affect keyExists/keyMayExist behaviour + now it is a single app: https://github.com/alim-akbashev/keyExistsBugDemo/blob/main/src/main/java/akbashev/rocksdb/MinimalStepsToReproduceApp.java :
So now I more or less certain that the behaviour depends on the index and if a db is configured to place indices in a block cache instead of heap (am I right that by default a bloom filter and an index are at heap and they are preloaded during opening a db?), keyMayExist does not trigger loading the corresponding index to a cache as Thanks, |
Hello @alim-akbashev , I think I found it. In DBImpl::KeyMayExist we always set Then at the end, you can see that the method return I think the method should have something like this at the end : if(value_found != nullptr && s.IsIncomplete()) {
*value_found = false;
} or line @pdillinger What do you think about it ? Unfortunately I'm not very familiar with this part of RocksDB codebase. After the proposed change, Radek cc: @adamretter |
The method keyExists() returns correct result only when data exist in a block cache.
It seems that
key_exists_helper
atrocksjni.cc
has a bug due to a side-effect of c++ impl ofKeyMayExist
which changes the state ofread_options
atrocksdb/db/db_impl/db_impl.cc
Line 3765 in b26b395
So consequent call of db->Get(read_opts, ...) looks up only through a block cache tier too:
UPDATE: probably my assumption on the root cause is wrong. It seems that
keyMayExist()
just does not work as expected (and advertised in docs):Actually it returns false negative all the time :( it only returns positive result if the data by key is in a data block cache.
Expected behavior
keyExists()
should not produce false negative casesActual behavior
keyExists()
returns false-negative result if data by a requested key is not in a block cache.Steps to reproduce the behavior
The text was updated successfully, but these errors were encountered: