-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable FDP for CacheBench #302
Comments
yes this was just the example in the path, I edited for /dev/nvme0n1 which is drive with FDP enabled, which I can verify with here is the log
and I should have FDP enabled here
|
@jmhands Actually, the name of the config is |
getting closer! That worked for enabling FDP but I'm getting an abort after the 30 seconds of runtime I specified
|
It is a case liburing library (which the FDP support is dependent on for now) is not available on the system, and the cachelib build system chose not to install it for some reason. |
I'm using Ubuntu 22.04.4 LTS with HWE kernel, 6.5.0-26-generic. io_uring works fine in fio, etc. |
after installing
|
Building and installing iouring from source looks to be working. The following is a method which works for FDP. git clone https://github.com/axboe/liburing.git git clone https://github.com/facebook/CacheLib.git |
After building liburing from source it still fails at Folly. I can get it to go farther by linking
after linking
|
@jmhands Is this error occurring even with clean build after removing Yeah, those flags like Doesn't liburing has debian build as well? https://github.com/axboe/liburing/blob/master/make-debs.sh |
I was able to get it working with the following steps
this builds correctly but then I get an error when I run cachebench
but was able to resolve with
now that cachebench works... add these into config
|
Hi, @jaesoo-fb I have a few questions about testing Cachebench after enabling FDP. 1. WAF Expectation for KVCache 2. [Error] We saw the IO Error Issue in the log, but the test didn't fail. I hope below code is working well, this IO error never happen. 3. [Fatal Error] The test failed due to an out of range issue. |
Hi @FletcherAtFADU ,
You might have selected a kvcache workload without BH enabled. Could you please check the "navyBigHashSizePct" in the config.json file of the workload selected.
Could you check the "nvmCacheSizeMB" with your device size. Looks like "nvmCacheSizeMB" might be going above the NVMe NS/partition that you have chosen. Could you attach the config.json and initialization/run logs of the cachebench. That would help to analyze it better.
|
Hi, @arungeorge83
=> I've checked that the default setting of "navyBigHashSizePct" is 0.
=>nvmCacheSizeMB set 932000 (MB), but the device capacity is 1.25TB (1250602278912 Bytes) which is over than nvmCacheSizeMB. |
Yes, it should be non-zero for BigHash enabled cases. (I see that you have used test_configs/ssd_perf/kvcache_l2_wc/ which does not have bighash enabled). Please use the production traces mentioned at https://cachelib.org/docs/Cache_Library_User_Guides/Cachebench_FB_HW_eval#running-cachebench-with-the-trace-workload for FDP experiments. Or you can use test_configs/ssd_perf/flat_kvcache_reg after changing the device from /dev/md0 to /dev/nvme0n1. The IO errors looks interesting. |
@arungeorge83 Additionally, even when an IO Error occurs, the tests continue to run. |
@arungeorge83 buddy, could you please share your CacheLib test config for FDP SSD? |
@gaowayne Please find the sample FDP config used with kvcache production traces. |
@arungeorge83 @gaowayne There is still one thing that I do not understand. 1 More Question: |
@FletcherAtFADU It is great to know that you are able to re-produce the results.
The current code does not support that, though a configurable RUH allocation mechanism is in thoughts.
Interesting. We were able to test with the full capacity of the device. |
@arungeorge83 buddy, may I know if I try to build, I can start from here? https://github.com/arungeorge83/CacheLib/tree/fdp/fdp_upstream_PR2 |
@gaowayne yes. And you can use the latest working code from main branch also. That PR was already merged |
@arungeorge83 thank you so much man. I try to build cachelib, but suffer this on my OS. :(
|
I was seeing the same IO Erros that @FletcherAtFADU reported in this thread. However, it would possibly be better to either query the device capabilities when setting the @jaesoo-fb Do you think this is an actual issue worth investigating a bit more to post a patch? |
@MaisenbacherD: yes. We'd really appreciate it if you could send out a patch for this. It'd be good to detect the right setting for deviceMaxWriteSize on start up. |
Dear all, @arungeorge83 @MaisenbacherD
|
@ByteMansion Please use a Linux Kernel version 6.1.32 and higher. |
I upgrade the kernel version to 6.2 and segmentation fault disappeared. Thanks for your help. @arungeorge83 |
Reading the commit notes from
009e89b
I tried to enable FDP by adding
to the following. The Samsung docs I was following to enable FDP say to add
"devicePlacement": true,
. Which one is it?but when I run CacheBench I see
and it fails at
The text was updated successfully, but these errors were encountered: