Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Parallel File Systems #754

Open
henoby opened this issue Jun 5, 2023 · 4 comments
Open

Support for Parallel File Systems #754

henoby opened this issue Jun 5, 2023 · 4 comments

Comments

@henoby
Copy link

henoby commented Jun 5, 2023

Hi,

thanks a lot for your great work!
We would like to use GoCryptFS from multiple nodes on our HPC system concurrently to write to our Parallel File Systems (e.g. BeeGFS, Lustre, GPFS, NFS). However, when multiple processes are writing to a single file concurrently we seem to get a race condition.

@henoby henoby changed the title Supprt for Parallel File Systems Support for Parallel File Systems Jun 5, 2023
@rfjakob
Copy link
Owner

rfjakob commented Jun 13, 2023

Work is ongoing in the https://github.com/rfjakob/gocryptfs/tree/LockSharedStorage branch.

This uses "Open file description locks" / F_OFD_SETLKW which have both a sensible API and work on NFS acc. to
https://gavv.net/articles/file-locks/#differing-features
Note: the article http://0pointer.de/blog/projects/locking.html is from 2010 where F_OFD_SETLKW did not exist yet.

I will comment again once I have something ready to test.

rfjakob added a commit that referenced this issue Jun 19, 2023
With -sharedstorage, when we get a decryption error, we lock the
byte range and try again.

This makes concurrent R/W safe agains torn writes.

#754
rfjakob added a commit that referenced this issue Jun 19, 2023
Complain loudly when the underlying storage does not support
byte-range locks.

#754
@rfjakob
Copy link
Owner

rfjakob commented Jun 19, 2023

Hi, this is ready to test. When mounting with -sharedstorage, all concurrent R/W and W/W should be safe now. As long as the storage supports F_OFD_SETLKW byte-range locks.

@henoby
Copy link
Author

henoby commented Jun 22, 2023

Hi,
thanks a lot for your very quick response. I have quickly tested it on BeeGFS from two client nodes using the following ior command:
mpirun -n 2 ior --dataPacketType=timestamp -C -Q 1 -g -G=-1367929591 -k -e -o /gocryptfs/mount/file -O stoneWallingStatusFile=/local/path/ior-hard.stonewall -t 47008 -b 47008 -s 10000000 -w -D 300 -a POSIX -O saveRankPerformanceDetailsCSV=/local/path/ior-hard-write.csv -O stoneWallingWearOut=1
Without the -sharedstorage flag I get the expected ERROR: write(17, 0x770000, 47008) message.
Using the flag, it works!
I will do some benchmarking, to see the "overall cost" of using this lock. I will share the detailed results with you as soon as I have them.
Thanks a lot! :)

@rfjakob
Copy link
Owner

rfjakob commented Oct 30, 2024

Ping @henoby

rfjakob added a commit that referenced this issue Dec 4, 2024
With -sharedstorage, when we get a decryption error, we lock the
byte range and try again.

This makes concurrent R/W safe agains torn writes.

#754
rfjakob added a commit that referenced this issue Dec 4, 2024
Complain loudly when the underlying storage does not support
byte-range locks.

#754
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants