-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hdtCat error in LongArrayDisk with large files #211
Comments
Hi, could you try out this: https://github.com/the-qa-company/qEndpoint/wiki/qEndpoint-CLI-commands#hdtdiffcat-qep-specific it is an evolution of the tool .... |
@D063520 thank you for pointing that out, I hadn't come across it yet. I'm trying it now. |
@D063520 the qEndpoint tool worked! It seems a good bit faster as well, but it uses quite a bit more RAM. I had originally been using a max heap of 150 GB, but ended up increasing it 3 times until it worked with a 400 GB heap. Now I've got an HDT file containing 27.5 billion triples. |
@D063520 actually I used |
If you have the -kcat it's the same, otherwise by default the qep cli is using the disk optimized version and the rdfhdt cli the memory version. The memory one is slow and not efficient |
I'm trying to merge two HDT files using
hdtCat.sh
. Each file has more than 13 billion triples:13736601325
triples13827925785
triplesAfter about 25 hours I get this error:
I tried both
v3.0.10
andv3.0.9
with the same result. I can provide these files, but each is about 170 GB. I haven't run into this issue with any smaller files.The text was updated successfully, but these errors were encountered: