Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Working with huge graphs #62

Closed
sharpaper opened this issue May 14, 2020 · 3 comments
Closed

Working with huge graphs #62

sharpaper opened this issue May 14, 2020 · 3 comments

Comments

@sharpaper
Copy link

"huge" as in "too big to fit into memory". I don't think I've seen this mentioned in the readme, so please excuse me if it's a dumb question. But considering that rdflib works in-memory, I was wondering if this plugin also needs to load all the graph in memory in order to work.

@mwatts15
Copy link
Collaborator

Great question, @sharpaper . rdflib-sqlalchemy can work with graphs larger than the size of available memory, but there are caveats. rdflib-sqlalchemy will allocate memory proportional to the number of triples you are adding at once with an addN, so it's possible to outstrip your memory capacity that way. Also, for a call to triples(), we allocate memory proportional to the number of triples returned.

As a work-around, for adding triples, it's somewhat inconvenient, but you can split up the inserts to chunks that do fit in memory. There isn't a great work-around to triples() memory usage other than finding queries with smaller result sets. That said, now that I see that memory usage is a problem, I don't think it would be too difficult to limit memory usage here. I created a couple of issues (#63 and #64) to address this.

@sharpaper
Copy link
Author

There isn't a great work-around to triples() memory usage other than finding queries with smaller result sets

This makes sense, the client should limit the range of possible results. However have you considered Python iterators? It could be a nice addition.

@mwatts15
Copy link
Collaborator

mwatts15 commented May 14, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants