-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implements pagination #63
Comments
👋 thanks for this. Adding pagination handling sounds good. I lean towards adding the functionality to the existing methods as I don't love the idea of adding a bunch of new methods - if we went with new methods, i imagine we'd have to add a new method for every current method? p.s. pygbif is using https://github.com/psf/black formatter now - so make sure you use that before pushing changes up - there's lots of text editor integrations and a command line tool, etc. |
Any advancement concerning the limit of 300 records using |
No - note that this library is now maintained by the GBIF team - hopefully they'll chime in here to indicate if that's something they're interested in or not |
Ok. I came up with a bit of code that solved my problem: not very clean, but functional enough! def paginated_search(max_limit, *args, **kwargs):
""" In its current version, pygbif can not search more than 300 occurences at once: this solves a bit of the problem
"""
MAX_LIMIT = max_limit
PER_PAGE = 100
results = []
from pygbif import occurrences
if(MAX_LIMIT <= PER_PAGE):
resp = occurrences.search(*args, **kwargs, limit=MAX_LIMIT)
results = resp['results']
else :
from tqdm import tqdm
progress_bar = tqdm(total=MAX_LIMIT, unit='B', unit_scale=True, unit_divisor=1024)
offset = 0
while offset < MAX_LIMIT:
resp = occurrences.search(*args, **kwargs, limit=PER_PAGE, offset=offset)
results = results + resp['results']
progress_bar.update(len(resp['results']))
if resp['endOfRecords']:
progress_bar.close()
break
else:
offset = offset + PER_PAGE
progress_bar.close()
return results # list of dicts |
Hi @sckott!
For a current project I'm working on with @damianooldoni, we'll need to access a long list of results from the
name_usage()
API call.Instead of implementing the pagination/looping is our client code (like in this quick&dirty example: https://gist.github.com/niconoe/b9dcb6c468b996b6f77e18f51516e840), we were wondering if you'd be interested in receiving a PR to implement it in
pygbif
itself. That would be similar to what Damiano did (forrgbif
) in ropensci/rgbif#291 and ropensci/rgbif#295.The plan would be to:
name_usage()
but also other functions that deal with paginated results from the GBIF API.all_name_usages()
for example) or change the existing functions. To avoid breaking the API, we could add an optional parameter (that default to False) to tell pygbif to handle the pagination. So for example:name_usage()
andname_usage(handle_pagination=False)
would keep the existing behaviour, butname_usage(handle_pagination=True)
would take care of the pagination and return all results. I have a slight preference for the first option (new functions) because I find the API clearer, but it's up to you!Just tell us what you think, if you're interested we hope to start working on a PR soon!
The text was updated successfully, but these errors were encountered: