-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Paging past 10k items #111
Comments
Is there any plan to use |
It hasn't been determined yet, but likely yes as it keeps coming up. Don't have an ETA on this though. The next version of pystac-client, which will come in Q1 of 2022 will allow for splitting up searches and making async requests, so that would be another way, and perhaps better way, to get around this limit. |
search_after with a stable sort like created ascending is the preferred way to do this. One note is that ES uses milliseconds since the epoch for the search_after value for datatime fields (which is unclear from their docs). A next value that would be guaranteed to be stable would be like this, also using the itemid and collection to ensure documents with the same creation timestamp don't get arbitrarily reordered in the queries and mess up pagination:
Update: it looks like ES now supports specifying the ISO8601 datetime instead of converting it to seconds, but I don't know if that's supported in the 7.12 we're using |
Paging past 10k items throws a meaningless (to users) error:
search_phase_execution_exception
up to 10k items
https://earth-search.aws.element84.com/v0/search?limit=100&page=100
past 10k items
https://earth-search.aws.element84.com/v0/search?limit=100&page=101
This limit can be changed:
https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules.html#index-max-result-window
or
search_after
can be used which would change how pagination worked in stac-server:https://www.elastic.co/guide/en/elasticsearch/reference/current/paginate-search-results.html#search-after
Even if the limit were raised (which will consume more memory and may be less performance, performance testing would be required) there will still be a limit, only higher, so at the very least
stac-server
should throw a meaningful error message when trying to page past 10k items.The text was updated successfully, but these errors were encountered: