-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
github-to-sqlite should handle rate limits better #51
Comments
This just caused a failure in deploying the demo: https://github.com/dogsheep/github-to-sqlite/runs/1471304407?check_suite_focus=true
|
I don't have much experience with github's rate limiting. In my day job we use the tenacity library to handle http errors we get. |
I've been looking into how to to get this data out of Github (especially now there are "secondary rate limits" without an advertised allowance separate from the regular rate limits. I've had decent success with the Airbyte github extractor (aside from one data quality issue airbytehq/airbyte#15420 ). Airbyte splits data extraction between the GraphQL and REST endpoints depending on the resource type, but they're very comprehensive. Before this, I tried a few solutions in my own custom wrapper mentioned in this thread + its children PyGithub/PyGithub#1989 , but they weren't working as expected. |
also, it says that authenticated requests have a much higher "rate limit". Unauthenticated requests only get 60 req/hour ?? seems more like a quota than a "rate limit" (although I guess that is semantic equivalence) You would want to use
But a more complete solution would bring authenticated requests to the other subcommands. I'm surprised only |
From #50 - right now it will crash with an error of it hits the rate limit. Since the rate limit information (including reset time) is available in the headers it could automatically sleep and try again instead.
The text was updated successfully, but these errors were encountered: