-
Notifications
You must be signed in to change notification settings - Fork 772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to skip RateLimitTransport #235
Add option to skip RateLimitTransport #235
Conversation
72fe53e
to
bb1e3da
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Hey there! My TF plans are taking 10 minutes in the PR, 10 more minutes after merging, just to add a user in a team. |
Hi @emoreth ! Thank you for the PR, I appreciate the opportunity to rethink how we want to approach this issue. My concern with allowing unlimited concurrency is that the majority of the users who use the provider will likely hit the abuse rate limits. From my understanding, the abuse rate limits are different than the normal rate limits that include the 5000 requests/hour that you referred to. Rate limiting: https://developer.github.com/v3/#rate-limiting It appears to me that making multiple concurrent requests would trigger these abuse rate limits, thus, in order to not abuse GitHub’s API, I think it’s safe we don’t open that up. That being said, perhaps we could re-think a different approach that is less restricting, something that could be agreed upon by all users. One thought we had was perhaps making only 2 parallel requests at a time and seeing how that performs, and if it triggers the abuse rate limits. I haven’t tested that yet, so we’d have to see how that works. There is a concern with that as well, that it is going against GitHub’s documented best practice (https://developer.github.com/v3/guides/best-practices-for-integrators/#dealing-with-abuse-rate-limits)
Thanks! |
Hi @megan07 ! Thank you for your response and I totally agree with your concern in not abusing any limits, and I think that GitHub knows that Terraform is not a form of abuse. I'm currently managing a GitHub account with ~200 memberships, ~140 repositories, ~77 teams, ~150 associations with teams and repositories and ~500 associations with teams and memberships. On top of that, every change needs to be planned on a PR and replanned/executed after merge. Still, all these operations get to around 2.1K requests, which takes ~30 minutes. When using an older version of the plugin it cuts down to ~10 minutes. With that said, I don't have more than 2~5 GitHub change requests a day, which takes me almost 3 hours to apply. The setting I added was inactive by default, so whoever wants to manage it manually can opt-in. Currently, it would be way easier for me to coordinate with my team to handle hitting the API limits than having to wait for those plans. I do think that github-provider should not validate every single user/repo/etc separately (as GitHub has endpoints that return multiple results) but I lack Since it took 23 days to get an answer for this PR, I'll wait for your approval to fix these conflicts. Regards, |
@emoreth have you looked at partitioning your github organization(s) into multiple states and using data sources / remote states to share references between them? Given how github rate limits (which IMO leaves much to be desired) it seems wasteful to check every single object every time you make a change in your entire org. At my org each team or business unit has its own state where its memberships, repositories, etc live in. This acts as a natural limiter to your calls. I also manage several very large orgs with hundreds of users, repositories, branch protections, issue labels, repo topics, webhooks, etc. Even with these logical separations I ran into rate limits quite easily before adding caching and limiting the concurrency but things are still generally well under 10 minutes for our larger teams leaving the smaller teams quite snappy.
Terraform is a tool, it can be used to abuse a system even if that's not its intended use case. I have hit both types of rate limits with this provider they are both concerns but I can't quite recall how often or in what scenarios I have seen the abuse vs normal rate limits. |
@majormoses I totally agree that you have to comply with the limits, I'm just adding an Making it serially just adds time for everyone. |
@emoreth I agree, I was trying to provide a suggestion to help improve outside of the context of the PR. I agree we should have a way to disable caching. If someone opts into a less efficient solution for your "api budget", it makes sense to allow as they just raise their chances of exhausting the api budget. I am still in favor of supporting this I was just pointing out there are ways to negate the need and still keep the time more reasonable. |
As RateLimitTransport requires requests not to be concurrent (https://github.com/terraform-providers/terraform-provider-github/pull/145), this PR adds the option to skip that.
Many users will not hit the 5000 requests per hour stated by Github or can handle it manually. Making all requests serially to everyone adds a time burden that small teams do not need to handle, thus could be explicitly skipped.