Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to skip RateLimitTransport #235

Closed

Conversation

emoreth
Copy link

@emoreth emoreth commented Jun 5, 2019

As RateLimitTransport requires requests not to be concurrent (https://github.com/terraform-providers/terraform-provider-github/pull/145), this PR adds the option to skip that.

Many users will not hit the 5000 requests per hour stated by Github or can handle it manually. Making all requests serially to everyone adds a time burden that small teams do not need to handle, thus could be explicitly skipped.

@ghost ghost added the size/XS label Jun 5, 2019
@emoreth emoreth force-pushed the add_option_to_skip_rate_limit branch from 72fe53e to bb1e3da Compare June 5, 2019 00:12
github/config.go Outdated Show resolved Hide resolved
@ghost ghost added size/S and removed size/XS labels Jun 7, 2019
Copy link
Contributor

@majormoses majormoses left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ghost ghost added the Type: Documentation Improvements or additions to documentation label Jun 11, 2019
@emoreth
Copy link
Author

emoreth commented Jun 18, 2019

Hey there!
Is there anything else I can do for this to be approved/applied?

My TF plans are taking 10 minutes in the PR, 10 more minutes after merging, just to add a user in a team.

@megan07
Copy link
Contributor

megan07 commented Jun 27, 2019

Hi @emoreth !

Thank you for the PR, I appreciate the opportunity to rethink how we want to approach this issue. My concern with allowing unlimited concurrency is that the majority of the users who use the provider will likely hit the abuse rate limits. From my understanding, the abuse rate limits are different than the normal rate limits that include the 5000 requests/hour that you referred to.

Rate limiting: https://developer.github.com/v3/#rate-limiting
Abuse rate limits: https://developer.github.com/v3/#abuse-rate-limits

It appears to me that making multiple concurrent requests would trigger these abuse rate limits, thus, in order to not abuse GitHub’s API, I think it’s safe we don’t open that up. That being said, perhaps we could re-think a different approach that is less restricting, something that could be agreed upon by all users.

One thought we had was perhaps making only 2 parallel requests at a time and seeing how that performs, and if it triggers the abuse rate limits. I haven’t tested that yet, so we’d have to see how that works. There is a concern with that as well, that it is going against GitHub’s documented best practice (https://developer.github.com/v3/guides/best-practices-for-integrators/#dealing-with-abuse-rate-limits)

Make requests for a single user or client ID serially. Do not make requests for a single user or client ID concurrently.

Thanks!
Megan

@emoreth
Copy link
Author

emoreth commented Jun 27, 2019

Hi @megan07 !

Thank you for your response and I totally agree with your concern in not abusing any limits, and I think that GitHub knows that Terraform is not a form of abuse.

I'm currently managing a GitHub account with ~200 memberships, ~140 repositories, ~77 teams, ~150 associations with teams and repositories and ~500 associations with teams and memberships. On top of that, every change needs to be planned on a PR and replanned/executed after merge. Still, all these operations get to around 2.1K requests, which takes ~30 minutes. When using an older version of the plugin it cuts down to ~10 minutes.

With that said, I don't have more than 2~5 GitHub change requests a day, which takes me almost 3 hours to apply.

The setting I added was inactive by default, so whoever wants to manage it manually can opt-in. Currently, it would be way easier for me to coordinate with my team to handle hitting the API limits than having to wait for those plans.

I do think that github-provider should not validate every single user/repo/etc separately (as GitHub has endpoints that return multiple results) but I lack go skills to send you that PR.

Since it took 23 days to get an answer for this PR, I'll wait for your approval to fix these conflicts.

Regards,
Everton

@majormoses
Copy link
Contributor

majormoses commented Jun 28, 2019

@emoreth have you looked at partitioning your github organization(s) into multiple states and using data sources / remote states to share references between them? Given how github rate limits (which IMO leaves much to be desired) it seems wasteful to check every single object every time you make a change in your entire org. At my org each team or business unit has its own state where its memberships, repositories, etc live in. This acts as a natural limiter to your calls. I also manage several very large orgs with hundreds of users, repositories, branch protections, issue labels, repo topics, webhooks, etc. Even with these logical separations I ran into rate limits quite easily before adding caching and limiting the concurrency but things are still generally well under 10 minutes for our larger teams leaving the smaller teams quite snappy.

I think that GitHub knows that Terraform is not a form of abuse.

Terraform is a tool, it can be used to abuse a system even if that's not its intended use case.

I have hit both types of rate limits with this provider they are both concerns but I can't quite recall how often or in what scenarios I have seen the abuse vs normal rate limits.

@emoreth
Copy link
Author

emoreth commented Jun 28, 2019

@majormoses I totally agree that you have to comply with the limits, I'm just adding an opt-out for people who won't hit the limit and/or can manage by themselves.

Making it serially just adds time for everyone.

@emoreth emoreth closed this Jun 30, 2019
@emoreth emoreth deleted the add_option_to_skip_rate_limit branch June 30, 2019 18:19
@majormoses
Copy link
Contributor

@emoreth I agree, I was trying to provide a suggestion to help improve outside of the context of the PR. I agree we should have a way to disable caching. If someone opts into a less efficient solution for your "api budget", it makes sense to allow as they just raise their chances of exhausting the api budget. I am still in favor of supporting this I was just pointing out there are ways to negate the need and still keep the time more reasonable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/S Type: Documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants