Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] NetKAN being rate limited by authenticated requests #2949

Closed
HebaruSan opened this issue Dec 23, 2019 · 2 comments · Fixed by #2950
Closed

[Bug] NetKAN being rate limited by authenticated requests #2949

HebaruSan opened this issue Dec 23, 2019 · 2 comments · Fixed by #2950
Labels
Bug Netkan Issues affecting the netkan data Network Issues affecting internet connections of CKAN

Comments

@HebaruSan
Copy link
Member

Problem

#2945 reported these inflation errors as being caused by unauthenticated API requests:

New inflation error for HalfRSS-Textures2048: GitHub API rate limit exceeded.
New inflation error for HalfRSS-Textures4096: GitHub API rate limit exceeded.
New inflation error for HalfRSS-Textures8192: GitHub API rate limit exceeded.
New inflation error for KerboKatzSmallUtilities-Afterburner: GitHub API rate limit exceeded.
New inflation error for KerboKatzSmallUtilities-AutoBalancingLandingLeg: GitHub API rate limit exceeded.
New inflation error for KerboKatzSmallUtilities-DisableTempGauges: GitHub API rate limit exceeded.
New inflation error for KerboKatzSmallUtilities: GitHub API rate limit exceeded.
New inflation error for KerboKatzSmallUtilities-DailyFunds: GitHub API rate limit exceeded.
New inflation error for KerboKatzSmallUtilities-DestroyAll: GitHub API rate limit exceeded.
New inflation error for KerboKatzSmallUtilities-EditorCamUtilities: GitHub API rate limit exceeded.
New inflation error for KerboKatzSmallUtilities-FillSpotsWithTourists: GitHub API rate limit exceeded.
New inflation error for KerboKatzSmallUtilities-FPSLimiter: GitHub API rate limit exceeded.
New inflation error for KerboKatzSmallUtilities-FPSViewer: GitHub API rate limit exceeded.
New inflation error for KerboKatzSmallUtilities-KerbalScienceExchange: GitHub API rate limit exceeded.
New inflation error for KerboKatzSmallUtilities-ModifiedExplosionPotential: GitHub API rate limit exceeded.
New inflation error for KerboKatzSmallUtilities-PhysicalTimeRatioViewer: GitHub API rate limit exceeded.
New inflation error for KerboKatzSmallUtilities-RecoverAll: GitHub API rate limit exceeded.
New inflation error for KerboKatzUtilities: GitHub API rate limit exceeded.

In fact, they are all authenticated. The GitHub API rate limit exceeded string occurs in GitHubApi.Call, which makes authenticated API requests:

private string Call(string path, string mimeType = null)
{
var url = new Uri(ApiBase, path);
Log.DebugFormat("Calling {0}", url);
try
{
return _http.DownloadText(url, _oauthToken, mimeType);
}
catch (NativeAndCurlDownloadFailedKraken k)
{
if (k.responseStatus == 403 && k.responseHeader.Contains("X-RateLimit-Remaining: 0"))
{
throw new Kraken($"GitHub API rate limit exceeded: {path}");
}
throw;
}
}

So when we captured all the unauthenticated GitHub calls and made them authenticated in #2946, this problem got worse. Where before it was intermittent, now there are 11 mods on the status page that persistently suffer from it (plus 6 Stock Visual Enahncements modules that are mis-reported 404s).

http://status.ksp-ckan.space/

Cause

Further research suggests that GitHub may trigger rate limiting not because the documented limits were exceeded, but because an individual request is too expensive to complete:

https://www.reddit.com/r/github/comments/bif6io/abuse_detection_mechanism_after_5_searches/en8r2x6/

GitHub seems to classify most code searches as computationally intensive therefore they trigger the abuse detection.

If that's what's happening, this would be our culprit:

var json = Call($"repos/{reference.Repository}/releases?per_page=100");

Retrieving up to 100 releases (which would include all releases for most mods) is a lot of database and network traffic, and in most cases not necessary. This was added in #2681 so the --releases all option would not miss releases.

Suggestion

We could instead retrieve fewer releases per request and then retrieve additional groups if we need them. This would be tricky to get right, but would make the 80% case less intensive and hopefully possibly help with those errors.

@HebaruSan HebaruSan added Bug Netkan Issues affecting the netkan data Network Issues affecting internet connections of CKAN labels Dec 23, 2019
@HebaruSan
Copy link
Member Author

And before anyone suggests it, no, passing per_page=X with X as the value of the --releases parameter would not work, because we skip pre-releases. If the command line says --releases 5, and we request per_page=5, and the API returns 5 releases, but 3 of them are pre-releases, but there are several older regular releases, then we've just missed 3 releases that we should be finding.

@HebaruSan
Copy link
Member Author

Another possible cause is repeated requests to the same URL. We have several groups of modules that share the same GitHub repo (e.g. the KerboKatz utilities), each of which would request the same list of releases when it's being inflated. Usually these are alphabetically adjacent, meaning the repeated requests would follow one another quite rapidly.

Caching these calls for a few minutes could help with these errors, and as a side bonus eliminating network calls should also speed up the Inflator mildly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Netkan Issues affecting the netkan data Network Issues affecting internet connections of CKAN
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant