-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[working] webinstall.dev was down due to the recent GitHub API outage #962
Comments
Confirmed. We're having a repeat of #874. I'm investigating. |
Confirmed that this was due to some incomplete error handling on our part, which was triggered by an internal problem with GitHub's Release API: First we were getting this: <p><strong>We couldn't respond to your request in time.</strong></p>
<p>Sorry about that. Please try refreshing and contact us if the problem persists.</p>
<div id="suggestions">
<a href="https://github.com/contact">Contact Support</a> —
<a href="https://www.githubstatus.com">GitHub Status</a> —
<a href="https://twitter.com/githubstatus">@githubstatus</a>
</div> And afterwards simply:
Admittedly, this should just return an error for the packages whose metadata relies on the GitHub Releases API - which is most of them (but not Node or Zig or Go or Rust or many other popular installers) - not take down the entire service. However, there's a background task that which isn't part of the API start which does not have the same error handling as the API routes. When it has an error, it's bubbles up to an async task, and then fails as an uncaught error.
This was just an oversight in the design. Then when the restart occurs, the background task is immediately started and runs its first random update - which is 90% likely to use the GitHub Release API, which immediately causes the failure, triggering a restart, and then the restart rate limite. MitigationAs an immediate fix, I'm simply removed the restart limit. I will also work on the real fix today - which is to make sure the background update task is wrapped with an error handler that simply logs the error, and return the best data from cache. One concern I have is that repeatedly hitting the GitHub Release API in an error condition may trigger rate-limiting, would could cause updates to quit until a cool-off period. However, I doubt that's the case that GitHub would count internal errors against the API limits. |
curl -vv https://webinstall.dev
The text was updated successfully, but these errors were encountered: