-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
intermittent 404 Not Found on /pulls github calls #1019
Comments
This is happening to us too. It always happens when opening a PR, but then we comment |
we are having this issue as well:
|
Are you guys implementing a retry/backoff mechanism to handle eventual consistency?
|
Looks like we need to add a retry. |
FWIW we hadn't seen this for a while and started seeing it again yesterday (or the day before?). I'm sure it's a github problem, not an Atlantis problem, but Atlantis probably needs to work around it. |
We have been using Atlantis for ~1 week and we also just saw this problem just now for the first time in a project with ~30 terraform files and on a PR with only 1 changed file. We are using Atlantis v0.14.0. |
Yeah totally, and it shouldn't be too hard to throw some retries in there. |
We just started seeing messages like this on PRs. I wonder if there need to be more retries or implement the exponential backoff? Mostly commenting just to see if others who happen to stop by here are having the same issue |
We have been getting is more often as well |
After trying the version with the fix, we stopped seeing this |
We're running with the fix implemented in #1131 and have still seen this issue occur, relatively often within the past week (presumably due to GitHub performance), so it seems like it might be worth implementing a different retry strategy such as exponential backoff, as suggested in that PR. |
* Improve github pull request call retries Retry with fixed 1 second backoff up to 3 retries was added by #1131 to address #1019, but the issue continued to show up (#1453). Increase max attempts to 5 and use exponential backoff for a maximum total retry time of (2^n - n - 1) seconds, which is roughly 30 seconds for current max attempts n = 5. Also move the sleep to the top of the loop so that we never sleep without sending the request again on the last iteration. * Fix style with gofmt -s
We are still observing this issue fairly consistently in new Pull requests with autoplan enabled in 0.17.5 |
This is a follow on to resolve similar issues to runatlantis#1019. In runatlantis#1131 retries were added to GetPullRequest. And in runatlantis#1810 a backoff was included. However, those only resolve one potential request at the very beginning of a PR creation. The other request that happens early on during auto-plan is one to ListFiles to detect the modified files. This too can sometimes result in a 404 due to async updates on the GitHub side.
* Improve github pull request call retries Retry with fixed 1 second backoff up to 3 retries was added by runatlantis#1131 to address runatlantis#1019, but the issue continued to show up (runatlantis#1453). Increase max attempts to 5 and use exponential backoff for a maximum total retry time of (2^n - n - 1) seconds, which is roughly 30 seconds for current max attempts n = 5. Also move the sleep to the top of the loop so that we never sleep without sending the request again on the last iteration. * Fix style with gofmt -s
Same on version 0.23 |
Hi, sometimes when Atlantis is triggered on a PR in github, Atlantis posts the following error onto the PR:
Looking at github's API docs, that per_page=300 seems okay:
We can replan and it works- e.g., it appears to be intermittent. Looking in the Atlantis logs, I see the following (I've removed the timestamps and redacted IPs/private info):
We have a couple of theories but haven't been able to reproduce. First, it's only happened since we updated to v12.0, the current release. (we also added
--hide-prev-plan-comments --disable-markdown-folding
at this time).Second is that it may happen with a largeish number of directories, though generally our changed dirs is under 50 and changed files is under 100.
The third theory is that it might happen when two unrelated repos are processing at the same time. That can be seen here; I've left the timestamps so you can see the overlap, "myrepo" is the same as above, and "REPO2" is the other repo that is planning.
The text was updated successfully, but these errors were encountered: