-
-
Notifications
You must be signed in to change notification settings - Fork 335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flakiness downloading codecov-bash #1231
Comments
Replace |
So weirdly enough, it turns out this is actually an issue with Maybe |
Apparently curl's
We're not the only ones who noticed: pytest-dev/pytest#5951 (comment) Maybe wget would be a better choice, like @smurfix suggested? I didn't know this, but the man page says:
|
In the hopes that it copes better with flaky servers/networks than curl does. The hope is that this will finally resolve python-triogh-1231.
The hope is that this will finally resolve python-triogh-1231.
The hope is that this will finally resolve gh-1231.
So this is baffling to me, and I know it has nothing to do with anything we did, but it's super frustrating: for some reason, running
curl https://codecov.io/bash
has recently become super flaky and is making our CI runs fail all the time and we gotta do something.Observations:
The most common failure seems to be a timeout during connect. I've also seen this, which is super weird:
According to the curl docs,
curl --retry 5
should retry on timeouts but.... I can't tell whether it's actually doing that?Codecov as a whole is kind of flaky, so it's tempting to blame this on them. But I'm not sure this is actually their fault...
codecov.io
appears to be hosted by google (based on runningwhois
on the IP address), so I'm guessing it's some google CDN or something? (Or maybe it's just some random server running on google cloud with no CDN in front of it, who knows?)The failures seem to all be from Travis, not Azure. And I'm pretty sure Travis is running on Google Cloud servers. Wild supposition: maybe Travis's traffic is considered "internal" to Google Cloud and doesn't hit the CDN, while Azure's traffic is "external" and does hit the CDN? I have no idea. I guess we could try running traceroute from both Travis and Azure, but that seems like it might be running off down an unproductive rabbit-hole.
...oh wait, and actually it just failed on azure too, so never mind: https://dev.azure.com/python-trio/trio/_build/results?buildId=1031&view=logs&jobId=872bf439-86bb-5ce5-edcd-c35619d700a0
so... what the heck can we do about this? the obvious answer is to retry, but curl's
--retry
isn't working. Is that because there's something wrong with--retry
, or is it because when we hit one of these failures, it's somehow "sticky"? Would putting our own retry loop around curl help?I guess we could also like, stash a copy of codecov-bash somewhere more reliable, but that seems like it would create all kinds of operational annoyances.
The text was updated successfully, but these errors were encountered: