-
Notifications
You must be signed in to change notification settings - Fork 18k
net/http: frequent failures in TestNoSniffExpectRequestBody_h2 since Oct. 28 #42256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I spent some time digging into this, somewhat out of curiosity but also to be sure it wasn't related to the fix for #42156 since the timing was suspect. It doesn't seem to be related to #42156 since http.Server.Shutdown is not involved. It would be nice to simulate |
I'm pretty certain this has been fixed, but I haven't managed to bisect down to the exact point of the fix yet. It reproduces on these builders at the previously-failing commits with |
Bisection results point at this being fixed by e02ab89: https://go.googlesource.com/go/+/e02ab89eb8994fa6f2dfa2924cdadb097633fcc1 (passes) The test is flaky. The failures take ~10s to run, which points at the \cc @ianlancetaylor |
https://golang.org/cl/266304 shouldn't affect any correct code. It's a minor optimization which will cause the runtime poller to be woken up less often to run timers. It will only affect programs that set an I/O deadline or a time.Timer, and then change the deadline or the Timer to run earlier than when it was created. Programs that do that will see slightly different, hopefully more efficient, scheduling behavior. It's really puzzling that it fails before CL 266304 and passes after it. If anything I would expect CL 266304 to wake up fewer goroutines. So why would waking up more goroutines cause a timeout to expire? I do not know what is happening here. |
It's worth nothing that is what the h2 bits do in the Lines 9105 to 9108 in da7aa86
And then using Reset to have the timer fire earlier: Line 9141 in da7aa86
|
Well, at least that explains why the CL has some effect, though I still can't explain why it has the effect that it does. |
Actually, something else I wondered about when digging into this is the use of Stop and Reset together. Expanding the code around the Reset: Lines 9140 to 9142 in da7aa86
if s.timer.Stop() {
s.timer.Reset(s.delay)
} Since this timer was created with AfterFunc, is it questionable to use Stop and Reset in this way? |
That usage is fine. The |
I can explain how the error occurs, but it doesn't explain why. The body is being written after 10s but the timer should have been cancelled. |
https://go-review.googlesource.com/c/net/+/269058/ should fix this as well |
@bcmills Is this still an issue now that https://golang.org/cl/269058 is merged? |
I don't see any occurrences of this test failure since October 30, 2020, so closing. |
2020-10-28T14:25:56-b85c2dd/aix-ppc64
2020-10-28T13:46:11-a0a4439/solaris-amd64-oraclerel
2020-10-28T13:25:44-72dec90/aix-ppc64
2020-10-28T04:20:39-02335cf/aix-ppc64
2020-10-28T01:22:47-2414e1f/aix-ppc64
2020-10-28T01:18:38-7be9158/solaris-amd64-oraclerel
Marking as release-blocker because this appears to be a regression in 1.16.
The failures so far are on somewhat esoteric builders, but those are also some of our slowest builders — this could be a timing bug that just happens to reproduce more easily on them.
The text was updated successfully, but these errors were encountered: