-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: unexpected hangs by cmd/go and cmd/compile #29385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Sample timeout, on linux/amd64 trybot: https://storage.googleapis.com/go-build-log/20c62074/linux-amd64_360cb6cf.log |
I saw trybots still report as failed build instead of ignoring. |
Another instance reported in #29418. |
It's hard to say exactly what's going on here without a |
I've seen periodic hangs during all.bash for a while on git tip, on 64-bit x86 Fedora Linux. With the most recent one, which happened yesterday at +c043fc4f65, I got a stack trace:
I have in the past just ^C'd these and redone them, but I will now try to get stack dumps or actual cores (with |
I've seen hangs on This time I waited for a full three minutes, and
I seem to be able to reproduce this quite easily, so let me know if there's more information I could extract or fixes I could try. Below are my system details. Also, I'm bootstrapping with
|
I've run |
I've had I'm running into this on 12-core and 16-core desktop machines. Every time I've caught the hang, Given all of this, my wild speculation and wondering is if there is some path in GC where GC will force a running goroutine to switch to GC activity while it has a lock held. If this GC activity then tries for a lock itself, you perhaps could get deadlock. I often see waiting semaquires for both I'd be happy to point dlv/gdlv at one of my gcore-acquired core dumps if people tell me what to look for. I poked around briefly on one of them and didn't see much, although here is the dlv
|
Change https://golang.org/cl/156017 mentions this issue: |
For what it's worth, I manually patched my copy of git tip to include the changes from https://golang.org/cl/156017 and I now can't reproduce a hang under repeated runs of |
@aclements Thanks for mailing a fix. We are planning on cutting the RC this week, please submit it as soon as possible. |
Currently it's possible for the runtime to deadlock if checkPut is called in a non-preemptible context. In this case, checkPut may spin, so it won't leave the non-preemptible context, but the thread running gcMarkDone needs to preempt all of the goroutines before it can release the checkPut spin loops. Fix this by returning from checkPut if it's called under any of the conditions that would prevent gcMarkDone from preempting it. In this case, it leaves a note behind that this happened; if the runtime does later detect left-over work it can at least indicate that it was unable to catch it in the act. For #27993. Updates #29385 (may fix it). Change-Id: Ic71c10701229febb4ddf8c104fb10e06d84b122e Reviewed-on: https://go-review.googlesource.com/c/156017 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>
CL 156017 has been submitted. I believe this fixes the hangs observed here, so I'm closing this issue. If we continue to see hangs, we can re-open. |
Several times recently while running all.bash I've seen the cmd/compile or cmd/go programs hang without making progress. These happen at different times during all.bash, usually in the cmd/go test or the final test directory (presumably because those run a lot of commands).
Here is a case that just happened with cmd/go hanging in the final test directory. This is the output after
kill -QUIT
.I'm filing this in case anybody else sees this. On the builders this would show up as a timeout; we normally ignore timeouts in the builders and trybots.
CC @aclements @randall77
The text was updated successfully, but these errors were encountered: