-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: sigprof handler crashes backtracing runtime.nanotime #24925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Dup of #24142? |
Well, not a dup as such, but perhaps fixed by https://golang.org/cl/97315. |
Nope, it turned out we needed a very small additional change, which I'll mail out shortly. |
Change https://golang.org/cl/107778 mentions this issue: |
Hey, was this ever cherrypicked into an official release? I unfortunately had a service running with 1.10.1 that crashed a few times daily if I enabled pprof with the same stack trace above. I took the cherrypick from https://golang.org/cl/107778 and it fixed it for good. I didn't see this CL referenced in the Go 1.10.2 or 1.10.3 milestones. Looking over the source (below), I don't see it there either (or was this abandoned for another way to fix it?) From runtime/proc.go in 1.10.3 line 3689 - 3691: |
@jgracenin This will be fixed in 1.11, but it is not in any earlier releases. |
If a sigprof is received while runtime.nanotime is executing, the runtime may crash with a stack like this:
To add the running stack trace to the profile, the handler needs to walk the stack frame-by-frame, and to do that it needs to know how big each stack frame is. The compiler records that information in the pc-to-sp table, but runtime.nanotime does manual stack alignment:
go/src/runtime/sys_linux_amd64.s
Lines 258 to 259 in 0a4b962
which isn't (can't be) tracked in the table. When the signal handler walks the stack, it computes the wrong size for that frame, and then reads the wrong place looking for the return address, finding stack garbage instead and crashing.
The problem was introduced in a158382, which shipped in 1.10, so this should be considered for a backport for 1.10.2.
The text was updated successfully, but these errors were encountered: