Skip to content

runtime: sigprof handler crashes backtracing runtime.nanotime #24925

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
heschi opened this issue Apr 18, 2018 · 6 comments
Closed

runtime: sigprof handler crashes backtracing runtime.nanotime #24925

heschi opened this issue Apr 18, 2018 · 6 comments

Comments

@heschi
Copy link
Contributor

heschi commented Apr 18, 2018

If a sigprof is received while runtime.nanotime is executing, the runtime may crash with a stack like this:

runtime: invalid pc-encoded table [...]
runtime.throw(0xc52c894, 0x1c)
  src/runtime/panic.go:622 +0x8a fp=0xc435e35260 sp=0xc435e35240 pc=0x26adb4a
runtime.pcvalue(0x119a4080, 0x15ea31c0, 0xc401b4e585, 0x39ad635, 0xc435e35488, 0x15ea3101, 0x0)
  src/runtime/symtab.go:763 +0x529 fp=0xc435e35310 sp=0xc435e35260 pc=0x26cc029
runtime.funcspdelta(0x119a4080, 0x15ea31c0, 0x39ad635, 0xc435e35488, 0xb0080b1a00000000)
  src/runtime/symtab.go:815 +0x5f fp=0xc435e35380 sp=0xc435e35310 pc=0x26cc4df
runtime.gentraceback(0x26dfd17, 0x7fc8115df708, 0x0, 0xc43e926300, 0x0, 0xc435e356f8, 0x40, 0x0, 0x0, 0x6, ...)
  src/runtime/traceback.go:249 +0x1873 fp=0xc435e35698 sp=0xc435e35380 pc=0x26d38d3
runtime.sigprof(0x26dfd17, 0x7fc8115df708, 0x0, 0xc43e926300, 0xc43b160c00)
  src/runtime/proc.go:3755 +0x3e9 fp=0xc435e35938 sp=0xc435e35698 pc=0x26b81f9
runtime.sighandler(0xc40000001b, 0xc435e35bf0, 0xc435e35ac0, 0xc43e926300)
  src/runtime/signal_sighandler.go:33 +0x678 fp=0xc435e359b8 sp=0xc435e35938 pc=0x26c3218
runtime.sigtrampgo(0x1b, 0xc435e35bf0, 0xc435e35ac0)
  src/runtime/signal_unix.go:349 +0x21c fp=0xc435e35a68 sp=0xc435e359b8 pc=0x26c3c2c

To add the running stack trace to the profile, the handler needs to walk the stack frame-by-frame, and to do that it needs to know how big each stack frame is. The compiler records that information in the pc-to-sp table, but runtime.nanotime does manual stack alignment:

SUBQ $16, SP // Space for results
ANDQ $~15, SP // Align for C code

which isn't (can't be) tracked in the table. When the signal handler walks the stack, it computes the wrong size for that frame, and then reads the wrong place looking for the return address, finding stack garbage instead and crashing.

The problem was introduced in a158382, which shipped in 1.10, so this should be considered for a backport for 1.10.2.

@heschi heschi added this to the Go1.10.2 milestone Apr 18, 2018
@heschi heschi self-assigned this Apr 18, 2018
@ianlancetaylor
Copy link
Contributor

Dup of #24142?

@ianlancetaylor
Copy link
Contributor

Well, not a dup as such, but perhaps fixed by https://golang.org/cl/97315.

@heschi
Copy link
Contributor Author

heschi commented Apr 18, 2018

Nope, it turned out we needed a very small additional change, which I'll mail out shortly.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/107778 mentions this issue: runtime: use saved state in SIGPROF handler for vDSO calls

@jgracenin
Copy link

Hey, was this ever cherrypicked into an official release?

I unfortunately had a service running with 1.10.1 that crashed a few times daily if I enabled pprof with the same stack trace above. I took the cherrypick from https://golang.org/cl/107778 and it fixed it for good.

I didn't see this CL referenced in the Go 1.10.2 or 1.10.3 milestones. Looking over the source (below), I don't see it there either (or was this abandoned for another way to fix it?)

From runtime/proc.go in 1.10.3 line 3689 - 3691:
if gp == nil || sp < gp.stack.lo || gp.stack.hi < sp || setsSP(pc) { traceback = false }

@ianlancetaylor
Copy link
Contributor

@jgracenin This will be fixed in 1.11, but it is not in any earlier releases.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants