-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: libunwind is unable to unwind CGo to Go's stack #40044
Comments
This is different from #39524 . We switch stacks at Go/C boundaries. Go code runs on goroutine stacks (typically small), whereas C code runs on system stacks (typically large). Since they are not on the same stack, I would not expect any stack unwinding tool to work. Not sure if there is anything we could do. Maybe we could use the frame pointer to "fake" it? Not sure this is a good idea... |
Indeed. Thinking about it however, it doesn't feel that hard to do. Either by locally modifying asmcgocall, or, in a more ambitious way, via |
Also, weirdly enough, lldb is able to do it, without dwarf. |
I am also realizing this is very different from #39524 indeed. But in some way, since |
If you want to unwind from C++ back into Go you may want to try github.com/ianlancetaylor/cgosymbolizer. Although that will only help from the Go side, not the C++ side. In principle we could hand write unwind information for |
@ianlancetaylor thank you. The issue, on the iOS side, is that unwinding is done locally, on the device (presumably with libunwind), without DWARF. DWARF is only added later to symbolicate the crashes. That said, it could be useful for Android (which uses breakpad with minidumps) @cherrymui I tried that forsaken piece of code to, in order to call the backtrace method without cgo, and alas, the unwinding still stops before it somehow. This is based on the rustgo article: TEXT ·backtracetrampoline(SB),0,$2048
MOVQ SP, BX // Save SP in a callee-saved registry
ADDQ $2048, SP // Rollback SP to reuse this function's frame
ANDQ $~15, SP // Align the stack to 16-bytes
CALL backtrace(SB)
MOVQ BX, SP // Restore SP
RET |
@steeve Sorry, I'm not sure exactly what you're planning to do, and why Also, on what architecture? You mentioned iOS (presumably ARM64), but also AMD64 in your That said, does CL https://go-review.googlesource.com/c/go/+/241080 makes any difference (on ARM64)? Thanks. |
@cherrymui Thank you for the CL, I wasn't hoping as much. Will definitely try and let you know. My ultimate target is indeed iOS (and Android, to an extent). |
Change https://golang.org/cl/241158 mentions this issue: |
@steeve libunwind unwinds the stack using the unwind information, which is not DWARF but is approximately the same format as a subset of DWARF. That's what I was referring to when I suggested that we could write unwind information for |
I just tried your CL @cherrymui on a real device, and unfortunately, when I pause inside XCode's, I only see the stack up to In my case I did put a
Note that on amd64, |
The lack of frame pointer in I've tried with gdb, and it's also broken: Thread 1 hit Breakpoint 1, runtime.asmcgocall () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/asm_amd64.s:835
835 MOVQ (g_sched+gobuf_sp)(SI), SP
(gdb) backtrace
#0 runtime.asmcgocall () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/asm_amd64.s:835
#1 0x0000000000402f49 in runtime.cgocall (fn=0x45a320 <runtime.asmstdcall>, arg=0x4daae0, ~r0=<optimized out>) at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/cgocall.go:167
#2 0x0000000000455314 in syscall.loadsystemlibrary (filename=0xc00010c000, absoluteFilepath=<optimized out>, handle=<optimized out>, err=<optimized out>) at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/syscall_windows.go:443
#3 0x000000000045b39c in syscall.loadsystemlibrary (filename=0x45a320 <runtime.asmstdcall>, absoluteFilepath=0x4daae0, handle=<optimized out>, err=<optimized out>) at <autogenerated>:1
... omitted
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) si
runtime.asmcgocall () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/asm_amd64.s:840
840 SUBQ $64, SP
(gdb) backtrace
#0 runtime.asmcgocall () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/asm_amd64.s:840
#1 0x00000000007afee8 in ?? ()
@steeve curiously, WinDbg and gdb can unwind the stack if #0 runtime.asmcgocall () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/asm_amd64.s:811
#1 0x000000000042d3a5 in runtime.stdcall (fn=<optimized out>, ~r0=<optimized out>) at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/os_windows.go:1074
#2 0x000000000042d4bc in runtime.stdcall1 (fn=0x7ffca80d95d0 <LoadLibraryA>, a0=<optimized out>, ~r0=<optimized out>) at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/os_windows.go:1095
#3 0x000000000042ab85 in runtime.loadOptionalSyscalls () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/os_windows.go:249
#4 0x000000000042b9b5 in runtime.osinit () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/os_windows.go:551
#5 0x0000000000456637 in runtime.rt0_go () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/asm_amd64.s:348
#6 0x00000000a80d26bd in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) si
812 MOVQ arg+8(FP), BX
(gdb) backtrace
#0 runtime.asmcgocall () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/asm_amd64.s:812
#1 0x000000000042d3a5 in runtime.stdcall (fn=<optimized out>, ~r0=<optimized out>) at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/os_windows.go:1074
#2 0x000000000042d4bc in runtime.stdcall1 (fn=0x7ffca80d95d0 <LoadLibraryA>, a0=<optimized out>, ~r0=<optimized out>) at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/os_windows.go:1095
#3 0x000000000042ab85 in runtime.loadOptionalSyscalls () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/os_windows.go:249
#4 0x000000000042b9b5 in runtime.osinit () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/os_windows.go:551
#5 0x0000000000456637 in runtime.rt0_go () at C:/Users/qmuntaldiaz/code/golang-go/src/runtime/asm_amd64.s:348
#6 0x00000000a80d26bd in ?? () |
@ianlancetaylor @cherrymui go/src/cmd/internal/obj/x86/obj6.go Lines 609 to 626 in 1ba7341
|
Change https://go.dev/cl/459395 mentions this issue: |
I think that is a good direction. Thanks for looking into it. Are we sure what matters are all assembly functions? We don't have explicit NOFRAME control for compiled functions. I'm still not sure about stack transition in asmcgocall being "broken". Technically, it is running on two stacks -- the C functions run on a different stack. So if we are unwinding the physical stack, you shouldn't see both Go and C functions. You could argue we're expecting to unwind the logical stack. That is a reasonable argument. But I don't think a decision has been made for whether the unwinding should be the physical stack or the logical stack. Further, at least some debugger is not happy if the stack pointer suddenly changes direction. I don't think it is a good idea if the unwinding only works when the C stack is at a higher address than the Go stack, given that we don't generally control where the stacks are in the address space. For a (not quite accurate) analogy, what does the debugger do for C Thanks. |
The C Note that we do support unwinding the cgo stack to the Go stack via the What would work for libunwind is for us to write unwind information for |
I agree that longjmp is not really a good analogy because it could be an actual context switch (although it could be used to implement a temporary stack transition like asmcgocall, but the tools never know). I don't think there is anything in C that does a temporary stack switch? I think the Go traceback API is mostly showing only the "user frames", e.g. we hide compiler-generated wrappers. So hiding the asmcgocall stack transition when |
I don't expect Go to hide wrappers and stack transitions to external unwinders, if it is possible at all. I do expect Go to facilitate unwinding stack transitions, even from C to Go, and vice versa. This is certainly doable using Windows' SEH if a frame pointer is set in the function prologue and the linker emits the appropriate metadata. |
This CL marks non-leaf nosplit assembly functions as NOFRAME to avoid relying on the implicit amd64 NOFRAME heuristic, where NOSPLIT functions without stack were also marked as NOFRAME. Updates #57302 Updates #40044 Change-Id: Ia4d26f8420dcf2b54528969ffbf40a73f1315d61 Reviewed-on: https://go-review.googlesource.com/c/go/+/459395 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Quim Muntal <quimmuntal@gmail.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com>
This comment was marked as off-topic.
This comment was marked as off-topic.
Frame pointer is enabled on ARM64. When copying stacks, the saved frame pointers need to be adjusted. Updates #39524, #40044. Fixes #58432. Change-Id: I73651fdfd1a6cccae26a5ce02e7e86f6c2fb9bf7 Reviewed-on: https://go-review.googlesource.com/c/go/+/241158 Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com> Run-TryBot: Cherry Mui <cherryyz@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
What version of Go are you using (
go version
)?master
as of the buildDoes this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
Following @cherrymui's comment on #39524, I figured I tried to check why lots of our backtraces on iOS stop at
runtime.asmcgocall
.Since I wanted to reproduce it on my computer and
lldb
manges to properly backtrace, I figured I'd givelibunwind
a try, since this is was iOS uses when a program crashes.Unfortunately
libunwind
didn't manage to walk the stack past CGo generated_Cfunc_
functions.Given this program:
It prints:
I tried doing Go(1) -> C(1) -> Go(2) -> C(2) and backtrace, and it only unwinds C(2).
Also, I tried to make set
asmcgocall
to have a 16 bytes stack, hoping that the generated frame pointer would help, but it didn't.What did you expect to see?
The complete backtrace.
What did you see instead?
A backtrace for C functions only.
The text was updated successfully, but these errors were encountered: