Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

misc/cgo: testplugin fails on Darwin with runtime: bad pointer in frame on the latest tip #24653

Closed
YoshikiShibata opened this issue Apr 3, 2018 · 17 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@YoshikiShibata
Copy link

Please answer these questions before submitting your issue. Thanks!

What did you do?

If possible, provide a recipe for reproducing the error.
A complete runnable program is good.
A link on play.golang.org is best.

Try to rebuild the latest tip version with all.bash

What did you expect to see?

All tests would pass.

What did you see instead?

##### ../misc/cgo/testplugin
runtime: bad pointer in frame main.main at 0xc000301cf0: 0x8
fatal error: invalid pointer found on stack

runtime stack:
runtime.throw(0x750b94c, 0x1e)
	/Users/yoshiki/tools/go/src/runtime/panic.go:598 +0x72 fp=0x7ffeefbfe9a8 sp=0x7ffeefbfe978 pc=0x7486332
runtime.adjustpointers(0xc000301c90, 0x7ffeefbfeaa0, 0x7ffeefbfee50, 0x4170448, 0x4178360)
	/Users/yoshiki/tools/go/src/runtime/stack.go:592 +0x23c fp=0x7ffeefbfea18 sp=0x7ffeefbfe9a8 pc=0x74998dc
runtime.adjustframe(0x7ffeefbfed60, 0x7ffeefbfee50, 0x4178360)
	/Users/yoshiki/tools/go/src/runtime/stack.go:663 +0x325 fp=0x7ffeefbfead0 sp=0x7ffeefbfea18 pc=0x7499c25
runtime.gentraceback(0xffffffffffffffff, 0xffffffffffffffff, 0x0, 0xc000000180, 0x0, 0x0, 0x7fffffff, 0x750e220, 0x7ffeefbfee50, 0x0, ...)
	/Users/yoshiki/tools/go/src/runtime/traceback.go:310 +0x12a1 fp=0x7ffeefbfedc8 sp=0x7ffeefbfead0 pc=0x74a2111
runtime.copystack(0xc000000180, 0x4000, 0x7ffeefbfef01)
	/Users/yoshiki/tools/go/src/runtime/stack.go:891 +0x26c fp=0x7ffeefbfef80 sp=0x7ffeefbfedc8 pc=0x749a70c
runtime.newstack()
	/Users/yoshiki/tools/go/src/runtime/stack.go:1063 +0x315 fp=0x7ffeefbff118 sp=0x7ffeefbfef80 pc=0x749ab25
runtime.morestack()
	/Users/yoshiki/tools/go/src/runtime/asm_amd64.s:481 +0x8f fp=0x7ffeefbff120 sp=0x7ffeefbff118 pc=0x74a7baf

goroutine 1 [copystack]:
strings.NewReplacer(0x0, 0x0, 0x0, 0xc0000d4188)
	/Users/yoshiki/tools/go/src/strings/replace.go:23 +0x71b fp=0xc000301830 sp=0xc000301828 pc=0x74d3c2b
plugin2.init.0()
	/Users/yoshiki/tools/go/misc/cgo/testplugin/src/plugin2/plugin2.go:21 +0x32 fp=0xc000301860 sp=0xc000301830 pc=0x74d7482
plugin2.init()
	<autogenerated>:1 +0x60 fp=0xc000301870 sp=0xc000301860 pc=0x74d7740
plugin.open(0x40f3e83, 0x7, 0xc0000dc000, 0x200000, 0x200000)
	/Users/yoshiki/tools/go/src/plugin/plugin_dlopen.go:113 +0xac8 fp=0xc000301b28 sp=0xc000301870 pc=0x40b4478
plugin.Open(0x40f3e83, 0xa, 0x40f31d0, 0x1, 0xc0000ba3e8)
	/Users/yoshiki/tools/go/src/plugin/plugin.go:32 +0x35 fp=0xc000301b60 sp=0xc000301b28 pc=0x40b3575
main.main()
	/Users/yoshiki/tools/go/misc/cgo/testplugin/src/host/host.go:129 +0x918 fp=0xc000301f88 sp=0xc000301b60 pc=0x40b5d98
runtime.main()
	/Users/yoshiki/tools/go/src/runtime/proc.go:198 +0x207 fp=0xc000301fe0 sp=0xc000301f88 pc=0x402d5d7
runtime.goexit()
	/Users/yoshiki/tools/go/src/runtime/asm_amd64.s:1385 +0x1 fp=0xc000301fe8 sp=0xc000301fe0 pc=0x4053771

goroutine 2 [force gc (idle)]:
runtime.gopark(0x40fb368, 0x4186d20, 0x40f48cb, 0xf, 0x40fb214, 0x1)
	/Users/yoshiki/tools/go/src/runtime/proc.go:291 +0x105 fp=0xc000034768 sp=0xc000034748 pc=0x402da15
runtime.goparkunlock(0x4186d20, 0x40f48cb, 0xf, 0x14, 0x1)
	/Users/yoshiki/tools/go/src/runtime/proc.go:297 +0x5e fp=0xc0000347a8 sp=0xc000034768 pc=0x402dade
runtime.forcegchelper()
	/Users/yoshiki/tools/go/src/runtime/proc.go:248 +0xca fp=0xc0000347e0 sp=0xc0000347a8 pc=0x402d86a
runtime.goexit()
	/Users/yoshiki/tools/go/src/runtime/asm_amd64.s:1385 +0x1 fp=0xc0000347e8 sp=0xc0000347e0 pc=0x4053771
created by runtime.init.4
	/Users/yoshiki/tools/go/src/runtime/proc.go:237 +0x35

goroutine 3 [GC sweep wait]:
runtime.gopark(0x40fb368, 0x4186e00, 0x40f445a, 0xd, 0x401fc14, 0x1)
	/Users/yoshiki/tools/go/src/runtime/proc.go:291 +0x105 fp=0xc000034f60 sp=0xc000034f40 pc=0x402da15
runtime.goparkunlock(0x4186e00, 0x40f445a, 0xd, 0x14, 0x1)
	/Users/yoshiki/tools/go/src/runtime/proc.go:297 +0x5e fp=0xc000034fa0 sp=0xc000034f60 pc=0x402dade
runtime.bgsweep(0xc000064000)
	/Users/yoshiki/tools/go/src/runtime/mgcsweep.go:52 +0xa2 fp=0xc000034fd8 sp=0xc000034fa0 pc=0x401fcd2
runtime.goexit()
	/Users/yoshiki/tools/go/src/runtime/asm_amd64.s:1385 +0x1 fp=0xc000034fe0 sp=0xc000034fd8 pc=0x4053771
created by runtime.gcenable
	/Users/yoshiki/tools/go/src/runtime/mgc.go:216 +0x58

goroutine 18 [finalizer wait]:
runtime.gopark(0x40fb368, 0x41a31b8, 0x40f46d6, 0xe, 0x14, 0x1)
	/Users/yoshiki/tools/go/src/runtime/proc.go:291 +0x105 fp=0xc000030718 sp=0xc0000306f8 pc=0x402da15
runtime.goparkunlock(0x41a31b8, 0x40f46d6, 0xe, 0x14, 0x1)
	/Users/yoshiki/tools/go/src/runtime/proc.go:297 +0x5e fp=0xc000030758 sp=0xc000030718 pc=0x402dade
runtime.runfinq()
	/Users/yoshiki/tools/go/src/runtime/mfinal.go:175 +0xac fp=0xc0000307e0 sp=0xc000030758 pc=0x4016fec
runtime.goexit()
	/Users/yoshiki/tools/go/src/runtime/asm_amd64.s:1385 +0x1 fp=0xc0000307e8 sp=0xc0000307e0 pc=0x4053771
created by runtime.createfing
	/Users/yoshiki/tools/go/src/runtime/mfinal.go:156 +0x61

Does this issue reproduce with the latest release (go1.10.1)?

N/A

System details

go version devel +da02dcda02 Mon Apr 2 23:39:16 2018 +0000 darwin/amd64
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/yoshiki/Library/Caches/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/yoshiki/gocode"
GORACE=""
GOROOT="/Users/yoshiki/tools/go"
GOTMPDIR=""
GOTOOLDIR="/Users/yoshiki/tools/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/rk/f21ym6ks1ydfj0rf1kqt1ctc0000gn/T/go-build235793060=/tmp/go-build -gno-record-gcc-switches -fno-common"
GOROOT/bin/go version: go version devel +da02dcda02 Mon Apr 2 23:39:16 2018 +0000 darwin/amd64
GOROOT/bin/go tool compile -V: compile version devel +da02dcda02 Mon Apr 2 23:39:16 2018 +0000
uname -v: Darwin Kernel Version 17.4.0: Sun Dec 17 09:19:54 PST 2017; root:xnu-4570.41.2~1/RELEASE_X86_64
ProductName:	Mac OS X
ProductVersion:	10.13.3
BuildVersion:	17D102
lldb --version: lldb-902.0.79.2
  Swift-4.1
@odeke-em
Copy link
Member

odeke-em commented Apr 3, 2018

Thank you @YoshikiShibata for reporting this issue!

/cc @TocarIP @bradfitz, perhaps this just added CL https://go-review.googlesource.com/#/c/97255/ aka commit fd22542
which was the last to touch strings.Replace?

@odeke-em odeke-em changed the title cgo: testplugin failed with the latest tip verison misc/cgo: testplugin failed with runtime: bad pointer in frame on the latest tip Apr 3, 2018
@odeke-em odeke-em added the NeedsFix The path to resolution is known, but the work has not been done. label Apr 3, 2018
@odeke-em odeke-em added this to the Go1.11 milestone Apr 3, 2018
@bradfitz bradfitz changed the title misc/cgo: testplugin failed with runtime: bad pointer in frame on the latest tip misc/cgo: testplugin fails on Darwin with runtime: bad pointer in frame on the latest tip Apr 3, 2018
@bradfitz
Copy link
Contributor

bradfitz commented Apr 3, 2018

Duplicate of #23369 it seems? But actually hurting us now.

/cc @aclements @randall77 @crawshaw @hirochachacha @bcmills

@bradfitz
Copy link
Contributor

bradfitz commented Apr 3, 2018

I really don't want to roll back https://go-review.googlesource.com/#/c/97255/ (commit fd22542). It's innocent. It just tickled an existing bug.

But re-disabling plugin support on Darwin would also be sad.

@bradfitz
Copy link
Contributor

bradfitz commented Apr 3, 2018

/cc @mdempsky too

@bcmills bcmills added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. and removed NeedsFix The path to resolution is known, but the work has not been done. labels Apr 3, 2018
@cherrymui
Copy link
Member

cherrymui commented Apr 3, 2018

The throwing part of the stack trace (from morestack to throw) has the PCs of runtime functions at 0x74xxxxx, whereas for goroutine 1 runtime.goexit and runtime.main have PCs at 0x40xxxxx. For one module, I think the linker will put all runtime functions together, adjacent in PC. So it seems that morestack from another module's runtime is called. In the second runtime, runtime.framepointer_enabled is not set by the linker, so the stack scan goes off from there.

If I understand correctly, we suppose to always use the same runtime, right? This seems broken on darwin/amd64.

@aclements
Copy link
Member

We probably shouldn't even allow loading a plugin where the value of framepointer_enabled doesn't match that of the main binary. Even if we're using one runtime, this would make it unable to unwind frames created by the plugin.

@aclements
Copy link
Member

Apparently this is a todo. In plugin/plugin_dlopen.go:

	// TODO(crawshaw): look for plugin note, confirm it is a Go plugin
	// and it was built with the correct toolchain.

(where "correct toolchain" should include the experiments string)

@cherrymui
Copy link
Member

In this case, the plugin is indeed built with the "correct toolchain", and frame pointer is indeed enabled. But the linker doesn't set runtime.framepointer_enabled for runtimes other than the main runtime, presumably because it is not supposed to be used. Always setting it does seem to make the test pass, but I guess it's just hiding the real bug.

@aclements
Copy link
Member

Okay, that makes more sense. (We should still check that it's compatible, but you're right that that's not the problem here.)

@aclements
Copy link
Member

So, this seems closely related to #24351 and #23133, which also indicated we were getting multiple runtimes.

@aclements
Copy link
Member

To clarify, it's expected to have a copy of the runtime in the plugin (e.g., we could have dead-coded parts of it that the plugin needs, which is why in general we need all of the packages the plugin uses in the plugin), but the problem is that we're still seeing multiple copies after dynamic linking. It seems like which copy of the global the code sees depends on which copy of the code is executing.

Some hypotheses @cherrymui and I have come up with:

  1. It might not be specific to this variable or even the runtime. Maybe all global variables are messed up like this and we just haven't noticed. It might require just the right alignment of dead-code elimination and calls from the plugin. Maybe the macOS dynamic linker finds the global variable from the same module as the text. Symbols aren't usually defined by multiple shared objects, so it wouldn't surprise me if there are dark corners here.

  2. Maybe this has something to do with runtime.framepointer_enabled being unexported.

  3. runtime.framepointer_enabled is particularly weird because the linker sets it. The linker's logic happens to not set it for plugins (which is probably a mistake), but one consequence of this is that it winds up in .rodata in the main binary and .noptrbss in the plugin. Maybe the differing sections prevent the dynamic linker from recognizing these as the same symbol.

@cherrymui
Copy link
Member

Running in the debugger

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 3.4
  * frame #0: 0x0000000007616910 plugin2.so`runtime.adjustframe
    frame #1: 0x000000000761dd58 plugin2.so`runtime.tracebackdefers + 216
    frame #2: 0x0000000007617215 plugin2.so`runtime.adjustdefers + 69
    frame #3: 0x000000000761762c plugin2.so`runtime.copystack + 380
    frame #4: 0x0000000007617b35 plugin2.so`runtime.newstack + 789
    frame #5: 0x0000000007624bbf plugin2.so`runtime.morestack + 143
    frame #6: 0x0000000007654750 plugin2.so`plugin2.init + 96
    frame #7: 0x00000000040b42e8 host`plugin.open + 2760
    frame #8: 0x00000000040b33e5 host`plugin.Open + 53
    frame #9: 0x00000000040b5c08 host`main.main + 2328
    frame #10: 0x000000000402d487 host`runtime.main + 519

(lldb) disas -a $pc
plugin2.so`runtime.adjustframe:
...
0x76169ff <+239>:  movq   0x3e91a(%rip), %rdx       ; (void *)0x000000000733996d: runtime.framepointer_enabled
...

which does seem to use GOT to load runtime.framepointer_enabled. However,

(lldb) x/a 0x76169ff+7+0x3e91a
0x07655320: 0x000000000733996d plugin1.so`runtime.framepointer_enabled

It is pointing to plugin1.so (note that we're in plugin2.so), not the main executable. Probably the main executable does not export the symbol, or the difference in segment/section (text/rodata vs. data/noptrbss) makes the dynamic linker resolve to the wrong one.

(lldb) image lookup -s runtime.framepointer_enabled
1 symbols match 'runtime.framepointer_enabled' in ./host:
        Address: host[0x00000000041088f1] (host.__TEXT.__rodata + 332177)
        Summary: host`runtime.framepointer_enabled
1 symbols match 'runtime.framepointer_enabled' in /Users/cherryyz/src/go-tip/misc/cgo/testplugin/plugin1.so:
        Address: plugin1.so[0x000000000012996d] (plugin1.so.__DATA.__noptrbss + 13)
        Summary: plugin1.so`runtime.framepointer_enabled
1 symbols match 'runtime.framepointer_enabled' in /Users/cherryyz/src/go-tip/misc/cgo/testplugin/sub/plugin1.so:
        Address: plugin1.so[0x00000000000c0988] (plugin1.so.__DATA.__noptrbss + 8)
        Summary: plugin1.so`runtime.framepointer_enabled
1 symbols match 'runtime.framepointer_enabled' in /Users/cherryyz/src/go-tip/misc/cgo/testplugin/plugin2.so:
        Address: plugin2.so[0x00000000001318ef] (plugin2.so.__DATA.__noptrbss + 15)
        Summary: plugin2.so`runtime.framepointer_enabled

@cherrymui
Copy link
Member

The symbol is not exported from the main executable:

$ nm host | grep framepointer
00000000041088f1 s runtime.framepointer_enabled

(lowercase s means not exported)

On Linux it is exported:

$ nm host | grep framepointer
000000000057932b R runtime.framepointer_enabled

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/104715 mentions this issue: cmd/link: put runtime.framepointer_enabled in DATA instead of RODATA

@bradfitz
Copy link
Contributor

bradfitz commented Apr 4, 2018

Thanks, @cherrymui!

@randall77
Copy link
Contributor

@gopherbot please consider for backport

@gopherbot
Copy link
Contributor

Backport issue(s) opened: #26202 (for 1.10), #26203 (for 1.9).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

8 participants