Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: fatal error: unknown caller pc (libfuzz) #35158

Open
klauspost opened this issue Oct 25, 2019 · 11 comments
Open

runtime: fatal error: unknown caller pc (libfuzz) #35158

klauspost opened this issue Oct 25, 2019 · 11 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@klauspost
Copy link
Contributor

What version of Go are you using (go version)?

go1.13.3.linux.amd64

Compiled through go-fuzz/libfuzz

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

Fuzz test run on https://fuzzit.dev/

Linux, AMD64 is as much as I know

What did you do?

I have 3 crashes crash from "fuzzit.dev" where I am running continuous fuzz testing of my compression packages. Go 1.12.10 was used for 2 builds, Go 1.13.3 for one.

There is no assembly or "unsafe" involved so there shouldn't be any reasonable way for memory corruption. This fuzzing is also strictly running in a single goroutine, so races also seems unlikely.

That said I have no idea about the hardware stability of the servers running the tests.

Also, a lot of new code has just been added here, so there is a chance of something bad, though I don't know how I would be able to trigger this error.

Crash logs: https://gist.github.com/klauspost/d4ec7bd6ecefa1bec56dd8ca4ac8ec39

Go 1.12.10 on top and bottom. Go 1.13.3 in the middle.

It is completely different functions that were pre-empted (flate.(*fastGen).matchlenLong) vs. flate.(*decompressor).Read - completely different code. All crashes were in mgcmark.go:711. Final crash was while executing bytes.(*Buffer).grow.

Crashes have not reproduced locally, so this could be a libfuzz specific problem. Build script is here: https://github.com/klauspost/compress/blob/master/fuzzit.sh#L17 - all crashes have been in the same fuzzer (flate), so it seems something in there is triggering it.

What did you expect to see?

No crash, or more information.

What did you see instead?

Crash.

@klauspost
Copy link
Contributor Author

This does not use unsafe and the only assembler would be in the stdlib. Imports here: https://godoc.org/github.com/klauspost/compress/flate?imports

"sync" is only used for a sync.Once and there are no goroutines, so no races should happen either.

Fuzz test imports: https://godoc.org/github.com/klauspost/compress-fuzz/flate?imports

This unfortunately happens completely at random.

@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 25, 2019
@aclements
Copy link
Member

Just checking if this is related to #35777. What kernel version are you running? Does the application itself receive a lot of signals?

@klauspost
Copy link
Contributor Author

I don't have access to the servers, so I cannot tell you. It is running on https://fuzzit.dev/ - https://twitter.com/fuzzitdev

@klauspost
Copy link
Contributor Author

As Keith Randall noted on the original issue #20846

This looks like the stack has been trashed somehow.
Not only the return address for gopark. gopark's arguments also look trashed. The gcBgMarkWorker failure looks similar, hard to tell for sure if its args are trashed as it has only one arg.

Not sure what might cause this. Could be misuse of unsafe, could be runtime bug (use after free of stack memory?).

@prestonvanloon
Copy link

Same issue in go1.14, more info about our case here: prysmaticlabs/prysm#5131

@prestonvanloon
Copy link

prestonvanloon commented Mar 20, 2020

I thought this might be resolved by #37782 in go 1.14.1, but we saw the issue again last night with libfuzzer on go 1.14.1.

@prestonvanloon
Copy link

I was able to reproduce locally on kernel 5.3.0-45-generic

@futurist
Copy link
Contributor

I can provide some more context about this issue, we use https://github.com/nhooyr/websocket/releases/tag/v1.8.7 to setup a websocket tunnel with default setting(with RFC 7692 permessage-deflate compression on), and we watched some panic in production, the stack trace as below:

runtime: g 79954: unexpected return pc for github.com/klauspost/compress/flate.(*decompressor).moreBits called from 0x1
stack: frame={sp:0xc002c19a30, fp:0xc002c19a68} stack=[0xc002c18000,0xc002c1a000)
0x000000c002c19930:  0x0000000000000001  0x0000000202c8d680 
0x000000c002c19940:  0x0000000000000000  0x0000000000000003 
0x000000c002c19950:  0x000000c002c30640  0x0000000000000000 
0x000000c002c19960:  0x0000000000000003  0x0000000000000004 
0x000000c002c19970:  0x0000000000416c91 <runtime.typedmemclr+0x0000000000000051>  0x000000c002c199b0 
0x000000c002c19980:  0x0000000000484e8e <sync.(*Pool).Get+0x000000000000008e>  0x000000c002944d40 
0x000000c002c19990:  0x000000c002944d20  0x0000000000000000 
0x000000c002c199a0:  0x0000000000c2dbc0  0x000000000144a280 
0x000000c002c199b0:  0x0000000000000000  0x0000000000000000 
0x000000c002c199c0:  0x0000000000000000  0x0000000000000000 
0x000000c002c199d0:  0x000000c002c19a20  0x0000000000453976 <runtime.sigpanic+0x00000000000002f6> 
0x000000c002c199e0:  0x0000000000c2dbc0  0x000000000144a280 
0x000000c002c199f0:  0x0000000002c19a00  0x000000c002c30640 
0x000000c002c19a00:  0x000000c002c19a38  0x0000000000b5c41f <nhooyr.io/websocket.(*msgReader).resetFlate+0x00000000000000bf> 
0x000000c002c19a10:  0x000000c002c19a38  0x0000000000407c65 <runtime.selectnbrecv+0x0000000000000025> 
0x000000c002c19a20:  0x000000c002c19a40  0x0000000000b4cdc0 <github.com/klauspost/compress/flate.(*decompressor).moreBits+0x0000000000000060> 
0x000000c002c19a30: <0x0000000000b4b771 <github.com/klauspost/compress/flate.(*decompressor).nextBlock+0x0000000000000031>  0x000000c002c19ae0 
0x000000c002c19a40:  0x000000c002c19a78  0x0000000000b4b9dc <github.com/klauspost/compress/flate.(*decompressor).Read+0x000000000000007c> 
0x000000c002c19a50:  0x000000c002c30640  0x0000000000cfebf9 
0x000000c002c19a60: !0x0000000000000001 >0x0000000000000000 
0x000000c002c19a70:  0x000000c002c30718  0x000000c002c19ae0 
0x000000c002c19a80:  0x0000000000b5f151 <nhooyr.io/websocket.(*limitReader).Read+0x0000000000000111>  0x000000c002c30640 
0x000000c002c19a90:  0x000000c002def000  0x0000000000001000 
0x000000c002c19aa0:  0x00000000000000f2  0x0000000002df2000 
0x000000c002c19ab0:  0x000000c002df21e0  0x000000c002c19a7c 
0x000000c002c19ac0:  0x000000c002df2360  0x0000000000000000 
0x000000c002c19ad0:  0x000000c002df2180  0x0000000000000000 
0x000000c002c19ae0:  0x000000c002c19b80  0x0000000000b5e6a5 <nhooyr.io/websocket.(*msgReader).Read+0x0000000000000165> 
0x000000c002c19af0:  0x000000c000e274a0  0x000000c002def000 
0x000000c002c19b00:  0x000000c002cd6d00  0x00000000000000f2 
0x000000c002c19b10:  0x0000000002df2000  0x010000c002c19b00 
0x000000c002c19b20:  0x0000000000f6e6a0  0x0000000000000000 
0x000000c002c19b30:  0x0000000000000000  0x0000000000000000 
0x000000c002c19b40:  0x0000000000b5e320 <nhooyr.io/websocket.(*Conn).reader.func2+0x0000000000000000>  0x0000000000000000 
0x000000c002c19b50:  0x0000000000000000  0x0000000000b5eb00 <nhooyr.io/websocket.(*msgReader).Read.func1+0x0000000000000000> 
0x000000c002c19b60:  0x000000c000130440 
fatal error: unknown caller pc

runtime stack:
runtime.throw({0xcfc8fe?, 0x1412580?})
	/usr/local/lib/go/src/runtime/panic.go:1047 +0x5d fp=0xc000443be0 sp=0xc000443bb0 pc=0x43d19d
runtime.gentraceback(0xc002944d20?, 0x10000c000443f00?, 0xc000038500?, 0xc000443fb8?, 0x0, 0x0, 0x7fffffff, 0xc000443fa0, 0x440566?, 0x0)
	/usr/local/lib/go/src/runtime/traceback.go:258 +0x1cf7 fp=0xc000443f50 sp=0xc000443be0 pc=0x462737
runtime.addOneOpenDeferFrame.func1()
	/usr/local/lib/go/src/runtime/panic.go:645 +0x6b fp=0xc000443fc8 sp=0xc000443f50 pc=0x43c32b
runtime.systemstack()
	/usr/local/lib/go/src/runtime/asm_amd64.s:492 +0x49 fp=0xc000443fd0 sp=0xc000443fc8 pc=0x46cb09

goroutine 79954 [running]:
runtime.systemstack_switch()
	/usr/local/lib/go/src/runtime/asm_amd64.s:459 fp=0xc002c198e0 sp=0xc002c198d8 pc=0x46caa0
runtime.addOneOpenDeferFrame(0xc002c19a82?, 0x3?, 0x1?)
	/usr/local/lib/go/src/runtime/panic.go:644 +0x69 fp=0xc002c19920 sp=0xc002c198e0 pc=0x43c269
panic({0xc2dbc0, 0x144a280})
	/usr/local/lib/go/src/runtime/panic.go:844 +0x112 fp=0xc002c199e0 sp=0xc002c19920 pc=0x43cab2
runtime.panicmem(...)
	/usr/local/lib/go/src/runtime/panic.go:260
runtime.sigpanic()
	/usr/local/lib/go/src/runtime/signal_unix.go:843 +0x2f6 fp=0xc002c19a30 sp=0xc002c199e0 pc=0x453976

Seems the source of problem is from this method:

https://github.com/klauspost/compress/blob/788b7f06fee85b7e1d2aa4a3a86f8dbbbcc771ae/flate/inflate.go#L865


Go version: 1.19.4
OS: Debian 11
Kernel: 4.14.81.bm.30-amd64 #1 SMP Debian 4.14.81.bm.30 Thu May 6 03:23:40 UTC 2021 x86_64 GNU/Linux

@klauspost
Copy link
Contributor Author

@futurist I am not sure if this is related. My best bet would be there is a problem with the sync.Pool reuse mechanics causing a race. Unfortunately the project seems pretty dead.

This issue should probably be closed since there is now built in fuzzing.

@futurist
Copy link
Contributor

futurist commented Feb 3, 2023

@klauspost That way should I repost the error into https://github.com/klauspost/compress as a new issue? If it's really not related to the go runtime.
Or maybe as a new go issue?

@klauspost
Copy link
Contributor Author

@futurist It is not a problem in the compression package. The stack looks messed up.

It looks like either a problem in the package you are using or a runtime problem. I can't tell you which.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Status: Triage Backlog
Development

No branches or pull requests

7 participants