-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: fatal error: found bad pointer in Go heap #26243
Comments
Which version of CentOS are you running? Which kernel version? This looks like memory corruption. Have you tried running your program under the race detector? See https://blog.golang.org/race-detector . |
CentOS Linux release 7.4.1708 (Core) Linux version 3.10.0-693.21.1.el7.x86_64 |
When I running program under the race detector there nothing output. |
Are you using cgo or unsafe? |
Is there a way that we can reproduce the problem ourselves? |
Yes, sometimes. (0x102cea0,0xc420628720) runtime stack: goroutine 50 [GC worker (idle)]: |
@davecheney @bradfitz @ianlancetaylor We are very distressed about it. |
@wgliang please respond to @ianlancetaylor 's request, #26243 (comment) |
It's a large project, even in the testing phase, we can't use the -race parameter to scan the stack all the time, it is also a burden for us. |
@ianlancetaylor |
I didn't see a clear answer to whether you use cgo or unsafe. From the limited information we have the most natural guess would be that your program is somehow producing invalid pointer values, which most commonly happens due to a violation of the cgo pointer passing rules (https://golang.org/cmd/cgo/#hdr-Passing_pointers). Would it be possible for you to run your program with the environment variable |
Just hit this on a trybot, I think. https://storage.googleapis.com/go-build-log/18080916/freebsd-amd64-12_0_e6698dd7.log |
CC @aclements See trybot link just above. |
@aclements, feel free to delegate if you're swamped on other things, but assigning to you by default for runtime. |
The error in the trybot failure is reminiscent of #24993 which was fixed 13 days ago. Since it happened on freebsd, it's also possible it's related to #28054. The failure in this issue is from go1.10, however, so it's unlikely to be related to either of those or the trybot failure. @kevinburke the hash at the top of that trybot run isn't in any branch I receive from a |
@mknyszek, you'd see if it you did:
Or you can search like: https://go-review.googlesource.com/q/18080916300fb1a035d03703f481224cb6bce9ca Which redirects to: https://go-review.googlesource.com/c/go/+/154423 (where it was PS3) |
@aclements Do you think this should be a release blocker? I am leaning towards no, since it’s extremely rare, does not feel like an easy fix, and has been an issue since 1.10. |
@bradfitz thanks for the tip! I'll note that for the future. @FiloSottile just to be precise, it's unclear at this point if this bug is related to anything but go1.10, or if it's a bug in the runtime at all. The linked trybot failure I think more information is needed before we can label it a release blocker, but I'll let @aclements have the final say. To summarize what others have asked for in this thread:
If I were to take a guess, and also assume that this is a bug in the runtime that's manifest, the most recent issue which has a similar-looking failure based on the stack traces provided above is #29362 whose fix has been backported to 1.10 already (#29567). The similarities in the stack traces between that issue and this one is a little subtle because in go1.10 there was |
In both of the provided traces, the pointer appears to be a legal Go heap pointer and we've already freed the span to which it points. In the first trace, we probably freed it recently since the pointer is still within the span bounds. In the second trace, the mspan has already been reused for another region of memory, since the bad pointer doesn't even fall into its bounds (so we picked up a stale span pointer from the spans array). Also in both cases, the object containing the bad pointer looks like it's probably a slice, and the pointer to its backing array is bad. If true, this is interesting because that pointer is largely hidden from user Go code. @wgliang, in addition to @mknyszek's question, could you check if your code uses @FiloSottile, given that we need a lot more information to debug this, and the original report is from a fairly old version of Go, I'm going to drop release-blocker from this. |
Timed out in state WaitingForInfo. Closing. (I am just a bot, though. Please speak up if this is a mistake or you have the requested information.) |
Please answer these questions before submitting your issue. Thanks!
What version of Go are you using (
go version
)?go version go1.10 linux/amd64
Does this issue reproduce with the latest release?
I'm not sure.
What operating system and processor architecture are you using (
go env
)?GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/data/go"
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build690544410=/tmp/go-build -gno-record-gcc-switches"
What did you do?
On a test machine, from time to time crash, the problem was not found elsewhere. This test machine is a virtual machine.
runtime: pointer 0xc423d9fe30 to unallocated span idx=0x1ecf span.base()=0xc423d9e000 span.limit=0xc423da6000 span.state=3
runtime: found in object at *(0xc424cd7480+0x0)
object=0xc424cd7480 k=0x621266b s.base()=0xc424cd6000 s.limit=0xc424cd8000 s.spanclass=6 s.elemsize=32 s.state=_MSpanInUse
*(object+0) = 0xc423d9fe30 <==
*(object+8) = 0x1f
*(object+16) = 0x1f
*(object+24) = 0x0
fatal error: found bad pointer in Go heap (incorrect use of unsafe or cgo?)
runtime stack:
runtime.throw(0xf4d6b0, 0x3e)
/usr/local/go/src/runtime/panic.go:619 +0x81 fp=0x7ffce1cc2e88 sp=0x7ffce1cc2e68 pc=0x42cd81
runtime.heapBitsForObject(0xc423d9fe30, 0xc424cd7480, 0x0, 0xc41fd9945b, 0xc400000000, 0x7fdf0407c3f0, 0xc420047c70, 0xa4)
/usr/local/go/src/runtime/mbitmap.go:425 +0x473 fp=0x7ffce1cc2ee0 sp=0x7ffce1cc2e88 pc=0x414293
runtime.scanobject(0xc424cd7480, 0xc420047c70)
/usr/local/go/src/runtime/mgcmark.go:1209 +0x251 fp=0x7ffce1cc2f88 sp=0x7ffce1cc2ee0 pc=0x41f551
runtime.gcDrain(0xc420047c70, 0xd)
/usr/local/go/src/runtime/mgcmark.go:965 +0x237 fp=0x7ffce1cc2fe0 sp=0x7ffce1cc2f88 pc=0x41ed37
runtime.gcBgMarkWorker.func2()
/usr/local/go/src/runtime/mgc.go:1865 +0x187 fp=0x7ffce1cc3020 sp=0x7ffce1cc2fe0 pc=0x457107
runtime.systemstack(0x0)
/usr/local/go/src/runtime/asm_amd64.s:409 +0x79 fp=0x7ffce1cc3028 sp=0x7ffce1cc3020 pc=0x459499
runtime.mstart()
/usr/local/go/src/runtime/proc.go:1170 fp=0x7ffce1cc3030 sp=0x7ffce1cc3028 pc=0x431440
What did you expect to see?
What did you see instead?
The text was updated successfully, but these errors were encountered: