Skip to content

runtime: fatal error: bad pointer in write barrier #11689

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mikioh opened this issue Jul 12, 2015 · 18 comments
Closed

runtime: fatal error: bad pointer in write barrier #11689

mikioh opened this issue Jul 12, 2015 · 18 comments
Milestone

Comments

@mikioh
Copy link
Contributor

mikioh commented Jul 12, 2015

See http://build.golang.org/log/aafb6004d46f88ffe84788b31f51d64bcef6d9b1.

runtime: writebarrierptr *0xc82b3ee008 = 0x3
fatal error: bad pointer in write barrier

runtime stack:
runtime.throw(0x5d2e80, 0x1c)
    /tmp/workdir/go/src/runtime/panic.go:527 +0x96
runtime.writebarrierptr.func1()
    /tmp/workdir/go/src/runtime/mbarrier.go:133 +0xb3
runtime.systemstack(0xc820018000)
    /tmp/workdir/go/src/runtime/asm_amd64.s:262 +0x7c
runtime.mstart()
    /tmp/workdir/go/src/runtime/proc1.go:668

goroutine 20 [running]:
runtime.systemstack_switch()
    /tmp/workdir/go/src/runtime/asm_amd64.s:216 fp=0xc8210711d0 sp=0xc8210711c8
runtime.writebarrierptr(0xc82b3ee008, 0x3)
    /tmp/workdir/go/src/runtime/mbarrier.go:134 +0x70 fp=0xc821071200 sp=0xc8210711d0
compress/flate.(*compressor).init(0xc820070240, 0xc8210481b0, 0x3, 0x9, 0x0, 0x0)
    /tmp/workdir/go/src/compress/flate/deflate.go:398 +0xdab fp=0xc821071398 sp=0xc821071200
compress/flate.NewWriter(0xc8210481b0, 0x3, 0x9, 0x0, 0x0, 0x0)
    /tmp/workdir/go/src/compress/flate/deflate.go:487 +0x64 fp=0xc8210713d8 sp=0xc821071398
compress/flate.testSync(0xc8210481b0, 0x3, 0xc82b414000, 0x0, 0x0, 0x5b00f0, 0xe)
    /tmp/workdir/go/src/compress/flate/deflate_test.go:458 +0xaf fp=0xc821071a60 sp=0xc8210713d8
compress/flate.testToFromWithLevelAndLimit(0xc8210481b0, 0x3, 0xc82b414000, 0x186a3, 0x188a3, 0x5b00f0, 0xe, 0xc7ce)
    /tmp/workdir/go/src/compress/flate/deflate_test.go:302 +0xac1 fp=0xc821071d10 sp=0xc821071a60
compress/flate.testToFromWithLimit(0xc8210481b0, 0xc82b414000, 0x186a3, 0x188a3, 0x5b00f0, 0xe, 0x186b2, 0xc5da, 0xc710, 0xc7ce, ...)
    /tmp/workdir/go/src/compress/flate/deflate_test.go:307 +0x83 fp=0xc821071d60 sp=0xc821071d10
compress/flate.TestDeflateInflateString(0xc8210481b0)
    /tmp/workdir/go/src/compress/flate/deflate_test.go:351 +0x256 fp=0xc821071f58 sp=0xc821071d60
testing.tRunner(0xc8210481b0, 0x6906f8)
    /tmp/workdir/go/src/testing/testing.go:455 +0x98 fp=0xc821071f90 sp=0xc821071f58
runtime.goexit()
    /tmp/workdir/go/src/runtime/asm_amd64.s:1696 +0x1 fp=0xc821071f98 sp=0xc821071f90
created by testing.RunTests
    /tmp/workdir/go/src/testing/testing.go:560 +0x86d

goroutine 1 [chan receive]:
testing.RunTests(0x5f9d50, 0x690680, 0x12, 0x12, 0x80067d001)
    /tmp/workdir/go/src/testing/testing.go:561 +0x8ad
testing.(*M).Run(0xc82004cf08, 0x1a000)
    /tmp/workdir/go/src/testing/testing.go:493 +0x70
main.main()
    compress/flate/_test/_testmain.go:160 +0x116
FAIL    compress/flate  0.092s
@mikioh mikioh added this to the Go1.5Maybe milestone Jul 12, 2015
@mikioh
Copy link
Contributor Author

mikioh commented Jul 13, 2015

/cc @aclements

#10603 happend again.

@ianlancetaylor ianlancetaylor modified the milestones: Go1.5, Go1.5Maybe Jul 13, 2015
@ianlancetaylor
Copy link
Contributor

CC @RLH

@spetrovic77
Copy link
Contributor

I'm seeing the exact same problem when running Go on android. It's easily reproducible, though the setup is quite complicated.

@rsc
Copy link
Contributor

rsc commented Jul 21, 2015

This is a write barrier being called to write 3 to the slot:

runtime.writebarrierptr(0xc82b3ee008, 0x3)
    /tmp/workdir/go/src/runtime/mbarrier.go:134 +0x70 fp=0xc821071200 sp=0xc8210711d0

Because init was called with an io.Writer interface holding a 3 instead of a pointer:

compress/flate.(*compressor).init(0xc820070240, 0xc8210481b0, 0x3, 0x9, 0x0, 0x0)
    /tmp/workdir/go/src/compress/flate/deflate.go:398 +0xdab fp=0xc821071398 sp=0xc821071200

Because NewWriter was called with that same io.Writer (and level = 9):

compress/flate.NewWriter(0xc8210481b0, 0x3, 0x9, 0x0, 0x0, 0x0)
    /tmp/workdir/go/src/compress/flate/deflate.go:487 +0x64 fp=0xc8210713d8 sp=0xc821071398

But testSync was called with t=0xc8210481b0 and level=0x3:

compress/flate.testSync(0xc8210481b0, 0x3, 0xc82b414000, 0x0, 0x0, 0x5b00f0, 0xe)
    /tmp/workdir/go/src/compress/flate/deflate_test.go:458 +0xaf fp=0xc821071a60 sp=0xc8210713d8
compress/flate.testToFromWithLevelAndLimit(0xc8210481b0, 0x3, 0xc82b414000, 0x186a3, 0x188a3, 0x5b00f0, 0xe, 0xc7ce)
    /tmp/workdir/go/src/compress/flate/deflate_test.go:302 +0xac1 fp=0xc821071d10 sp=0xc821071a60

It is as if the (t, level) pair was copied directly to the NewWriter argument slots instead of the result of io.MultiWriter. Obviously the compiler isn't doing this all the time, or bad things would be happening.

Maybe a stack barrier at the wrong time, or a bad stack copy?

@rsc
Copy link
Contributor

rsc commented Jul 22, 2015

@spetrovic77 Can you post the full failure dump you get? I want to see how much the two have in common. Thanks.

@rsc
Copy link
Contributor

rsc commented Jul 22, 2015

And if you can make it happen repeatedly, posting three wouldn't hurt. Thanks.

@spetrovic77
Copy link
Contributor

Sorry for the delay, here is the 1st stack:

runtime: writebarrierptr *0x91f2ab60 = 0x1
fatal error: bad pointer in write barrier

runtime stack:
runtime.throw(0xa2555bb8, 0x1c)
/usr/local/google/home/spetrovic/vanadium/third_party/android/go/src/runtime/panic.go:527 +0x78
runtime.writebarrierptr.func1()
/usr/local/google/home/spetrovic/vanadium/third_party/android/go/src/runtime/mbarrier.go:120 +0xac
runtime.systemstack(0x91c80a00)
/usr/local/google/home/spetrovic/vanadium/third_party/android/go/src/runtime/asm_arm.s:239 +0x80
runtime.mstart()
/usr/local/google/home/spetrovic/vanadium/third_party/android/go/src/runtime/proc1.go:668

goroutine 553 [running, locked to thread]:
runtime.systemstack_switch()
/usr/local/google/home/spetrovic/vanadium/third_party/android/go/src/runtime/asm_arm.s:187 +0x4 fp=0x920651c4 sp=0x920651c0
runtime.writebarrierptr(0x91f2ab60, 0x1)
/usr/local/google/home/spetrovic/vanadium/third_party/android/go/src/runtime/mbarrier.go:121 +0x74 fp=0x920651dc sp=0x920651c4
v.io/x/jni/util.jArg(0xb4aed1c0, 0xa23c9870, 0x91f2ab58, 0x91d22760, 0x12, 0x9218eda0, 0x920b3f01)

@spetrovic77
Copy link
Contributor

2nd stack:

runtime: writebarrierptr *0x91ce4ec8 = 0x21
fatal error: bad pointer in write barrier

runtime stack:
runtime.throw(0xa2555bb8, 0x1c)
/usr/local/google/home/spetrovic/vanadium/third_party/android/go/src/runtime/panic.go:527 +0x78
runtime.writebarrierptr.func1()
/usr/local/google/home/spetrovic/vanadium/third_party/android/go/src/runtime/mbarrier.go:120 +0xac
runtime.systemstack(0xaeb01d48)
/usr/local/google/home/spetrovic/vanadium/third_party/android/go/src/runtime/asm_arm.s:239 +0x80
runtime.mstart()
/usr/local/google/home/spetrovic/vanadium/third_party/android/go/src/runtime/proc1.go:668

goroutine 638 [running, locked to thread]:
runtime.systemstack_switch()
/usr/local/google/home/spetrovic/vanadium/third_party/android/go/src/runtime/asm_arm.s:187 +0x4 fp=0x920d7118 sp=0x920d7114
runtime.writebarrierptr(0x91ce4ec8, 0x21)
/usr/local/google/home/spetrovic/vanadium/third_party/android/go/src/runtime/mbarrier.go:121 +0x74 fp=0x920d7130 sp=0x920d7118
v.io/x/jni/util.jArg(0xaec28940, 0xa23be978, 0x920e05c0, 0x91d36160, 0x2, 0x920d73ac, 0x92124fe0)

@spetrovic77
Copy link
Contributor

3rd stack:

runtime: writebarrierptr *0x91f9cb10 = 0x29
fatal error: bad pointer in write barrier

runtime stack:
runtime.throw(0xa2555bb8, 0x1c)
/usr/local/google/home/spetrovic/vanadium/third_party/android/go/src/runtime/panic.go:527 +0x78
runtime.writebarrierptr.func1()
/usr/local/google/home/spetrovic/vanadium/third_party/android/go/src/runtime/mbarrier.go:120 +0xac
runtime.systemstack(0xaf1d1914)
/usr/local/google/home/spetrovic/vanadium/third_party/android/go/src/runtime/asm_arm.s:239 +0x80
runtime.mstart()
/usr/local/google/home/spetrovic/vanadium/third_party/android/go/src/runtime/proc1.go:668

goroutine 453 [running, locked to thread]:
runtime.systemstack_switch()
/usr/local/google/home/spetrovic/vanadium/third_party/android/go/src/runtime/asm_arm.s:187 +0x4 fp=0x91edc83c sp=0x91edc838
runtime.writebarrierptr(0x91f9cb10, 0x29)
/usr/local/google/home/spetrovic/vanadium/third_party/android/go/src/runtime/mbarrier.go:121 +0x74 fp=0x91edc854 sp=0x91edc83c
v.io/x/jni/util.jArg(0xb4b11040, 0xa23be978, 0x91d4fed0, 0x91d28130, 0x2, 0x91edcad0, 0x91fe6340)

@spetrovic77
Copy link
Contributor

I forgot to mention, my Go client is synced to: cc6554f. The problem persists @ head for me, let me know if you want those stacks. (I should have sent those stacks instead...)

@aclements
Copy link
Member

v.io/x/jni/util.jArg

There's a huge amount of unsafe code in this function. It's more likely that the bad write barrier in this case is the fault of jArg, and not the same cause as the original report.

@spetrovic77
Copy link
Contributor

True, this is probably our fault. We'll look into it.

@aclements
Copy link
Member

Here are the occurrences of "bad pointer in write barrier" over the past month. I haven't looked at these to see if they look at all related, and the other two were before some memory corruption fixes (8c3533c and cc8f544).

2015-06-28T21:41:38-d0ed87d/openbsd-amd64-gce56
2015-07-01T16:10:38-596ddf4/nacl-arm-cheney
2015-07-11T07:02:57-d5004ee/freebsd-amd64-gce101

@rsc
Copy link
Contributor

rsc commented Jul 30, 2015

2015-06-28T21:41:38-d0ed87d/openbsd-amd64-gce56 looks like map corruption.
2015-07-01T16:10:38-596ddf4/nacl-arm-cheney looks like the old cmd/link death, now fixed.
2015-07-11T07:02:57-d5004ee/freebsd-amd64-gce101 is this one, with the magically reverted arguments.

I don't see any links here. Possibly the map bug is related to the one we can reproduce on nacl, or possibly not. I suspect the nacl-arm cmd/link bug is fixed. And this one hasn't happened again, so I am going to assume it is fixed or hibernating until we have evidence otherwise.

@rsc rsc closed this as completed Jul 30, 2015
@davecheney
Copy link
Contributor

I've been trying to get the nacl/arm builder back online, spoiler alert,
the build is not passing.

I tried updating to pepper44 yesterday without luck, I'll dump what I know
so far into an issue.

On Fri, 31 Jul 2015 06:36 Russ Cox notifications@github.com wrote:

2015-06-28T21:41:38-d0ed87d/openbsd-amd64-gce56 looks like map corruption.
2015-07-01T16:10:38-596ddf4/nacl-arm-cheney looks like the old cmd/link
death, now fixed.
2015-07-11T07:02:57-d5004ee/freebsd-amd64-gce101 is this one, with the
magically reverted arguments.

I don't see any links here. Possibly the map bug is related to the one we
can reproduce on nacl, or possibly not. I suspect the nacl-arm cmd/link bug
is fixed. And this one hasn't happened again, so I am going to assume it is
fixed or hibernating until we have evidence otherwise.


Reply to this email directly or view it on GitHub
#11689 (comment).

@quentinmit
Copy link
Contributor

Pinging this bug; @balboah's dupe #15831 seems like it could be evidence the bug is still there.

@mikioh
Copy link
Contributor Author

mikioh commented Jun 17, 2016

@quentinmit, I think it's better to ping @aclements on #15831.

@aclements
Copy link
Member

#15831 is unlikely to be related to this bug. (Lots of things can lead to "bad pointer in write barrier". Usually it means that something somewhere conjured a non-pointer into a pointer and then it got passed to the write barrier. The write barrier catches it, but it's not the write barrier's fault.)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants