Skip to content

runtime: fatal error: MSpanList_Remove #14831

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bits01 opened this issue Mar 15, 2016 · 11 comments
Closed

runtime: fatal error: MSpanList_Remove #14831

bits01 opened this issue Mar 15, 2016 · 11 comments

Comments

@bits01
Copy link

bits01 commented Mar 15, 2016

Please answer these questions before submitting your issue. Thanks!

  1. What version of Go are you using (go version)?
    go version go1.6 linux/amd64
  2. What operating system and processor architecture are you using (go env)?
    linux/amd64, CentOS Linux release 7.1.1503 (Core)
  3. What did you do?
    If possible, provide a recipe for reproducing the error.
    A complete runnable program is good.
    A link on play.golang.org is best.

Long running process crashes with:

failed MSpanList_Remove 0x7fb057ebfee8 0x7fb057fa2b30 0x8a2ed0 0x8a2ee0
fatal error: MSpanList_Remove

runtime stack:
runtime.throw(0x758680, 0x10)
    /home/me/go/src/runtime/panic.go:530 +0x90
runtime.(*mSpanList).remove(0x8a2ee0, 0x7fb057ebfee8)
    /home/me/go/src/runtime/mheap.go:911 +0x1ad
runtime.(*mcentral).freeSpan(0x8a2ec0, 0x7fb057ebfee8, 0x10, 0xc82062e000, 0xc82062fe00, 0xc820018200, 0xc8200a0300)
    /home/me/go/src/runtime/mcentral.go:178 +0x156
runtime.(*mspan).sweep(0x7fb057ebfee8, 0x300000000, 0xc800000001)
    /home/me/go/src/runtime/mgcsweep.go:319 +0x5e7
runtime.sweepone(0x437db2)
    /home/me/go/src/runtime/mgcsweep.go:112 +0x23e
runtime.gosweepone.func1()
    /home/me/go/src/runtime/mgcsweep.go:124 +0x21
runtime.systemstack(0xc82001b500)
    /home/me/go/src/runtime/asm_amd64.s:291 +0x79
runtime.mstart()
    /home/me/go/src/runtime/proc.go:1048

goroutine 18 [running]:
runtime.systemstack_switch()
    /home/me/go/src/runtime/asm_amd64.s:245 fp=0xc820062758 sp=0xc820062750
runtime.gosweepone(0x0)
    /home/me/go/src/runtime/mgcsweep.go:125 +0x3d fp=0xc820062780 sp=0xc820062758
runtime.bgsweep(0xc8200aa000)
    /home/me/go/src/runtime/mgcsweep.go:66 +0xb6 fp=0xc8200627b8 sp=0xc820062780
runtime.goexit()
    /home/me/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc8200627c0 sp=0xc8200627b8
created by runtime.gcenable
    /home/me/go/src/runtime/mgc.go:191 +0x53
  1. What did you expect to see?
    Not a crash
  2. What did you see instead?
    A crash
@bradfitz bradfitz changed the title fatal error: MSpanList_Remove runtime: fatal error: MSpanList_Remove Mar 15, 2016
@ianlancetaylor ianlancetaylor added this to the Go1.6.1 milestone Mar 15, 2016
@bits01
Copy link
Author

bits01 commented Mar 15, 2016

I have also seen the program simply hang with this trace:

fatal error: unexpected signal during runtime execution
[signal 0xb code=0x1 addr=0xeb pc=0x41b1fb]

goroutine 5 [running]:
runtime.throw(0x5d5e00, 0x2a)
    /home/me/go/src/runtime/panic.go:530 +0x90 fp=0xc820064ea8 sp=0xc820064e90
runtime.sigpanic()
    /home/me/go/src/runtime/sigpanic_unix.go:12 +0x5a fp=0xc820064ef8 sp=0xc820064ea8
runtime.gcMarkRootCheck()
    /home/me/go/src/runtime/mgcmark.go:86 +0xeb fp=0xc820064f20 sp=0xc820064ef8
runtime.gcMarkDone()
    /home/me/go/src/runtime/mgc.go:1066 +0xcf fp=0xc820064f40 sp=0xc820064f20
runtime.gcBgMarkWorker(0xc820016000)
    /home/me/go/src/runtime/mgc.go:1479 +0x448 fp=0xc820064fb8 sp=0xc820064f40
runtime.goexit()
    /home/me/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc820064fc0 sp=0xc820064fb8
created by runtime.gcBgMarkStartWorkers
    /home/me/go/src/runtime/mgc.go:1329 +0x92

@bits01
Copy link
Author

bits01 commented Mar 15, 2016

I should note that I'm doing some IOCTL calls during the lifetime of the program. I think I narrowed it down to one IOCTL call that seems to be causing it, but not all the time. I tried forcing a runtime.GC() call right after the IOCTL call and it reproduces faster. I will have to review that code some more and make sure the kernel is not writing past the buffer I'm passing to it (buffer is allocated in Go, just a []byte) and corrupting something in the process.

@davecheney
Copy link
Contributor

@bits01 please provide a runnable sample program that demonstrates the issue so we can try to reproduce it.

@ianlancetaylor
Copy link
Contributor

Looks like the memory span had an empty free list but was on the nonempty span list. The traceback shows the call to c.empty.remove in mcentral.go:mcentral.freeSpan:

    wasempty := s.freelist.ptr() == nil
    ...
    if wasempty {
        c.empty.remove(s)
        c.nonempty.insert(s)
    }

The print before the panic shows that the span's list field, expected to point to c.empty, instead points to c.nonempty.

I don't understand this code. It's clear that mallocgc can leave a span's freelist set to nil, and I don't see anything moving that span from the nonempty list to the empty list in that case.

@bits01
Copy link
Author

bits01 commented Mar 15, 2016

@davecheney Unfortunately it's not easy to provide sample code because it involves proprietary code and hardware that requires IOCTLs to talk to. But any suggestions on how to troubleshoot as much as I can on my side are appreciated.

@ianlancetaylor
Copy link
Contributor

Certainly memory corruption would explain this problem....

@bits01
Copy link
Author

bits01 commented Mar 15, 2016

I will continue looking on my side, hopefully the problem is over here and not with the Go runtime.

@bits01
Copy link
Author

bits01 commented Mar 16, 2016

Sorry for the false alarm. Turns out it was a 4 byte kernel driver buffer overrun that caused the occasional memory corruption.

@bits01 bits01 closed this as completed Mar 16, 2016
@googollee
Copy link

I meet this issue too. And I'm sure I didn't change any preference of kernel.

Our program is a long TCP connection server, which receiving data from a TCP and forwarding it to other. When it upgrade to 1.6, it run a long time then panic with same error.

  1. What version of Go are you using (go version)?
    go version go1.6 linux/amd64
  2. What operating system and processor architecture are you using (go env)?
    Linux xxxx 3.19.0-25-generic unknown pc's #26~14.04.1-Ubuntu SMP Fri Jul 24 21:16:20 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
  3. What did you do?
    Our program is a long TCP connection server, which receiving data from a TCP and forwarding it to other. When it upgrade to 1.6, it run a long time then panic with same error. Log attached below:
failed MSpanList_Remove 0x7f27485cf7f0 0x7f27481d3aa8 0xb6f170 0xb6f180
fatal error: MSpanList_Remove

runtime stack:
runtime.throw(0x947fd0, 0x10)
        /opt/go1.6/src/runtime/panic.go:530 +0x90
runtime.(*mSpanList).remove(0xb6f180, 0x7f27485cf7f0)
        /opt/go1.6/src/runtime/mheap.go:911 +0x1ad
runtime.(*mcentral).freeSpan(0xb6f160, 0x7f27485cf7f0, 0x16, 0xc8cbdc4080, 0xc8cbdc5b80, 0xc847e62100, 0x100000001)
        /opt/go1.6/src/runtime/mcentral.go:178 +0x131
runtime.(*mspan).sweep(0x7f27485cf7f0, 0x3efd50003ef00, 0xc800021b01)
        /opt/go1.6/src/runtime/mgcsweep.go:319 +0x613
runtime.sweepone(0x4)
        /opt/go1.6/src/runtime/mgcsweep.go:112 +0x23e
runtime.gosweepone.func1()
        /opt/go1.6/src/runtime/mgcsweep.go:124 +0x21
runtime.systemstack(0x7f272ab15db8)
        /opt/go1.6/src/runtime/asm_amd64.s:307 +0xab
runtime.gosweepone(0xb6eb08)
        /opt/go1.6/src/runtime/mgcsweep.go:125 +0x3d
runtime.deductSweepCredit(0x8000, 0x0)
        /opt/go1.6/src/runtime/mgcsweep.go:384 +0xc6
runtime.(*mcentral).cacheSpan(0xb70a50, 0x7f27703fd000)
        /opt/go1.6/src/runtime/mcentral.go:36 +0x56
runtime.(*mcache).refill(0x7f278da73000, 0xc900000042, 0xc867dc1380)
        /opt/go1.6/src/runtime/mcache.go:119 +0xcc
runtime.mallocgc.func2()
        /opt/go1.6/src/runtime/malloc.go:642 +0x2b
runtime.systemstack(0xc82002b500)
        /opt/go1.6/src/runtime/asm_amd64.s:291 +0x79
runtime.mstart()
        /opt/go1.6/src/runtime/proc.go:1048

@bradfitz
Copy link
Contributor

@googolee, this issue is closed and we don't re-use issues. Please open a new one. You can reference this one.

@googollee
Copy link

ok

@adg adg removed this from the Go1.6.1 milestone Apr 7, 2016
@golang golang locked and limited conversation to collaborators Apr 8, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants