Skip to content

testing: go 1.17 regression: deadlock in test suite #48402

Closed
@tbonfort

Description

@tbonfort

What version of Go are you using (go version)?

$ go version
go version go1.17 linux/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/tbonfort/.cache/go-build"
GOENV="/home/tbonfort/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/tbonfort/go/pkg/mod"
GOOS="linux"
GOPATH="/home/tbonfort/go"
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.17"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build2515039918=/tmp/go-build -gno-record-gcc-switches"

What did you do?

My cgo testsuite deadlocks, commenting out a seemingly random cgo call from 2 tests before the deadlock happens fixes the issue.

Failing test run:

...snip...
=== RUN   TestOpen
--- PASS: TestOpen (0.00s)
=== RUN   TestOpenUpdate
--- PASS: TestOpenUpdate (0.00s)
=== RUN   TestClosingErrors
--- PASS: TestClosingErrors (0.00s)
=== RUN   TestOpenShared
--- PASS: TestOpenShared (0.00s)
=== RUN   TestRegister
--- PASS: TestRegister (0.00s)
^Csignal: interrupt
FAIL	github.com/airbusgeo/godal	9.929s

Commenting out this line in the TestOpenShared test fixes the deadlock:
https://github.com/airbusgeo/godal/blob/00980a3df723677b2d949a10eedd78ab6c5d6aee/godal_test.go#L1242

Here is a gdb stacktrace when deadlocked:

sudo gdb ~/dev/godal/godal.test 1904477
GNU gdb (Ubuntu 10.1-2ubuntu2) 10.1.90.20210411-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/tbonfort/dev/godal/godal.test...
Attaching to program: /home/tbonfort/dev/godal/godal.test, process 1904477
[New LWP 1904478]
[New LWP 1904479]
[New LWP 1904480]
[New LWP 1904481]
[New LWP 1904482]
Error while reading shared library symbols for /lib/x86_64-linux-gnu/libpthread.so.0:
Cannot find user-level thread for LWP 1904482: generic error
runtime.futex () at /usr/local/go/src/runtime/sys_linux_amd64.s:520
520		MOVL	AX, ret+40(FP)
Loading Go Runtime support.
(gdb) bt
#0  runtime.futex () at /usr/local/go/src/runtime/sys_linux_amd64.s:520
#1  0x000000000043b6f6 in runtime.futexsleep (addr=0xfffffffffffffe00, val=2, ns=4670691) at /usr/local/go/src/runtime/os_linux.go:44
#2  0x00000000004155f9 in runtime.lock2 (l=0xfffffffffffffe00) at /usr/local/go/src/runtime/lock_futex.go:107
#3  0x0000000000410318 in runtime.lockWithRank (l=0x4744e3 <runtime.futex+35>, rank=<optimized out>)
    at /usr/local/go/src/runtime/lockrank_off.go:23
#4  runtime.lock (l=0x4744e3 <runtime.futex+35>) at /usr/local/go/src/runtime/lock_futex.go:48
#5  runtime.chanrecv (c=0xc0001cd340, ep=0xc00025fb67, block=true) at /usr/local/go/src/runtime/chan.go:508
#6  0x0000000000410198 in runtime.chanrecv1 (c=0xfffffffffffffe00, elem=0x2) at /usr/local/go/src/runtime/chan.go:439
#7  0x0000000000513d95 in testing.(*T).Run (t=0xc0003b41a0, name=..., f={void (testing.T *)} 0xc00025fbd8)
    at /usr/local/go/src/testing/testing.go:1307
#8  0x00000000005159ae in testing.runTests.func1 (t=0xc000283040) at /usr/local/go/src/testing/testing.go:1598
#9  0x0000000000513142 in testing.tRunner (t=0xc000283040, fn={void (testing.T *)} 0xc00025fc68)
    at /usr/local/go/src/testing/testing.go:1259
#10 0x000000000051585f in testing.runTests (matchString={void (string, string, bool *, error *)} 0xc00025fd10, 
    tests=[]testing.InternalTest = {...}, deadline=...) at /usr/local/go/src/testing/testing.go:1596
#11 0x00000000005145bd in testing.(*M).Run (m=0xc00011c800, code=0) at /usr/local/go/src/testing/testing.go:1504
#12 0x00000000009ca56b in main.main () at _testmain.go:177
(gdb) thread apply all bt

Thread 6 (LWP 1904482 "godal.test"):
#0  runtime.futex () at /usr/local/go/src/runtime/sys_linux_amd64.s:520
#1  0x000000000043b6f6 in runtime.futexsleep (addr=0xfffffffffffffe00, val=0, ns=4670691) at /usr/local/go/src/runtime/os_linux.go:44
#2  0x0000000000415947 in runtime.notesleep (n=0xfffffffffffffe00) at /usr/local/go/src/runtime/lock_futex.go:160
#3  0x00000000004461d1 in runtime.templateThread () at /usr/local/go/src/runtime/proc.go:2385
#4  0x0000000000444c93 in runtime.mstart1 () at /usr/local/go/src/runtime/proc.go:1407
#5  0x0000000000444bd9 in runtime.mstart0 () at /usr/local/go/src/runtime/proc.go:1365
#6  0x0000000000470445 in runtime.mstart () at /usr/local/go/src/runtime/asm_amd64.s:248
#7  0x0000000000474b65 in runtime.mstart () at <autogenerated>:1
#8  0x00000000009d00b2 in crosscall_amd64 () at gcc_amd64.S:40
#9  0x00007fb86affd640 in ?? ()
#10 0x0000000000000000 in ?? ()

Thread 5 (LWP 1904481 "godal.test"):
#0  runtime.epollwait () at /usr/local/go/src/runtime/sys_linux_amd64.s:666
#1  0x000000000043b45c in runtime.netpoll (delay=<optimized out>) at /usr/local/go/src/runtime/netpoll_epoll.go:127
#2  0x00000000004474d3 in runtime.findrunnable () at /usr/local/go/src/runtime/proc.go:2947
#3  0x00000000004486d9 in runtime.schedule () at /usr/local/go/src/runtime/proc.go:3367
#4  0x0000000000448c2d in runtime.park_m (gp=0xc000282340) at /usr/local/go/src/runtime/proc.go:3516
#5  0x00000000004704c3 in runtime.mcall () at /usr/local/go/src/runtime/asm_amd64.s:307
#6  0xf934f160b63ffd00 in ?? ()
#7  0x0000000000800000 in text/template/parse.(*lexer).errorf (l=0x0, format=..., args=...) at /usr/local/go/src/text/template/parse/lex.go:188
#8  text/template/parse.lexComment (l=0x0) at /usr/local/go/src/text/template/parse/lex.go:319
#9  0x0000000000000000 in ?? ()

Thread 4 (LWP 1904480 "godal.test"):
#0  runtime.futex () at /usr/local/go/src/runtime/sys_linux_amd64.s:520
--Type <RET> for more, q to quit, c to continue without paging-- 
#1  0x000000000043b6f6 in runtime.futexsleep (addr=0xfffffffffffffe00, val=0, ns=4670691) at /usr/local/go/src/runtime/os_linux.go:44
#2  0x0000000000415947 in runtime.notesleep (n=0xfffffffffffffe00) at /usr/local/go/src/runtime/lock_futex.go:160
#3  0x0000000000444d8a in runtime.mPark () at /usr/local/go/src/runtime/proc.go:1441
#4  0x00000000004462d8 in runtime.stopm () at /usr/local/go/src/runtime/proc.go:2408
#5  0x00000000004477a5 in runtime.findrunnable () at /usr/local/go/src/runtime/proc.go:2984
#6  0x00000000004486d9 in runtime.schedule () at /usr/local/go/src/runtime/proc.go:3367
#7  0x0000000000448c2d in runtime.park_m (gp=0xc0000011e0) at /usr/local/go/src/runtime/proc.go:3516
#8  0x00000000004704c3 in runtime.mcall () at /usr/local/go/src/runtime/asm_amd64.s:307
#9  0xf934f160b63ffd00 in ?? ()
#10 0x0000000000800000 in text/template/parse.(*lexer).errorf (l=0x0, format=..., args=...) at /usr/local/go/src/text/template/parse/lex.go:188
#11 text/template/parse.lexComment (l=0x0) at /usr/local/go/src/text/template/parse/lex.go:319
#12 0x0000000000000000 in ?? ()

Thread 3 (LWP 1904479 "godal.test"):
#0  runtime.futex () at /usr/local/go/src/runtime/sys_linux_amd64.s:520
#1  0x000000000043b6f6 in runtime.futexsleep (addr=0xfffffffffffffe00, val=2, ns=4670691) at /usr/local/go/src/runtime/os_linux.go:44
#2  0x00000000004155f9 in runtime.lock2 (l=0xfffffffffffffe00) at /usr/local/go/src/runtime/lock_futex.go:107
#3  0x000000000040f55a in runtime.lockWithRank (l=0x4744e3 <runtime.futex+35>, rank=<optimized out>) at /usr/local/go/src/runtime/lockrank_off.go:23
#4  runtime.lock (l=0x4744e3 <runtime.futex+35>) at /usr/local/go/src/runtime/lock_futex.go:48
#5  runtime.chansend (c=0xc0001cd340, ep=0xc000406e4f, block=true, callerpc=<optimized out>) at /usr/local/go/src/runtime/chan.go:200
#6  0x000000000040f47d in runtime.chansend1 (c=0xfffffffffffffe00, elem=0x2) at /usr/local/go/src/runtime/chan.go:143
#7  0x00000000005139e8 in testing.tRunner.func1.1 () at /usr/local/go/src/testing/testing.go:1189
#8  0x00000000005134e7 in testing.tRunner.func1 () at /usr/local/go/src/testing/testing.go:1250
#9  0x000000000051317b in testing.tRunner (t=0xc0003b41a0, fn={void (testing.T *)} 0xc000406fc8) at /usr/local/go/src/testing/testing.go:1265
#10 0x0000000000513e4a in testing.(*T).Run·dwrap·21 () at /usr/local/go/src/testing/testing.go:1306
#11 0x0000000000472701 in runtime.goexit () at /usr/local/go/src/runtime/asm_amd64.s:1581
--Type <RET> for more, q to quit, c to continue without paging--
#12 0x0000000000000000 in ?? ()

Thread 2 (LWP 1904478 "godal.test"):
#0  runtime.usleep () at /usr/local/go/src/runtime/sys_linux_amd64.s:146
#1  0x000000000044d2d1 in runtime.sysmon () at /usr/local/go/src/runtime/proc.go:5337
#2  0x0000000000444c93 in runtime.mstart1 () at /usr/local/go/src/runtime/proc.go:1407
#3  0x0000000000444bd9 in runtime.mstart0 () at /usr/local/go/src/runtime/proc.go:1365
#4  0x0000000000470445 in runtime.mstart () at /usr/local/go/src/runtime/asm_amd64.s:248
#5  0x0000000000474b65 in runtime.mstart () at <autogenerated>:1
#6  0x00000000009d00b2 in crosscall_amd64 () at gcc_amd64.S:40
#7  0x00007fb871623640 in ?? ()
#8  0x0000000000000000 in ?? ()

Thread 1 (LWP 1904477 "godal.test"):
#0  runtime.futex () at /usr/local/go/src/runtime/sys_linux_amd64.s:520
#1  0x000000000043b6f6 in runtime.futexsleep (addr=0xfffffffffffffe00, val=2, ns=4670691) at /usr/local/go/src/runtime/os_linux.go:44
#2  0x00000000004155f9 in runtime.lock2 (l=0xfffffffffffffe00) at /usr/local/go/src/runtime/lock_futex.go:107
#3  0x0000000000410318 in runtime.lockWithRank (l=0x4744e3 <runtime.futex+35>, rank=<optimized out>) at /usr/local/go/src/runtime/lockrank_off.go:23
#4  runtime.lock (l=0x4744e3 <runtime.futex+35>) at /usr/local/go/src/runtime/lock_futex.go:48
#5  runtime.chanrecv (c=0xc0001cd340, ep=0xc00025fb67, block=true) at /usr/local/go/src/runtime/chan.go:508
#6  0x0000000000410198 in runtime.chanrecv1 (c=0xfffffffffffffe00, elem=0x2) at /usr/local/go/src/runtime/chan.go:439
#7  0x0000000000513d95 in testing.(*T).Run (t=0xc0003b41a0, name=..., f={void (testing.T *)} 0xc00025fbd8) at /usr/local/go/src/testing/testing.go:1307
#8  0x00000000005159ae in testing.runTests.func1 (t=0xc000283040) at /usr/local/go/src/testing/testing.go:1598
#9  0x0000000000513142 in testing.tRunner (t=0xc000283040, fn={void (testing.T *)} 0xc00025fc68) at /usr/local/go/src/testing/testing.go:1259
#10 0x000000000051585f in testing.runTests (matchString={void (string, string, bool *, error *)} 0xc00025fd10, tests=[]testing.InternalTest = {...}, deadline=...) at /usr/local/go/src/testing/testing.go:1596
--Type <RET> for more, q to quit, c to continue without paging--
#11 0x00000000005145bd in testing.(*M).Run (m=0xc00011c800, code=0) at /usr/local/go/src/testing/testing.go:1504
#12 0x00000000009ca56b in main.main () at _testmain.go:177
(gdb) q
A debugging session is active.

	Inferior 1 [process 1904477] will be detached.

Quit anyway? (y or n) y
Detaching from program: /home/tbonfort/dev/godal/godal.test, process 1904477
[Inferior 1 (process 1904477) detached]

I have bisected the issue down to 1c59066 (from @bcmills). Reverting this specific commit in master fixes the deadlock.

Metadata

Metadata

Assignees

No one assigned

    Labels

    FrozenDueToAgeNeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.WaitingForInfoIssue is not actionable because of missing required information, which needs to be provided.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions