Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: failures with signal arrived during cgo execution #60132

Open
bcmills opened this issue May 11, 2023 · 15 comments
Open

net: failures with signal arrived during cgo execution #60132

bcmills opened this issue May 11, 2023 · 15 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-FreeBSD
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented May 11, 2023

#!watchflakes
post <- goos == "freebsd" && pkg == "net" && `signal arrived during cgo execution`
@bcmills bcmills added the compiler/runtime Issues related to the Go compiler and/or runtime. label May 11, 2023
@bcmills bcmills changed the title net: net: failures with signal arrived during cgo execution May 11, 2023
@bcmills
Copy link
Contributor Author

bcmills commented May 11, 2023

One of these was reported in #27992 (comment), but doesn't match the prior failure mode for which that issue was created.

I saw another in a TryBot in https://storage.googleapis.com/go-build-log/55480854/freebsd-amd64-12_3_a94bfe2e.log.

@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 11, 2023
@mknyszek
Copy link
Contributor

Yeah, that definitely looks different from #27992, but I'm not sure this is necessarily going to be a C&RT issue? It seems like there's a segfault in C code during a net package test. signal arrived during cgo execution is just indicating that the crash happened in cgo, so the stack is going to be truncated. Tentatively removing the compiler/runtime label, but feel free to add it back if you disagree.

@mknyszek mknyszek removed the compiler/runtime Issues related to the Go compiler and/or runtime. label May 17, 2023
@bcmills
Copy link
Contributor Author

bcmills commented May 17, 2023

signal arrived during cgo execution is just indicating that the crash happened in cgo, so the stack is going to be truncated.

Ah, I see. That makes it tricky to track down the actual failure, though — is there some way we can provide the equivalent of a default runtime.SetCgoTraceback for the cgo dependencies in the standard library? (Can we make assumptions about system libraries that allow for something simpler than the full generality of SetCgoTraceback that we need for user C code?)

(CC @ianlancetaylor)

@ianlancetaylor
Copy link
Contributor

If anything, system libraries are harder to get a traceback from, because they are always heavily optimized and because often the debug info is stored somewhere else. (Without the debug info the traceback is close to useless, it's just a list of PC values.)

For debugging purposes a blank import of github.com/ianlancetaylor/cgosymbolizer will often get a C backtrace, but I really can't recommend making that part of the Go standard library. It's 17,000 lines of C code.

@gopherbot
Copy link
Contributor

Found new dashboard test flakes for:

#!watchflakes
post <- pkg == "net" && `signal arrived during cgo execution`
2023-05-10 22:43 freebsd-amd64-12_3 go@639957eb net.TestLookupDotsWithRemoteSource (log)
SIGSEGV: segmentation violation
PC=0x8009f9f1c m=0 sigcode=1
signal arrived during cgo execution

rax    0xc2c3
rbx    0x437e
rcx    0xc2c3
rdx    0xffffffffffffffff
rdi    0x0
rsi    0x0
...
	/tmp/workdir/go/src/net/lookup_unix.go:96 +0xa5 fp=0xc00032bd98 sp=0xc00032bd38 pc=0x561325
net.(*Resolver).LookupCNAME(0xc000392310?, {0x6dcc10?, 0x879980?}, {0x685212, 0xb})
	/tmp/workdir/go/src/net/lookup.go:472 +0x2b fp=0xc00032bde0 sp=0xc00032bd98 pc=0x55deeb
net.LookupCNAME(...)
	/tmp/workdir/go/src/net/lookup.go:455
net.testDots(0xc0001df380, {0x682ee0, 0x3})
	/tmp/workdir/go/src/net/lookup_test.go:676 +0x12e fp=0xc00032bef0 sp=0xc00032bde0 pc=0x5b918e
net.TestLookupDotsWithRemoteSource(0xc0001df380)
	/tmp/workdir/go/src/net/lookup_test.go:658 +0x157 fp=0xc00032bf70 sp=0xc00032bef0 pc=0x5b8fb7
testing.tRunner(0xc0001df380, 0x699c48)
2023-05-22 16:48 freebsd-amd64-13_0 go@10fbd925 net.TestLookupDotsWithRemoteSource (log)
SIGSEGV: segmentation violation
PC=0x800a115e1 m=0 sigcode=1
signal arrived during cgo execution

rax    0x1ffff8
rbx    0x8008621b0
rcx    0xffffffffffffffff
rdx    0x8008620c0
rdi    0xffffffffffffffff
rsi    0x800862280
...
	/tmp/workdir/go/src/net/lookup_unix.go:96 +0xa5 fp=0xc00023bd98 sp=0xc00023bd38 pc=0x563c85
net.(*Resolver).LookupCNAME(0xc000311cb0?, {0x6e3a90?, 0x883bc0?}, {0x68a838, 0xb})
	/tmp/workdir/go/src/net/lookup.go:472 +0x2b fp=0xc00023bde0 sp=0xc00023bd98 pc=0x56082b
net.LookupCNAME(...)
	/tmp/workdir/go/src/net/lookup.go:455
net.testDots(0xc00020cea0, {0x6884f3, 0x3})
	/tmp/workdir/go/src/net/lookup_test.go:676 +0x12e fp=0xc00023bef0 sp=0xc00023bde0 pc=0x5bd08e
net.TestLookupDotsWithRemoteSource(0xc00020cea0)
	/tmp/workdir/go/src/net/lookup_test.go:658 +0x157 fp=0xc00023bf70 sp=0xc00023bef0 pc=0x5bceb7
testing.tRunner(0xc00020cea0, 0x69f4f0)
2023-05-22 19:05 freebsd-amd64-13_0 go@6761bff4 net.TestLookupDotsWithRemoteSource (log)
SIGSEGV: segmentation violation
PC=0x800a115e1 m=0 sigcode=1
signal arrived during cgo execution

rax    0x1ffff8
rbx    0x8008621b0
rcx    0xffffffffffffffff
rdx    0x8008620c0
rdi    0xffffffffffffffff
rsi    0x800862280
...
	/tmp/workdir/go/src/net/lookup_unix.go:96 +0xa5 fp=0xc0003abd98 sp=0xc0003abd38 pc=0x563c85
net.(*Resolver).LookupCNAME(0xc000358b10?, {0x6e3a90?, 0x883bc0?}, {0x68a838, 0xb})
	/tmp/workdir/go/src/net/lookup.go:472 +0x2b fp=0xc0003abde0 sp=0xc0003abd98 pc=0x56082b
net.LookupCNAME(...)
	/tmp/workdir/go/src/net/lookup.go:455
net.testDots(0xc0003689c0, {0x6884f3, 0x3})
	/tmp/workdir/go/src/net/lookup_test.go:676 +0x12e fp=0xc0003abef0 sp=0xc0003abde0 pc=0x5bd08e
net.TestLookupDotsWithRemoteSource(0xc0003689c0)
	/tmp/workdir/go/src/net/lookup_test.go:658 +0x157 fp=0xc0003abf70 sp=0xc0003abef0 pc=0x5bceb7
testing.tRunner(0xc0003689c0, 0x69f4f0)
2023-05-22 19:37 freebsd-amd64-13_0 go@8c445b7c net.TestLookupDotsWithRemoteSource (log)
SIGSEGV: segmentation violation
PC=0x800a115e1 m=0 sigcode=1
signal arrived during cgo execution

rax    0x1ffff8
rbx    0x8008621b0
rcx    0xffffffffffffffff
rdx    0x8008620c0
rdi    0xffffffffffffffff
rsi    0x800862280
...
	/tmp/workdir/go/src/net/lookup_unix.go:96 +0xa5 fp=0xc00023dd98 sp=0xc00023dd38 pc=0x563c85
net.(*Resolver).LookupCNAME(0xc000580dd0?, {0x6e3a90?, 0x883bc0?}, {0x68a838, 0xb})
	/tmp/workdir/go/src/net/lookup.go:472 +0x2b fp=0xc00023dde0 sp=0xc00023dd98 pc=0x56082b
net.LookupCNAME(...)
	/tmp/workdir/go/src/net/lookup.go:455
net.testDots(0xc0002b4ea0, {0x6884f3, 0x3})
	/tmp/workdir/go/src/net/lookup_test.go:676 +0x12e fp=0xc00023def0 sp=0xc00023dde0 pc=0x5bd08e
net.TestLookupDotsWithRemoteSource(0xc0002b4ea0)
	/tmp/workdir/go/src/net/lookup_test.go:658 +0x157 fp=0xc00023df70 sp=0xc00023def0 pc=0x5bceb7
testing.tRunner(0xc0002b4ea0, 0x69f4f0)
2023-05-23 11:36 freebsd-amd64-13_0 go@380529d5 net.TestLookupDotsWithRemoteSource (log)
SIGSEGV: segmentation violation
PC=0x800a115e1 m=0 sigcode=1
signal arrived during cgo execution

rax    0x1ffff8
rbx    0x8008621b0
rcx    0xffffffffffffffff
rdx    0x8008620c0
rdi    0xffffffffffffffff
rsi    0x800862280
...
	/tmp/workdir/go/src/net/lookup_unix.go:96 +0xa5 fp=0xc0003d5d98 sp=0xc0003d5d38 pc=0x563c85
net.(*Resolver).LookupCNAME(0xc0000c87f0?, {0x6e3a90?, 0x883bc0?}, {0x68a838, 0xb})
	/tmp/workdir/go/src/net/lookup.go:472 +0x2b fp=0xc0003d5de0 sp=0xc0003d5d98 pc=0x56082b
net.LookupCNAME(...)
	/tmp/workdir/go/src/net/lookup.go:455
net.testDots(0xc000133a00, {0x6884f3, 0x3})
	/tmp/workdir/go/src/net/lookup_test.go:676 +0x12e fp=0xc0003d5ef0 sp=0xc0003d5de0 pc=0x5bd08e
net.TestLookupDotsWithRemoteSource(0xc000133a00)
	/tmp/workdir/go/src/net/lookup_test.go:658 +0x157 fp=0xc0003d5f70 sp=0xc0003d5ef0 pc=0x5bceb7
testing.tRunner(0xc000133a00, 0x69f4f0)
2023-05-23 16:36 freebsd-amd64-13_0 go@d9f7efed net.TestLookupDotsWithRemoteSource (log)
SIGSEGV: segmentation violation
PC=0x800a115e1 m=0 sigcode=1
signal arrived during cgo execution

rax    0x1ffff8
rbx    0x8008621b0
rcx    0xffffffffffffffff
rdx    0x8008620c0
rdi    0xffffffffffffffff
rsi    0x800862280
...
	/tmp/workdir/go/src/net/lookup_unix.go:96 +0xa5 fp=0xc0000cfd98 sp=0xc0000cfd38 pc=0x563c85
net.(*Resolver).LookupCNAME(0xc00009c2b0?, {0x6e3a90?, 0x883bc0?}, {0x68a838, 0xb})
	/tmp/workdir/go/src/net/lookup.go:472 +0x2b fp=0xc0000cfde0 sp=0xc0000cfd98 pc=0x56082b
net.LookupCNAME(...)
	/tmp/workdir/go/src/net/lookup.go:455
net.testDots(0xc000596820, {0x6884f3, 0x3})
	/tmp/workdir/go/src/net/lookup_test.go:676 +0x12e fp=0xc0000cfef0 sp=0xc0000cfde0 pc=0x5bd08e
net.TestLookupDotsWithRemoteSource(0xc000596820)
	/tmp/workdir/go/src/net/lookup_test.go:658 +0x157 fp=0xc0000cff70 sp=0xc0000cfef0 pc=0x5bceb7
testing.tRunner(0xc000596820, 0x69f4f0)
2023-05-23 19:06 freebsd-amd64-13_0 go@ef2bb813 net.TestLookupDotsWithRemoteSource (log)
SIGSEGV: segmentation violation
PC=0x800a115e1 m=0 sigcode=1
signal arrived during cgo execution

rax    0x1ffff8
rbx    0x8008621b0
rcx    0xffffffffffffffff
rdx    0x8008620c0
rdi    0xffffffffffffffff
rsi    0x800862280
...
	/tmp/workdir/go/src/net/lookup_unix.go:96 +0xa5 fp=0xc00022bd98 sp=0xc00022bd38 pc=0x5640c5
net.(*Resolver).LookupCNAME(0xc000388f20?, {0x6e3b70?, 0x883bc0?}, {0x68a838, 0xb})
	/tmp/workdir/go/src/net/lookup.go:472 +0x2b fp=0xc00022bde0 sp=0xc00022bd98 pc=0x560c6b
net.LookupCNAME(...)
	/tmp/workdir/go/src/net/lookup.go:455
net.testDots(0xc00039b860, {0x6884f3, 0x3})
	/tmp/workdir/go/src/net/lookup_test.go:676 +0x12e fp=0xc00022bef0 sp=0xc00022bde0 pc=0x5bd4ce
net.TestLookupDotsWithRemoteSource(0xc00039b860)
	/tmp/workdir/go/src/net/lookup_test.go:658 +0x157 fp=0xc00022bf70 sp=0xc00022bef0 pc=0x5bd2f7
testing.tRunner(0xc00039b860, 0x69f528)

watchflakes

@bcmills
Copy link
Contributor Author

bcmills commented Jun 30, 2023

Iiiiinteresting, all freebsd-amd64-13_0.

attn @golang/freebsd !

@ayang64
Copy link
Member

ayang64 commented Jul 14, 2023

Okay -- so I'll see if i can reproduce but running the tests with github.com/ianlancetaylor/cgosymbolizer and if I bump into it, I'll post the trace.

I'm curious: are we running FreeBSD 14 trybots? It might be interesting to know if this was fixed in later releases -- might give me a place to start bisecting.

@bcmills
Copy link
Contributor Author

bcmills commented Jul 14, 2023

are we running FreeBSD 14 trybots?

Appears not. https://cs.opensource.google/go/x/build/+/master:env/freebsd-amd64/make.bash only shows versions up to 13.0-SNAPSHOT. (You're welcome to update the scripts, though — someone on release interrupts should be able to help you deploy the image.)

@evanj
Copy link
Contributor

evanj commented Oct 10, 2023

See issue #55197 which appears to have the same flakes. Issue #27992 has flakes for this same test which are different (e.g. no such host, server misbehaving).

@gopherbot
Copy link
Contributor

Found new dashboard test flakes for:

#!watchflakes
post <- goos == "freebsd" && pkg == "net" && `signal arrived during cgo execution`
2024-06-26 22:21 go1.21-freebsd-riscv64 release-branch.go1.21@c9be6ae7 net.TestLookupDotsWithRemoteSource [ABORT] (log)
=== RUN   TestLookupDotsWithRemoteSource
SIGSEGV: segmentation violation
PC=0x405cacc8 m=4 sigcode=2
signal arrived during cgo execution

goroutine 778 [syscall]:
runtime.cgocall(0x426df0, 0x8817b790)
	/usr/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/cgocall.go:157 +0x48 fp=0x8817b768 sp=0x8817b738 pc=0x22b260
net._C2func_res_ninit(0x8ca3d280)
	_cgo_gotypes.go:222 +0x44 fp=0x8817b788 sp=0x8817b768 pc=0x406e54
...
a3  0xffff	a4  0x0
a5  0x409f1180	a6  0x409f1170
a7  0x1	s2  0x8ca3d280
s3  0x409f1160	s4  0xffffffffffffffff
s5  0x3ffff	s6  0xffffffffc0000000
s7  0x409f1070	s8  0x8817bd08
s9  0x8817bbc0	s10 0x8822fb90
s11 0x409f1040	t3  0x4060efdc
t4  0xff00	t5  0xfefefefefefefeff
t6  0x409f1020	pc  0x405cacc8
2024-07-02 18:51 go1.21-freebsd-riscv64 release-branch.go1.21@12e9b968 net.TestLookupDotsWithRemoteSource [ABORT] (log)
=== RUN   TestLookupDotsWithRemoteSource
SIGSEGV: segmentation violation
PC=0x405cacc8 m=3 sigcode=2
signal arrived during cgo execution

goroutine 763 [syscall]:
runtime.cgocall(0x426df0, 0x882a1790)
	/usr/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/cgocall.go:157 +0x48 fp=0x882a1768 sp=0x882a1738 pc=0x22b260
net._C2func_res_ninit(0x8c016280)
	_cgo_gotypes.go:222 +0x44 fp=0x882a1788 sp=0x882a1768 pc=0x406e54
...
a3  0xffff	a4  0x0
a5  0x409d9180	a6  0x409d9170
a7  0x1	s2  0x8c016280
s3  0x409d9160	s4  0xffffffffffffffff
s5  0x3ffff	s6  0xffffffffc0000000
s7  0x409d9070	s8  0x882a1d08
s9  0x882a1b00	s10 0x880de030
s11 0x409d9040	t3  0x4060efdc
t4  0xff00	t5  0xfefefefefefefeff
t6  0x409d9020	pc  0x405cacc8

watchflakes

@enihcam
Copy link

enihcam commented Sep 2, 2024

any progress on this? centos is have the same issue.

image

@ianlancetaylor
Copy link
Contributor

@enihcam Please post plain text as plain text, not as an image. Images are much harder to read. Also, please include all the text; your image seems to be missing the first line or two. Thanks.

That said, the issue you are encountering does not seem to be the one that this bug report is about. This issue is about a failure on FreeBSD, and you are using CentOS. The logs in this issue are all about crashes in res_ninit. Yours seems to be a crash in getaddrinfo. So I suggest that you open a new issue.

When you open a new issue: does your problem repeat consistently? Do you have a test case you could share? Thanks.

@enihcam
Copy link

enihcam commented Sep 3, 2024

@enihcam Please post plain text as plain text, not as an image. Images are much harder to read. Also, please include all the text; your image seems to be missing the first line or two. Thanks.

That said, the issue you are encountering does not seem to be the one that this bug report is about. This issue is about a failure on FreeBSD, and you are using CentOS. The logs in this issue are all about crashes in res_ninit. Yours seems to be a crash in getaddrinfo. So I suggest that you open a new issue.

When you open a new issue: does your problem repeat consistently? Do you have a test case you could share? Thanks.

issue resolved. it was due to glibc incompatible. I replaced it with musl libc.

@evanj
Copy link
Contributor

evanj commented Sep 6, 2024

@enihcam based on your description and the traceback, I suspect you may be running in to #63567 . Any chance your program is calling os.Setenv() ? That specific crash won't happen with musl since its DNS resolver does not use environment variables.

@enihcam
Copy link

enihcam commented Sep 6, 2024

@enihcam based on your description and the traceback, I suspect you may be running in to #63567 . Any chance your program is calling os.Setenv() ? That specific crash won't happen with musl since its DNS resolver does not use environment variables.

yes, you are right. the program compiled with an old-version glibc (with corresponding old-version libnss) crashes while running in an OS with newer version of glibc+libnss, because glibc loads libnss dynamically. musl has no such issues because musl uses its built-in function for resolving domain names, just like netdns=go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-FreeBSD
Projects
Status: No status
Development

No branches or pull requests

8 participants