-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: continuing TestSegv/SegvInCgo failures with "unknown pc" #50979
Comments
This failure mode has now also occurred on Given how rare the failure seems to be and how little time is left in the Go 1.18 cycle, marking as release-blocker for Go 1.19 (instead of 1.18).
2022-02-04T22:34:05-f9763a6/linux-amd64-clang |
|
Four flakes in a month during the testing-lull that is the code freeze makes me think this test is too noisy to leave enabled in the If we aren't going to be able to fix it ahead of the 1.18 release — especially given that one of the failures was observed on a first-class port — I think it at least needs a skip. |
Change https://go.dev/cl/385154 mentions this issue: |
This test has failed on four different builders in the past month. Moreover, because every Go program depends on "runtime", it is likely to be run any time a user runs 'go test all' in their own program. Since the test is known to be flaky, let's skip it to avoid introducing testing noise until someone has time to investigate. It seems like we have enough samples in the builder logs to at least start with. For #50979 Change-Id: I9748a82fbb97d4ed95d6f474427e5aa6ecdb023d Reviewed-on: https://go-review.googlesource.com/c/go/+/385154 Trust: Bryan Mills <bcmills@google.com> Run-TryBot: Bryan Mills <bcmills@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
This failure mode is now skipped, so this is no longer a 1.18 release-blocker. (I'll leave it up to @cherrymui and @prattmic to decide whether to prioritize a fix or move it to the Backlog.) |
@prattmic, it looks like the change in the
2022-03-08T21:16:53-c3c7477/linux-mips64le-mengzhuo |
Apologies, I thought I checked for these references, but didn't do a good job. |
Change https://go.dev/cl/391139 mentions this issue: |
CL 390034 changed this throw message to add the goid, breaking the match. For #50979. Change-Id: I52d97695484938701e5b7c269e2caf0c87d44d7a Reviewed-on: https://go-review.googlesource.com/c/go/+/391139 Trust: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Bryan Mills <bcmills@google.com>
ping -- what's the status of this issue? |
The failures here should be skipped, but we'd still like to investigate. Not a release blocker, though. |
Rolling forward to 1.20. Please comment if you disagree. Thanks. |
Change https://go.dev/cl/430375 mentions this issue: |
@golang/runtime, this failure mode should either be fixed or skipped, and there is an open CL (https://go.dev/cl/430375) that does the latter. Marking as release-blocker pending a decision to either investigate and fix or merge the skip. Please don't leave flaky tests running if they aren't actively being worked on. |
Hmm.... Actually, looking a that CL it is for |
Looking at recent failures, I think this continues to not be a release blocker, but it has flaked a bunch of times on some ports: 2022-11-21T17:11:59-d685946/openbsd-amd64-68 |
For the openbsd failures
@golang/openbsd do you know if the OpenBSD kernel may report a user-sent SIGSEGV as kernel-sent (i.e. the signal code being not SI_USER)? Thanks. |
I think this is the same problem as #52963 and @pmur created a CL to skip this failure https://go.dev/cl/430375. There is a Go signal handler that tries to do a Go backtrace but gets sent to a thread running C code which won't work. |
If the Go signal handler is trying to obtain a backtrace, does the test binary need to call (It looks like the fact that the thread is running C code is an intentional part of the test.) Or is the problem that the thread is running C code without any Go frames on the stack? In that case, would it help to thread-lock the |
This is not easily reproducible. I think @pmur was able to make it fail and found that it was on a stack that was running only C code. But the Go stacktracer only works for a Go stackframe at least on PPC64. Slot 0 of the frame is the LR value in Go, but it is the caller's stack pointer in C and that is why you get the unknown PC error when running C code. See @cherrymui's suggestion in the CL for fixing the test. |
greplogs --dashboard -md -l -e '\Anetbsd-.*(?:\n.*)*FAIL: TestSegv/SegvInCgo .*(?:\n .*)*unknown pc' --since=2022-01-07
2022-02-01T16:10:04-93fe469/netbsd-arm-bsiegert
It is not obvious to me whether this has the same underlying cause as #50605.
(See previously #49182; CC @prattmic @cherrymui.)
The text was updated successfully, but these errors were encountered: