-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/go,os: build cache checksum errors in x/tools/cmd/callgraph.TestCallgraph on windows/arm64 #50706
Comments
The The error appears to indicate file corruption in the Given the failure mode, I think the bug is more likely in We are also running a relatively old Windows 10 build (#48946, CC @golang/release, @zx2c4), so I can't rule out a bug in the underlying platform either. |
This is a release-blocker via #11811, but given that this is not a first-class port and appears to be a platform-specific bug affecting only one test, I plan to add a test skip for this specific builder in If we also observe this failure mode on the new |
Change https://golang.org/cl/379734 mentions this issue: |
The 'bad checksum' means we read a file that was named for a sha256 hash and the content did not match that sha256. |
The fact that this is only windows/arm64 and that we've seen absolutely no mentions of it on other systems or in other bug reports makes me feel okay with this not being a release-blocker. If there really is corruption, the content-addressed and checksum-checked nature of the cache means that the system is either failstop or works correctly. So far we are getting no reports of failstop other than this one. |
I agree, but the failure rate for |
t.Skips are always OK in my book. |
We don't know whether this failure is due to a Go bug or a platform bug, so we'll skip it on the one builder to reduce noise, but not the GOOS/GOARCH as a whole. If we do not observe failures on other windows/arm64 builders, we can perhaps chalk it up to a platform bug. If we do observe failures on other builders, then we'll have more data to investigate with. For golang/go#50706 Change-Id: I52511dd4a5cff80953823d9cf901975ff4657457 Reviewed-on: https://go-review.googlesource.com/c/tools/+/379734 Trust: Bryan Mills <bcmills@google.com> Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Trust: Daniel Martí <mvdan@mvdan.cc>
Change https://golang.org/cl/381314 mentions this issue: |
I uploaded https://go-review.googlesource.com/c/go/+/381314 just to have around if we need to patch it in to investigate this further. No intent to submit it. |
Now observed on
|
A couple more. Whatever the cause, this is not fixed in Windows 11.
2022-04-01T20:25:27-153e30b-32ff9b5/windows-arm64-11 |
Change https://go.dev/cl/397996 mentions this issue: |
This test produces apparent file corruption on all of the windows/arm64 builders. I suspect that this is a low-level bug (in either the platform itself or the Go standard library on windows/arm64). Since windows/arm64 is not yet a first-class port, this test can be skipped for now. However, if windows/arm64 becomes a first-class port the underlying file-corruption bug should be investigated and fixed. Updates golang/go#50706. Change-Id: I0bc80cefee50895d40acc658286eb7ef8790493a Reviewed-on: https://go-review.googlesource.com/c/tools/+/397996 Reviewed-by: Russ Cox <rsc@golang.org> Trust: Bryan Mills <bcmills@google.com> Run-TryBot: Bryan Mills <bcmills@google.com> gopls-CI: kokoro <noreply+kokoro@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
@qmuntal, this is one of the issues I think should block promoting |
@bcmills which is the ~occurrence rate of this issue? I've been running Will keep it running a couple more days, but I'm leaning towards a HW capacity issue related to #51019. |
@qmuntal, before we started skipping It does seem plausible that this could be a defect (or a bad interaction with a platform bug) somewhere in the virtualization stack used to host the builder. |
I haven't been able to reproduce this failure locally yet. Let's unskip |
greplogs --dashboard -md -l -e 'reading srcfiles list: cache entry not found: bad checksum' --since=2021-01-01
2022-01-19T20:29:36-7c251d6-9de1ac6/windows-arm64-10
2021-11-02T15:54:27-058ed05-c3cb1ec/windows-arm64-10
2021-11-01T13:50:47-513e3fb-4a84298/windows-arm64-10
2021-10-14T17:38:39-e69ba9d-011fd00/windows-arm64-10
2021-09-14T02:53:17-384e5da-ee91bb8/windows-arm64-10
The text was updated successfully, but these errors were encountered: