-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/build: AIX builders fail with "unknown directive "Disconnected" " #68481
Comments
cc @golang/aix |
Related Issues and Documentation (Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.) |
These tests also shows flaky but high frequency failures on the netbsd-arm-bsiegert builder, e.g. https://build.golang.org/log/7d88bbec36f25867219a11c16791e0c24ea622e7 . Not sure if they are separate builder issues or they are related. cc @bsiegert |
I was pinged by the AIX maintainers to take a look. Something on the VM is misbehaving, @ayappanec I suspect this VM might be due for updates. Would you be able to investigate or verify it is up to date? |
This may actually be caused by the Go test in some way. The other day, I noticed that /dev/null on the netbsd-arm builder had been overwritten with a file containing one line of text. Could it be that something specifies /dev/null as output file and the program is misbehaving somehow? |
Yes, the machine is due for updates. I will check on this. |
It wouldn't be surprising if some tests write to /dev/null, but usually that should be fine just like it is common to do things like |
It appeared something was recreating I modified sshd_config and syslog to place sshd's logs into I think everything is running as root, so there are few guardrails. I don't have enough background with AIX to say what is clobbering the file, but sshd seems suspect. |
I assume this happens if the software thinks it's extra clever by creating the file under a different name and using rename(2) to put it into place. |
This is still happening. I wonder if a test is rewriting the file. I couldn't reproduce it when running the go dist tests on the latest 1.22 or 1.23rc releases. |
@bsiegert this sounds plausible. The go command does move the output file from the temp WORK directory to the output. But it special cases |
I think I found the culprit after cycling through the x repos. Something in x/oscar (which coincidentally is failing to build on CI now) seems to be causing /dev/null to be deleted. I'll look more into this later today. |
Thanks for finding that. Note that x/oscar is defined PT.TOOL, a repository intended to only be tested on a few first-class platforms like Linux/Windows/macOS, in the LUCI build configuration (see here). Hopefully this builder can be migrated to LUCI soon (issue #67299) since we're migrating away from the coordinator, and so the coordinator won't be maintained indefinitely. But in the short term it would be fine to reconfigure the coordinator not to test x/oscar on the GOOS=aix builder. |
And, the culprit is found. I've opened #68558. |
Change https://go.dev/cl/600515 mentions this issue: |
Rework TestErrors to use tmp files contained within its test specific tmpdir. This fixes the accidental deletion of /dev/null on the AIX builder when run as a privileged user. Fixes golang/go#68558 Fixes golang/go#68481 Change-Id: I31c4ca3ea7963b013516ce6d85bbc91c7483981e Reviewed-on: https://go-review.googlesource.com/c/oscar/+/600515 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
AIX golang CI recently started failing with the below error.
https://build.golang.org/log/28dadf4b964f28d9137fb36e3f3181f03394faa7
Looks like there is some problem with the CI machine because our internal CI is working fine.
Any idea what could be the problem ? Any hints will be useful to fix that CI machine.
The text was updated successfully, but these errors were encountered: