Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build: add LUCI netbsd-arm builder #63698

Open
bsiegert opened this issue Oct 23, 2023 · 16 comments
Open

x/build: add LUCI netbsd-arm builder #63698

bsiegert opened this issue Oct 23, 2023 · 16 comments
Assignees
Labels
Builders x/build issues (builders, bots, dashboards) NeedsFix The path to resolution is known, but the work has not been done. new-builder OS-NetBSD
Milestone

Comments

@bsiegert
Copy link
Contributor

Hostname: netbsd-arm-bsiegert

netbsd-arm-bsiegert.csr.txt

/cc @golang/release

@gopherbot gopherbot added the Builders x/build issues (builders, bots, dashboards) label Oct 23, 2023
@gopherbot gopherbot added this to the Unreleased milestone Oct 23, 2023
@cherrymui cherrymui added the NeedsFix The path to resolution is known, but the work has not been done. label Oct 30, 2023
@bsiegert
Copy link
Contributor Author

Any updates? It's been a month.

@dmitshur
Copy link
Contributor

Thanks for pinging; it looks like we missed this new-builder issue, sorry.

We'll pick this up at the start of next week since this week is short due to US holidays.

@dmitshur dmitshur self-assigned this Nov 27, 2023
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/545536 mentions this issue: main.star: add netbsd-arm, netbsd-arm64, openbsd-riscv64 builders

gopherbot pushed a commit to golang/build that referenced this issue Nov 28, 2023
For golang/go#63698.
For golang/go#63614.
For golang/go#64176.

Change-Id: I2a203cd2a1e2e80ee44cfd5ce11c1ce5bbd60002
Reviewed-on: https://go-review.googlesource.com/c/build/+/545536
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Bypass: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
@dmitshur
Copy link
Contributor

dmitshur commented Nov 30, 2023

Here's the resulting certificate: netbsd-arm-bsiegert-1701367164.cert.txt.

The builder definitions have been added in CL 545536 so your bot should be able to connect once you follow the rest of the steps on your end.

We have some more work to do to make the dependencies built for the netbsd/arm port and available in CIPD, which will be needed for the builds to complete successfully. We'll update this issue once that's done.

@dmitshur
Copy link
Contributor

dmitshur commented Dec 8, 2023

more work to do to make the dependencies built for the netbsd/arm port and available in CIPD

I mailed crrev.com/c/5086069 for this.

@dmitshur
Copy link
Contributor

That CL is submitted and the dependencies are built.

If you give it a shot to connect with the builder, we can see what the next steps are for this.

@dmitshur dmitshur assigned bsiegert and unassigned dmitshur Dec 15, 2023
@bsiegert
Copy link
Contributor Author

Thanks Dmitri! The bot is now up and running at https://chromium-swarm.appspot.com/bot?id=netbsd-arm-bsiegert. It shows up as

@bsiegert bsiegert reopened this Dec 29, 2023
@bsiegert
Copy link
Contributor Author

bsiegert commented Dec 29, 2023

Sorry, hit Submit too fast.

It shows up as

cipd_platform=netbsd-armv6l
cpu=evbarm | evbarm-32

@dmitshur
Copy link
Contributor

I'm not seeing successful builds in https://ci.chromium.org/ui/p/golang/builders/ci/gotip-netbsd-arm?limit=200, and the builder isn't showing up as connected now. Can you take a look at what its current status is? I'll reopen this issue so we can track what's still left to do here.

@dmitshur dmitshur reopened this Jan 23, 2024
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/558517 mentions this issue: main.star: fix cipd_platform value for GOHOSTARCH=arm

gopherbot pushed a commit to golang/build that referenced this issue Jan 26, 2024
Handle another way in which cipd_platform is slightly
different from Go's GOHOSTOS and GOHOSTARCH values.

For golang/go#65241.
For golang/go#63698.
For golang/go#63601.

Change-Id: I3caad897b821208939b8b411663ba417c4c21df7
Reviewed-on: https://go-review.googlesource.com/c/build/+/558517
TryBot-Bypass: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
@dmitshur
Copy link
Contributor

dmitshur commented Jan 26, 2024

CL 558517 fixed the cipd_platform value in the builder definition, and triggered some work, e.g., https://chromium-swarm.appspot.com/task?id=675f6b5f66994610. It's failing with an internal failure:

swarming_bot_logs: 2024-01-26 16:28:59.383: Starting run_isolated script
swarming_bot_logs: 2024-01-26 16:28:59.537: Trimming caches. min_ts: 1704472139, free_disk: 35253248000, min_free_space: 62578626346
swarming_bot_logs: 2024-01-26 16:28:59.542: trimming cache with dir /home/swarming/.swarming/cas_cache
swarming_bot_logs: 2024-01-26 16:28:59.546: trimming cache with dir /home/swarming/.swarming/c
swarming_bot_logs: 2024-01-26 16:28:59.549: trim_caches: took 0 seconds
swarming_bot_logs: 2024-01-26 16:29:04.267: Installed CIPD client
10397 2024-01-26 16:29:05.190 E: internal failure: Expecting value: line 1 column 1 (char 0)
Traceback (most recent call last):
  File "/home/swarming/.swarming/swarming_bot.2.zip/client/run_isolated.py", line 858, in map_and_run
    with data.install_packages_fn(run_dir, cas_client_dir) as cipd_info:
  File "/usr/pkg/lib/python3.10/contextlib.py", line 135, in __enter__
    return next(self.gen)
  File "/home/swarming/.swarming/swarming_bot.2.zip/client/run_isolated.py", line 1199, in install_client_and_packages
    package_pins = _install_packages(run_dir, cipd_cache_dir, client,
  File "/home/swarming/.swarming/swarming_bot.2.zip/client/run_isolated.py", line 1124, in _install_packages
    pins = client.ensure(
  File "/home/swarming/.swarming/swarming_bot.2.zip/client/cipd.py", line 245, in ensure
    result_json = json.load(jfile)
  File "/usr/pkg/lib/python3.10/json/__init__.py", line 293, in load
    return loads(fp.read(),
  File "/usr/pkg/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/pkg/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/pkg/lib/python3.10/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Expecting value: line 1 column 1 (char 0)

We need to figure out what's causing that and fix it to make progress here.

Edit: I think https://source.chromium.org/chromium/infra/infra/+/main:luci/client/cipd.py;l=245 is the relevant line. The check for exit_code happens slightly later, on line 260, so it's possible something went wrong during the invocation of cipd ensure, we just don't see what it was in the log above. If you can find a way to reproduce it locally and share its output, that'd be helpful.

@dmitshur dmitshur self-assigned this Jan 26, 2024
@dmitshur
Copy link
Contributor

dmitshur commented Feb 5, 2024

In addition to trying to reproduce this by running cipd ensure manually on the builder, you might be able to check if there's more information in /home/swarming/.swarming/logs/task_runner.log (or run_isolated.log).

@bsiegert
Copy link
Contributor Author

bsiegert commented Feb 6, 2024

Thanks for the pointers, I will take a look and report back.

@bsiegert
Copy link
Contributor Author

bsiegert commented Feb 7, 2024

This from run_isolated.log looks related:

22683 2024-01-26 16:03:45.157 U: Installed CIPD client
22683 2024-01-26 16:03:45.159 I: Installing packages {'': [('infra/tools/luci/bbagent/${platform}', 'git_revision:1f801c4894a7ced859ae672642feeeb8960da330')]} into /home/swarming/.swarming/w/ir
22683 2024-01-26 16:03:45.283 D: Running ['/home/swarming/.swarming/cipd_cache/bin/cipd', 'ensure', '-root', '/home/swarming/.swarming/w/ir', '-ensure-file', '/tmp/cipd-ensure-file-y5cmbnt9.txt', '-verbose', '-json-output', '/tmp/cipd-ensure-result-0pjif55u.json', '-cache-dir', '/home/swarming/.swarming/cipd_cache/cache', '-service-url', 'https://chrome-infra-packages.appspot.com/']
22683 2024-01-26 16:03:45.769 D: cipd client: runtime: this system has multiple CPUs and must use
22683 2024-01-26 16:03:45.806 D: cipd client: atomic synchronization instructions. Recompile using GOARM=7.
22683 2024-01-26 16:03:45.906 E: internal failure: Expecting value: line 1 column 1 (char 0)

So the cipd binary needs to be recompiled with GOARM=7 set.

@dmitshur
Copy link
Contributor

dmitshur commented Feb 7, 2024

Downloading the Go binary from here and ranning go version -m on it prints:

	build	CGO_ENABLED=0
	build	GOARCH=arm
	build	GOOS=netbsd
	build	GOARM=6

So it is built with GOARM=6 now (even though cross-compilation default for GOARM is 7 as of Go 1.21.).

Searching finds entries like this, this, and this that all suggest making a change isn't quite straightforward, because the "v6l" suffix of the "netbsd-armv6l" CIPD platform dimension corresponds to GOARM=6.

Maybe it's possible to make it work with GOARM=6 anyway, through changes to the atomic operations, if it turns out there's not much to do? For example, I recently did crrev.com/c/5268803 which was enough to resolve the problem for linux/arm (with GOARM=6). But if it's much more invasive, that path might be harder. Edit: I see this is coming from the Go runtime, i.e., here and seems you'd need not to have multiple CPUs to work around it.

If the builder for this port cannot work with GOARM=6 binaries and really needs GOARM=7, we can see how involved that might be.

@bsiegert
Copy link
Contributor Author

bsiegert commented Feb 8, 2024

I wonder why the Chromium infra thinks the architecture is "armv6l". That's a different sub-architecture. On this machine, uname -p prints earmv7hf, not the older earmv6hf. We do not have a ARMv6 builder at the moment, these are kind of old and crufty in general.

FWIW, http://wiki.netbsd.org/ports/evbarm/ shows the different sub-architectures.

This comment says:

  For example, on ARMv7 machines we claim that we are in fact running ARMv6
  (which is subset of ARMv7), since we don't really care about v7 over v6
  difference and want to reduce the variability in supported architectures
  instead.

Which is clearly a wrong assumption.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Builders x/build issues (builders, bots, dashboards) NeedsFix The path to resolution is known, but the work has not been done. new-builder OS-NetBSD
Projects
Status: In Progress
Development

No branches or pull requests

5 participants
@bsiegert @dmitshur @gopherbot @cherrymui and others