Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix macOS CI #883

Merged
merged 2 commits into from
Feb 22, 2024
Merged

Fix macOS CI #883

merged 2 commits into from
Feb 22, 2024

Conversation

dave-tucker
Copy link
Member

@dave-tucker dave-tucker commented Feb 15, 2024

This PR contains 3 commits to fix the macOS CI job which has been failing since ~12th February.

The causes was that Rust nightly updated to LLVM v18.1-RC2 but bpf-linker was linked against LLVM 17. Even with the bpf-linker v0.9.11 release (which supports the LLVM RC) macOS CI still fails due to LLVM 18.1-RC2 not being installable via brew to link against. Solving this with homebrew tap/bottles and/or binstalls of bpf-linker is the best long term solution. In the short term we revert to the last working rust nightly toolchain + bpf-linker v0.9.10

This was pretty hard to debug due to missing output from cargo xtask integration-test.
A commit in this PR provides a fix for that too which should make this easier to debug in future.

Note: At some point in CI we were also seeing errors about missing libelf. This seems to be a red herring 🐟
Commits in this PR that added libelf to the brew dependencies have been reverted

Copy link

netlify bot commented Feb 15, 2024

Deploy Preview for aya-rs-docs ready!

Name Link
🔨 Latest commit 6e9dcee
🔍 Latest deploy log https://app.netlify.com/sites/aya-rs-docs/deploys/65d770d8a65adb000809c61f
😎 Deploy Preview https://deploy-preview-883--aya-rs-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@mergify mergify bot added the test A PR that improves test cases or CI label Feb 15, 2024
@tamird
Copy link
Member

tamird commented Feb 18, 2024

Just opened #884 and then saw this. The true failure is coming from bpf-linker but we're not getting its logs (again).

@tamird tamird mentioned this pull request Feb 18, 2024
@vadorovsky
Copy link
Member

@dave-tucker @tamird bpf-linker is fixed, but it uses LLVM 18 now. To fix the latest failure, we need to find some way to install LLVM 18 on macOS.

@tamird
Copy link
Member

tamird commented Feb 19, 2024

We shouldn't merge this - it doesn't fix anything. We should figure out why we aren't getting errors from bpf-linker.

@vadorovsky
Copy link
Member

We were getting errors from bpf-linker, because nightly started using LLVM 18, bpf-linker was using LLVM 17, so it ended up trying to link incompatible bitcodes. That was fixed here aya-rs/bpf-linker#184 and in yesterday's 0.9.1 release. And that error is gone after resetting the jobs.

However, the lasest failure is because we don't have LLVM 18 installed on macOS. Homebrew doesn't ship it, which is understandable - it's just a -rc release.

Copy link
Member

@vadorovsky vadorovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on hold until we sort out the LLVM 18 on macOS thing

@tamird
Copy link
Member

tamird commented Feb 19, 2024

The first failing job was https://github.com/aya-rs/aya/actions/runs/7896427886/job/21550389236.

What I'm saying is that I don't see the bitcode error there, it's not clear why the job failed from the logs.

@vadorovsky
Copy link
Member

The second failing job was https://github.com/aya-rs/aya/actions/runs/7915187581/job/21606490379, not the libelf thing anymore.

error: failed to run custom build command for `integration-test v0.1.0 (/Users/runner/work/aya/aya/test/integration-test)`

Caused by:
  process didn't exit successfully: `/Users/runner/work/aya/aya/target/debug/build/integration-test-014cde848726027a/build-script-build` (exit status: 101)
  --- stderr
  thread 'main' panicked at test/integration-test/build.rs:258:9:
  assertion `left == right` failed: cd "/Users/runner/work/aya/aya/test/integration-ebpf" && env -u RUSTC -u RUSTUP_TOOLCHAIN CARGO_CFG_BPF_TARGET_ARCH="x86_64" "cargo" "build" "-Z" "build-std=core" "--bins" "--message-format=json" "--release" "--target" "bpfel-unknown-none" "--target-dir" "/Users/runner/work/aya/aya/target/x86_64-unknown-linux-musl/debug/build/integration-test-bc58de130b9ea635/out/integration-ebpf" failed: ExitStatus(unix_wait_status(25856))
    left: Some(101)
   right: Some(0)
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Error: AYA_BUILD_INTEGRATION_BPF="true" "cargo" "build" "--message-format=json" "--target" "x86_64-unknown-linux-musl" "--config" "target.x86_64-unknown-linux-musl.linker = \"rust-lld\"" "--package" "integration-test" "--tests" "--profile" "dev" failed: ExitStatus(unix_wait_status(25856))
Error: Process completed with exit code 123.

Our bad for not logging the whole stderr/out, but this is for sure the bitcode linking mismatch we've seen in other places.

@tamird
Copy link
Member

tamird commented Feb 19, 2024

Yes, this is exactly what I'm saying: we can't see the actual error.

@dave-tucker
Copy link
Member Author

@tamird see the last commit I just pushed. It seems cargo::warning decided not print anything after a newline character...
So we have to manually print the longer message line-by-line to get the error output from build.rs 😓

With the change applied a failing run now produces the following output, which includes the error message that @vadorovsky provided:

arning: integration-test@0.1.0:   = note: 14:01:43 [ERROR] llvm: Invalid record (Producer: 'LLVM18.1.0-rust-1.78.0-nightly' Reader: 'LLVM 17.0.6')
warning: integration-test@0.1.0:           error: failure linking module core-96a21db970ed0e7f.core.1934461c7d58be1e-cgu.00.rcgu.o from /Users/dave/dev/aya/target/aarch64-unknown-linux-musl/debug/build/integration-test-0da7decb9fffdd68/out/integration-ebpf/bpfel-unknown-none/release/deps/libcore-96a21db970ed0e7f.rlib14:01:43 [WARN] ignoring file "/var/folders/yg/xdvg2sbs6fj19g06_ttyz1n40000gn/T/rustcVLTxyq/symbols.o": no embedded bitcode
warning: integration-test@0.1.0:           14:01:43 [WARN] ignoring archive item "lib.rmeta": no embedded bitcode
warning: integration-test@0.1.0:           14:01:43 [WARN] ignoring archive item "lib.rmeta": no embedded bitcode
warning: integration-test@0.1.0:           14:01:43 [WARN] ignoring archive item "lib.rmeta": no embedded bitcode
warning: integration-test@0.1.0:           14:01:43 [WARN] ignoring archive item "lib.rmeta": no embedded bitcode
warning: integration-test@0.1.0:           14:01:43 [WARN] ignoring archive item "lib.rmeta": no embedded bitcode
warning: integration-test@0.1.0:           14:01:43 [WARN] ignoring archive item "lib.rmeta": no embedded bitcode
warning: integration-test@0.1.0:           14:01:43 [WARN] ignoring archive item "lib.rmeta": no embedded bitcode
warning: integration-test@0.1.0:           14:01:43 [WARN] ignoring archive item "lib.rmeta": no embedded bitcode
warning: integration-test@0.1.0:           
warning: integration-test@0.1.0: 
warning: integration-test@0.1.0: 
warning: integration-test@0.1.0: error: aborting due to 1 previous error
warning: integration-test@0.1.0: 
warning: integration-test@0.1.0: 
warning: integration-test@0.1.0: error: could not compile `integration-ebpf` (bin "log") due to 2 previous errors

error: failed to run custom build command for `integration-test v0.1.0 (/Users/dave/dev/aya/test/integration-test)`

Caused by:
  process didn't exit successfully: `/Users/dave/dev/aya/target/debug/build/integration-test-f879e55cb8bd8929/build-script-build` (exit status: 101)
  --- stderr
  thread 'main' panicked at test/integration-test/build.rs:260:9:
  assertion `left == right` failed: cd "/Users/dave/dev/aya/test/integration-ebpf" && env -u RUSTC -u RUSTUP_TOOLCHAIN CARGO_CFG_BPF_TARGET_ARCH="aarch64" "cargo" "build" "-Z" "build-std=core" "--bins" "--message-format=json" "--release" "--target" "bpfel-unknown-none" "--target-dir" "/Users/dave/dev/aya/target/aarch64-unknown-linux-musl/debug/build/integration-test-0da7decb9fffdd68/out/integration-ebpf" failed: ExitStatus(unix_wait_status(25856))
    left: Some(101)
   right: Some(0)
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Error: AYA_BUILD_INTEGRATION_BPF="true" "cargo" "build" "--message-format=json" "--target" "aarch64-unknown-linux-musl" "--config" "target.aarch64-unknown-linux-musl.linker = \"rust-lld\"" "--package" "integration-test" "--tests" "--profile" "dev" failed: ExitStatus(unix_wait_status(25856))

@tamird
Copy link
Member

tamird commented Feb 19, 2024

Nasty! We have the same code in xtask/src/run.rs that we should also fix.

@@ -234,7 +234,7 @@ jobs:
find /usr/local/bin -type l -exec sh -c 'readlink -f "$1" \
| grep -q ^/Library/Frameworks/Python.framework/Versions/' _ {} \; -exec rm -v {} \;
HOMEBREW_NO_INSTALLED_DEPENDENTS_CHECK=1 \
brew install dpkg findutils gnu-tar llvm pkg-config qemu
brew install dpkg findutils gnu-tar llvm pkg-config libelf qemu
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should remove this commit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afaict we still need it - it was an implicit dependency of libbpf before but we didn't install it.
once macOS CI is green again (which i'm hoping my last commit should do) i'll do an experiment to see whether it is still required.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please drop a note here when you learn the answer

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it seems I was wrong 😢 libelf isn't needed.
Commit reverted. I'm unable to find or reproduce the error that led me to believe it was libelf in the first place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It logs a warning :)

@dave-tucker dave-tucker force-pushed the libelf branch 2 times, most recently from e9cd636 to 322c883 Compare February 22, 2024 12:51
@tamird
Copy link
Member

tamird commented Feb 22, 2024

(reminder to update the PR title plz)

@dave-tucker dave-tucker changed the title ci: Install libelf on macOS Fix macOS CI Feb 22, 2024
@dave-tucker
Copy link
Member Author

Note: Clippy will fail until #885 has merged. We can merge either first, but one will be red (unless I incorporate the clippy fix here too).

.github/workflows/ci.yml Outdated Show resolved Hide resolved
@dave-tucker dave-tucker changed the title Fix macOS CI DO NOT MERGE (YET): Fix macOS CI Feb 22, 2024
@dave-tucker dave-tucker marked this pull request as draft February 22, 2024 15:56
The cargo::warning seems to ignore output after a newline.
Iterate over the entire rendered message and print it line-by-line.

Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
LLVM-18 hasn't been released on macOS yet

Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
@dave-tucker dave-tucker changed the title DO NOT MERGE (YET): Fix macOS CI Fix macOS CI Feb 22, 2024
@dave-tucker dave-tucker marked this pull request as ready for review February 22, 2024 16:56
@dave-tucker
Copy link
Member Author

Merging to get CI back to green!

@dave-tucker dave-tucker merged commit 664bb47 into aya-rs:main Feb 22, 2024
21 checks passed
@dave-tucker dave-tucker deleted the libelf branch February 22, 2024 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test A PR that improves test cases or CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants