Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust Minimal Workflow Failing #449

Closed
maspe36 opened this issue Dec 17, 2024 · 16 comments · Fixed by #452
Closed

Rust Minimal Workflow Failing #449

maspe36 opened this issue Dec 17, 2024 · 16 comments · Fixed by #452

Comments

@maspe36
Copy link
Collaborator

maspe36 commented Dec 17, 2024

Our latest Rust Minimal workflow failed with the following.

   Starting >>> rclrs
  --- output: rclrs
  error: package `home v0.5.11` cannot be built because it requires rustc 1.81 or newer, while the currently active rustc version is 1.75.0
  Either upgrade to rustc 1.81 or newer, or use
  cargo update home@0.5.11 --precise ver
  where `ver` is the latest version of `home` supporting rustc 1.75.0
  ---
  --- stderr: rclrs
  error: package `home v0.5.11` cannot be built because it requires rustc 1.81 or newer, while the currently active rustc version is 1.75.0
  Either upgrade to rustc 1.81 or newer, or use
  cargo update home@0.5.11 --precise ver
  where `ver` is the latest version of `home` supporting rustc 1.75.0
  ---
  Failed   <<< rclrs [1.32s, exited with code 1]

Changelog for home can be found here

@mxgrey
Copy link
Collaborator

mxgrey commented Dec 18, 2024

Any objections to committing a lockfile for rclrs @esteve @jhdcs @maspe36 ?

The guidance from the cargo team has shifted over the years. Originally they advised to commit lockfiles for applications and not libraries, but in recent years they've advised to probably just go ahead and always commit lockfiles. I've been doing the latter for my projects the last 1-2 years and haven't had any issues with it.

@maspe36
Copy link
Collaborator Author

maspe36 commented Dec 18, 2024

I don't fully understand what's going on here in this issue. How are we seeing a home crate upgrade? Shouldn't that be part of the toolchain or the cargo version? I don't know where this crate gets pulled in, I just know its source code lives in cargos repo.

How would a lockfile enable both minimum and stable rust version CI?

@mxgrey
Copy link
Collaborator

mxgrey commented Dec 18, 2024

Great questions, I'll walk us through my understanding of what's happening.

First of all, where does home come from? Going into the rclrs package and running $ cargo tree gives the following information:

[build-dependencies]
├── bindgen v0.66.1
│   └── which v4.4.2
│       ├── home v0.5.9

So rclrs depends on bindgen which depends on which which depends on home. This tree was taken from my local checkout of the main branch prior to this issue happening. If I run $ cargo update, then cargo will fetch the latest "compatible" version of all dependencies, including transitive dependencies like home, and update my local Cargo.lock file to track this update. Now I'll end up with the following tree:

├── bindgen v0.66.1
│   └── which v4.4.2
│       ├── home v0.5.11

Notice that home has gone from v0.5.9 to v0.5.11.

We don't have direct control over what version of home gets fetched by cargo. The which crate has some degree of control over this. In the Cargo.toml of which they likely have something like

home = "0.5"

and cargo will make sure that it only fetches versions of home whose semantic versions are compatible with 0.5, which includes both 0.5.9 and 0.5.11.

One thorny problem with semantic versioning is that it only accounts for API compatibility. It does not account for the Rust compiler version. In order for us to guarantee support for Rust version 1.75, we need stricter requirements than semantic versioning offers. If I were to commit my lockfile that specifies v0.5.9 for home then the minimal version CI should go green.

@mxgrey
Copy link
Collaborator

mxgrey commented Dec 18, 2024

Oh but one thing that saving the lockfile does not help with... downstream crates that depend on rclrs will not have access to a lockfile that we commit to the rclrs repo. So committing our lockfile will allow CI to go green for us, but it wouldn't help a downstream user of rclrs who needs to use Rust compiler version 1.75. I'll need to do some research on whether there's a solution for that.

@mxgrey
Copy link
Collaborator

mxgrey commented Dec 18, 2024

Okay I've given it more thought...

We only care about supporting Rust compiler version 1.75 in the context of making sure rclrs will be compatible with the ROS build farm. In that context, if/when rclrs is ever built on the ROS build farm, cargo will be restricted to only using dependencies (including transitive dependencies) that are also available on the build farm. That means home will need to be added to the build farm before rclrs can be added. When adding home, we will have to select a version that can be built by Rust compiler version 1.75, so we'll be forced to select a version prior to 0.5.11. At that point, cargo will be forced to select a version which is compatible with Rust 1.75.

So I do think committing lockfiles for all our packages will be sufficient for our needs.

One additional thing to consider is that maybe the workflow for stable should include a $ cargo update so that the stable CI is still testing against the latest versions of all our dependencies.

@mxgrey
Copy link
Collaborator

mxgrey commented Dec 18, 2024

An alternative way to solve this:

For each crate (rclrs and the various examples) we save a known-good lockfile for 1.75 under a name like Cargo.lock.1-75 and then in our CI for Minimal we add a line like $ cp Cargo.lock.1-75 Cargo.lock so the committed lockfile only affects our minimal CI build. Behavior will be unchanged for the Stable CI and for users who checkout the repo.

@esteve
Copy link
Collaborator

esteve commented Dec 18, 2024

I've tested locally and added this as a build dependency for rclrs in its Cargo.toml:

home = "=0.5.9"

it seemed to work both with Rust 1.75 and 1.83, would this be ok to commit?

However, the core issue here is that the which crate depends on the home crate, which is definitely not meant for external use, from home's README:

This crate is maintained by the Cargo team, primarily for use by Cargo and Rustup and not intended for external use. This crate may make major changes to its APIs or be deprecated without warning.

There is, in fact, a PR for which that switches to env_home (harryfei/which-rs#105)

If everyone is ok with the fix I proposed, we can commit it temporarily and once which has a new release that uses env_home, we can drop it.

@mxgrey
Copy link
Collaborator

mxgrey commented Dec 18, 2024

@esteve I have a few concerns about taking that approach:

  1. This will work around the CI issue for home specifically, but this problem will almost certainly keep popping up for other dependencies (including transitive dependencies) as other libraries gradually start using new language features without breaking their API compatibility. We'll find ourselves playing whack-a-mole specifying more and more transitive dependency versions, and then we'll need to start keeping track of whether those transitive dependencies are even being used by our real dependencies as our real dependencies get updated.
  2. Maybe not too much of a concern for home, but definitely a concern for other transitive dependencies: We shouldn't put an artificial ceiling on how recent of a version of these libraries can be obtained by users of rclrs just so that we can confirm in our CI that rclrs maintains compatibility with 1.75. It's not hard to imagine there could be bug fixes or performance improvements in newer versions of these libraries that are API-compatible, but those newer versions depend on more recent language features. We should make sure that users who pull rclrs from crates.io can still benefit from all that.

I would promote the last approach that I suggested: we keep a 1.75-compatible "known-good" lockfile in each crate directory. Maybe call it Cargo.lock.minimal to be more generic. Then the Minimal CI workflow just does

$ cp Cargo.lock.minimal Cargo.lock

before building and testing each crate. This will have no negative effect on the Stable CI nor on users of rclrs.

@esteve
Copy link
Collaborator

esteve commented Dec 18, 2024

@mxgrey the fix I propose is a one time thing for a very specific case. The crate that's causing this issue (which) already has a PR fixing it, so I'd prefer to have a localized fix than generalize for potential issues that haven't even happened yet.

@mxgrey
Copy link
Collaborator

mxgrey commented Dec 18, 2024

Even for the specific case, I don't think it's good dependency hygiene to lock all users of rclrs into a specific version of a transitive dependency just so we can get green checkmarks in a CI that exists for aspirational reasons (good and noble aspirations that we should absolutely hold to, but nevertheless aspirational).

rclrs users could end up in a position where they can't combine rclrs with another dependency that requires a newer version of home.

What I'm proposing not only avoids any pitfalls for users but also is a more appropriate simulation of what rclrs will experience if it gets put on the buildfarm: It will be locked into a certain compiler-version-compatible subset of its dependencies. It will also be a one-and-done fix so no other dependencies (transitive or otherwise) will be able to do this kind of rug pull on us.

@maspe36
Copy link
Collaborator Author

maspe36 commented Dec 18, 2024

So the idea is just to have a minimal rustc lockfile, not one for stable? If so, I'm okay with that approach.

Before pulling the trigger, we should look around the ecosystem. Surely this is not a unique problem to support a minimum version and stable.

@mxgrey
Copy link
Collaborator

mxgrey commented Dec 18, 2024

Surely this is not a unique problem to support a minimum version and stable.

Truthfully I suspect it's a very very small portion of the community that would want to explicitly support both a frozen compiler version and the latest stable compiler at the same time. Those who want to support a frozen compiler version will most likely want all their dependencies to be frozen, so they'll just use a lockfile and/or their own crate registry for all their dependencies.

I just tested out my lockfile idea with #450 but it failed. It looks like colcon-cargo might not give any regard to the lockfile in the source code, but I'd need to do more local experimentation to figure out if that's the actual problem.

I also attempted a different solution where we would use $ cargo add home@=0.5.9 inside the CI workflow script for Minimal. That should have the effect that @esteve 's suggestion is achieving but applied only to the Minimal CI workflow, with no impact on downstream users. Somehow that is also failing.

There must be something that I don't understand about how ros-tooling/action-ros-ci works, because somehow even directly modifying the Cargo.toml of rclrs using cargo add isn't stopping colcon-cargo from using version 0.5.9 of home, even though the same exact steps work when I try them locally on my own computer.

It's very late for me now, so I'll try revisiting this tomorrow.

@mxgrey
Copy link
Collaborator

mxgrey commented Dec 19, 2024

I've figured out why none of my attempts were working. It wasn't the fault of colcon-cargo, it's because of how action-ros-ci functions.

Instead of building the source directory that the GitHub workflows starts you out with, action-ros-ci will patch the contents of your .repos file to include the url of the repo that is being tested, and then clone it from the GitHub url. Therefore we end up with a directory structure something like this

rclrs
rosidl_runtime_rs
rosidl_generator_rs
examples
ros2_rust_ws
  - src
    - ros2_rust
      - rclrs
      - rosidl_runtime_rs
      - rosidl_generator_rs
      - examples

where the copy of ros2_rust that's inside of ros2_rust_ws/src a totally fresh clone of the repo, containing none of the tweaks that I tried to make from inside the workflow.

I'm still looking through action-ros-ci to see if it has any option or hook that we can take advantage of, but the impression that I have right now is it was designed to accommodate a very specific sequence of CI steps, so it might not be possible for us to force a different set of dependencies for different workflow runs. This makes me worried that we could end up in a situation where supporting Rust 1.75 in the CI will force us to freeze the versions of more and more dependencies in a way that could be detrimental to crates.io users.

@mxgrey
Copy link
Collaborator

mxgrey commented Dec 19, 2024

After a lot of experimentation and iterating, I came up with #451

It's essentially what @esteve was suggesting, but we keep it confined to the Minimal CI so there's no risk of a detrimental impact on downstream users.

@Xaeroxe
Copy link

Xaeroxe commented Dec 20, 2024

which 7.0.1 is now released which should resolve this problem.

@esteve
Copy link
Collaborator

esteve commented Dec 23, 2024

@Xaeroxe thanks for your feedback, we've instead opted to bump the version of bindgen to 0.70, which dropped which as a dependency

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants