Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Imported audits may change imports.lock each time cargo vet is run #272

Closed
alexcrichton opened this issue Jul 20, 2022 · 4 comments · Fixed by #296
Closed

Imported audits may change imports.lock each time cargo vet is run #272

alexcrichton opened this issue Jul 20, 2022 · 4 comments · Fixed by #296
Labels
p1 do now

Comments

@alexcrichton
Copy link

After a discussion with Wasmtime folks and @bholley yesterday about using cargo vet in the Bytecode Alliance, specifically the Wasmtime project, it sounded like we would likely want to get into a situation where audits are imported from Firefox (and perhaps vice-versa one day). In doing so though it sounded like each time cargo vet is run locally by Wasmtime contributors that the imports.lock would continue to grow with the audits added to Firefox.

One situation that we were worried about was where a contributor sends a PR with changes to imports.lock, presumably imported from Firefox. When checking this locally to verify that imports.lock doesn't change for us it sounded like imports.lock could indeed change if Firefox added more audits relative to when the contributor themsleves ran cargo vet. I believe this was because imports.lock largely copied the audits.toml from imported files without filter for what was actually used. This felt surprising because we would not be able to guarantee locally that running cargo vet wouldn't actually change the imports.lock file if the contributor had correctly run cargo vet to import an audit from Firefox.

Two possible ideas we thought might mitigate this were:

  • Fetching the most recent contents of audits.toml from improted sources could be delayed until there's an actual audit failure. This still sort of runs the issue of something needs to check that imports.lock is indeed contained in the remote audits.toml
  • The contents of imports.lock could be filtered to a subset of imported audits which apply to the crates in the local dependency graph. This way so long as Firefox doesn't remove audits for example no matter what new crates were added we'd still generate the same imports.lock locally.
@mystor
Copy link
Collaborator

mystor commented Jul 20, 2022

Fetching the most recent contents of audits.toml from improted sources could be delayed until there's an actual audit failure. This still sort of runs the issue of something needs to check that imports.lock is indeed contained in the remote audits.toml

Delaying fetching audits until there is a failure may not be the best option as there are also violations which can be recorded, and someone running cargo vet likely wants to find out that an upstream audit source recorded some violations for a crate they're using without first encountering some other cargo vet failure. It could also lead to an update to remote audits which they weren't actually needed, if someone runs cargo vet, finds a failure, and then ends up doing a local audit which doesn't depend on any remote audits so the import was unnecessary.

I'm not sure how much I like this approach, as it means that the imports can be quite out-of-date in some cases, and that updates are somewhat arbitrary, and relate more to what specific commands the person doing a local cargo update happens to run (e.g. whether they run cargo vet or cargo vet certify first).

Something like this might work best if it was instead that when running cargo vet suggest we fetch but don't save the latest version of upstream audits, and use it for suggestions, and when running cargo vet, we first run against the local copy, and if that fails we fetch the upstream version and try to cargo vet again (generating suggestions based on the second run). If that second vet succeeded, we'd then update imports.lock to contain the latest versions of upstream audits. cargo certify would also need to do a "phantom update" to perform audit version suggestions based on audits which haven't been saved locally, but would only actually update upstream audits if one of them was being used (perhaps?).

Definitely opens some interesting questions about which commands would work off of what version of the universe, and whether they should be allowed to trigger an update.

The contents of imports.lock could be filtered to a subset of imported audits which apply to the crates in the local dependency graph. This way so long as Firefox doesn't remove audits for example no matter what new crates were added we'd still generate the same imports.lock locally.

This could potentially be something with the current approach which we take to importing remote audits (where they're extracted and stored locally in a toml file), though is somewhat counter to the approach which we have been taking in #238,#261 which are trying to preserve the verbatim remote sources to report errors etc. directly against the source line numbers. Perhaps we could work around that by only doing that when running an unlocked import (by fetching the remote sources), and not get nice errors when running a locked version which depends on imports.lock (as hopefully someone has already run it locally, and that version should only be on CI)

If we were to do this, we'd probably need to have commands like cargo vet suggest fetch remote audits every time in order to provide reliable suggestions to e.g. use delta audits against a remotely-audited source, and something like cargo vet certify would also need to fetch the remote sources to potentially update imports.lock when a change is made to reflect new remote audits being depended upon.

If we decide to go this approach, we probably shouldn't merge #261 as-is, as we don't actually want to preserve imported files 1:1 with upstream in imports.lock.

@Gankra How reasonably can we actually determine which remote audits we depend on for a cargo vet to succeed to make sure we import only that subset of audits? I worry a bit that our resolver doesn't really record that information, or that figuring it out is somewhat undecidable (especially because the shortest audit path for a given version could easily change as upstream releases new audits, leading to unnecessary local changes...)

@bholley
Copy link
Collaborator

bholley commented Jul 20, 2022

I'm persuaded that it would be valuable to only cache the imports that are actually used. This would substantially reduce incidental noise in a world where people were pumping out lots of audits and make it trivial to inspect the actual trust points. Non-CI runs will already need to re-fetch everything (since we'll want to ensure that the cache isn't stale), so we can use those full copies for error reporting.

@repi
Copy link

repi commented Aug 10, 2022

That sounds like a reasonable solution with only importing and storing what is used.

We (Embark) recently moved our initial audits to a shared public file as well and also import from Firefox and Bytecode Alliance, so 3 imports in total which does even now early on add up to quite a bit of additions on on runs and including lots of crates we do not actually use in many projects (but some we do, hence the import).

@mystor
Copy link
Collaborator

mystor commented Aug 22, 2022

This should be better now with #296. The next time you run after the update, audits you're not using should disappear, and new audits should (generally) only be imported for a package when they are needed.

aarongable pushed a commit to chromium/chromium that referenced this issue Mar 20, 2024
In the future we want `cargo vet` to be used by
`third_party/rust/PRESUBMIT.py` and when used in this mode, we want
`cargo vet` to be somewhat hermetic.  This CL helps with that by
ensuring that `--locked --frozen` switches work when used as follows:

    ```
    $ tools/crates/run_cargo_vet.py check \
        --locked --frozen \
        --no-minimize-exemptions --no-registry-suggestions \
        --output-format=json
    {
      "conclusion": "success",
      "vetted_fully": [
        {
          "name": "aho-corasick",
          "version": "1.1.2"
        },
    ...
    ```

The CL achieves the above by:

* Reintroducing a pre-populated `imports.lock` file.
  - Having this in the repo means that code reviews will explicitly
    include information about newly imported audits.
  - Note that (per mozilla/cargo-vet#272)
    `imports.lock` should hopefully see minimal churn, because it
    should only cover imports that are actually used.
  - In a sense, this CL reverts/reconsiders `imports.lock`-related parts
    of https://crrev.com/c/5368308
* Tweaking `vet_config.toml.hbs` to produce `config.toml` with the exact
  formatting expected by `cargo vet`.
  - This has some downsides - e.g. we loose the
    '@generated by tools/crates/gnrt vendor. Do not edit.' comment.
  - This is necessary because even formatting-only discrepancies trigger
    an error in `--locked` mode: "A file in the store is not correctly
    formatted [...] run `cargo vet` without --locked to reformat files
    in the store"
  - This is somewhat related to
    mozilla/cargo-vet#589
  - In a sense, this CL makes obsolete the `config.toml`-related parts
    of https://crrev.com/c/5368308.  OTOH, detecting changes to
    `config.toml` (including a comments-only change) is not harmful, so
    let's leave the detection logic in place (in `run_cargo_vet.py).

Bug: 41494083
Change-Id: Ia5967fa8e8b13b885f703810ef84c5d79030dfae
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5368743
Reviewed-by: danakj <danakj@chromium.org>
Commit-Queue: Łukasz Anforowicz <lukasza@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1275813}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p1 do now
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants