Imported audits may change `imports.lock` each time `cargo vet` is run #272

alexcrichton · 2022-07-20T14:31:22Z

After a discussion with Wasmtime folks and @bholley yesterday about using cargo vet in the Bytecode Alliance, specifically the Wasmtime project, it sounded like we would likely want to get into a situation where audits are imported from Firefox (and perhaps vice-versa one day). In doing so though it sounded like each time cargo vet is run locally by Wasmtime contributors that the imports.lock would continue to grow with the audits added to Firefox.

One situation that we were worried about was where a contributor sends a PR with changes to imports.lock, presumably imported from Firefox. When checking this locally to verify that imports.lock doesn't change for us it sounded like imports.lock could indeed change if Firefox added more audits relative to when the contributor themsleves ran cargo vet. I believe this was because imports.lock largely copied the audits.toml from imported files without filter for what was actually used. This felt surprising because we would not be able to guarantee locally that running cargo vet wouldn't actually change the imports.lock file if the contributor had correctly run cargo vet to import an audit from Firefox.

Two possible ideas we thought might mitigate this were:

Fetching the most recent contents of audits.toml from improted sources could be delayed until there's an actual audit failure. This still sort of runs the issue of something needs to check that imports.lock is indeed contained in the remote audits.toml
The contents of imports.lock could be filtered to a subset of imported audits which apply to the crates in the local dependency graph. This way so long as Firefox doesn't remove audits for example no matter what new crates were added we'd still generate the same imports.lock locally.

The text was updated successfully, but these errors were encountered:

mystor · 2022-07-20T15:24:56Z

Fetching the most recent contents of audits.toml from improted sources could be delayed until there's an actual audit failure. This still sort of runs the issue of something needs to check that imports.lock is indeed contained in the remote audits.toml

Delaying fetching audits until there is a failure may not be the best option as there are also violations which can be recorded, and someone running cargo vet likely wants to find out that an upstream audit source recorded some violations for a crate they're using without first encountering some other cargo vet failure. It could also lead to an update to remote audits which they weren't actually needed, if someone runs cargo vet, finds a failure, and then ends up doing a local audit which doesn't depend on any remote audits so the import was unnecessary.

I'm not sure how much I like this approach, as it means that the imports can be quite out-of-date in some cases, and that updates are somewhat arbitrary, and relate more to what specific commands the person doing a local cargo update happens to run (e.g. whether they run cargo vet or cargo vet certify first).

Something like this might work best if it was instead that when running cargo vet suggest we fetch but don't save the latest version of upstream audits, and use it for suggestions, and when running cargo vet, we first run against the local copy, and if that fails we fetch the upstream version and try to cargo vet again (generating suggestions based on the second run). If that second vet succeeded, we'd then update imports.lock to contain the latest versions of upstream audits. cargo certify would also need to do a "phantom update" to perform audit version suggestions based on audits which haven't been saved locally, but would only actually update upstream audits if one of them was being used (perhaps?).

Definitely opens some interesting questions about which commands would work off of what version of the universe, and whether they should be allowed to trigger an update.

The contents of imports.lock could be filtered to a subset of imported audits which apply to the crates in the local dependency graph. This way so long as Firefox doesn't remove audits for example no matter what new crates were added we'd still generate the same imports.lock locally.

This could potentially be something with the current approach which we take to importing remote audits (where they're extracted and stored locally in a toml file), though is somewhat counter to the approach which we have been taking in #238,#261 which are trying to preserve the verbatim remote sources to report errors etc. directly against the source line numbers. Perhaps we could work around that by only doing that when running an unlocked import (by fetching the remote sources), and not get nice errors when running a locked version which depends on imports.lock (as hopefully someone has already run it locally, and that version should only be on CI)

If we were to do this, we'd probably need to have commands like cargo vet suggest fetch remote audits every time in order to provide reliable suggestions to e.g. use delta audits against a remotely-audited source, and something like cargo vet certify would also need to fetch the remote sources to potentially update imports.lock when a change is made to reflect new remote audits being depended upon.

If we decide to go this approach, we probably shouldn't merge #261 as-is, as we don't actually want to preserve imported files 1:1 with upstream in imports.lock.

@Gankra How reasonably can we actually determine which remote audits we depend on for a cargo vet to succeed to make sure we import only that subset of audits? I worry a bit that our resolver doesn't really record that information, or that figuring it out is somewhat undecidable (especially because the shortest audit path for a given version could easily change as upstream releases new audits, leading to unnecessary local changes...)

bholley · 2022-07-20T16:35:45Z

I'm persuaded that it would be valuable to only cache the imports that are actually used. This would substantially reduce incidental noise in a world where people were pumping out lots of audits and make it trivial to inspect the actual trust points. Non-CI runs will already need to re-fetch everything (since we'll want to ensure that the cache isn't stale), so we can use those full copies for error reporting.

repi · 2022-08-10T22:35:41Z

That sounds like a reasonable solution with only importing and storing what is used.

We (Embark) recently moved our initial audits to a shared public file as well and also import from Firefox and Bytecode Alliance, so 3 imports in total which does even now early on add up to quite a bit of additions on on runs and including lots of crates we do not actually use in many projects (but some we do, hence the import).

mystor · 2022-08-22T21:03:57Z

This should be better now with #296. The next time you run after the update, audits you're not using should disappear, and new audits should (generally) only be imported for a package when they are needed.

@generated

In the future we want `cargo vet` to be used by `third_party/rust/PRESUBMIT.py` and when used in this mode, we want `cargo vet` to be somewhat hermetic. This CL helps with that by ensuring that `--locked --frozen` switches work when used as follows: ``` $ tools/crates/run_cargo_vet.py check \ --locked --frozen \ --no-minimize-exemptions --no-registry-suggestions \ --output-format=json { "conclusion": "success", "vetted_fully": [ { "name": "aho-corasick", "version": "1.1.2" }, ... ``` The CL achieves the above by: * Reintroducing a pre-populated `imports.lock` file. - Having this in the repo means that code reviews will explicitly include information about newly imported audits. - Note that (per mozilla/cargo-vet#272) `imports.lock` should hopefully see minimal churn, because it should only cover imports that are actually used. - In a sense, this CL reverts/reconsiders `imports.lock`-related parts of https://crrev.com/c/5368308 * Tweaking `vet_config.toml.hbs` to produce `config.toml` with the exact formatting expected by `cargo vet`. - This has some downsides - e.g. we loose the '@generated by tools/crates/gnrt vendor. Do not edit.' comment. - This is necessary because even formatting-only discrepancies trigger an error in `--locked` mode: "A file in the store is not correctly formatted [...] run `cargo vet` without --locked to reformat files in the store" - This is somewhat related to mozilla/cargo-vet#589 - In a sense, this CL makes obsolete the `config.toml`-related parts of https://crrev.com/c/5368308. OTOH, detecting changes to `config.toml` (including a comments-only change) is not harmful, so let's leave the detection logic in place (in `run_cargo_vet.py). Bug: 41494083 Change-Id: Ia5967fa8e8b13b885f703810ef84c5d79030dfae Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5368743 Reviewed-by: danakj <danakj@chromium.org> Commit-Queue: Łukasz Anforowicz <lukasza@chromium.org> Cr-Commit-Position: refs/heads/main@{#1275813}

alexcrichton mentioned this issue Jul 20, 2022

Enable cargo-vet bytecodealliance/wasmtime#4444

Merged

alexcrichton mentioned this issue Jul 20, 2022

Automatic verification of untrusted contributors adding crates which are trusted by imported sources #271

Open

bholley added the p1 do now label Jul 20, 2022

This was referenced Jul 20, 2022

un-nest imports.lock, treat it as N SourceFiles #238

Closed

Store verbatim source in imports.lock #261

Closed

mystor mentioned this issue Aug 19, 2022

Update imports.lock less frequently #296

Merged

bholley closed this as completed in #296 Aug 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Imported audits may change `imports.lock` each time `cargo vet` is run #272

Imported audits may change `imports.lock` each time `cargo vet` is run #272

alexcrichton commented Jul 20, 2022

mystor commented Jul 20, 2022

bholley commented Jul 20, 2022

repi commented Aug 10, 2022 •

edited

Loading

mystor commented Aug 22, 2022

Imported audits may change imports.lock each time cargo vet is run #272

Imported audits may change imports.lock each time cargo vet is run #272

Comments

alexcrichton commented Jul 20, 2022

mystor commented Jul 20, 2022

bholley commented Jul 20, 2022

repi commented Aug 10, 2022 • edited Loading

mystor commented Aug 22, 2022

Imported audits may change `imports.lock` each time `cargo vet` is run #272

Imported audits may change `imports.lock` each time `cargo vet` is run #272

repi commented Aug 10, 2022 •

edited

Loading