Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn when using multiple versions of a package #2753

Closed
wants to merge 7 commits into from

Conversation

schuster
Copy link

Here's a possible change to have Cargo warn the user when it's going to use multiple versions of the same package (see #2064 (comment)).

It probably needs a number of changes before being merged, so I'm looking for feedback. Here are a few particular issues that should be addressed:

  • Should two crates with the same name from a different source be considered "the same" crate for the purpose of this warning? Is that even possible?
  • I ended up sorting the list of versions just to get deterministic test results, which seems like a silly workaround. Is there a better solution?
  • As I understand the code, Cargo will display the warning every time the user runs build, fetch, update, or generate-lockfile. That might be too often, but what do others think? Should it change to only warn if the resolution graph changes? If so, is there an easy way to check for that? (admittedly, I haven't looked very hard for such a check yet).

@rust-highfive
Copy link

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @wycats (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.


if !package_version_map.contains_key(&key) {
package_version_map.insert(key, vec![package_id.version()]);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may actually be able to just be:

package_version_map.entry(key).or_insert(Vec::new()).push(package_id.version())

@alexcrichton
Copy link
Member

Thanks for the PR @schuster! I think we'll definitely want to consider the UI here as I think that's pretty important. I agree that this should at most only ever warn when the dependency graph changes, although detecting that can indeed get interesting.

Even then, though, I'm somewhat wary of doing this by default. There are numerous situations where multiple versions are fine and a warning would otherwise just be annoying, so maybe this should be an opt-in of some form?

For your other questions, though:

Should two crates with the same name from a different source be considered "the same" crate for the purpose of this warning? Is that even possible?

Nah they should stay distinct, you can for example have the same crate name from two different git repositories, which may be pretty unlikely to be the same crate or very likely to intentionally get pulled in twice.

I ended up sorting the list of versions just to get deterministic test results, which seems like a silly workaround. Is there a better solution?

Perhaps a BTreeMap which stores keys sorted?

@schuster
Copy link
Author

schuster commented Jun 6, 2016

Sorry for the slow response - been a busy week and weekend. I agree that some kind of opt-in makes sense here. How about a Cargo config option? The only concern there is whether such an option would still make sense in a hypothetical future where Cargo uses some kind of SAT/ILP solver to solve this issue in most cases, but I could see that even then one might want this option.

Thanks for the code suggestion to update package_version_map. That does indeed work, and is much shorter.

I'll take a look at the BTreeMap option and the config option above either later this week or weekend.

@alexcrichton
Copy link
Member

I feel like we may want to try to get this working without a config option because once we require that it's probably gonna see far less usage than it would otherwise. If we only warn when lock files change that seems somewhat reasonable (maybe). Although that does have the downside that if you check out a new library repo you're likely to almost always get warnings, which kinda isn't great...

I guess I kinda have two minds on this, on one hand having a warning that's off by default isn't too useful, but if it's on it's going to give too many false positives. Typically I err on the side of omitting a warning with a lack of a better solution, but others may feel differently!

@schuster schuster force-pushed the multi-version-warning branch from f11b0c5 to df6ea2e Compare June 15, 2016 00:43
@schuster
Copy link
Author

Ok, just added some logic to print a warning only if the current resolution has multiple versions of a given package, and the previous resolution only had one or zero. The previous resolution is based on the one passed to ops::resolve_with_previous, which is only None in the case of cargo generate-lockfile, and in that case giving another warning might make sense since it appears a "fresh" lockfile is desired. That part of the code could easily change, though.

I've included a few unit tests, but the change probably bears testing out in practice, too, to see if the number of warnings is about right.

I still need to look at the BTreeMaps for deterministic testing; that's next on the list.

@schuster
Copy link
Author

It also might be necessary to see if this has a significant impact on performance. The code still runs and walks the dependency tree every time a resolution is done, even if it doesn't print any warnings. My hunch is that the rest of the resolution operation will dwarf the running time for the warning computation, but it's probably worth checking.

@alexcrichton
Copy link
Member

Nice! This is looking great to me, thanks @schuster!

@alexcrichton
Copy link
Member

And to elaborate on that, tests look good to me (modulo them passing), and yeah I wouldn't worry about the speed here as Cargo will basically always scream through this.

@schuster
Copy link
Author

Does this need anything more at this point? Code review/cleanup, squashing the commits, or anything else?

@alexcrichton
Copy link
Member

Thanks! Did the tests disappear though? Or if not, it'd be good to have some tests as well :)

cc @rust-lang/tools, thoughts on this? To me it seems reasonable as it's a one-time warning that goes away quickly, and you typically do want to be warned about it.
cc @larsbergstrom, not sure if this would be of much interest to Servo, I believe there's already a check there, right?

@brson
Copy link
Contributor

brson commented Jun 23, 2016

I'm mildly worried about all the false positives this will cause - it's not wrong to use multiple versions of a crate (it's only wrong it it causes problems...). If the user does see this warning what is their recourse? Do we have a tool that will help them figure out what to change? I suspect that without further guidance this could be quite frustrating and users will train themselves to ignore it.

One thing this warning might do is explain the chain of dependencies that lead to each version.

@schuster
Copy link
Author

@alexcrichton I still see the tests in the committed files for this PR - are they not running or something? Or are there additional tests you'd like to see?

@schuster
Copy link
Author

@brson I can look into the chain of dependencies when I have time (likely not until after this weekend at least).

@alexcrichton
Copy link
Member

Whoa weird maybe I was looking at the wrong diff... Tests look good to me.

@brson yeah I agree that we don't want too many false positives here (e.g. my stance on lints). This seems somewhat better to me as it's only seen once (when a lock file changes) rather than every build, but we should still tread carefully.

Along the lines of the discussion with public/private dependencies, it's somewhat wrong to have multiple versions of a public dependency (e.g. shared between two crates) but it should be fine to have multiple versions of a private dependency (as they aren't reachable). Right now Cargo can't differentiate between these two, however.

I do think it's a good point that we may want a chained explanation why the crates are brought in, but I'd still be somewhat on the fence about this warning. Maybe something off-by-default but a config option could turn on if we really want, but even that seems like overkill...

@larsbergstrom
Copy link

We handle this in servo via a python script that looks for duplicates:
https://github.com/servo/servo/blob/master/python/tidy/servo_tidy/tidy.py#L248-L294

We've also added an exceptions filter as, as you mentioned above, sometimes it is either innocuous or required to have extra copies of a crate. While this feature is less important now that sys crates have the "highlander rule" in place, for anyone targeting mobile or embedded platforms that care about code side this is still going to be a concern. The Firefox Android team, for example, tries to shave off a few KB here and there wherever possible because it has such a huge impact on user experience during upgrades, so for projects with similar concerns I'd expect them to also consider duplicate packages less than desirable.

That said, having to opt into it (via cargo.toml, commandline arguments, or both) seems perfectly reasonable to me.

@schuster
Copy link
Author

schuster commented Jul 5, 2016

Just added some code to report the dependency chain for each version rather than just list the versions. The output format was just my first thought, so feel free to suggest something better.

Any further thoughts on making a config option for this, or otherwise adding some opt-in/opt-out functionality?

@bors
Copy link
Contributor

bors commented Jul 5, 2016

☔ The latest upstream changes (presumably #2759) made this pull request unmergeable. Please resolve the merge conflicts.

@schuster schuster force-pushed the multi-version-warning branch from fbb6281 to 7ee0a85 Compare July 6, 2016 13:34
@alexcrichton
Copy link
Member

The tools team got a chance to discuss this PR yesterday during triage, but our feeling was that we probably don't want to merge this into Cargo proper just yet. The experience with Servo shows that we probably at least want some more configuration of at least having a whitelist, and at that point this is probably best done as an external cargo subcommand for now.

I wonder if perhaps cargo outdated or something like cargo clippy would be a good place to put this for now? Basically another subcommand which already takes a look and analyzes the dependency graph would be a great place to put this!

I'm gonna close this for now though as we don't want to merge it into Cargo quite just yet, but thanks regardless for the PR!

@schuster
Copy link
Author

No problem, that makes sense. I may look around and see if there's some kind of subcommand solution that makes sense (although I may just not get around to it - people are welcome to take the code from this PR and put it somewhere useful if they like).

In the meantime, would it still be worth my time to play around with using an ILP solver to resolve dependencies? I thought if I could at least get something running and post the code, it makes it easier for anyone to experiment with different "scoring" functions for the solver to maximize.

@alexcrichton
Copy link
Member

I'm personally pretty wary of throwing a constraint solver at Cargo because it's not really solving the root of the problem, which is that if you have two crates which share different versions of a dependency, Cargo should give an error for that rather than deferring to the compiler. A constraint solver would fix some cases today which are resolving to obviously the wrong choice when there's another correct choice that wasn't hit yet, but there will still be cases where Cargo resolves to the wrong choice (legitimately) and then the compile fails with a super obscure error.

So along those lines I'd probably hold off for now, but I'll keep the relevant issue posted!

@gwillen
Copy link

gwillen commented Aug 26, 2016

Is there a documented way to enforce a "highlander rule" for a crate? Sorry for necroposting on this pull request, but I just discovered this multiple-linking behavior and find it creepy and weird. ;-) I'm kind of surprised there isn't more discussion of it. I do see the bug about public/private imports, but I don't think that necessarily solves all the problems.

@alexcrichton
Copy link
Member

@gwillen not currently, no. A crate can't say that there should be only one version, but that's covered by #73

@gwillen
Copy link

gwillen commented Aug 26, 2016

@alexcrichton Thanks. I've left a comment (#2064 (comment)) on the pub-priv import issue, explaining my concerns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants