-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cargo should understand distinctions between public/private dependencies #2064
Comments
The SAT solver reference was not a joke btw - the yum package manager uses a SAT solver to do its job. |
Yeah I've been worried about this in the past as well. Right now the resolver is kinda just a pile of heuristics and then it returns the first solution it always finds (despite there being many, in situations such as this). It may be the case that we could avoid a full SAT solver by applying smarter heuristics for the time being (e.g. we'd need some sort of "scoring system" for any solution from a SAT solver anyway), and that may also keep the resolution stage speedy. |
cc @wycats, curious on your thoughts about this |
Yeah, the scoring system is what I'm most unsure about. It should prefer newer versions of libraries to older versions and try to minimize the number of copies of dependencies, but how those should interplay and factor into the solver seem like pretty nontrivial questions. |
I also noticed that, with a setup like: [package]
name = "foo"
[dependencies]
bar = "*"
libc = ">= 0.1, < 0.3" [package]
name = "bar"
[dependencies]
libc = "0.1" cargo will pull in multiple copies of libc at version 0.1 and 0.2, even though the requirements can be satisfied with just 0.1. (Same issue happens with a wildcard dependency.) Is that the same issue as here? It seems related but like a fix might be different. |
Yeah I think it's basically the same issue as this. |
I think this issue really limits the usefulness of dependencies in crates and makes life hard for library authors. Particularly, it makes it dangerous to use any types in your public API that are defined by another library. To use these types in your public API, users must add the other library as one of their dependencies. The hard part, however, is that if these dependencies are at different versions it won't compile. For example, my objc crate works fine with either libc 0.1 or 0.2. I described the dependency as such in my Cargo.toml and assumed that being more widely compliant would be best so that the dependency doesn't conflict with other libraries that require a specific version. However, what's happened is that cargo always chooses to satisfy that dependency with 0.2, even when the user is using libc 0.1, resulting in a conflict and compiler error. The result is that, currently, you're forced to choose a specific version for your crate and make the user use it, as well. If a dependency releases a new version, allowing use of it requires dropping support for the old version and publishing a breaking change, splitting your users and meaning the older versions stop receiving updates. This is frustrating for users, because they must ensure all their dependencies use the same version of a crate and they will have to update their dependencies more often as there are more frequent breaking changes. Not to get melodramatic, but I think easing these multiple version issues will be important for rust and cargo to be pleasant to use in the future. I'm hoping this issue is on the radar of folks smarter than me and is being thought about, sorry if this turned into a bit of a rant. |
@SSheldon I believe people should be able to force the use of specific versions with |
@huonw yeah, it can be avoided if you're careful with your Cargo.lock. Though libraries are recommended to not commit that, so each new user would have to be required to do this. Maybe another that'd help here would be a way to say in your Cargo.toml, for example, "I want to use the objc crate and I want it to use libc 0.1"? Currently I don't know of any way to do that besides overriding the Cargo file of your dependency. |
sass-sys specifies a constraint of "0.1", but sass-rs specifies "0", which allows version 0.2 and above this shouldn't be a problem, because a "0.1" version would satisfy both constraints, but I think the reason this isn't working is because of rust-lang/cargo#2064 which might not get resolved anytime soon without this fix, building sass-rs fails because it tries to use libc 0.2 which has a different interface from the 0.1 that sass-sys used this gets things building again for now, but we should look into updating to the 0.2 interface when we get a chance
I've been playing with hooking the Z3 SMT solver up to rust recently and realized this bug exists in cargo, so I sketched out a little semver-solver using it. I know it's a bit like breaking a walnut with a tank (Z3 is a sort of intense 14MB C++ library) but since Z3 (as of v4.4.1) also contains a numerical optimizer it's really trivial to express a mixed constraint-solving / optimization problem with it. It's just a bit of plumbing, maintaining a mapping from packages to Z3 terms, package-versions to integers, and expressing version constraints as integer inequalities on the terms. Code's here: https://github.com/graydon/z3-rs/blob/master/tests/semver_tests.rs#L158-L267 |
That's awesome! |
For what it's worth, the problem of supporting duplicates makes the exact constraints (and their weights) a little bit more complicated than the usual approach to this problem. In most versions of this problem, dependency "conflicts" simply fail to resolve, but we have another option available to us. That said, over time, I've increasingly come to believe that we were too willing to use that solution in Cargo, and we'll probably ratchet down our usage of allowed-duplicates somewhat. |
Yeah, I think cargo casually permitting multiple copies of a dependency in a project is ... a dangerous default. It needs to be permitted (and we engineered the symbol mangling scheme to explicitly support it) but it's also relatively likely that any given instance of it is a bug. A potentially-very-subtle bug, like "two instances of a library that both think they are in charge of talking to OS subsystem / native library foo" I think it should probably default to not-working, and require opting in. Though it might also break the ecosystem today if you set that to be the case. I'd be interested to run the experiment on the existing package set, see how many of them can't build with their existing dependency declarations, and if so how much work it'd take / how many duplicate-opt-ins or dependency-range-adjustments to fix. |
@graydon I think it's a little more subtle than that. "Shared" dependencies (dependencies whose types are used in other pub types, or which are transported to other crates through type inference) must not be duplicated. "Internal" dependencies (dependencies whose types are only used internally) can be safely duplicated (but you might not want to for other reasons, like binary size). We've talked some about how packages could say that a dependency was "internal", which would then permit duplication and error out if the type was actually made public. Even if we could perfectly detect this, it would still be important to explicitly opt in, because switching from an "internal" to "shared" dependency would be a breaking change to downstream crates. |
Hm. I think either we're talking past each other, or I disagree with your interpretation here, or something's changed I don't understand in the problem space. As far as I know (and as it was initially designed), if crate X exports type T, two copies of X (with the same or different versions, doesn't matter) can both be imported independently in the same composite crate w/o any linkage or type system problems. Their types should be given independent, disjoint identities in any composition of them. The risk is only, as you say, in binary-size and (imo more seriously) violating uniqueness assumptions each copy of X might assume, such as "I'm the one that calls the init routine on a C library I'm linked against" or "I'm in charge of handling sigusr1" or whatever. If I'm wrong about this, something has .. significantly changed in the design. What part do you think goes wrong? |
Here's a simple example of the problem I'm talking about:
Now let's say another crate ( It's possible for In this case, both |
Ok, that's .. definitely an instance where you'd want non-duplication. And I can think of others! I mentioned some above. But it's not a hard requirement of any of the parts in the system that "public types" implies "can't duplicate", right? I'm just trying to get clear on what you're arguing. Last time I discussed this (in the context of linkage), I was assured that cargo passed a disambiguating sort of name/ver/source triple of metadata to rustc for each crate in a project, such that two copies of the same symbolic-named crate wouldn't clash at metadata-load or link time. I hope that's still true, even if I think it should not be the default behavior. To be clear on what I'm arguing:
|
Yep. Cargo will do that so there won't be any linker errors, but you will run into compile errors like "Expected type |
@pornel I do believe so, yes, but it may be pretty non-trivial to implement (haven't thought too deeply) |
I'm just chiming in because I recently discovered that cargo will silently compile multiple versions of the same crate into the same binary, and this feels weird and scary to me, and I see that a warning was proposed but rejected. I'm concerned that it doesn't seem like people are necessarily all on the same page with respect to @graydon's objection, which I (a total Rust outsider, I grant you) completely agree with -- it's not necessarily safe to link a crate twice EVEN if the linkage is private, and if it's not safe, the resulting bugs will be very subtle at best. I like @graydon 's example of "two instances of a library that both think they are in charge of talking to OS subsystem / native library foo", which is disastrous EVEN if there's no type clash for the linker to detect because both copies are private. The best example I could come up with was a logging crate -- suppose that it by default opens a logfile whose name is, say, the name of the binary it's in, plus some other stuff (the string ".log", the current time, the pid, etc.) Then two private copies of the crate will open the same file, and the resulting behavior will be the worst kind of totally inexplicable, and probably someone will tear their hair out for weeks trying to debug it. The pull request for the rejected warning mentioned that the sys crates are "highlanders" -- they enforce non-duplication -- and that servo has a tool for catching duplicate crates. This seems like evidence to me that some important cases require avoiding this currently-unavoidable behavior. |
So there's been some breakage this weekend with the release of openssl-sys 0.9.0, due to the requirement that native libraries are linked by only one crate. Any package that depends on git2 ^0.4 no longer builds out of the box due to two versions of openssl-sys being pulled in. Can we get a SAT solver already pretty please? :) |
Some interesting discussion of these issues: https://research.swtch.com/version-sat |
A question about the openssl-sys case -- do we actually need a SAT solver to fix it, or would ad-hoc adjustment to use the highest version of the dependency have been sufficient? (That is, did the package depending on the lower version specifically exclude the higher?) It sounds from @sfackler's link like other systems may perform this ad-hoc adjustment (and that it is in fact necessary to solve the non-NP-complete case correctly.) |
Any progress on this in 2017? It just came up again for one of Ruma's users who ended up with three different versions of a public dependency in their program and resulted in confusing error messages. Does this issue fit into any of the 2017 roadmap issues? Better error messages from the compiler, which make it clearer that there are conflicts between versions, could go a long way in helping new users figure out what the problem is before there is a more formal solution to public/private dep management with Cargo. |
I believe rust-lang/rfcs#1977 is being relatively actively worked on. |
rust-lang/rfcs#1977 has been accepted, and there is now a tracking issue: rust-lang/rust#44663 which IMO replaces this issue, so I'm going to close this, but note that it will be implemented! |
Current report
Dependencies for crates today can be separated into two categories: those that are implementation details and those whose types are present in the public API of the crate. Cargo does not support distinguishing these two cases today, but it can often be crucial for dependency management.
If a dependency is an implementation detail (i.e. no types are exposed) then it can be easily duplicated in the dependency graph without worry (other than concerns like binary size). In other words, Cargo's safe to have multiple semver-incompatible versions of this dependency because they won't talk with one another.
If a dependency is part of the public API, however, then it's critical that any crate which can share this type it needs to all be of the same version. In other words, Cargo's logic of allowing multiple semver-incompatible versions is almost always guaranteed to go awry.
If Cargo were to understand a distinction between these private/public (or internal/exposed) dependencies then it could solve problems like:
Original report
Cargo is overeager to pull in multiple copies of a crate
Say we have a crate with the following
Cargo.toml
:r2d2_postgres
version0.9.3
depends onpostgres
version0.10
, while version0.9.2
and lower depend onpostgres
version0.9
. Cargo would ideally pickr2d2_postgres
version0.9.2
andpostgres
version0.9.6
, but it instead picksr2d2_postgres
version0.9.3
, which pulls inpostgres
version0.10
, as well aspostgres
version0.9.6
. This ends up totally blocking any use of these dependencies.This issue rather significantly impedes crates' abilities to upgrade their dependencies.
It seems like Cargo's resolution logic should have the property that if there exists a resolved dependency graph with no more than one copy of each dependency, it should never pull in more than one copy of a dependency. Unfortunately, questions like "does such a resolved dependency graph exist" are NP-complete, but I suspect that it's tractable in practice (we don't expect crates to be randomly permuting their dependencies every minor version, for example).
In the meantime before someone has a chance to integrate a SAT solver into Cargo, I think there are some things we can do to help users correct Cargo when it fails to resolve things "properly". For example,
cargo update
could warn whenever it has to pull in multiple copies of a crate, identifying the portion of the object graph that forced that to happen. This should provide users an easier starting point to poking the graph into shape withcargo update --precise
.The text was updated successfully, but these errors were encountered: