-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dependency resolver downloads each version of one of two requested packages, due to choosing version of a dependency before inspecting both requested packages. #8893
Comments
There is (currently) no such metadata existed anywhere, so pip used to just go ¯\_(ツ)_/¯ and declare the latest What you see is a common problem in dependency resolution logic. It is easy to reverse-infer the best resolution strategy when you know what the result should be before hand, but the resolver does not it should choose that strategy because it may result in worse results in a different situation (hindsight is always 20/20, as they say). It is quite possible that the current logic has room for improvement, but that should not be driven by anecdotals, but would require a relatively comprehensive survey including many common dependency combinations. |
My main surprise is that the If it wasn't stopped by timeouts or errors, I suspect that it would eventually resolve to the expected solution, but it's taking a very long route to get there. |
The problem here is likely to be that the resolver starts by assuming the latest version of every package is the best choice. When that fails, the resolver backtracks, which is the point at which it decides to try an earlier version of the local package. In theory, this shouldn't matter, as the algorithm will eventually find a correct solution, but in practice the strategy used for searching the set of solutions affects the number of steps required drastically. It would probably be useful if we could include some heuristics to guide the search strategy - "pick earlier versions of dependencies before picking earlier versions of root packages" might be a reasonable heuristic, for example - but:
|
I hope this logic will be improved before the resolver becomes the default. Or will there be a way to disable it?
|
When the new resolver becomes the default, it will initially still be possible for users to opt back into the old resolver, so switching on the new resolver isn't a hard deadline. |
To add another data point to this, we ran into this issue testing the AWS CLI and Boto3 with the new resolver. If they're installed together with one dependency explicitly pinned, we end up making hundreds of calls to warehouse to finish resolution. It sounds like there isn't a great way to resolve this with metadata right now, but are there any concerns around infrastructure with that kind of call volume increase? Repro: Also in cases where build environments can't complete this resolution within their timeout lifespan, is the suggested workaround to disable the resolver for now? |
The simplest workaround would be to manually provide the extra dependency information to the resolver. Pin a package’s version if it is backtracked too many times. Using the top post’s example, since the user knows that every
|
@DanielFEvans I made some changes to the resolver logic. Can you modify your pip insallation locally and check whether the resolver can catch the conflict earlier? Code is at https://github.com/pypa/pip/pull/8924/files |
@uranusjr - the modified version behaves much better, thanks! It chooses the latest versions of Thanks for looking into it! One thing that looks a little funny is that it states Output from install with modified version
|
Yeah, this is a known UX problem caused by fitting the more complex resolver into the naive output format currently in use. pip does not currently know how to “merge” requirements, so the Collecting line is just arbitrarily picking one of them to display. It’s unfortunately difficult to change the output right now since it’s shared with the naive resolver. |
I'll note that we do plan on re-visiting pip's output (#4649) and getting "rid" of the old resolver will play a big role in the same. :) |
Now that #8924 is merged I'm marking this as closed, since per #8893 (comment) it looks like the original issue is resolved and we already know about the visual quirk. Thanks, everyone! |
What did you want to do?
In a fresh virtual environment, I wish to install two locally developed and hosted packages,
jbafarm
(v1.10.31) andlitmus
(v1.4.2). These packages have a single common dependency,gdal
, which we also have internally hosted versions of. The latest version ofjbafarm
has the requirementgdal==2.2.0
, whilelitmus
hasgdal>=2.2.0
.gdal
2.2.0 is available on the package server, and is clearly (to the human observer) the only obvious resolution.Output
When one tries install a specific version of
jbafarm
, the install proceeds successfully -gdal==2.2.0
is picked up, satisfying both packages' requirements:However, when the
jbafarm
requirement is left open, the resolver 'fails' by downloading each version of jbafarm individually. We've not yet seen what happens when it finally exhausts all versions - timeouts on CI builds, on the package server, or our own patience kick in!The issue appears to be that Pip downloads
litmus
, picks up thegdal>=2.2.0
requirement, and settles on the latest available version (gdal==3.1.3
). Only after this does it download the latestjbafarm
, and finds this does not resolve that GDAL requirement. It then downloads the previous version ofjbafarm
, finds the same, and so on.Additional information
We use an instance of
pypi-server
(version 1.3.2) to host these packages, and configure Pip to use it via--extra-index-url
(either CLI or via Pip's config file).I'm surprised that Pip resorts to downloading each
jbafarm
version in turn. I've never seen this behaviour before - does Pip usually use some metadata from PyPI to avoid all these downloads, which our localpypi-server
instance is missing? Either way, I'm surprised that it doesn't check the requirements for (the latest versions of) all user-specified packages before settling on some resolutions.It may be that Pip changes the
gdal
choice after inspecting everyjbafarm
version, but it'll take a very long time to achieve that. It doesn't seem like a good tactic is to search older versions of a package to see if it ever allowed a higher version of some dependency; firstly, because it seems fairly unlikely that an older version of a package allowed a higher version of a dependency, and secondly, because as a user I expect it to install the latest possible version of the specified package, even if some dependencies aren't the latest available, rather than vice versa.The issue still occurs on the current development version of Pip. Please say if you'd like some more verbose logging output - I didn't include it initially as
pip -v
is very verbose.The text was updated successfully, but these errors were encountered: