Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dependency resolver downloads each version of one of two requested packages, due to choosing version of a dependency before inspecting both requested packages. #8893

Closed
DanielFEvans opened this issue Sep 21, 2020 · 12 comments

Comments

@DanielFEvans
Copy link

DanielFEvans commented Sep 21, 2020

What did you want to do?

In a fresh virtual environment, I wish to install two locally developed and hosted packages, jbafarm (v1.10.31) and litmus (v1.4.2). These packages have a single common dependency, gdal, which we also have internally hosted versions of. The latest version of jbafarm has the requirement gdal==2.2.0, while litmus has gdal>=2.2.0. gdal 2.2.0 is available on the package server, and is clearly (to the human observer) the only obvious resolution.

Output

When one tries install a specific version of jbafarm, the install proceeds successfully - gdal==2.2.0 is picked up, satisfying both packages' requirements:

$ pip install "jbafarm==1.10.31" litmus --use-feature=2020-resolver        
Looking in indexes: https://pypi.org/simple, http://<internal url>
Collecting jbafarm==1.10.31
  Downloading http://<internal url>/packages/jbafarm-1.10.31-cp36-cp36m-linux_x86_64.whl (822 kB)
     |████████████████████████████████| 822 kB 33.7 MB/s
Collecting gdal==2.2.0
  Downloading http://<internal url>/packages/GDAL-2.2.0-cp36-cp36m-manylinux1_x86_64.whl (23.7 MB)
     |████████████████████████████████| 23.7 MB 47.3 MB/s
Collecting litmus
  Downloading http://<internal url>/packages/litmus-1.4.2-py2.py3-none-any.whl (18 kB)

...

However, when the jbafarm requirement is left open, the resolver 'fails' by downloading each version of jbafarm individually. We've not yet seen what happens when it finally exhausts all versions - timeouts on CI builds, on the package server, or our own patience kick in!

The issue appears to be that Pip downloads litmus, picks up the gdal>=2.2.0 requirement, and settles on the latest available version (gdal==3.1.3). Only after this does it download the latest jbafarm, and finds this does not resolve that GDAL requirement. It then downloads the previous version of jbafarm, finds the same, and so on.

$ pip install jbafarm litmus --use-feature=2020-resolver
Looking in indexes: https://pypi.org/simple, http://<internal url>
Collecting litmus
  Downloading http://<internal url>/packages/litmus-1.4.2-py2.py3-none-any.whl (18 kB)
Requirement already satisfied: wheel in /home/jbanorthwest.co.uk/danielevans/venvs/farmmap3/lib/python3.6/site-packages (from litmus) (0.35.1)
Collecting gdal>=2.2.0
  Downloading http://<internal url>/packages/GDAL-3.1.3-cp36-cp36m-manylinux1_x86_64.whl (28.4 MB)
     |████████████████████████████████| 28.4 MB 35.5 MB/s
Collecting jbafarm
  Downloading http://<internal url>/packages/jbafarm-1.10.31-cp36-cp36m-linux_x86_64.whl (822 kB)
     |████████████████████████████████| 822 kB 41.4 MB/s
  Downloading http://<internal url>/packages/jbafarm-1.10.28-cp36-cp36m-linux_x86_64.whl (821 kB)
     |████████████████████████████████| 821 kB 39.3 MB/s
  Downloading http://<internal url>/packages/jbafarm-1.10.27-cp36-cp36m-linux_x86_64.whl (816 kB)
     |████████████████████████████████| 816 kB 28.3 MB/s
  Downloading http://<internal url>/packages/jbafarm-1.10.26-cp36-cp36m-linux_x86_64.whl (816 kB)
     |████████████████████████████████| 816 kB 29.7 MB/s
  Downloading http://<internal url>/packages/jbafarm-1.10.25-cp36-cp36m-linux_x86_64.whl (816 kB)
     |████████████████████████████████| 816 kB 38.8 MB/s
  Downloading http://<internal url>/packages/jbafarm-1.10.24-cp36-cp36m-linux_x86_64.whl (816 kB)
     |████████████████████████████████| 816 kB 27.5 MB/s
  Downloading http://<internal url>/packages/jbafarm-1.10.23-cp36-cp36m-linux_x86_64.whl (816 kB)
     |████████████████████████████████| 816 kB 37.7 MB/s

[Continues until timeout or cancelled]

Additional information

We use an instance of pypi-server (version 1.3.2) to host these packages, and configure Pip to use it via --extra-index-url (either CLI or via Pip's config file).

I'm surprised that Pip resorts to downloading each jbafarm version in turn. I've never seen this behaviour before - does Pip usually use some metadata from PyPI to avoid all these downloads, which our local pypi-server instance is missing? Either way, I'm surprised that it doesn't check the requirements for (the latest versions of) all user-specified packages before settling on some resolutions.

It may be that Pip changes the gdal choice after inspecting every jbafarm version, but it'll take a very long time to achieve that. It doesn't seem like a good tactic is to search older versions of a package to see if it ever allowed a higher version of some dependency; firstly, because it seems fairly unlikely that an older version of a package allowed a higher version of a dependency, and secondly, because as a user I expect it to install the latest possible version of the specified package, even if some dependencies aren't the latest available, rather than vice versa.

The issue still occurs on the current development version of Pip. Please say if you'd like some more verbose logging output - I didn't include it initially as pip -v is very verbose.

@uranusjr
Copy link
Member

uranusjr commented Sep 21, 2020

does Pip usually use some metadata from PyPI to avoid all these downloads, which our local pypi-server instance is missing?

There is (currently) no such metadata existed anywhere, so pip used to just go ¯\_(ツ)_/¯ and declare the latest jbafarm is good enough. The 2020-resolver feature is actually trying to do proper resolution in this situation for the first time, and downloading previous package versions is the only way to do it (currently).

What you see is a common problem in dependency resolution logic. It is easy to reverse-infer the best resolution strategy when you know what the result should be before hand, but the resolver does not it should choose that strategy because it may result in worse results in a different situation (hindsight is always 20/20, as they say). It is quite possible that the current logic has room for improvement, but that should not be driven by anecdotals, but would require a relatively comprehensive survey including many common dependency combinations.

@DanielFEvans
Copy link
Author

My main surprise is that the gdal package version is chosen before inspecting both latest versions of the requested packages. My expectation is that Pip's default behaviour would be to install the latest versions of both requested packages, assuming they're compatible, as they are. Are there 'rules' for such cases noted down anywhere to reference?

If it wasn't stopped by timeouts or errors, I suspect that it would eventually resolve to the expected solution, but it's taking a very long route to get there.

@pfmoore
Copy link
Member

pfmoore commented Sep 21, 2020

The problem here is likely to be that the resolver starts by assuming the latest version of every package is the best choice. When that fails, the resolver backtracks, which is the point at which it decides to try an earlier version of the local package. In theory, this shouldn't matter, as the algorithm will eventually find a correct solution, but in practice the strategy used for searching the set of solutions affects the number of steps required drastically.

It would probably be useful if we could include some heuristics to guide the search strategy - "pick earlier versions of dependencies before picking earlier versions of root packages" might be a reasonable heuristic, for example - but:

  1. I'm not sure the resolver currently has a way to add such heuristics, so we may need to change resolvelib to handle this.
  2. It's possible that a heuristic which is good for one case could be very bad for another, so we need to be careful not to just move the problem somewhere else.
  3. Once we start to get into heuristics, the question of whether to allow the user to control what heuristics are used arises, and we get into UI questions.

@jontwo
Copy link

jontwo commented Sep 21, 2020

I hope this logic will be improved before the resolver becomes the default. Or will there be a way to disable it?

We recommend you use --use-feature=2020-resolver to test your packages with the new resolver before it becomes the default.

@pfmoore
Copy link
Member

pfmoore commented Sep 21, 2020

I hope this logic will be improved before the resolver becomes the default.

When the new resolver becomes the default, it will initially still be possible for users to opt back into the old resolver, so switching on the new resolver isn't a hard deadline.

@nateprewitt
Copy link
Member

To add another data point to this, we ran into this issue testing the AWS CLI and Boto3 with the new resolver. If they're installed together with one dependency explicitly pinned, we end up making hundreds of calls to warehouse to finish resolution. It sounds like there isn't a great way to resolve this with metadata right now, but are there any concerns around infrastructure with that kind of call volume increase?

Repro:
pip install --use-feature=2020-resolver awscli==1.17.0 boto3<1.16

Also in cases where build environments can't complete this resolution within their timeout lifespan, is the suggested workaround to disable the resolver for now?

@uranusjr
Copy link
Member

The simplest workaround would be to manually provide the extra dependency information to the resolver. Pin a package’s version if it is backtracked too many times. Using the top post’s example, since the user knows that every gdal later than 2.2.0 is incompatible, this should make the resolution fast:

pip install jbafarm litmus 'gdal<=2.2.0' --use-feature=2020-resolver

@uranusjr
Copy link
Member

uranusjr commented Sep 27, 2020

@DanielFEvans I made some changes to the resolver logic. Can you modify your pip insallation locally and check whether the resolver can catch the conflict earlier?

Code is at https://github.com/pypa/pip/pull/8924/files

@DanielFEvans
Copy link
Author

@uranusjr - the modified version behaves much better, thanks! It chooses the latest versions of jbafarm and litmus, and chooses the compatible version of gdal. The overall resolve/install time is normal, and I don't see pip churning through package downloads anywhere.

Thanks for looking into it!

One thing that looks a little funny is that it states Collecting gdal>=2.2.0 when it has (presumably) resolved it to gdal==2.2.0, but that is almost certainly that's an entirely unrelated visual quirk (I presume it's quoting the first reference in any requirements file, not the actual resolved requirement).

Output from install with modified version
$ pip install jbafarm litmus --use-feature=2020-resolver
Looking in indexes: https://pypi.org/simple, http://<internal url>
Collecting litmus
  Downloading http://<internal url>/packages/litmus-1.4.2-py2.py3-none-any.whl (18 kB)
Collecting jbafarm
  Downloading http://<internal url>/packages/jbafarm-1.10.34-cp36-cp36m-linux_x86_64.whl (822 kB)
     |████████████████████████████████| 822 kB 27.8 MB/s
Collecting numpy
  Using cached numpy-1.16.4-cp36-cp36m-manylinux1_x86_64.whl (17.3 MB)
Collecting gdal>=2.2.0
  Downloading http://<internal url>/packages/GDAL-2.2.0-cp36-cp36m-manylinux1_x86_64.whl (23.7 MB)
     |████████████████████████████████| 23.7 MB 13.3 MB/s
Collecting scipy==1.2.1
  Using cached scipy-1.2.1-cp36-cp36m-manylinux1_x86_64.whl (24.8 MB)
[...]

@uranusjr
Copy link
Member

One thing that looks a little funny is that it states Collecting gdal>=2.2.0 when it has (presumably) resolved it to gdal==2.2.0, but that is almost certainly that's an entirely unrelated visual quirk (I presume it's quoting the first reference in any requirements file, not the actual resolved requirement).

Yeah, this is a known UX problem caused by fitting the more complex resolver into the naive output format currently in use. pip does not currently know how to “merge” requirements, so the Collecting line is just arbitrarily picking one of them to display. It’s unfortunately difficult to change the output right now since it’s shared with the naive resolver.

@pradyunsg
Copy link
Member

I'll note that we do plan on re-visiting pip's output (#4649) and getting "rid" of the old resolver will play a big role in the same. :)

@brainwane
Copy link
Contributor

Now that #8924 is merged I'm marking this as closed, since per #8893 (comment) it looks like the original issue is resolved and we already know about the visual quirk. Thanks, everyone!

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 9, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants