-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new dependency resolver runs "forever" due to incompatible package versions #8922
Comments
Looks similar to #8893. |
OK no, this is a separate problem. Not sure if the resolver is not quitting right after it cannot find a marshmallow-sqlalchemy version that works. |
Ah I know why now. The resolver is not smart enough to tell that psycopg2-binary and scipy are irrelevent to the marshmallow-sqlalchemy conflict, and goes on to try them anyway since it is theoratically possible they provide a solution to the conflict. The direct URL requirement feature (PEP 508 I don’t have a good solution to this problem, given how Python packaging works, unfortunately 😞 |
Good analysis. I agree this is a fundamental problem with Python packaging, and I also don't see an obvious way of fixing it. The frustration here is that the cases that go wrong are rare, but we have to assume the worst - so everyone pays a cost for a minority feature. Hmm, one possible way we might be able to address this - if we merge constraints like |
That seems like a sneaky feature of the packaging system - I'm learning all kinds of new things about it today. @pfmoore, I'm not sure your suggestion quite works (at least not by itself) because the issue is that another repository might provide a URL-keyed version of Three thoughts on things that pip might do:
There might be some other things that could be cut as well; I didn't check on the combination of packages in this issue, but in our production app, when I ran in verbose mode, pip seemed to be repeatedly re-downloading the index of available versions of |
You hit the nail right on the head here 🙂 I think one simple way pip can do here is to ask for help eagerly, even if it hasn’t exhaused all possibilities. It is very rarely the user actually wants marshmallow-sqlalchemy 0.1.0 when they specify a large version rage, and pip can stops after trying a few times and just aborts saying “hey eh I haven’t tried everything but marshmallow-sqlalchemy is taking too much time, please tell me more especifically what you need”. This goes back to the |
We can trim the search tree by remembering the cause of conflicts we've seen, and refusing when we re-explore the same subtree. alpha 1.0 conflicts with beta 3.0, and once we figure that out, we don't need to re-understand that by remembering the conflict (basically CDCL). |
Maybe we could add some logging that just says "pip has tested N possibilities" that gets updated as the work goes on (like the progress bars, but without a total known in advance). Being able to collect some feedback from users (or from example test cases) on what typical numbers look like might help us work out a more reasonable metric... |
Certainly anything that can be done to trim the search tree helps - and more generically than just this issue, if done correctly. I think adding logging about the number of tested possibilities would be helpful, but I think it's more useful for the user to know what pip's having trouble with - in this case, knowing that pip was having trouble finding an acceptable version of sqlalchemy and that packages A, B, and C have dependencies on that would be super useful in troubleshooting. If you were looking for a metric on when to give up (or whatever), I'd suggest doing it based on "time spent excluding downloads". Not sure whether it's 1 minute or 5 (I'd guess closer to 1), but probably by the time pip has spent 5 minutes of CPU time running the resolver (not counting the part of downloading files), your average user is going to get frustrated and give up. One other thing I'd point out is that this seems like something (like many other issues with the new resolver) where it would actually be super helpful to have some sort of cache of package requirements, at least at the pypi level - if pip could download a list of all versions of a package and their requirements, that could save quite a bit of time in this process, especially when you consider packages like numpy that can run 20MB per version. At that point, one could even start pre-combining requirements (e.g. maybe marshmallow_sqlalchemy versions 0.1 through 0.14 require sqlalchemy >=0.7 and versions 0.15+ requires sqlalchemy >= 0.8; then instead of the resolver needing to consider 42 different specs, you can just consider 2). Given that most packages probably change requirements for a fairly small fraction of their releases, that sort of thing could help drastically reduce the search space. |
100% true. The problem is that we already have a cache of wheels, which basically means that getting wheel metadata is already done. And we have a cache of wheels built from sdists, which covers a lot of the rest. What's not obvious (to me, at least!) is why that's not enough, or to put it another way, given that we have those caches, where's the time being spent? We really need to instrument the code, and then fire some of the bigger/more problematic examples at it, but so far, no-one's had the time to do that 🙁 |
In fact, the caches do seem to be working. We get a distinct improvement from #8912 which basically eliminates a lot of unnecessary network calls to PyPI. I suspect that with #8912 included, what's left in this issue is just the fundamental problem that pip doesn't know it can give up without checking everything... |
As far as I can tell, it's the network overhead of reaching out and getting back a "you have it in the cache already" response. :( |
As I just noted in #8683 (comment) , the team discussed this problem in last week's meeting and work is now in progress so that pip will inform the user when the resolver is doing a lot of backtracking. We'll probably be giving the user a short in-terminal error message and pointing to longer documentation, as we did with this conflict resolution documentation. |
@bkurtz Thank you for the issue report! In #8683, #8346, #8975, and some related issues and pull requests, we addressed the fundamental problem of how much and in what cases pip backtracks, and a user-visible part of that which is the informative and error messages pip outputs during backtracking. In today's team meeting we decided that this is sufficient to close this issue. @pradyunsg will file a followup issue about the idea of printing package names better in error output. Thanks @bkurtz and I hope you can also test the pip 20.3 beta release in case there are other issues for you to report! |
The new resolver in pip iterates over all versions of all packages, even if a small number (e.g. 2) of the specifications are completely incompatible.
Tested with latest pip from master.
What did you want to do?
Something like (this is just a toy example; using different packages for our application):
pip install --use-feature=2020-resolver marshmallow-sqlalchemy sqlalchemy==0.6.0 psycopg2-binary scipy
All available versions of
marshmallow-sqlalchemy
requiresqlalchemy
of at least 0.7, so the first two specifications are incompatible and will reasonably quickly (~11 seconds on my machine) give a failure if specified by themselves. However, adding additional packages with un-pinned versions (e.g. scipy, which depends on numpy) introduces an exponentially increasing number of options. The new dependency resolver apparently feels the need to test all of these, even though it could reasonably fail after discovering that the first two specifications were incompatible. I didn't let it run to completion, but I suspect it would literally take days.Output
Additional information
We had something similar to this in our application (pinned version of sqlalchemy, unpinned version of another package that depended on it and had ~200 published versions), and it was literally at a point of running for 45 minutes with no output to the console (verbose mode did help). I ended up having to debug by iteratively removing things until I got down to the incompatible packages.
The text was updated successfully, but these errors were encountered: