-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Max Backtracking Option and print out current failure casues #10417
Comments
In principle, I'm +1 on this. However, backtracking is done a lot in normal processing (as I imagine you are aware) and it will be extremely hard to give good advice on what would be a "reasonable" number to set the backtracking parameter to. I think the hardest part of doing this is likely to be to document it in a way that will (a) help users not familiar with the backtracking algorithm when they hit complex resolution issues, and (b) improve the quality of issue reports we receive when users cannot work out what's going wrong. |
I agree, my thought here is it's "better than nothing" which is currently the situation for some users. For example in #10373 pip is not helping the user that much, it backtracks seemingly forever with no useful messages for the user as to why. Personally I could spend an hour or 2 on the issue and inject print statements and breakpoints in to pips codebase and eventually figure out how to change the order of the requirements to something that will resolve quicker, but this doesn't help the general user. I feel like particularly "--max-backtracks 0" or a "--no-backtrack" flag would be extremely useful for projects which ship very tight requirements such as Airflow, so they can put it in to their ci/cd pipeline and resolve the requirements file as issues come up. It may also help tools that want to build on top of pip but not want pip to do the resolution. My idea to suggest "--max-backtracks" rather than just "--no-backtrack" was to give the users flexibility in their approach. It would be difficult for pip to suggest a "reasonable" number but a motivated enough user could figure out their own "reasonable" number e.g. they may determine for their particular project under normal circumstances there are less than N backtracks and if there are more than 10*N backtracks something has gone terribly wrong. Another idea I had to improve information to users / quality of bug reports to pip is to log the failure causes when they change, similar to how raising |
There’s another issue on this somewhere and I mused about something similar, but I couldn’t find it now. There are several “strategies” to tweak for the resolution. Never backtracking is one (and definitely useful; the problem is mainly how to expose it in a sensible way). Others include:
The three are orthogontal, so we would need a way to combine them into one option (say |
I agree that including something that has the effect of no backtracking in the Use case 2 though is still not covered, that is you have a limited amount of CPU time in which to let pip run and you would rather fail than spend more CPU time, Through your own infrastructure you can implement some kind of kill signal on to the pip process but currently when you do that you don't have anything useful from the logs as to why pip unexpectedly took so long backtracking. Perhaps though sarugaku/resolvelib#81 and #10258 are sufficient to let the user know what went wrong in that time that caused pip to backtrack so much. I was meaning to test them against the list of known problematic requirements I have been working on but I realized I don't actually know how to test something that requires both |
Yeah, it does not cover use case 2. But personally IMO it’s lost cause to even try to do that. Installing any kind of requirement that does not do strict pinning ( |
I just hit an issue where due to a bad git merge a lockfile had incompatible requirements. Pip was backtracking on several of my CI runners for 4+ hours. @uranusjr Is backtracking in a correct lockfile always an indication of an error? I.e. does Pip ever backtracks when given a complete set of compatible requirements, where all of them have If so, it would be nice to have something that makes |
@MrMino do you have a reproducible example? I have an open PR that significantly improves the speed of backtracking when ther are lots of possible causes: #12459 Every example I've tried so far has sped up backtracking from hours to minutes, it would be good to see if your issue also would be solved by this. |
@notatallshaw I don't think I can share one, sorry. Even if I could, the lockfile in question contains hundreds of packages that aren't available on PyPI, so I doubt this would be of any value to you. Edit: |
Pip has some known performance issues for very large pinned files, there is a different open PR which should significantly reduce the time spends in this scenario: #12453 |
@notatallshaw I'm unable to reproduce my original issue on my local VM, so I can't tell if your patch speeds things up. I cannot install your version on my runner either, so I have no results. Sorry. The root cause of my issue might be connected to the CI cache that I'm using, or a rate limit imposed by my index. Not sure. Intuitively, judging by the description of your MR, I don't expect much difference in my case. I would expect |
The problem at the moment is that the backtrack can choose causes which don't really conflict. By preferring conflicts resolvelib can much more quickly prove that it's impossible to resolve. But certainly, this won't help all situations where Pip can get stuck backtracking. It's why I ask for reproducible examples, unfortunately situations like yours a pretty common in that there's no pubic way to reproduce. But thanks for trying. |
What's the problem this feature will solve?
When a user has a complex requirements set it's possible that the backtracking can take hours / days / years to find a solution.
In a large requirements list it can be unclear why pip is backtracking as the conflict may exist many layers deep that the user does not know about.
Adding a max backtracking option attempts to solve 2 use cases:
Describe the solution you'd like
self._backtrack_count
to Resolution objectself._backtrack_count
and check if exceeds the maximum backtrack countraise ResolutionImpossible(causes)
so the user can inspect the current error was causing the backtrackingAdditional context
This requires adding to the pip CLI, updating resolvelib, adding many test cases, and updating the documentation. I currently don't have a strong enough understanding of pip's code base to implement all of this. But if no one else works on this I will try and eventually submit the relevant PRs.
The text was updated successfully, but these errors were encountered: