-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
An example of resolution failure while it does have a valid resolution. #134
Comments
I can reproduce both steps on Pip 23.2.1 Python 3.9 on Linux. The resolution impossible gives the error:
Which is certainly weird, here is the debug output: https://gist.github.com/notatallshaw/f39c7986aa81fb05ee3a710787484551 I took a quick look now to see if the issue is related to specifiers rather than resolvelib, but I didn't get very far. I can investigate more in a couple of days if someone hasn't figured it out by then. |
Actually it falls into this branch: resolvelib/src/resolvelib/resolvers.py Lines 317 to 318 in 77b256c
Where the causes are related to To fix this, we should considered a less hard fail for this case, for example, when no state is found, just pop the last pin. |
I tested this on Pip directly by creating a copy the States ahead of time and if backjumping fails falling back to the old backtracking by restoring the state and doing the backtrack. It appears to fix this situation, it can correctly resolve:
But I do not have the knowledge of how backjumping is working to be confident to raise a PR or know what tests to add, there is also probably a far more elegant approach to this. So here is the patch if anyone wants to adapt this specific approach this code is free to use: diff --git a/src/pip/_vendor/resolvelib/resolvers.py b/src/pip/_vendor/resolvelib/resolvers.py
index 2c3d0e306..e63377c8e 100644
--- a/src/pip/_vendor/resolvelib/resolvers.py
+++ b/src/pip/_vendor/resolvelib/resolvers.py
@@ -307,22 +307,31 @@ class Resolution(object):
# Remove the state that triggered backtracking.
del self._states[-1]
- # Ensure to backtrack to a state that caused the incompatibility
- incompatible_state = False
- while not incompatible_state:
- # Retrieve the last candidate pin and known incompatibilities.
- try:
+ _states_backup = [
+ State(s.mapping.copy(), s.criteria.copy(), s.backtrack_causes[:])
+ for s in self._states
+ ]
+ try:
+ # Ensure to backtrack to a state that caused the incompatibility
+ incompatible_state = False
+ while not incompatible_state:
+ # Retrieve the last candidate pin and known incompatibilities.
broken_state = self._states.pop()
name, candidate = broken_state.mapping.popitem()
- except (IndexError, KeyError):
- raise ResolutionImpossible(causes)
- current_dependencies = {
- self._p.identify(d)
- for d in self._p.get_dependencies(candidate)
- }
- incompatible_state = not current_dependencies.isdisjoint(
- incompatible_deps
- )
+ current_dependencies = {
+ self._p.identify(d)
+ for d in self._p.get_dependencies(candidate)
+ }
+ incompatible_state = not current_dependencies.isdisjoint(
+ incompatible_deps
+ )
+ except (IndexError, KeyError):
+ # Backjumping failed, so fall back to backtrack where
+ # we just pop the previous state
+ self._states = _states_backup
+ broken_state = self._states.pop()
+ name, candidate = broken_state.mapping.popitem()
+ del _states_backup
incompatibilities_from_broken = [
(k, list(v.incompatibilities))
|
I created this much more tighly bounded set of requirements that fails when it shouldn't, it should be relatively resistent to any changes on PyPi and hopefully easier to debug as
Notice that these requirements succeeds:
And:
I also notice in the original PR @pradyunsg made a comment about this exception: https://github.com/sarugaku/resolvelib/pull/113/files#r1097012024 but closed it without any interaction from the PR author @bennati |
An even tighter set of requirements, this resolves:
This does not:
Not sure any of this is helpful but at the begining of the method [
RequirementInformation(requirement=SpecifierRequirement('pystac<=1.8.3,>=1.8.2'), parent=None),
RequirementInformation(requirement=SpecifierRequirement('pystac~=1.2.0'), parent=LinkCandidate('https://files.pythonhosted.org/packages/ff/a3/f0e871f97d77ac638b58f634dfa545aca5f866d2310a299b1eb1a51e84f2/pystac_client-0.3.2-py3-none-any.whl (from https://pypi.org/simple/pystac-client/) (requires-python:>=3.7)'))
] And this is what self.states looks like: [
State(mapping=OrderedDict(), criteria={'pandas': Criterion((SpecifierRequirement('pandas<=1.4.0,>=1.3.5'), via=None)), 'pystac': Criterion((SpecifierRequirement('pystac<=1.8.3,>=1.8.2'), via=None)), 'pystac-client': Criterion((SpecifierRequirement('pystac-client<=0.3.3,>=0.3.2'), via=None)), 'sat-stac': Criterion((SpecifierRequirement('sat-stac<=0.1.1'), via=None))}, backtrack_causes=[]),
State(mapping=OrderedDict([('pandas', LinkCandidate('https://files.pythonhosted.org/packages/eb/fa/6cbc442e86f625dc403fbceb79e869893fc09486cfba79bd4ba33e366293/pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from https://pypi.org/simple/pandas/) (requires-python:>=3.8)'))]), criteria={'pandas': Criterion((SpecifierRequirement('pandas<=1.4.0,>=1.3.5'), via=None)), 'pystac': Criterion((SpecifierRequirement('pystac<=1.8.3,>=1.8.2'), via=None)), 'pystac-client': Criterion((SpecifierRequirement('pystac-client<=0.3.3,>=0.3.2'), via=None)), 'sat-stac': Criterion((SpecifierRequirement('sat-stac<=0.1.1'), via=None)), 'python-dateutil': Criterion((SpecifierRequirement('python-dateutil>=2.8.1'), via=LinkCandidate('https://files.pythonhosted.org/packages/eb/fa/6cbc442e86f625dc403fbceb79e869893fc09486cfba79bd4ba33e366293/pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from https://pypi.org/simple/pandas/) (requires-python:>=3.8)'))), 'pytz': Criterion((SpecifierRequirement('pytz>=2020.1'), via=LinkCandidate('https://files.pythonhosted.org/packages/eb/fa/6cbc442e86f625dc403fbceb79e869893fc09486cfba79bd4ba33e366293/pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from https://pypi.org/simple/pandas/) (requires-python:>=3.8)'))), 'numpy': Criterion((SpecifierRequirement('numpy>=1.18.5; platform_machine != "aarch64" and platform_machine != "arm64" and python_version < "3.10"'), via=LinkCandidate('https://files.pythonhosted.org/packages/eb/fa/6cbc442e86f625dc403fbceb79e869893fc09486cfba79bd4ba33e366293/pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from https://pypi.org/simple/pandas/) (requires-python:>=3.8)'))), '<Python from Requires-Python>': Criterion((RequiresPythonRequirement('>=3.8'), via=LinkCandidate('https://files.pythonhosted.org/packages/eb/fa/6cbc442e86f625dc403fbceb79e869893fc09486cfba79bd4ba33e366293/pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from https://pypi.org/simple/pandas/) (requires-python:>=3.8)')))}, backtrack_causes=[]),
State(mapping=OrderedDict([('pandas', LinkCandidate('https://files.pythonhosted.org/packages/eb/fa/6cbc442e86f625dc403fbceb79e869893fc09486cfba79bd4ba33e366293/pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from https://pypi.org/simple/pandas/) (requires-python:>=3.8)')), ('<Python from Requires-Python>', <pip._internal.resolution.resolvelib.candidates.RequiresPythonCandidate object at 0x7fd6585e33a0>)]), criteria={'pandas': Criterion((SpecifierRequirement('pandas<=1.4.0,>=1.3.5'), via=None)), 'pystac': Criterion((SpecifierRequirement('pystac<=1.8.3,>=1.8.2'), via=None)), 'pystac-client': Criterion((SpecifierRequirement('pystac-client<=0.3.3,>=0.3.2'), via=None)), 'sat-stac': Criterion((SpecifierRequirement('sat-stac<=0.1.1'), via=None)), 'python-dateutil': Criterion((SpecifierRequirement('python-dateutil>=2.8.1'), via=LinkCandidate('https://files.pythonhosted.org/packages/eb/fa/6cbc442e86f625dc403fbceb79e869893fc09486cfba79bd4ba33e366293/pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from https://pypi.org/simple/pandas/) (requires-python:>=3.8)'))), 'pytz': Criterion((SpecifierRequirement('pytz>=2020.1'), via=LinkCandidate('https://files.pythonhosted.org/packages/eb/fa/6cbc442e86f625dc403fbceb79e869893fc09486cfba79bd4ba33e366293/pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from https://pypi.org/simple/pandas/) (requires-python:>=3.8)'))), 'numpy': Criterion((SpecifierRequirement('numpy>=1.18.5; platform_machine != "aarch64" and platform_machine != "arm64" and python_version < "3.10"'), via=LinkCandidate('https://files.pythonhosted.org/packages/eb/fa/6cbc442e86f625dc403fbceb79e869893fc09486cfba79bd4ba33e366293/pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from https://pypi.org/simple/pandas/) (requires-python:>=3.8)'))), '<Python from Requires-Python>': Criterion((RequiresPythonRequirement('>=3.8'), via=LinkCandidate('https://files.pythonhosted.org/packages/eb/fa/6cbc442e86f625dc403fbceb79e869893fc09486cfba79bd4ba33e366293/pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from https://pypi.org/simple/pandas/) (requires-python:>=3.8)')))}, backtrack_causes=[]),
State(mapping=OrderedDict([('pandas', LinkCandidate('https://files.pythonhosted.org/packages/eb/fa/6cbc442e86f625dc403fbceb79e869893fc09486cfba79bd4ba33e366293/pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from https://pypi.org/simple/pandas/) (requires-python:>=3.8)')), ('<Python from Requires-Python>', <pip._internal.resolution.resolvelib.candidates.RequiresPythonCandidate object at 0x7fd6585e33a0>), ('pystac', LinkCandidate('https://files.pythonhosted.org/packages/d2/35/efb3ada4f8db776144d786338a41d38e5128f2c1e4a86b681c658fe1151e/pystac-1.8.3-py3-none-any.whl (from https://pypi.org/simple/pystac/) (requires-python:>=3.8)'))]), criteria={'pandas': Criterion((SpecifierRequirement('pandas<=1.4.0,>=1.3.5'), via=None)), 'pystac': Criterion((SpecifierRequirement('pystac<=1.8.3,>=1.8.2'), via=None)), 'pystac-client': Criterion((SpecifierRequirement('pystac-client<=0.3.3,>=0.3.2'), via=None)), 'sat-stac': Criterion((SpecifierRequirement('sat-stac<=0.1.1'), via=None)), 'python-dateutil': Criterion((SpecifierRequirement('python-dateutil>=2.8.1'), via=LinkCandidate('https://files.pythonhosted.org/packages/eb/fa/6cbc442e86f625dc403fbceb79e869893fc09486cfba79bd4ba33e366293/pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from https://pypi.org/simple/pandas/) (requires-python:>=3.8)')), (SpecifierRequirement('python-dateutil>=2.7.0'), via=LinkCandidate('https://files.pythonhosted.org/packages/d2/35/efb3ada4f8db776144d786338a41d38e5128f2c1e4a86b681c658fe1151e/pystac-1.8.3-py3-none-any.whl (from https://pypi.org/simple/pystac/) (requires-python:>=3.8)'))), 'pytz': Criterion((SpecifierRequirement('pytz>=2020.1'), via=LinkCandidate('https://files.pythonhosted.org/packages/eb/fa/6cbc442e86f625dc403fbceb79e869893fc09486cfba79bd4ba33e366293/pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from https://pypi.org/simple/pandas/) (requires-python:>=3.8)'))), 'numpy': Criterion((SpecifierRequirement('numpy>=1.18.5; platform_machine != "aarch64" and platform_machine != "arm64" and python_version < "3.10"'), via=LinkCandidate('https://files.pythonhosted.org/packages/eb/fa/6cbc442e86f625dc403fbceb79e869893fc09486cfba79bd4ba33e366293/pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from https://pypi.org/simple/pandas/) (requires-python:>=3.8)'))), '<Python from Requires-Python>': Criterion((RequiresPythonRequirement('>=3.8'), via=LinkCandidate('https://files.pythonhosted.org/packages/eb/fa/6cbc442e86f625dc403fbceb79e869893fc09486cfba79bd4ba33e366293/pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from https://pypi.org/simple/pandas/) (requires-python:>=3.8)')), (RequiresPythonRequirement('>=3.8'), via=LinkCandidate('https://files.pythonhosted.org/packages/d2/35/efb3ada4f8db776144d786338a41d38e5128f2c1e4a86b681c658fe1151e/pystac-1.8.3-py3-none-any.whl (from https://pypi.org/simple/pystac/) (requires-python:>=3.8)')))}, backtrack_causes=[]),
State(mapping=OrderedDict([('pandas', LinkCandidate('https://files.pythonhosted.org/packages/eb/fa/6cbc442e86f625dc403fbceb79e869893fc09486cfba79bd4ba33e366293/pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from https://pypi.org/simple/pandas/) (requires-python:>=3.8)')), ('<Python from Requires-Python>', <pip._internal.resolution.resolvelib.candidates.RequiresPythonCandidate object at 0x7fd6585e33a0>), ('pystac', LinkCandidate('https://files.pythonhosted.org/packages/d2/35/efb3ada4f8db776144d786338a41d38e5128f2c1e4a86b681c658fe1151e/pystac-1.8.3-py3-none-any.whl (from https://pypi.org/simple/pystac/) (requires-python:>=3.8)'))]), criteria={'pandas': Criterion((SpecifierRequirement('pandas<=1.4.0,>=1.3.5'), via=None)), 'pystac': Criterion((SpecifierRequirement('pystac<=1.8.3,>=1.8.2'), via=None)), 'pystac-client': Criterion((SpecifierRequirement('pystac-client<=0.3.3,>=0.3.2'), via=None)), 'sat-stac': Criterion((SpecifierRequirement('sat-stac<=0.1.1'), via=None)), 'python-dateutil': Criterion((SpecifierRequirement('python-dateutil>=2.8.1'), via=LinkCandidate('https://files.pythonhosted.org/packages/eb/fa/6cbc442e86f625dc403fbceb79e869893fc09486cfba79bd4ba33e366293/pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from https://pypi.org/simple/pandas/) (requires-python:>=3.8)')), (SpecifierRequirement('python-dateutil>=2.7.0'), via=LinkCandidate('https://files.pythonhosted.org/packages/d2/35/efb3ada4f8db776144d786338a41d38e5128f2c1e4a86b681c658fe1151e/pystac-1.8.3-py3-none-any.whl (from https://pypi.org/simple/pystac/) (requires-python:>=3.8)'))), 'pytz': Criterion((SpecifierRequirement('pytz>=2020.1'), via=LinkCandidate('https://files.pythonhosted.org/packages/eb/fa/6cbc442e86f625dc403fbceb79e869893fc09486cfba79bd4ba33e366293/pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from https://pypi.org/simple/pandas/) (requires-python:>=3.8)'))), 'numpy': Criterion((SpecifierRequirement('numpy>=1.18.5; platform_machine != "aarch64" and platform_machine != "arm64" and python_version < "3.10"'), via=LinkCandidate('https://files.pythonhosted.org/packages/eb/fa/6cbc442e86f625dc403fbceb79e869893fc09486cfba79bd4ba33e366293/pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from https://pypi.org/simple/pandas/) (requires-python:>=3.8)'))), '<Python from Requires-Python>': Criterion((RequiresPythonRequirement('>=3.8'), via=LinkCandidate('https://files.pythonhosted.org/packages/eb/fa/6cbc442e86f625dc403fbceb79e869893fc09486cfba79bd4ba33e366293/pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from https://pypi.org/simple/pandas/) (requires-python:>=3.8)')), (RequiresPythonRequirement('>=3.8'), via=LinkCandidate('https://files.pythonhosted.org/packages/d2/35/efb3ada4f8db776144d786338a41d38e5128f2c1e4a86b681c658fe1151e/pystac-1.8.3-py3-none-any.whl (from https://pypi.org/simple/pystac/) (requires-python:>=3.8)')))}, backtrack_causes=[RequirementInformation(requirement=SpecifierRequirement('python-dateutil>=2.8.1'), parent=LinkCandidate('https://files.pythonhosted.org/packages/eb/fa/6cbc442e86f625dc403fbceb79e869893fc09486cfba79bd4ba33e366293/pandas-1.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (from https://pypi.org/simple/pandas/) (requires-python:>=3.8)')), RequirementInformation(requirement=SpecifierRequirement('python-dateutil>=2.7.0'), parent=LinkCandidate('https://files.pythonhosted.org/packages/d2/35/efb3ada4f8db776144d786338a41d38e5128f2c1e4a86b681c658fe1151e/pystac-1.8.3-py3-none-any.whl (from https://pypi.org/simple/pystac/) (requires-python:>=3.8)')), RequirementInformation(requirement=SpecifierRequirement('python-dateutil~=2.7.5'), parent=LinkCandidate('https://files.pythonhosted.org/packages/4a/2b/558c882901d6c6d2593673d434141b915c80f11819290be2fe564d891296/sat-stac-0.1.1.tar.gz (from https://pypi.org/simple/sat-stac/)')), RequirementInformation(requirement=SpecifierRequirement('requests>=2.25'), parent=LinkCandidate('https://files.pythonhosted.org/packages/97/6b/8d156f9d1fefbf56cecaa5e4907910758fe9e4496c1868befbbf56bdb778/pystac_client-0.3.3-py3-none-any.whl (from https://pypi.org/simple/pystac-client/) (requires-python:>=3.7)')), RequirementInformation(requirement=SpecifierRequirement('requests~=2.19.1'), parent=LinkCandidate('https://files.pythonhosted.org/packages/93/70/0e9b2768c16b9de617946e0f89c31f90dfba9752fe65b5c8068a83d5c144/sat-stac-0.1.0.tar.gz (from https://pypi.org/simple/sat-stac/)'))])
] |
So, this issue appears to be that backjumping as implemented by #113 is incorrectly discarding state. I beleive this also causes Pip 23.3 to fail to install Although in that case not the same exception triggers. |
Experimenting tonight with the examples a bit I beleive the assumptions in #113 that you can safely discard state until there are no intersection between the causes and the now current dependencies is fault for two possible reasons:
This is all currently speculation and I'm just sharing in the hopes the issue is more obvious to someone else, I'll try and investigate further when I have time. |
Another triggering requirement, at least in Pip 23.2.1: It no longer occurs in Pip 23.3 but that is because of optimizations applied on Pip side that seems to side step this issue for this specific set of requirements, details of testing here: pypa/pip#12376 (comment) |
The requirements I hit this bug with were Narrowing the packages to either I've spent a fair chunk of the weekend trying to setup a test case, I think my first stop might be to make some adjustments to py2index. |
@notatallshaw 's solution seems a viable way to fix it, can you make a PR please? |
I don't think it is actually, it only works in the case that the incorrect backjumping is the one that causes states or state.mapping to be exhausted. However other than the initial example it seems the incorrect backjumping skips over a valid state earlier in the resolution process, meaning that this specific error doesn't raise but you still get a resolution impossible incorrectly. My plan currently is to take another look at it this week and if I can't figure out a patch that works for all found cases I was going create a PR to revert backjumping out of resolvelib. |
Can you please provide more context on your set-up, I was able to reproduce the
And it does look like backjumping is the issue, it seems to jump over the valid requirement |
@notatallshaw Oh weird, I've never hit the too deep problem. The only other context I have is:
I'm using
|
Here's a zip that contains an index file that can be used for the test (it took a few days of crawling to generate). With this I was able to recreate the This changes to |
I am looking for some feedback from resolvelib maintainers, @frostming @pradyunsg @uranusjr, I have multiple solutions and don't have a strong opinion on which one should be implemented, I am happy to answer questions and gather data but there are basically 4 different options and I don't want to waste time on options never going to happen. Here is the current state: There are two PRs that directly address the issue on resolvelib side, #142 reverts backjumping, #144 adds a fallback of backjumping to backtracking in a relatively simple way without having to start the whole process over from the begining. There are two coupled PRs that indirectly address the issue but for Pip only, pypa/pip#12459 (with the accompining resolvelib PR #145), this seems to cause Pip to choose the causes in a way that satisfies the backjumping assumptions about conflicts (or at least I can no longer reproduce this issue with that PR) and it also improves the performance of Pip that removing backjumping is not a significant performance degradation. This leads me to think of a third PR I could raise for resolvelib, one that gives the calling library a switch of whether to choose backtracking or backjumping, and leave it up to the calling library to decide whether they can trust the implict assumptions of backjumping and how they provide their dependency graph and prefer causes not to cause an incorrect ResolutionImpossibles. |
@notatallshaw Thanks for all the efforts you have put into addressing this problem. I personally like the fallback approach. I also would like to review your PRs and leave some comments, but it needs inputs from the pip maintainers for the final decision since resolvelib is a significant component of it. However, they haven't shown up for unknown reasons but let's be patient. |
Sounds good, no worries. I think everyone has just got busy outside of OSS work. Pip is seemingly precariously resource constrained, I wouldn't be able to add much time either if I was an OSS maintainer. If I have some time I will turn my attention back to the fallback solution to see if there's any simplifications or other improvements I can make. |
@frostming can you clarify what you need from the pip maintainers? I don’t have an in-depth understanding of the resolver algorithms, so I can’t give a view on the technicalities. What I can say is:
Beyond that, I have no view unless you have specific questions. In particular, I’d leave it you you to decide between the two “resolvelib only” options. |
I have been unhappy with either of the two PRs I created that workaround this bug, I have therefore been working on trying to make a significantly simpler reproducer that a test can be written against, and hopefully find the underlying cause. Therefore I have worked on creating a minimal pip-resolver-benchmarks scenario that produces the same error for a resolvable scenario. And here it is: {
"input": {
"requirements": [
"pandas<=1.4.0,>=1.3.5",
"pystac<=1.8.3,>=1.8.2",
"pystac-client<=0.3.3,>=0.3.2",
"sat-stac<=0.1.1"
],
"timestamp": "2024-05-12T15:44:05.521946",
"allow_sdists_for": [],
"environment": {
"markers": {
"implementation_name": "cpython",
"implementation_version": "3.9.19",
"os_name": "posix",
"platform_machine": "x86_64",
"platform_release": "5.15.146.1-microsoft-standard-WSL2",
"platform_system": "Linux",
"platform_version": "#1 SMP Thu Jan 11 04:09:03 UTC 2024",
"python_full_version": "3.9.19",
"platform_python_implementation": "CPython",
"python_version": "3.9",
"sys_platform": "linux"
},
"tags": [
"py39-none-any"
]
}
},
"packages": {
"pandas": {
"1.4.0": {
"depends_by_extra": {
"": [
"python-dateutil>=2.8.1"
]
}
},
"1.3.5": {
"depends_by_extra": {
"": [
"python-dateutil>=2.7.3"
]
}
}
},
"pystac": {
"1.8.3": {
"depends_by_extra": {
"": [
"python-dateutil>=2.7.0"
]
}
},
"1.8.2": {
"depends_by_extra": {
"": [
"python-dateutil>=2.7.0"
]
}
}
},
"pystac-client": {
"0.3.3": {
"depends_by_extra": {
"": [
"requests>=2.25",
"pystac>=1.4.0"
]
}
},
"0.3.2": {
"depends_by_extra": {
"": [
"requests>=2.25",
"pystac~=1.2.0"
]
}
}
},
"sat-stac": {
"0.1.1": {
"depends_by_extra": {
"": [
"python-dateutil~=2.7.5"
]
}
},
"0.1.0": {
"depends_by_extra": {
"": [
"requests~=2.19.1",
"python-dateutil~=2.7.5"
]
}
}
},
"python-dateutil": {
"2.9.0": {
"depends_by_extra": {}
},
"2.7.5": {
"depends_by_extra": {}
}
},
"requests": {
"2.31.0": {
"depends_by_extra": {}
}
}
}
} I had planned to turn it into a packse scenario and then debug pip against the index packse provides, but so far I have not been able to reproduce the error with packse. I am going to work on breaking about pip-resolver-benchmarks so I can use it to provide the wheels and independently run pip against those wheels it provides. |
Okay, I am now able to reproduce this without running via
Then you can run the scenario with:
Which produces the bug (and is independant of the Python version you use):
I have a hunch what the issue is and applied that hunch and this and other scenarios are now correctly resolving, and it's not a hacky workaround or removing backjumping entirely. But before I post a PR I want to carefully walk through each step, gain a strong intuitive understanding, recreate the scenario in packse, and have a good idea how to write a test for it. So I expect it to take at least a few days, if someone wants to beat me to the punch and post a PR first I'll happily test it. |
A PR is ready to review: #152 |
So #152 fixes the original example, but not I have been able to reproduce |
As I've looked deeper into I've written more on that PR, but I beleive backjumping should either be reverted or made opt-in. |
Is it worthwhile to try backjumping, and revert to backtracking if it yields no results? |
I think it would generally work but there are a few issues:
|
I’m thinking reverting to where the first backjump happens (so all the way back I guess?) The second point is the deciding point. Whether this is worthwhile probably depends on how much faster backjumping is than simple backtracking. Say it is 90% faster, the fall-back would cause a 10% slowdown in the worst case (than backtracking—I don’t think keeping the current implementation is viable at this point), which should be definitely worwhile. If it’s just 10% faster, the slowdown would be 90% and definitely not worthwhile. The real value is probably somewhere in between. We should probably investigate some examples, but I’m wondering if you have a ballpark idea we can use to make a quick decision from. |
Not quite sure that you mean, here are the possible points where it makes sense to me to revert to:
My PR implemented the last one, as anything after this point is logically unsound.
Thinking on this, there are two perspectives on the impact it has:
From perspective 1, the worst case scenario is the revert point is near the beginning of the resolution, but otherwise the resolution doesn't make much difference, and therefore reverting from backjumping to backtracking to confirm a resolution impossible doubles the amount of time, and the best case scenario is the revert point is near the end of the resolution and reverting to confirm resolution time is negligible. From perspective 2 however, the worst case is this can turn confirmed impossible resolutions from completing quickly to taking effectively an infinite amount of time. If users like @bennati believe backjumping is not unsound for their provider then it makes sense to add an opt-out of reverting and falling back to backtracking.
I can work on collecting some real world benchmarks of valid ResolutionImpossibles, but my concern is there is no simple relationship between a dependency graph and performance impact, very small changes can turn simple resolutions into pathological examples. So it will be hard to treat any sort of data collection as much more than anecdotal. |
I was thinking 4 too so most of the reasoning you made above makes sense to me. Personally I feel we should lean toward correctness than speed so if we can’t reliably ensure backjumping is logically correct, and doing the resolution twice (BJ + BT) takes too much additional time than simply BT, I would prefer to remove BJ entirely. If BJ is still mostly valuable for people, maybe we can treat BJ like compiler optimisations—optional, and if you turn it on we don’t promise it always does the right thing, but it should mostly do a good enough job to be worthwhile. |
Oh, in general terms of resolutions, there are definitely resolutions that will not complete if back jumping is removed because they will take too long. At least, until pip prefers directly conflicting dependencies, which will have a similar performance improvement for pathological examples as back jumping has. And I believe that back jumping is actually sound once pip (the provider) prefers directly conflicting dependencies. |
And sorry if I wasn't clear on this before, but backjumping within itself is sound in the context of dependency resolution. I'm just saying it requires the provder to make the "correct" preferences while backtracking, and resolvelib has no way of enforcing, or even checking, it is doing that. |
Could we not simply document that the provider must do this? I've only been following this discussion at a high level, but I'd rather have an efficient algorithm that places some requirements on providers over something that's less efficient but handles every possible case. If this means requiring pypa/pip#12459, then I'm fine with that - although as I said in that PR, I'd like resolvelib to document clearly what it expects from providers, so we don't end up with tight coupling between pip and resolvelib because no-one else can work out how to write a conforming provider 😉 |
It's certainly an option, but currently none of resolvelib's test providers implement it, the pip test provider would need to be updated at least, which I can work on, so it can pass at least the tests in #154. Also, in resolvelib's current API design, I don't beleive it's possible for the provider to always prefer directly conflicting conflicts without introducing a new O(n2) issue. And my proposed fix to avoid O(n2) is #145, which I had planned to work on after "fixing backjumping", but I guess it could all be part of the same thing. |
Okay, I think I found an actual logical fix: #155 |
Run resolution on this requirement will result in a failure:
While it does have a valid resolution if you remove the second dep but it is in the final resolution:
Which resolves to:
It seems an issue with the backtracking process.
Original report: pdm-project/pdm#2193
The text was updated successfully, but these errors were encountered: