Handle restricted dependencies as implicit multiple-constraints dependencies #6969

radoering · 2022-11-04T16:44:17Z

Pull Request Check List

Resolves: #5506

Added tests for changed code.
Updated documentation for changed code.

Although I think that this PR makes the solver more correct it comes with a massive performance regression that is far from acceptible.

I carried out some measurements with example pyproject.toml files from other PRs. If locking succeeds without this PR, the same lock file is generated with this PR, it just takes longer...

Times for poetry lock with a warm cache:

`pyproject.toml` from ...	time without PR	time with PR
#3367	0.8 s	2.8 s
#4670	1.8 s	4.2 s
#4870	71 s	11800 s (not a typo)
#5506	error after 4.5 s	1090 s
shootout example	3.9 s	250 s

Number of overrides:

`pyproject.toml` from ...	number of overrides without PR	number of overrides with PR
#3367	4	10
#4670	16	19
#4870	46	4179
#5506	-	288
shootout example	0	69

The data shows that the time seems to correlate with the number of overrides. Thus, I assume a more sophisticated algorithm to reduce the number of overrides or even a complete overhaul of how to handle multiple-constraints dependencies might be necessary. I can imagine to make the VersionSolver marker aware so that a version conflict is only a conflict if the intersection of markers is not empty. This way, overrides would not be necessary anymore and everything could be solved at once. However, that's probably a huge task.

…nts dependencies

jeertmans · 2023-12-05T08:05:48Z

Thanks for linking this to #8670 @radoering :-)

Maybe we should rewrite Poetry in Rust if speed is an issue ^^'

Jokes aside, having a resolving time this long is really an issue..

jorenham

I left some notes on potential performance improvements; perhaps it could help speed things up :)

jorenham · 2024-07-23T17:49:25Z

src/poetry/puzzle/provider.py

+                inverted_marker_dep = deps[0].with_constraint(EmptyConstraint())
+                inverted_marker_dep.marker = inverted_marker
+                deps.append(inverted_marker_dep)
+        return [dep for deps in by_name.values() for dep in deps]


Suggested change

return [dep for deps in by_name.values() for dep in deps]

return itertools.chain.from_iterable(by_name.values())

jorenham · 2024-07-23T17:59:41Z

src/poetry/puzzle/provider.py

+        self,
+        dependencies: Iterable[Dependency],
+        active_extras: Collection[NormalizedName] | None,
+    ) -> list[Dependency]:


The return value here is used only for _get_dependencies_with_overrides, which (unlike the annotations suggest), should accept any Iterable[Dependency].
So it doesn't need to return a list; any iterable will do:

Suggested change

) -> list[Dependency]:

) -> Iterable[Dependency]:

With this, you can avoid creating the entire dependency list, e.g. using itertools, or by turning this method into a generator .

jorenham · 2024-07-23T18:18:03Z

src/poetry/puzzle/provider.py

+        by_name: dict[str, list[Dependency]] = defaultdict(list)
+        for dep in dependencies:
+            by_name[dep.name].append(dep)
+        for _name, deps in by_name.items():
+            marker = marker_union(*[d.marker for d in deps])
+            if marker.is_any():
+                continue
+            inverted_marker = marker.invert()
+            if self._is_relevant_marker(inverted_marker, active_extras):
+                # Set constraint to empty to mark dependency as "not required".
+                inverted_marker_dep = deps[0].with_constraint(EmptyConstraint())
+                inverted_marker_dep.marker = inverted_marker
+                deps.append(inverted_marker_dep)


These loops could be merged if 1) you use itertools.groupby, with e.g. operators.attrgetter('name') as key function, and 2) turn this method into a generator (e.g. with a yield from in the first if statement, and yield in the second).
This way you can avoid creating temporary lists altogether, for a significant speedup.

jorenham · 2024-07-23T18:21:03Z

src/poetry/puzzle/provider.py

+            marker = marker_union(*[d.marker for d in deps])
+            if marker.is_any():
+                continue


Is marker_union also needed when e.g. len(deps) == 1?
Because, at a glance, marker_union looks like a rather expensive function call.

jorenham · 2024-07-23T18:23:43Z

src/poetry/puzzle/provider.py

@@ -570,6 +570,9 @@ def complete_package(
                        continue
                    self.search_for_direct_origin_dependency(dep)

+        active_extras = None if package.is_root() else dependency.extras
+        _dependencies = self._add_implicit_dependencies(_dependencies, active_extras)


Since _dependencies is only used once, it's probably better to skip the variable assignment, by inlining in into the _add_implicit_dependencies call

jorenham · 2024-07-23T18:33:11Z

src/poetry/puzzle/provider.py

+        # any other dependency for sure.
+        for i, dep in enumerate(dependencies):
+            if dep.constraint.is_empty():
+                new_dependencies.append(dependencies.pop(i))


The list.pop method can be very slow operation, and I think that it can be avoided here, by using a "blacklist" approach, e.g.

blacklist = set() for dep in dependencies: if dep.constraint.is_empty(): blacklist.add(dep) break

Then later on in itertools.product use
repeat=len(dependencies) - len(blacklist).
And when looping over dep in dependencies again, simply skip it if dep in blacklist.

This avoids the list.pop operation, which has a time-complexity O(n), by relying on set.__contains__, which is only O(1).

jorenham · 2024-07-23T18:34:19Z

tests/puzzle/test_solver.py

+        ("python_version < '3.7'", "python_version >= '3.7'"),
+        ("sys_platform == 'linux'", "sys_platform != 'linux'"),
+        (
+            "python_version < '3.7' and sys_platform == 'linux'",
+            "python_version >= '3.7' and sys_platform == 'linux'",


I don't think python<3.7 is relevant anymore

dimbleby · 2024-07-23T19:50:33Z

I left some notes on potential performance improvements; perhaps it could help speed things up :)

I think it is likely that you are micro-optimizing essentially irrelevant parts of the code. If you want to make performance improvements - recommend that the first thing to do is to profile, so that you spend your time optimizing the right things

But perhaps I am wrong, and you are now seeing results much better than those in the comment at the top of the thread? If so - submit a merge request!

jorenham · 2024-07-23T20:02:40Z

I left some notes on potential performance improvements; perhaps it could help speed things up :)

I think it is likely that you are micro-optimizing essentially irrelevant parts of the code.

I don't agree that improvements to the runtime complexity are the same as "micro-optimizing".

Plus, my suggestions will also result in fewer lines of code, without harming readability. So even if the performance benefits are minimal, at the very least there are no disadvantages.

tbenthompson mentioned this pull request Nov 27, 2022

Track poetry issues with JAX dependency Confirm-Solutions/confirmasaurus#168

Closed

radoering force-pushed the restricted-dependencies branch from 617846c to 224f6b3 Compare June 4, 2023 13:38

provider: treat restricted dependencies as implicit multiple-constrai…

5ee2526

…nts dependencies

radoering force-pushed the restricted-dependencies branch from 224f6b3 to 5ee2526 Compare December 2, 2023 14:50

radoering mentioned this pull request Dec 2, 2023

Poetry fails to create lock file when specifying 'linux' platform: Package('pywin32', '306') #8670

Open

4 tasks

radoering mentioned this pull request Jul 23, 2024

Solver breaks with related dependencies that are both conditional #5506

Open

3 tasks

jorenham reviewed Jul 23, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle restricted dependencies as implicit multiple-constraints dependencies #6969

Handle restricted dependencies as implicit multiple-constraints dependencies #6969

radoering commented Nov 4, 2022

jeertmans commented Dec 5, 2023

jorenham left a comment •

edited

Loading

jorenham Jul 23, 2024

jorenham Jul 23, 2024

jorenham Jul 23, 2024

jorenham Jul 23, 2024

jorenham Jul 23, 2024

jorenham Jul 23, 2024

jorenham Jul 23, 2024

dimbleby commented Jul 23, 2024

jorenham commented Jul 23, 2024

	return [dep for deps in by_name.values() for dep in deps]
	return itertools.chain.from_iterable(by_name.values())

Handle restricted dependencies as implicit multiple-constraints dependencies #6969

Are you sure you want to change the base?

Handle restricted dependencies as implicit multiple-constraints dependencies #6969

Conversation

radoering commented Nov 4, 2022

Pull Request Check List

jeertmans commented Dec 5, 2023

jorenham left a comment • edited Loading

Choose a reason for hiding this comment

jorenham Jul 23, 2024

Choose a reason for hiding this comment

jorenham Jul 23, 2024

Choose a reason for hiding this comment

jorenham Jul 23, 2024

Choose a reason for hiding this comment

jorenham Jul 23, 2024

Choose a reason for hiding this comment

jorenham Jul 23, 2024

Choose a reason for hiding this comment

jorenham Jul 23, 2024

Choose a reason for hiding this comment

jorenham Jul 23, 2024

Choose a reason for hiding this comment

dimbleby commented Jul 23, 2024

jorenham commented Jul 23, 2024

jorenham left a comment •

edited

Loading