Faster resolver: use a inverse-index to not activate the causes of conflict #5213

Eh2406 · 2018-03-20T17:00:28Z

This adds a test for #4810 (comment) with two extensions that make it harder. It is the last reproducible and in the wild exponentially slow resolution (that I have found).

The problem in the test is backtrack_trap0 = "*" is a lot of ways of saying constrained = ">=1.1.0, <=2.0.0" but constrained= "2.0.1" is already picked. Only then to try and solve constrained= "~1.0.0" which is incompatible. Our parent knows that we have been poisoned, and wont try to activate us again. Because of the order we evaluate deps we end up backtracking to where constrained: 1.1.0 is set instead of our parent. And so the poisoning does not help. This is harder then #4810 (comment) because:

Having multiple semver compatible versions of constrained in play makes for a lot more bookkeeping. Specifically bookkeeping I forgot when I first submitted this PR.
The problematic dependencies are added deep in a combinatorial explosion of possibilities. So if we don't correctly handle caching that backtrack_trap0 = "*" is doomed then we will never finish looking thru the different possibilities for level0 = "*"

This PR also includes a proof of concept solution for the test, which proves that it does solve #4810 (comment). The added code is tricky to read. It also adds a O(remaining_deps) job to every activation on the happy path, slower if the past_conflicting_activations is not empty.

I'd like some brainstorming on better solutions.

rust-highfive · 2018-03-20T17:00:39Z

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

alexcrichton · 2018-03-21T05:01:50Z

I wonder, would it be possible to shift more work from the "just after we activated a dependency see if it's doomed to fail" to the "we've exhausted a list of candidates, we may backtrack" area? I think that way we wouldn't ever have to worry too much about performance (as backtracking never happens on the happy path).

I"m not sure how feasible this is though!

Eh2406 · 2018-03-21T15:44:15Z

So thinking about that, some kind of inverse-index. At a minimum a set of all package_id that have ever appeared in a conflicting_activations. Then we only do the O(remaining_deps) operation if we are in the set, and thus never do it on the happy path.

Eh2406 · 2018-03-21T20:06:18Z

I'm now nervous that this does not get the correct answer for cases that involve conflicting_activations with len > 1. But don't have the cycle to confirm it with a test case.

Eh2406 · 2018-03-22T15:13:43Z

Updated the test so that it requires multiple things to be activated to cause backtracking. Indeed it fails with the proof of concept solution. (Failing to find a solution when one is available.) The test passes on master if DEPTH and BRANCHING_FACTOR are set very small, if not small enough it runs ~forever.

Eh2406 · 2018-03-22T15:51:02Z

That was surprisingly easy to fix, just conflicting_activations.extend(conflict). So I think we have:

does it make sense? (think yes)
is there a better way of doing it? (feel like yes, but have not thought of it yet)
is the big O acceptable? (think just barely yes)
is it clear what it does from the code? (no)
is it commented so someone else can pick it up? (no)

@alexcrichton what do you think?

alexcrichton · 2018-03-22T17:31:15Z

src/cargo/core/resolver/mod.rs

@@ -987,6 +987,10 @@ fn activate_deps_loop(
        Vec<HashMap<PackageId, ConflictReason>>,
    > = HashMap::new();

+    // `past_conflict_triggers` is an inverse-index of `past_conflicting_activations`.
+    // For every `PackageId` this lists the `Dependency`s that mention it in `past_conflicting_activations`.
+    let mut past_conflict_triggers: HashMap<PackageId, HashSet<Dependency>> = HashMap::new();


Since this function is already pretty tricky, perhaps this along with past_conflicting_activations could be moved to a separate data structure so they're managed in sync?

That's a good thought, It will at least move the large comment on the structure of the cache out of activate_deps_loop.

Working on it, should be done tomorrow some time.

alexcrichton · 2018-03-22T17:32:28Z

src/cargo/core/resolver/mod.rs

@@ -1255,10 +1265,41 @@ fn activate_deps_loop(
                            .flat_map(|x| x)
                            .find(|con| cx.is_conflicting(None, con))
                        {
-                            conflicting_activations.extend(conflicting.clone());
+                            let mut conflicting = conflicting.clone();
+                            conflicting.remove(&pid);


Could you add a comment for this is removed? To me it's not immediately obvious but I've always got a hard time following the specifics here anyway!

Yes, and then you can help me rewrite it so someone else can understand it. :-P

alexcrichton · 2018-03-22T17:42:08Z

src/cargo/core/resolver/mod.rs

+                                            .and_then(|past_bad| {
+                                                past_bad.iter().find(|con| {
+                                                    con.contains_key(&pid)
+                                                        && cx.is_conflicting(None, con)


I think this is the same as cx.is_conflicting(Some(&pid), con), right?

Unfortunately, no. That checks if pid is active. (We know it is. We just activated it.) This checks if the past conflict involves pid.

Ah yes, indeed!

alexcrichton · 2018-03-22T17:44:01Z

src/cargo/core/resolver/mod.rs

+                        if let Some(rel_deps) = past_conflict_triggers.get(&pid) {
+                            'deps: for debs in remaining_deps.iter() {
+                                for (_, (other_dep, _, _)) in debs.remaining_siblings.clone() {
+                                    if rel_deps.contains(&other_dep) {


I've personally found a convenient way to structure these sorts of loops to heavily use continue to avoid rightward drift, for example:

for (_, (other_dep, _, _)) in ... { if !rel_deps.contains(&other_dep) { continue; } // ... }

Also to help address this we may want to try to more aggressively move out portions of this function into helper functions maybe?

Ya the 6 layers of nesting is geting to be vary code-smelly.

I think I have replaced the loops with flat_map \ filter_map \ filter whether it is more readable, I don't know.

Looks great!

alexcrichton · 2018-03-22T17:53:30Z

src/cargo/core/resolver/mod.rs

                            has_past_conflicting_dep = true;
                        }
                    }
+                    if !has_past_conflicting_dep {
+                        // TODO: this is ugly, replace!


Ok I'm gonna do my best to rephrase what's happening here in english and you can tell me how wrong I am!

Here we're iterating over all remaining work we have to do (aka all of remaining_deps). We're looking at every single remaining Dependency left to work with. One of those Dependency structures may be known to fail because of a known set of conflicting activated packages. If that conflicting set includes the candidate we just activated and we've activated everything in the set then we know this Dependency is going to fail.

Since we know this Dependency is going to fail we want to extend our known set of conflicting activations. These conflicting activations are for when our own candidate can't be activated, and we've just deduced that activating ourselves will cause something else to fail.

After this though... I think I get a little lost. We take this conflicting set for some remote dependency, we remove ourselves from it, and we insert that remote dependency's parent with the reason we just removed. How come we do the swich here?

In any case we finally reach conflicting_activations.extend(conflict) after all that. If we entirely fail activation then this ensures that conflicting_activations contains an exhaustive set as to the reasons why we failed to activated.

I think this all makes sense? It seems plausible at least!

Exactly correct! The switch is just a kinda change in perspective.

The conflict we found is "other dep will not succeed if we are activated." We want to add "our dep will not succeed if other dep is in remaining_deps" but that is not how the cache is set up. So we add the less general but much faster, "our dep will not succeed if other dep's parent is activated".

We can remove ourselves from the list because we know that if our dep is being looked at then we are going to be activated. We have to remove ourselves from the list as when our parent considers if we will succeed we won't have been activated yet.

It seems plausible at least!

This resolver things is complicated enough, and I have been wrong about things I was sure of enough, that we should not merge unless we are sure it is correct and the comments make it possible to pick it up agen.

Ok I think that definitely makes sense. I think it's a direct implication that if the other dep's parent is activated that means that other dep will eventually be in remaining_deps, which means that if we're activated again it'll cause that to fail.

That definitely cleared things up for me! I'm definitely much more confident in this now

Eh2406 · 2018-03-23T15:54:21Z

Suggestions incorporated. I'd love more feedback, especially better wording for comments.

If it looks good to you, I still need to fix the commit history/messages.

alexcrichton

Ok mostly just nits and such, but otherwise looks great!

alexcrichton · 2018-03-23T17:45:17Z

src/cargo/core/resolver/mod.rs

-                            conflicting_activations.extend(conflicting.clone());
+                            let mut conflicting = conflicting.clone();
+
+                            // If one of our deps conflicts with


I think there's a missing word after "with"?

Good catch. Replaced with "known unresolvable"

alexcrichton · 2018-03-23T17:46:56Z

src/cargo/core/resolver/mod.rs

+                                })
+                                .next()
+                            {
+                                let mut conflict = conflict.clone();


I think the clone here can be avoided, right? We could use extend with filter to remove pid and then later insert manually other_parent into conflicting_activations?

That makes sense, but I can't get the iter to match correctly.

conflicting_activations.extend(conflicting.iter().filter(|&(p, _)| p == &pid));

gives:

error[E0271]: type mismatch resolving `<std::iter::Filter<std::collections::hash_map::Iter<'_, core::package_id::PackageId, core::resolver::ConflictReason>, [closure@src/cargo\core\resolver\mod.rs:1228:86: 1228:105 pid:_]> as std::iter::IntoIterator>::Item == (core::package_id::PackageId, core::resolver::ConflictReason)` --> src/cargo\core\resolver\mod.rs:1228:53 | 1228 | conflicting_activations.extend(conflicting.iter().filter(|&(p, _)| p == &pid)); | ^^^^^^ expected reference, found struct `core::package_id::PackageId` | = note: expected type `(&core::package_id::PackageId, &core::resolver::ConflictReason)` found type `(core::package_id::PackageId, core::resolver::ConflictReason)`

I was reading the error message backwards. added a:
.map(|(p, r)| (p.clone(), r.clone()) and it works.

alexcrichton · 2018-03-23T17:48:20Z

src/cargo/core/resolver/conflict_cache.rs

+            }
+        }
+    }
+    pub(super) fn get_dep_from_pid(&self, pid: &PackageId) -> Option<&HashSet<Dependency>> {


Perhaps this could be named dependencies_conflicting_with?

alexcrichton · 2018-03-23T17:48:58Z

src/cargo/core/resolver/conflict_cache.rs

+    ) -> Option<&HashMap<PackageId, ConflictReason>> {
+        self.filter_conflicting(cx, dep, |_| true)
+    }
+    pub(super) fn insert(&mut self, dep: &Dependency, con: &HashMap<PackageId, ConflictReason>) {


Could a doc comment be added here saying that we're adding a conflict where dep is known to be unresolvable if all the PackageId entries are activated?

alexcrichton · 2018-03-23T17:49:11Z

src/cargo/core/resolver/conflict_cache.rs

+}
+
+impl ConflictCache {
+    pub(super) fn new() -> ConflictCache {


Oh I think it's fine to just use pub here rather than pub(super)

oops yes, removed from the methods. If I remove it from the type, I get:
private type `core::resolver::ConflictReason` in public interface on the methods. So I got confused.

Ah I see, ok no worries!

FYI, I made a separate file, as IMO mod.rs is getting long. I'd prefer to move more out, but for now this seemed like a good chunk.

alexcrichton · 2018-03-23T17:49:59Z

src/cargo/core/resolver/conflict_cache.rs

+    where
+        for<'r> F: FnMut(&'r &HashMap<PackageId, ConflictReason>) -> bool,
+    {
+        self.con_from_dep.get(dep).and_then(|past_bad| {


This might be clearer to use ? to propagate the None from get

alexcrichton · 2018-03-23T17:50:46Z

src/cargo/core/resolver/conflict_cache.rs

+            dep_from_pid: HashMap::new(),
+        }
+    }
+    pub(super) fn filter_conflicting<F>(


Perhaps this could be called find_conflicting with a doc comment saying it finds any known set of conflicts, if any, which are activated in cx and pass the filter specified?

alexcrichton · 2018-03-23T17:51:07Z

src/cargo/core/resolver/mod.rs

+        Box::new(
+            self.remaining_siblings
+                .clone()
+                .map(move |(_, (d, _, _))| (self.parent.package_id().clone(), d)),


If we worry about perf at some point I think the clone() here could be deferred to a later point probably

It is an Arc clone, and I could not convinced the borrow checker that it lived long enough without it.
edit: now it works. good call.

alexcrichton · 2018-03-23T17:51:55Z

src/cargo/core/resolver/mod.rs

+                    // ourselves are incompatible with that dep, so we know that deps
+                    // parent conflict with us.
+                    if !has_past_conflicting_dep {
+                        if let Some(rel_deps) = past_conflicting_activations.get_dep_from_pid(&pid)


rel_deps here may be better named something like known_bad_deps perhaps?

Eh2406 · 2018-03-23T20:46:09Z

Another round of suggestions incorporated.

If it looks good to you, I still need to fix the commit history/messages.

Eh2406 · 2018-03-23T23:54:24Z

The history is ugly, but I have not succeeded at cleaning it up. I am ok with either scwashing or leaving it. At this point I'd rather have it in the public record than look at it more. :-P

The odd rebase error is fixed. Part of one commit was placed in a totally different test, and I am sure it was in the correct test before the rebase.

alexcrichton

Looks great to me! Did you want to squash things a bit though?

alexcrichton · 2018-03-24T14:28:46Z

src/cargo/core/resolver/conflict_cache.rs

+            }
+        }
+    }
+    pub fn dependencies_conflicting_with(&self, pid: &PackageId) -> Option<&HashSet<Dependency>> {


As a random drive-by-comment, it might be neat to store an empty HashSet<Dependency> on this structure so you could do something like:

self.dep_from_pid.get(pid).unwrap_or(&self.empty)

alexcrichton · 2018-03-24T14:29:07Z

src/cargo/core/resolver/conflict_cache.rs

+use core::{Dependency, PackageId};
+use core::resolver::{ConflictReason, Context};
+
+pub(super) struct ConflictCache {


I think this pub(super) can change to pub like the ones below, right?

If I remove it from the type, I get:
private type `core::resolver::ConflictReason` in public interface on the methods.
I could just make everything pub, but these are all just implementation details of core::resolver so I think pub(super) is appropriate.

Eh2406 · 2018-03-24T15:42:26Z

History is slightly cleaner. I am ok for merge.

When this hits nightly I will make an announcement on /r/rust, users, and internals asking people to test if cargo +nightly generate-lockfile gets the same result as cargo +stable generate-lockfile.

alexcrichton · 2018-03-24T16:59:56Z

@bors: r+

Sounds great!

bors · 2018-03-24T16:59:57Z

📌 Commit a7a1341 has been approved by alexcrichton

bors · 2018-03-24T17:00:04Z

⌛ Testing commit a7a1341 with merge 93b0b12...

Faster resolver: use a inverse-index to not activate the causes of conflict This adds a test for #4810 (comment) with two extensions that make it harder. It is the last reproducible and in the wild exponentially slow resolution (that I have found). The problem in the test is `backtrack_trap0 = "*"` is a lot of ways of saying `constrained = ">=1.1.0, <=2.0.0"` but `constrained= "2.0.1"` is already picked. Only then to try and solve `constrained= "~1.0.0"` which is incompatible. Our parent knows that we have been poisoned, and wont try to activate us again. Because of the order we evaluate deps we end up backtracking to where `constrained: 1.1.0` is set instead of our parent. And so the poisoning does not help. This is harder then #4810 (comment) because: 1. Having multiple semver compatible versions of constrained in play makes for a lot more bookkeeping. Specifically bookkeeping I forgot when I first submitted this PR. 2. The problematic dependencies are added deep in a combinatorial explosion of possibilities. So if we don't correctly handle caching that `backtrack_trap0 = "*"` is doomed then we will never finish looking thru the different possibilities for `level0 = "*"` This PR also includes a proof of concept solution for the test, which proves that it does solve #4810 (comment). The added code is tricky to read. It also adds a `O(remaining_deps)` job to every activation on the happy path, slower if the `past_conflicting_activations` is not empty. I'd like some brainstorming on better solutions.

bors · 2018-03-24T17:33:50Z

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing 93b0b12 to master...

remove all 4 RC clones in min_candidates. allowing it to be inlined So I was looking at a profile, and noted that `DepsFrame::min_candidates` was taking ~10% of the runtime. The odd part is that it should be a thin wrapper around `Vec::len()`, and so should be completely inlined away. Also it is the key for the `BinaryHeap` so it gets called a lot! Looking into it `remaining_siblings.clone()` clones the RC in the `RcVecIter` then `.next()` clones `T` witch is a `DepInfo` each part of which is an RC that needs to be cloned. All 4 of these RC clones can be removed, but it is apparently too much for the optimizer. So I added a 'peek' method that uses a normal reference to the inner value instead of an RC clone. After this `DepsFrame::min_candidates` does not appear in the profile results. Probably as the name is inlined away. But is the inlined code faster? before: 20000000 ticks, 104s, 192.308 ticks/ms after: 20000000 ticks, 87s, 229.885 ticks/ms So yes ~16% faster! All profiling/benchmark was done by commenting out the code from #5213 so its test case would run for a long time. But this should improve the happy path as well.

rust-highfive assigned alexcrichton Mar 20, 2018

Eh2406 force-pushed the faster_resolver branch from 930b8b6 to 74d9f6d Compare March 21, 2018 18:25

Eh2406 force-pushed the faster_resolver branch from e55cdd9 to 3c6e752 Compare March 22, 2018 15:05

alexcrichton reviewed Mar 22, 2018

View reviewed changes

Eh2406 changed the title ~~[WIP] Faster resolver: ... solution needed~~ [WIP] Faster resolver: use a inverse-index to not activate the causes of conflict Mar 22, 2018

alexcrichton reviewed Mar 23, 2018

View reviewed changes

Eh2406 added 2 commits March 23, 2018 17:23

Add a test for rust-lang#4810 (comment)

66e3d1c

A slow and ugly solution, but it works.

816203e

Eh2406 force-pushed the faster_resolver branch from fffac37 to 6779ae4 Compare March 23, 2018 21:25

Eh2406 changed the title ~~[WIP] Faster resolver: use a inverse-index to not activate the causes of conflict~~ Faster resolver: use a inverse-index to not activate the causes of conflict Mar 23, 2018

alexcrichton reviewed Mar 24, 2018

View reviewed changes

Eh2406 added 6 commits March 24, 2018 11:31

problematic extension and solution

efefe00

maintain an inverse-index

0f42afa

test the new backtracking does all of its bookkeeping

91c062e

remove some loops

f86346c

add comments

aac04b6

Move caching to a support struct and file

997b3dc

address suggestions

a7a1341

Eh2406 force-pushed the faster_resolver branch from f280b79 to a7a1341 Compare March 24, 2018 15:36

bors merged commit a7a1341 into rust-lang:master Mar 24, 2018

Eh2406 mentioned this pull request Mar 30, 2018

cargo could use a persistent hash map in it's resolver orium/rpds#6

Closed

Eh2406 deleted the faster_resolver branch April 4, 2018 02:25

Eh2406 mentioned this pull request Jun 10, 2018

remove all 4 RC clones in min_candidates. allowing it to be inlined #5625

Merged

Eh2406 mentioned this pull request Jul 26, 2018

Infinite cycle in depgraph resolution #4810

Closed

Eh2406 mentioned this pull request Oct 24, 2018

Tracking issue for RFC 1977: public & private dependencies, as it relates to the resolver #6129

Open

ehuss added this to the 1.26.0 milestone Feb 6, 2022

Faster resolver: use a inverse-index to not activate the causes of conflict #5213

Faster resolver: use a inverse-index to not activate the causes of conflict #5213

Conversation

Eh2406 commented Mar 20, 2018 • edited Loading

rust-highfive commented Mar 20, 2018

alexcrichton commented Mar 21, 2018

Eh2406 commented Mar 21, 2018

Eh2406 commented Mar 21, 2018

Eh2406 commented Mar 22, 2018

Eh2406 commented Mar 22, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Eh2406 commented Mar 23, 2018 • edited Loading

alexcrichton left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Eh2406 commented Mar 23, 2018

Eh2406 commented Mar 23, 2018

alexcrichton left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Eh2406 commented Mar 24, 2018

alexcrichton commented Mar 24, 2018

bors commented Mar 24, 2018

bors commented Mar 24, 2018

bors commented Mar 24, 2018

Eh2406 commented Mar 20, 2018 •

edited

Loading

Eh2406 commented Mar 22, 2018 •

edited

Loading

Eh2406 commented Mar 23, 2018 •

edited

Loading