Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster resolver: use a inverse-index to not activate the causes of conflict #5213

Merged
merged 9 commits into from
Mar 24, 2018

Conversation

Eh2406
Copy link
Contributor

@Eh2406 Eh2406 commented Mar 20, 2018

This adds a test for #4810 (comment) with two extensions that make it harder. It is the last reproducible and in the wild exponentially slow resolution (that I have found).

The problem in the test is backtrack_trap0 = "*" is a lot of ways of saying constrained = ">=1.1.0, <=2.0.0" but constrained= "2.0.1" is already picked. Only then to try and solve constrained= "~1.0.0" which is incompatible. Our parent knows that we have been poisoned, and wont try to activate us again. Because of the order we evaluate deps we end up backtracking to where constrained: 1.1.0 is set instead of our parent. And so the poisoning does not help. This is harder then #4810 (comment) because:

  1. Having multiple semver compatible versions of constrained in play makes for a lot more bookkeeping. Specifically bookkeeping I forgot when I first submitted this PR.
  2. The problematic dependencies are added deep in a combinatorial explosion of possibilities. So if we don't correctly handle caching that backtrack_trap0 = "*" is doomed then we will never finish looking thru the different possibilities for level0 = "*"

This PR also includes a proof of concept solution for the test, which proves that it does solve #4810 (comment). The added code is tricky to read. It also adds a O(remaining_deps) job to every activation on the happy path, slower if the past_conflicting_activations is not empty.

I'd like some brainstorming on better solutions.

@rust-highfive
Copy link

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@alexcrichton
Copy link
Member

I wonder, would it be possible to shift more work from the "just after we activated a dependency see if it's doomed to fail" to the "we've exhausted a list of candidates, we may backtrack" area? I think that way we wouldn't ever have to worry too much about performance (as backtracking never happens on the happy path).

I"m not sure how feasible this is though!

@Eh2406
Copy link
Contributor Author

Eh2406 commented Mar 21, 2018

So thinking about that, some kind of inverse-index. At a minimum a set of all package_id that have ever appeared in a conflicting_activations. Then we only do the O(remaining_deps) operation if we are in the set, and thus never do it on the happy path.

@Eh2406
Copy link
Contributor Author

Eh2406 commented Mar 21, 2018

I'm now nervous that this does not get the correct answer for cases that involve conflicting_activations with len > 1. But don't have the cycle to confirm it with a test case.

@Eh2406
Copy link
Contributor Author

Eh2406 commented Mar 22, 2018

Updated the test so that it requires multiple things to be activated to cause backtracking. Indeed it fails with the proof of concept solution. (Failing to find a solution when one is available.) The test passes on master if DEPTH and BRANCHING_FACTOR are set very small, if not small enough it runs ~forever.

@Eh2406
Copy link
Contributor Author

Eh2406 commented Mar 22, 2018

That was surprisingly easy to fix, just conflicting_activations.extend(conflict). So I think we have:

  • does it make sense? (think yes)
  • is there a better way of doing it? (feel like yes, but have not thought of it yet)
  • is the big O acceptable? (think just barely yes)
  • is it clear what it does from the code? (no)
  • is it commented so someone else can pick it up? (no)

@alexcrichton what do you think?

@@ -987,6 +987,10 @@ fn activate_deps_loop(
Vec<HashMap<PackageId, ConflictReason>>,
> = HashMap::new();

// `past_conflict_triggers` is an inverse-index of `past_conflicting_activations`.
// For every `PackageId` this lists the `Dependency`s that mention it in `past_conflicting_activations`.
let mut past_conflict_triggers: HashMap<PackageId, HashSet<Dependency>> = HashMap::new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this function is already pretty tricky, perhaps this along with past_conflicting_activations could be moved to a separate data structure so they're managed in sync?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good thought, It will at least move the large comment on the structure of the cache out of activate_deps_loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working on it, should be done tomorrow some time.

@@ -1255,10 +1265,41 @@ fn activate_deps_loop(
.flat_map(|x| x)
.find(|con| cx.is_conflicting(None, con))
{
conflicting_activations.extend(conflicting.clone());
let mut conflicting = conflicting.clone();
conflicting.remove(&pid);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment for this is removed? To me it's not immediately obvious but I've always got a hard time following the specifics here anyway!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and then you can help me rewrite it so someone else can understand it. :-P

.and_then(|past_bad| {
past_bad.iter().find(|con| {
con.contains_key(&pid)
&& cx.is_conflicting(None, con)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the same as cx.is_conflicting(Some(&pid), con), right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, no. That checks if pid is active. (We know it is. We just activated it.) This checks if the past conflict involves pid.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, indeed!

if let Some(rel_deps) = past_conflict_triggers.get(&pid) {
'deps: for debs in remaining_deps.iter() {
for (_, (other_dep, _, _)) in debs.remaining_siblings.clone() {
if rel_deps.contains(&other_dep) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've personally found a convenient way to structure these sorts of loops to heavily use continue to avoid rightward drift, for example:

for (_, (other_dep, _, _)) in ... {
    if !rel_deps.contains(&other_dep) {
        continue;
    }
    // ...
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also to help address this we may want to try to more aggressively move out portions of this function into helper functions maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya the 6 layers of nesting is geting to be vary code-smelly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have replaced the loops with flat_map \ filter_map \ filter whether it is more readable, I don't know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

has_past_conflicting_dep = true;
}
}
if !has_past_conflicting_dep {
// TODO: this is ugly, replace!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I'm gonna do my best to rephrase what's happening here in english and you can tell me how wrong I am!

Here we're iterating over all remaining work we have to do (aka all of remaining_deps). We're looking at every single remaining Dependency left to work with. One of those Dependency structures may be known to fail because of a known set of conflicting activated packages. If that conflicting set includes the candidate we just activated and we've activated everything in the set then we know this Dependency is going to fail.

Since we know this Dependency is going to fail we want to extend our known set of conflicting activations. These conflicting activations are for when our own candidate can't be activated, and we've just deduced that activating ourselves will cause something else to fail.

After this though... I think I get a little lost. We take this conflicting set for some remote dependency, we remove ourselves from it, and we insert that remote dependency's parent with the reason we just removed. How come we do the swich here?

In any case we finally reach conflicting_activations.extend(conflict) after all that. If we entirely fail activation then this ensures that conflicting_activations contains an exhaustive set as to the reasons why we failed to activated.

I think this all makes sense? It seems plausible at least!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly correct! The switch is just a kinda change in perspective.

The conflict we found is "other dep will not succeed if we are activated." We want to add "our dep will not succeed if other dep is in remaining_deps" but that is not how the cache is set up. So we add the less general but much faster, "our dep will not succeed if other dep's parent is activated".

We can remove ourselves from the list because we know that if our dep is being looked at then we are going to be activated. We have to remove ourselves from the list as when our parent considers if we will succeed we won't have been activated yet.

It seems plausible at least!

This resolver things is complicated enough, and I have been wrong about things I was sure of enough, that we should not merge unless we are sure it is correct and the comments make it possible to pick it up agen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I think that definitely makes sense. I think it's a direct implication that if the other dep's parent is activated that means that other dep will eventually be in remaining_deps, which means that if we're activated again it'll cause that to fail.

That definitely cleared things up for me! I'm definitely much more confident in this now

@Eh2406 Eh2406 changed the title [WIP] Faster resolver: ... solution needed [WIP] Faster resolver: use a inverse-index to not activate the causes of conflict Mar 22, 2018
@Eh2406
Copy link
Contributor Author

Eh2406 commented Mar 23, 2018

Suggestions incorporated. I'd love more feedback, especially better wording for comments.

If it looks good to you, I still need to fix the commit history/messages.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok mostly just nits and such, but otherwise looks great!

conflicting_activations.extend(conflicting.clone());
let mut conflicting = conflicting.clone();

// If one of our deps conflicts with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a missing word after "with"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Replaced with "known unresolvable"

})
.next()
{
let mut conflict = conflict.clone();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the clone here can be avoided, right? We could use extend with filter to remove pid and then later insert manually other_parent into conflicting_activations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, but I can't get the iter to match correctly.

conflicting_activations.extend(conflicting.iter().filter(|&(p, _)| p == &pid));

gives:


error[E0271]: type mismatch resolving `<std::iter::Filter<std::collections::hash_map::Iter<'_, core::package_id::PackageId, core::resolver::ConflictReason>, [closure@src/cargo\core\resolver\mod.rs:1228:86: 1228:105 pid:_]> as std::iter::IntoIterator>::Item == (core::package_id::PackageId, core::resolver::ConflictReason)`
    --> src/cargo\core\resolver\mod.rs:1228:53
     |
1228 |                             conflicting_activations.extend(conflicting.iter().filter(|&(p, _)| p == &pid));
     |                                                     ^^^^^^ expected reference, found struct `core::package_id::PackageId`
     |
     = note: expected type `(&core::package_id::PackageId, &core::resolver::ConflictReason)`
                found type `(core::package_id::PackageId, core::resolver::ConflictReason)`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was reading the error message backwards. added a:
.map(|(p, r)| (p.clone(), r.clone()) and it works.

}
}
}
pub(super) fn get_dep_from_pid(&self, pid: &PackageId) -> Option<&HashSet<Dependency>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this could be named dependencies_conflicting_with?

) -> Option<&HashMap<PackageId, ConflictReason>> {
self.filter_conflicting(cx, dep, |_| true)
}
pub(super) fn insert(&mut self, dep: &Dependency, con: &HashMap<PackageId, ConflictReason>) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could a doc comment be added here saying that we're adding a conflict where dep is known to be unresolvable if all the PackageId entries are activated?

}

impl ConflictCache {
pub(super) fn new() -> ConflictCache {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I think it's fine to just use pub here rather than pub(super)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops yes, removed from the methods. If I remove it from the type, I get:
private type `core::resolver::ConflictReason` in public interface on the methods. So I got confused.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, ok no worries!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I made a separate file, as IMO mod.rs is getting long. I'd prefer to move more out, but for now this seemed like a good chunk.

where
for<'r> F: FnMut(&'r &HashMap<PackageId, ConflictReason>) -> bool,
{
self.con_from_dep.get(dep).and_then(|past_bad| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be clearer to use ? to propagate the None from get

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call.

dep_from_pid: HashMap::new(),
}
}
pub(super) fn filter_conflicting<F>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this could be called find_conflicting with a doc comment saying it finds any known set of conflicts, if any, which are activated in cx and pass the filter specified?

Box::new(
self.remaining_siblings
.clone()
.map(move |(_, (d, _, _))| (self.parent.package_id().clone(), d)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we worry about perf at some point I think the clone() here could be deferred to a later point probably

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is an Arc clone, and I could not convinced the borrow checker that it lived long enough without it.
edit: now it works. good call.

// ourselves are incompatible with that dep, so we know that deps
// parent conflict with us.
if !has_past_conflicting_dep {
if let Some(rel_deps) = past_conflicting_activations.get_dep_from_pid(&pid)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rel_deps here may be better named something like known_bad_deps perhaps?

@Eh2406
Copy link
Contributor Author

Eh2406 commented Mar 23, 2018

Another round of suggestions incorporated.

If it looks good to you, I still need to fix the commit history/messages.

@Eh2406
Copy link
Contributor Author

Eh2406 commented Mar 23, 2018

The history is ugly, but I have not succeeded at cleaning it up. I am ok with either scwashing or leaving it. At this point I'd rather have it in the public record than look at it more. :-P

The odd rebase error is fixed. Part of one commit was placed in a totally different test, and I am sure it was in the correct test before the rebase.

@Eh2406 Eh2406 changed the title [WIP] Faster resolver: use a inverse-index to not activate the causes of conflict Faster resolver: use a inverse-index to not activate the causes of conflict Mar 23, 2018
Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me! Did you want to squash things a bit though?

}
}
}
pub fn dependencies_conflicting_with(&self, pid: &PackageId) -> Option<&HashSet<Dependency>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a random drive-by-comment, it might be neat to store an empty HashSet<Dependency> on this structure so you could do something like:

self.dep_from_pid.get(pid).unwrap_or(&self.empty)

use core::{Dependency, PackageId};
use core::resolver::{ConflictReason, Context};

pub(super) struct ConflictCache {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this pub(super) can change to pub like the ones below, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remove it from the type, I get:
private type `core::resolver::ConflictReason` in public interface on the methods.
I could just make everything pub, but these are all just implementation details of core::resolver so I think pub(super) is appropriate.

@Eh2406
Copy link
Contributor Author

Eh2406 commented Mar 24, 2018

History is slightly cleaner. I am ok for merge.

When this hits nightly I will make an announcement on /r/rust, users, and internals asking people to test if cargo +nightly generate-lockfile gets the same result as cargo +stable generate-lockfile.

@alexcrichton
Copy link
Member

@bors: r+

Sounds great!

@bors
Copy link
Contributor

bors commented Mar 24, 2018

📌 Commit a7a1341 has been approved by alexcrichton

@bors
Copy link
Contributor

bors commented Mar 24, 2018

⌛ Testing commit a7a1341 with merge 93b0b12...

bors added a commit that referenced this pull request Mar 24, 2018
Faster resolver: use a inverse-index to not activate the causes of conflict

This adds a test for #4810 (comment) with two extensions that make it harder. It is the last reproducible and in the wild exponentially slow resolution (that I have found).

The problem in the test is `backtrack_trap0 = "*"` is a lot of ways of saying `constrained = ">=1.1.0, <=2.0.0"` but `constrained= "2.0.1"` is already picked. Only then to try and solve `constrained= "~1.0.0"` which is incompatible. Our parent knows that we have been poisoned, and wont try to activate us again.  Because of the order we evaluate deps we end up backtracking to where `constrained: 1.1.0` is set instead of our parent. And so the poisoning does not help. This is harder then #4810 (comment) because:

1. Having multiple semver compatible versions of constrained in play makes for a lot more bookkeeping. Specifically bookkeeping I forgot when I first submitted this PR.
2. The problematic dependencies are added deep in a combinatorial explosion of possibilities. So if we don't correctly handle caching that `backtrack_trap0 = "*"` is doomed then we will never finish looking thru the different possibilities for `level0 = "*"`

This PR also includes a proof of concept solution for the test, which proves that it does solve #4810 (comment). The added code is tricky to read. It also adds a `O(remaining_deps)` job to every activation on the happy path, slower if the `past_conflicting_activations` is not empty.

I'd like some brainstorming on better solutions.
@bors
Copy link
Contributor

bors commented Mar 24, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: alexcrichton
Pushing 93b0b12 to master...

@bors bors merged commit a7a1341 into rust-lang:master Mar 24, 2018
@Eh2406 Eh2406 deleted the faster_resolver branch April 4, 2018 02:25
bors added a commit that referenced this pull request Jun 27, 2018
remove all 4 RC clones in min_candidates. allowing it to be inlined

So I was looking at a profile, and noted that `DepsFrame::min_candidates` was taking ~10% of the runtime. The odd part is that it should be a thin wrapper around `Vec::len()`, and so should be completely inlined away. Also it is the key for the `BinaryHeap` so it gets called a lot! Looking into it `remaining_siblings.clone()` clones the RC in the `RcVecIter` then `.next()` clones `T` witch is a `DepInfo` each part of which is an RC that needs to be cloned. All 4 of these RC clones can be removed, but it is apparently too much for the optimizer. So I added a 'peek' method that uses a normal reference to the inner value instead of an RC clone. After this `DepsFrame::min_candidates` does not appear in the profile results. Probably as the name is inlined away. But is the inlined code faster?

before: 20000000 ticks, 104s, 192.308 ticks/ms
after:    20000000 ticks, 87s,   229.885 ticks/ms

So yes ~16% faster!

All profiling/benchmark was done by commenting out the code from #5213 so its test case would run for a long time. But this should improve the happy path as well.
@ehuss ehuss added this to the 1.26.0 milestone Feb 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants