cargo could use a persistent hash map in it's resolver #6

Eh2406 · 2018-01-17T00:32:52Z

Hi,

I was reviewing the code in cargo's resolver when I came across this line:
"We should switch to persistent hash maps if we can..."
and thought didn't I just see a persistent hash map crate.

So if/when this crate is ready, perhaps a pr thare?

Just my 2c.

orium · 2018-03-30T01:28:34Z

I saw a reddit post about cargo's resolver and its performance and it reminded me of this. I decided to give this a go and I just now realized that you are the person posted that :)

Anyway, I decided to give it a try and it seems to work fine. You can take a look at it here.

I also did a very basic benchmark using rustc as an example and this is what I got:

Benchmark	Time (ms)
cargo master branch (4389f620)	3325
with rpds	3423
with rpds but using `Rc` instead of `Arc`	3172

This was measured with cd rust/src && CARGO_PROFILE=1 cargo generate-lockfile -v | grep resolving.

As you can see, in this basic benchmark, the performance of rpds is worse than before, but when using Rc instead of Arc the performance is better. It can very well be the case that for a more complex dependency graph the normal rpds (with Arc) is faster, but it doesn't happen in this case.

It is, however, planned for rpds to support both Rc and Arc in the future (see #7). There is also some potential gain with another performance improvement, described in #9.

I would like to perform a more complex benchmark, which I saw you did (by looking at your reddit post). Can you tell me what you methodology was?

Eh2406 · 2018-03-30T03:41:18Z

Thanks for looking it this! That diff looks manageable to me. There are 2 benchmarks. The first is a "large but straight forward" test usually servo or rustc. This one can get faster or slower it is all ok, but it does need to be checked as it is really easy to make it 10x slower witch is unacceptable. The other test is how fast can it get to 10M ticks for a graf that it cannot resolve. Unfortunately, with rust-lang/cargo#5213 we are fresh out of examples of things that do not terminate. One alternative is to comment out the if let Some(known_related_bad_deps) = part so that rust-lang/cargo#4810 (comment) is back to not terminateting.

Also I wonder if the Rc<Vec<Summary>> can be replaced with a persistent {vec, queue, set, something}.

Eh2406 · 2018-03-30T14:51:26Z

Also I usually got more targeted timing numbers with code like:

            if ticks % 1000000 == 0 {
                let e = start.elapsed();
                println!("{} ticks, {}s, {:.3} ticks/ms, {}, {}, {}",
                         ticks,
                         e.as_secs(),
                         ticks as f32 / (e.as_secs() as f32 * 1000.0),
                         cx.activations.iter().map(|x| x.1.len()).sum::<usize>(),
                         past_conflicting_activations.iter().map(|x| x.1.len()).sum::<usize>(),
                         remaining_deps.len(),
                );
            }

added after !printed && ticks % 1000 == 0

orium · 2018-03-30T18:48:31Z

Thank you for your tips!

I did some further benchmarking and I'm going to document it here. First thing I need to say it that my previous benchmark is meaningless because it included network i/o and it was a single run.

I added a new profiler to measure only the main resolver loop. I used this script to run the benchmark. I used 20 iterations.

Benchmarking servo	Median Time (ms)	Average Time (ms)
cargo master branch (4389f620)	1870	2640
with rpds	1695	2805
with rpds but using `Rc` instead of `Arc`	1339	3221

These results are discouraging but I followed @Eh2406 suggestion and comment out the if let Some(known_related_bad_deps) = part and added his code to output some stats. I then tested this with the Cargo.toml shown in rust-lang/cargo#4810 (comment). This was what I got:

Benchmark	Time @ 20 Mticks (s)
cargo master branch (4389f620)	61
with rpds	60
with rpds but using `Rc` instead of `Arc`	54

So, there was only a small improvement here with Rc (pending on #7).

I'm not ready to give up on this since there is at least on performance related improvements that might this worthwhile (#14). I will re-evaluate this when #14 lands.

Eh2406 · 2018-03-30T19:58:06Z

That is actually all very encouraging. As some background before rust-lang/cargo#5147 Activations::clone was most of any profile of the resolver, after it is just a part. So I am not surprised that now this is not a "night vs day" change. Also since then each tick dose more work so times are not comparable between now and then.

For the servo benchmark that +28% in Median Time is grate, the -22% on Average Time is concerning but the big question for the servo case is 10x regressions, so 3.2sec vs 2.6sec is probably ok.

For the other case that is a +11% improvement. That is nothing to sneeze at, given all the work that has gone into optimizing the resolver.

Sounds like it is worth talking to the cargo team (about adding the dependency) when #7 adds a way to use Rc.

Definitely worth continued exploration of questions. How much difrants dose #14 make? Are there more Rc<things> that should ues rpds structures? Are there things that get cloned into a BacktrackFrame that should ues rpds structures?

orium · 2018-03-30T20:33:01Z

How much difrants dose #14 make?

In general it is hard to say, because it will depend on the amount of sharing in the data structures. There is one place where it is definitly going to be significantly faster (not sure if this piece of code runs a lot of times).

Are there more Rc that should ues rpds structures? Are there things that get cloned into a BacktrackFrame that should ues rpds structures?

From a quick look it seems so! There is RemainingCandidates::conflicting_prev_active and BacktrackFrame::conflicting_activations. Tomorrow I will check if there is any speed improvements in making these maps persistent.

Eh2406 · 2018-03-30T20:58:59Z

I am excited by this progress. Unfortunately I will not have time to look at my computer this weekend.

Marwes · 2018-04-01T21:17:39Z

orium/cargo@4007c38#diff-2305770923daa439ab4f47bb3f1abf5cR372

Since the keys and values stored in the hashmaps are Arcs (PackageId/HashTrieSet) or small/cheaply cloneable (InternedString) I wouldn't be surprised if #7 (comment) would help (skip boxing keys and values in another Arc inside rpds). Depending on how much time is spent doing hashlookups the extra pointer chasing may end up fairly significant and avoiding the allocations would also be nice.

Eh2406 · 2018-04-02T17:18:38Z

Ok, hear is how I see it from reading this thread. @orium has demonstrated that the diff to mvp of resolver-with-persistent-maps is not big. That mvp with a solution to #7 (using RC) gets a positive change.

Once we have the mvp, there is low hanging fruit. #7 (comment), and #14 on the rpds side and moving more structures on the cargo side.

orium · 2018-04-02T17:23:58Z

Once we have the mvp, there is low hanging fruit. #7 (comment), and #14 on the rpds side and moving more structures on the cargo side.

I'm working on #14, in particular the HashTrieMap. I think we will be able to see how this affects cargo's performance soon.

orium · 2021-01-31T21:18:33Z

Closing since cargo is now using im.

Eh2406 mentioned this issue Jun 6, 2018

cargo could use a persistent hash map in it's resolver bodil/im-rs#26

Closed

orium closed this as completed Jan 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cargo could use a persistent hash map in it's resolver #6

cargo could use a persistent hash map in it's resolver #6

Eh2406 commented Jan 17, 2018

orium commented Mar 30, 2018 •

edited

Loading

Eh2406 commented Mar 30, 2018

Eh2406 commented Mar 30, 2018

orium commented Mar 30, 2018

Eh2406 commented Mar 30, 2018

orium commented Mar 30, 2018

Eh2406 commented Mar 30, 2018

Marwes commented Apr 1, 2018 •

edited

Loading

Eh2406 commented Apr 2, 2018

orium commented Apr 2, 2018

orium commented Jan 31, 2021

cargo could use a persistent hash map in it's resolver #6

cargo could use a persistent hash map in it's resolver #6

Comments

Eh2406 commented Jan 17, 2018

orium commented Mar 30, 2018 • edited Loading

Eh2406 commented Mar 30, 2018

Eh2406 commented Mar 30, 2018

orium commented Mar 30, 2018

Eh2406 commented Mar 30, 2018

orium commented Mar 30, 2018

Eh2406 commented Mar 30, 2018

Marwes commented Apr 1, 2018 • edited Loading

Eh2406 commented Apr 2, 2018

orium commented Apr 2, 2018

orium commented Jan 31, 2021

orium commented Mar 30, 2018 •

edited

Loading

Marwes commented Apr 1, 2018 •

edited

Loading