Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cargo could use a persistent hash map in it's resolver #6

Closed
Eh2406 opened this issue Jan 17, 2018 · 11 comments
Closed

cargo could use a persistent hash map in it's resolver #6

Eh2406 opened this issue Jan 17, 2018 · 11 comments

Comments

@Eh2406
Copy link

Eh2406 commented Jan 17, 2018

Hi,

I was reviewing the code in cargo's resolver when I came across this line:
"We should switch to persistent hash maps if we can..."
and thought didn't I just see a persistent hash map crate.

So if/when this crate is ready, perhaps a pr thare?

Just my 2c.

@orium
Copy link
Owner

orium commented Mar 30, 2018

I saw a reddit post about cargo's resolver and its performance and it reminded me of this. I decided to give this a go and I just now realized that you are the person posted that :)

Anyway, I decided to give it a try and it seems to work fine. You can take a look at it here.

I also did a very basic benchmark using rustc as an example and this is what I got:

Benchmark Time (ms)
cargo master branch (4389f620) 3325
with rpds 3423
with rpds but using Rc instead of Arc 3172

This was measured with cd rust/src && CARGO_PROFILE=1 cargo generate-lockfile -v | grep resolving.

As you can see, in this basic benchmark, the performance of rpds is worse than before, but when using Rc instead of Arc the performance is better. It can very well be the case that for a more complex dependency graph the normal rpds (with Arc) is faster, but it doesn't happen in this case.

It is, however, planned for rpds to support both Rc and Arc in the future (see #7). There is also some potential gain with another performance improvement, described in #9.

I would like to perform a more complex benchmark, which I saw you did (by looking at your reddit post). Can you tell me what you methodology was?

@Eh2406
Copy link
Author

Eh2406 commented Mar 30, 2018

Thanks for looking it this! That diff looks manageable to me. There are 2 benchmarks. The first is a "large but straight forward" test usually servo or rustc. This one can get faster or slower it is all ok, but it does need to be checked as it is really easy to make it 10x slower witch is unacceptable. The other test is how fast can it get to 10M ticks for a graf that it cannot resolve. Unfortunately, with rust-lang/cargo#5213 we are fresh out of examples of things that do not terminate. One alternative is to comment out the if let Some(known_related_bad_deps) = part so that rust-lang/cargo#4810 (comment) is back to not terminateting.

Also I wonder if the Rc<Vec<Summary>> can be replaced with a persistent {vec, queue, set, something}.

@Eh2406
Copy link
Author

Eh2406 commented Mar 30, 2018

Also I usually got more targeted timing numbers with code like:

            if ticks % 1000000 == 0 {
                let e = start.elapsed();
                println!("{} ticks, {}s, {:.3} ticks/ms, {}, {}, {}",
                         ticks,
                         e.as_secs(),
                         ticks as f32 / (e.as_secs() as f32 * 1000.0),
                         cx.activations.iter().map(|x| x.1.len()).sum::<usize>(),
                         past_conflicting_activations.iter().map(|x| x.1.len()).sum::<usize>(),
                         remaining_deps.len(),
                );
            }

added after !printed && ticks % 1000 == 0

@orium
Copy link
Owner

orium commented Mar 30, 2018

Thank you for your tips!

I did some further benchmarking and I'm going to document it here. First thing I need to say it that my previous benchmark is meaningless because it included network i/o and it was a single run.

I added a new profiler to measure only the main resolver loop. I used this script to run the benchmark. I used 20 iterations.

Benchmarking servo Median Time (ms) Average Time (ms)
cargo master branch (4389f620) 1870 2640
with rpds 1695 2805
with rpds but using Rc instead of Arc 1339 3221

These results are discouraging but I followed @Eh2406 suggestion and comment out the if let Some(known_related_bad_deps) = part and added his code to output some stats. I then tested this with the Cargo.toml shown in rust-lang/cargo#4810 (comment). This was what I got:

Benchmark Time @ 20 Mticks (s)
cargo master branch (4389f620) 61
with rpds 60
with rpds but using Rc instead of Arc 54

So, there was only a small improvement here with Rc (pending on #7).

I'm not ready to give up on this since there is at least on performance related improvements that might this worthwhile (#14). I will re-evaluate this when #14 lands.

@Eh2406
Copy link
Author

Eh2406 commented Mar 30, 2018

That is actually all very encouraging. As some background before rust-lang/cargo#5147 Activations::clone was most of any profile of the resolver, after it is just a part. So I am not surprised that now this is not a "night vs day" change. Also since then each tick dose more work so times are not comparable between now and then.

For the servo benchmark that +28% in Median Time is grate, the -22% on Average Time is concerning but the big question for the servo case is 10x regressions, so 3.2sec vs 2.6sec is probably ok.

For the other case that is a +11% improvement. That is nothing to sneeze at, given all the work that has gone into optimizing the resolver.

Sounds like it is worth talking to the cargo team (about adding the dependency) when #7 adds a way to use Rc.

Definitely worth continued exploration of questions. How much difrants dose #14 make? Are there more Rc<things> that should ues rpds structures? Are there things that get cloned into a BacktrackFrame that should ues rpds structures?

@orium
Copy link
Owner

orium commented Mar 30, 2018

How much difrants dose #14 make?

In general it is hard to say, because it will depend on the amount of sharing in the data structures. There is one place where it is definitly going to be significantly faster (not sure if this piece of code runs a lot of times).

Are there more Rc that should ues rpds structures? Are there things that get cloned into a BacktrackFrame that should ues rpds structures?

From a quick look it seems so! There is RemainingCandidates::conflicting_prev_active and BacktrackFrame::conflicting_activations. Tomorrow I will check if there is any speed improvements in making these maps persistent.

@Eh2406
Copy link
Author

Eh2406 commented Mar 30, 2018

I am excited by this progress. Unfortunately I will not have time to look at my computer this weekend.

@Marwes
Copy link
Contributor

Marwes commented Apr 1, 2018

orium/cargo@4007c38#diff-2305770923daa439ab4f47bb3f1abf5cR372

Since the keys and values stored in the hashmaps are Arcs (PackageId/HashTrieSet) or small/cheaply cloneable (InternedString) I wouldn't be surprised if #7 (comment) would help (skip boxing keys and values in another Arc inside rpds). Depending on how much time is spent doing hashlookups the extra pointer chasing may end up fairly significant and avoiding the allocations would also be nice.

@Eh2406
Copy link
Author

Eh2406 commented Apr 2, 2018

Ok, hear is how I see it from reading this thread. @orium has demonstrated that the diff to mvp of resolver-with-persistent-maps is not big. That mvp with a solution to #7 (using RC) gets a positive change.

Once we have the mvp, there is low hanging fruit. #7 (comment), and #14 on the rpds side and moving more structures on the cargo side.

@orium
Copy link
Owner

orium commented Apr 2, 2018

Once we have the mvp, there is low hanging fruit. #7 (comment), and #14 on the rpds side and moving more structures on the cargo side.

I'm working on #14, in particular the HashTrieMap. I think we will be able to see how this affects cargo's performance soon.

@orium
Copy link
Owner

orium commented Jan 31, 2021

Closing since cargo is now using im.

@orium orium closed this as completed Jan 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants