Multicore collapsing #128

bcmyers · 2019-06-08T02:12:53Z

So after a bit more time working on this, I got collapse-perf, collapse-dtrace and collapse-guess to all work with multiple cores.

The speedup is pretty good. The collapse/perf benchmark on my machine went from 186 MiB/s throughput to 421 MiB/s (+226% improvement).

All the test pass. That said, because we're testing log messages you need to run the test suite with cargo test -- --test-threads=1 to ensure the global test logger is not being trampled on by multiple tests running at the same time.

There are a lot of changes here; so I'll give you a chance to look over the code. Of course, if you have any questions at any time, please don't hesitate to reach out.

At a high level, what has changed is that we now have two code paths for each folder: a "single threaded" code path, which hasn't changed, and a new, "multi-threaded" code path. Which one is taken depends on a configuration option (nthreads).

In the multi-threaded code path, we do an initial pass over the data in order to figure out how to chunk it up. I've tried to make this initial pass as efficient as possible, as it's extra work that doesn't need to be done in the single-threaded situation.

After that's done, we spin up several threads and send the chunked input data to each (actually, we just send a reference). Each thread then uses the single-threaded code path on their chunk.

To bring all the output together ... instead of getting back output from each thread and then "reducing" these outputs, I elected to just use a concurrent hashmap; so there's no need to collate the results after each thread has done it's work. It's sufficient to just write out the contents of the shared hashmap in the same way we were writing out the contents of the single-threaded hashmap in the old code.

It might be interesting to experiment later with a model that doesn't use a concurrent hashmap but instead gives each thread it's own "regular" hashmap and then reduces them in the end. Not sure if that would be faster or slower.

bcmyers · 2019-06-08T03:22:13Z

I'm pretty sure Travis is failing only because I don't know how to get it to run cargo test --verbose -- --test-threads=1 instead of cargo test --verbose. Do you know how to tweak the .travis.yml to get it to run tests with only one thread?

jonhoo · 2019-06-10T13:41:44Z

Should just be a matter of adding:

script: cargo test --no-fail-fast --verbose --all -- --test-threads=1

just below os: linux I think.

I'll try to get around to reviewing this some time this week :)

bcmyers · 2019-06-10T15:18:01Z

Good stuff. Definitely take your time. In the meantime I'll try to fix the .travis.yml.

src/collapse/mod.rs

src/bin/collapse-dtrace.rs

bcmyers · 2019-06-12T20:27:47Z

FYI - Don't take a look at this yet. Still working on getting the commits split out. Will shoot you a note once that's finished.

bcmyers · 2019-06-12T21:09:19Z

Ok - I think that should be much better (although probably not perfect). The substantive commit is add concurrency to collapse operations.

jonhoo · 2019-06-12T21:12:13Z

I'm still seeing basically all of those code blocks moved around in 3a3e669. Am I missing something?

bcmyers · 2019-06-12T21:15:35Z

Ah - missed the bins, I think. One sec.

…ads=1

bcmyers · 2019-06-12T21:38:21Z

Ok - This should be better.

src/collapse/dtrace.rs

src/collapse/mod.rs

src/collapse/dtrace.rs

src/collapse/perf.rs

src/collapse/common.rs

jonhoo

This is looking really good now! I left a few inline comments, but nothing too bad I think!

Co-Authored-By: Jon Gjengset <jon@thesquareplanet.com>

...to separate line.

bcmyers · 2019-07-23T21:50:12Z

Ok. I think I'm done working on the most recent comments (assuming the CI keeps passing). There are still a couple of open conversations (see above), but hopefully I addressed everything else at least somewhat adequately. Looking forward to your feedback.

jonhoo

Loving the detailed commits -- makes it so easy to review 👍

Changes all look good to me. I think the one case highlighted above should be made an error as you suggest, and I think we should probably not implement Clone as discussed in the other comment chain, but beyond that I think this is ready to land 🎉

bcmyers · 2019-07-24T00:47:56Z

Pending CI, I think all is incorporated!

bcmyers · 2019-07-24T01:03:52Z

Using your machines for the benchmarks, you should probably update the Comparison to the Perl implementation section of the readme.

jonhoo · 2019-07-24T14:42:53Z

Merged! 🎉
Thanks for all your hard work on this :D

Windows is okay for CI again

ab095d5

jonhoo reviewed Jun 10, 2019

View reviewed changes

src/collapse/mod.rs Outdated Show resolved Hide resolved

jonhoo reviewed Jun 11, 2019

View reviewed changes

src/bin/collapse-dtrace.rs Show resolved Hide resolved

consistent ordering of imports and cargo dependencies

ac71f00

bcmyers force-pushed the multicore-collapsing branch from 4f3d473 to 762e213 Compare June 12, 2019 20:26

bcmyers force-pushed the multicore-collapsing branch 2 times, most recently from 355517b to 1e7277d Compare June 12, 2019 21:07

bcmyers added 6 commits June 12, 2019 17:19

alphabatize struct fields in collapse mod

6987b7d

make capitalization of help text in collapse bins consistent

57c85db

add concurrency to collapse operations

cf0e016

modify .cirrus.yml and .travis.yml so that tests run with --test-thre…

719c92c

…ads=1

clippy

0435f03

remove unneeded dev dependency

d3429d5

bcmyers force-pushed the multicore-collapsing branch from 1e7277d to d3429d5 Compare June 12, 2019 21:25