Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Experiment] Replace HashMap with OrderMap #45282

Closed
wants to merge 3 commits into from
Closed

[Experiment] Replace HashMap with OrderMap #45282

wants to merge 3 commits into from

Conversation

ghost
Copy link

@ghost ghost commented Oct 14, 2017

Do not merge.

This is just a simple experiment where I modified FxHashMap to use OrderMap instead of HashMap, as explained in #45273. Don't expect the code to look good. :)

cc @Mark-Simulacrum - Can we please run performance tests to see how this PR impacts compile times?
cc @bluss @kennytm

@eddyb said on IRC that we shouldn't blindly swap the implementation just yet - let's investigate a bit further. If changing the hash map implementation affects performance a lot, then we can probably gain even more by using a different data structure.

@rust-highfive
Copy link
Collaborator

r? @aturon

(rust_highfive has picked a reviewer for you, use r? to override)

@eddyb
Copy link
Member

eddyb commented Oct 14, 2017

cc @rust-lang/compiler

@kennytm
Copy link
Member

kennytm commented Oct 14, 2017

@bors try

Why are you copying the whole package instead of depending on the crates.io version though?

@bors
Copy link
Contributor

bors commented Oct 14, 2017

⌛ Trying commit 0f01c3b with merge 41825985ccde0f03bf3ea37052b1ad4679c82355...

@kennytm kennytm added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Oct 14, 2017
@ghost
Copy link
Author

ghost commented Oct 14, 2017

@kennytm No good reason. I wasn't sure how easy or hard it'd be to add ordermap as a dependency from crates.io, so I just copy-pasted the whole crate and called it a day. Last time I worked on rustc it wasn't easy to add crates.io dependencies. But that was long time ago...

Edit: I've pushed a commit that uses ordermap from crates.io.

@estebank
Copy link
Contributor

@bors try

(with the crates.io dep now)

@bors
Copy link
Contributor

bors commented Oct 14, 2017

⌛ Trying commit 9da5d99 with merge ddc21f8b46d56234290379ed208d9b58a9e1ea42...

@kennytm
Copy link
Member

kennytm commented Oct 14, 2017

@bors try

The previous try build failed due to Cargo.lock being outdated. Together with the new commit being pushed, the error is not returned. Let's try again.

EDIT: Oh no

@bors
Copy link
Contributor

bors commented Oct 14, 2017

⌛ Trying commit 5dbe2e4 with merge 43308d4...

bors added a commit that referenced this pull request Oct 14, 2017
[Experiment] Replace HashMap with OrderMap

Do not merge.

This is just a simple experiment where I modified `FxHashMap` to use `OrderMap` instead of `HashMap`, as explained in #45273. Don't expect the code to look good. :)

cc @Mark-Simulacrum - Can we please run performance tests to see how this PR impacts compile times?
cc @bluss @kennytm

@eddyb said on IRC that we shouldn't blindly swap the implementation just yet - let's investigate a bit further. If changing the hash map implementation affects performance a lot, then we can probably gain even more by using a different data structure.
@kennytm
Copy link
Member

kennytm commented Oct 14, 2017

@bors r- try- retry clean

Try build is successful but bors is not commenting.

cc @Mark-Simulacrum

Copy link
Member

@kennytm kennytm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The run-make/reproducible-build test failed in the CI.

[00:56:24] ---- [run-make] run-make/reproducible-build stdout ----
[00:56:24] 	
[00:56:24] error: make failed
[00:56:24] status: exit code: 2
[00:56:24] command: "make"
[00:56:24] stdout:
[00:56:24] ------------------------------------------
[00:56:24] make[1]: Entering directory '/checkout/src/test/run-make/reproducible-build'
[00:56:24] LD_LIBRARY_PATH="/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/reproducible-build.stage2-x86_64-unknown-linux-gnu:/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib:/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools/x86_64-unknown-linux-gnu/release/deps:/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-sysroot/lib/rustlib/x86_64-unknown-linux-gnu/lib:" '/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc' --out-dir /checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/reproducible-build.stage2-x86_64-unknown-linux-gnu -L /checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/reproducible-build.stage2-x86_64-unknown-linux-gnu  reproducible-build-aux.rs
[00:56:24] LD_LIBRARY_PATH="/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/reproducible-build.stage2-x86_64-unknown-linux-gnu:/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib:/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools/x86_64-unknown-linux-gnu/release/deps:/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-sysroot/lib/rustlib/x86_64-unknown-linux-gnu/lib:" '/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc' --out-dir /checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/reproducible-build.stage2-x86_64-unknown-linux-gnu -L /checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/reproducible-build.stage2-x86_64-unknown-linux-gnu  reproducible-build.rs -o"/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/reproducible-build.stage2-x86_64-unknown-linux-gnu/reproducible-build1"
[00:56:24] LD_LIBRARY_PATH="/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/reproducible-build.stage2-x86_64-unknown-linux-gnu:/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib:/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools/x86_64-unknown-linux-gnu/release/deps:/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-sysroot/lib/rustlib/x86_64-unknown-linux-gnu/lib:" '/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc' --out-dir /checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/reproducible-build.stage2-x86_64-unknown-linux-gnu -L /checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/reproducible-build.stage2-x86_64-unknown-linux-gnu  reproducible-build.rs -o"/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/reproducible-build.stage2-x86_64-unknown-linux-gnu/reproducible-build2"
[00:56:24] nm "/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/reproducible-build.stage2-x86_64-unknown-linux-gnu/reproducible-build1" | sort > "/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/reproducible-build.stage2-x86_64-unknown-linux-gnu/reproducible-build1.nm"
[00:56:24] nm "/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/reproducible-build.stage2-x86_64-unknown-linux-gnu/reproducible-build2" | sort > "/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/reproducible-build.stage2-x86_64-unknown-linux-gnu/reproducible-build2.nm"
[00:56:24] cmp "/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/reproducible-build.stage2-x86_64-unknown-linux-gnu/reproducible-build1.nm" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/reproducible-build.stage2-x86_64-unknown-linux-gnu/reproducible-build2.nm" || exit 1
[00:56:24] /checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/reproducible-build.stage2-x86_64-unknown-linux-gnu/reproducible-build1.nm /checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/reproducible-build.stage2-x86_64-unknown-linux-gnu/reproducible-build2.nm differ: char 5265, line 108
[00:56:24] Makefile:3: recipe for target 'all' failed
[00:56:24] make[1]: Leaving directory '/checkout/src/test/run-make/reproducible-build'
[00:56:24] 
[00:56:24] ------------------------------------------
[00:56:24] stderr:
[00:56:24] ------------------------------------------
[00:56:24] make[1]: warning: jobserver unavailable: using -j1.  Add '+' to parent make rule.
[00:56:24] warning: ignoring --out-dir flag due to -o flag.
[00:56:24] 
[00:56:24] warning: unused variable: `dropped`
[00:56:24]   --> reproducible-build.rs:80:9
[00:56:24]    |
[00:56:24] 80 |     let dropped = Struct {
[00:56:24]    |         ^^^^^^^
[00:56:24]    |
[00:56:24]    = note: #[warn(unused_variables)] on by default
[00:56:24]    = note: to avoid this warning, consider using `_dropped` instead
[00:56:24] 
[00:56:24] warning: unused variable: `pointer_shim`
[00:56:24]    --> reproducible-build.rs:123:9
[00:56:24]     |
[00:56:24] 123 |     let pointer_shim: &Fn(i32) = &regular_fn;
[00:56:24]     |         ^^^^^^^^^^^^
[00:56:24]     |
[00:56:24]     = note: to avoid this warning, consider using `_pointer_shim` instead
[00:56:24] 
[00:56:24] warning: ignoring --out-dir flag due to -o flag.
[00:56:24] 
[00:56:24] warning: unused variable: `dropped`
[00:56:24]   --> reproducible-build.rs:80:9
[00:56:24]    |
[00:56:24] 80 |     let dropped = Struct {
[00:56:24]    |         ^^^^^^^
[00:56:24]    |
[00:56:24]    = note: #[warn(unused_variables)] on by default
[00:56:24]    = note: to avoid this warning, consider using `_dropped` instead
[00:56:24] 
[00:56:24] warning: unused variable: `pointer_shim`
[00:56:24]    --> reproducible-build.rs:123:9
[00:56:24]     |
[00:56:24] 123 |     let pointer_shim: &Fn(i32) = &regular_fn;
[00:56:24]     |         ^^^^^^^^^^^^
[00:56:24]     |
[00:56:24]     = note: to avoid this warning, consider using `_pointer_shim` instead
[00:56:24] 
[00:56:24] make[1]: *** [all] Error 1
[00:56:24] 
[00:56:24] ------------------------------------------
[00:56:24] 
[00:56:24] thread '[run-make] run-make/reproducible-build' panicked at 'explicit panic', /checkout/src/tools/compiletest/src/runtest.rs:2467:8
[00:56:24] note: Run with `RUST_BACKTRACE=1` for a backtrace.
[00:56:24] 
[00:56:24] 
[00:56:24] failures:
[00:56:24]     [run-make] run-make/reproducible-build
[00:56:24] 
[00:56:24] test result: FAILED. 160 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out
[00:56:24] 
[00:56:24] thread 'main' panicked at 'Some tests failed', /checkout/src/tools/compiletest/src/main.rs:323:21

@kennytm
Copy link
Member

kennytm commented Oct 14, 2017

The numbers don’t look good. Most of them become slower.

@ghost
Copy link
Author

ghost commented Oct 15, 2017

This is just too bad. :(
So I think the difference in performance was simply the difference of using the nightly rustc from rustup and using my own compiled rustc.

@ghost ghost closed this Oct 15, 2017
@ghost ghost deleted the experiment-ordermap branch October 15, 2017 09:54
@kennytm
Copy link
Member

kennytm commented Oct 15, 2017

@Mark-Simulacrum I want to ensure I'm reading the report correctly, as different measures give conflicting results.

measure before (total) after (total) change
cpu-clock 372185.29 369019.14 -0.85%
cycles:u 1421498864061 1409937857765 -0.81%
faults 1511256 1510959 -0.02%
instructions:u 1699836885596 1723213735874 +1.38%
max-rss 16735048 16739924 +0.03%
task-clock 370849.03 367449.54 -0.92%
wall-time 202.36 203.21 +0.42%

If we judge by "instructions", there is a significant regression, but if we judge by "cpu-clock" or "cycles", there is a signficant improvement.

@Mark-Simulacrum
Copy link
Member

Yes, instructions is usually a good indicator that it's worth looking at timing data. I don't see any significant improvement though (usually anything <1% isn't too major -- the timing looks like it changed by ~0.5 seconds which isn't too meaningful).

@ghost
Copy link
Author

ghost commented Oct 15, 2017

Hmm, so what is it that makes my locally compiled rustc 5% to 20% faster than the one that came from rustup? Does ./x.py build --stage 1 optimize for the local machine or something?

My compile times:

crate rustc provided by rustup locally compiled rustc
crossbeam 31.94 secs 26.44 secs
rayon 9.40 secs 8.20 secs
pdqsort 0.23 secs 0.18 secs
chan 6.16 secs 5.26 secs
string-cache 10.13 secs 9.71 secs
parking_lot 5.45 secs 4.92 secs
timely-dataflow 19.69 secs 16.55 secs

@Mark-Simulacrum
Copy link
Member

What is your config.toml like? On most (all?) platforms nightly rustc is shipped with LLVM assertions enabled, whereas by default they are disabled in locally compiled builds.

@ghost
Copy link
Author

ghost commented Oct 15, 2017

I assume you meant .rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/multirust-config.toml:

config_version = "1"

[[components]]
pkg = "rustc"
target = "x86_64-unknown-linux-gnu"

[[components]]
pkg = "rust-std"
target = "x86_64-unknown-linux-gnu"

[[components]]
pkg = "cargo"
target = "x86_64-unknown-linux-gnu"

[[components]]
pkg = "rust-docs"
target = "x86_64-unknown-linux-gnu"

[[components]]
pkg = "rust-analysis"
target = "x86_64-unknown-linux-gnu"

[[components]]
pkg = "rust-src"
target = "*"

Is it out of the question to disable LLVM assertions in shipped rustc? I mean, this is a pretty nice speedup... :)

@bluss
Copy link
Member

bluss commented Oct 21, 2017

I just want to learn the thinking behind it, why is it needed to look at the instructions stat before cpu-clock or cycles?

@Mark-Simulacrum
Copy link
Member

LLVM assertions being disabled in nightly builds is actually a topic of ongoing discussion (and has been for a long time, to an extent). By config.toml, I meant the file in your locally compiled rustc -- or did you just run ./x.py build (without copying config.toml.example or running configure?) Do you perhaps have a config.mk still?

@bluss It's not required, but generally we see instructions as a far more stable measure of performance than wall time or cycles. If we see a regression or improvement in the number of instructions, then a look at wall time for actual impact on compile times is warranted. It's not necessarily clear that this is the best way of doing things, but it's how we approach the process for now. I'd be happy to hear suggestions if you have them!

@bluss
Copy link
Member

bluss commented Oct 21, 2017

@Mark-Simulacrum I haven't used the perf comparison website much in the past, I just saw the different measures now and seeing the reduced cpu-clock times certainly changed my perception of this benchmark.

On one hand, it's not surprising ordermap would use more instructions — its implementation doesn't have much micro optimizations, we have bounds checked vector accesses and other kinds of code that means more instructions are executed for the equivalent operations. So to see that it still runs fast makes us imagine something about the wonders of branch prediction :-)

Effectively straight line code with less branch prediction failures, can manage to run through more instructions in less time. (Like in this article and the 2x instructions vs 7x runtime difference; their benchmark is irrelevant to this discussion otherwise though.)

@ghost
Copy link
Author

ghost commented Oct 21, 2017

By config.toml, I meant the file in your locally compiled rustc -- or did you just run ./x.py build (without copying config.toml.example or running configure?) Do you perhaps have a config.mk still?

I just ran ./x.py build --stage 1 src/libtest without any additional fiddling. I don't have a config.mk.

@Mark-Simulacrum
Copy link
Member

Yep, that'll mean that you have a build without llvm assertions, which is expected to be faster.

@bluss Regarding instructions vs wall time, it's certainly true that more instructions may not mean worse results timing wise, but that's why we look at wall time as well. While it seems like OrderMap might be a win there, it's a small one, and may not even exist. It's interesting to know that it isn't a loss, but I don't think the results show enough impact to consider adding OrderMap at this point. I'd be happy to rerun the benchmarks later if OrderMap is optimized and we think it's worth another look.

@bluss
Copy link
Member

bluss commented Oct 21, 2017

It certainly warms my heart that it isn't even a loss. I didn't see those numbers until today. Mind you, I'm pretty convinced that OrderMap has slower lookup just like I'm convinced it has faster iteration.

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants