Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Revert accidental inclusion of a part of #69218 #71996

Merged
merged 2 commits into from
May 27, 2020

Conversation

Marwes
Copy link
Contributor

@Marwes Marwes commented May 7, 2020

This was accidentally included in #69464 after a rebase and given
how much inflate and keccak stresses the obligation forest seems
like a likely culprit to the regression in those benchmarks.

(It is necessary in #69218 as obligation forest needs to accurately
track the root variables or unifications will get lost)

@rust-highfive
Copy link
Collaborator

r? @davidtwco

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 7, 2020
This was accidentally included in rust-lang#69494 after a rebase and given
how much `inflate` and `keccak` stresses the obligation forest seems
like a likely culprit to the regression in those benchmarks.

(It is necessary in rust-lang#69218 as obligation forest needs to accurately
track the root variables or unifications will get lost)
@jonas-schievink
Copy link
Contributor

#69494

That seems like the wrong PR

@jonas-schievink
Copy link
Contributor

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented May 7, 2020

⌛ Trying commit 1d8489c with merge b80dbeb9b6dc6b59230376c05f5d2dd14a757f54...

@bors
Copy link
Contributor

bors commented May 8, 2020

☀️ Try build successful - checks-azure
Build commit: b80dbeb9b6dc6b59230376c05f5d2dd14a757f54 (b80dbeb9b6dc6b59230376c05f5d2dd14a757f54)

@rust-timer
Copy link
Collaborator

Queued b80dbeb9b6dc6b59230376c05f5d2dd14a757f54 with parent a08c473, future comparison URL.

@nikomatsakis
Copy link
Contributor

@bors r+

the perf run results, however, aren't visible yet

@bors
Copy link
Contributor

bors commented May 8, 2020

📌 Commit 1d8489c has been approved by nikomatsakis

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 8, 2020
@nikomatsakis
Copy link
Contributor

@bors delegate+

@bors
Copy link
Contributor

bors commented May 8, 2020

✌️ @Marwes can now approve this pull request

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit b80dbeb9b6dc6b59230376c05f5d2dd14a757f54, comparison URL.

@RalfJung
Copy link
Member

@bors rollup=never (please do that for PRs that are suspected to impact perf)

@davidtwco
Copy link
Member

r? @nikomatsakis

@nikomatsakis
Copy link
Contributor

@bors r-

Hmm

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels May 11, 2020
@nikomatsakis
Copy link
Contributor

@Marwes doesn't seem to have much impact on keccak... any thoughts?

@Marwes
Copy link
Contributor Author

Marwes commented May 12, 2020

Trying to figure that out but I really can't find another place which would disproportionately affect keccak and inflate :S

Reclaims most of the regression in inflate
@Marwes
Copy link
Contributor Author

Marwes commented May 24, 2020

Adding some #[inline] fixed most of it with inflate only having regressed by 1.5% in my measurements.

@nnethercote I tried cachegrind but for whatever reason I can't get it to work. Tried with callgrind also and tried to view the output in kcachegrind (which I have used before) but I just get addresses displayed. I guess something makes valgrind/cachegrind/ etc not understand the debug information. (perf + hotspot works fine though, as usual :/)

@nnethercote
Copy link
Contributor

@Marwes: your comment made me realize that the docs are missing a crucial part about setting a config.toml correctly. I just added a new section explaining it. Could you try again with that? Cachegrind is great for detecting inlining issues, because an inlined function has no instruction counts on its first and last lines, while a non-inline function does.

@Marwes
Copy link
Contributor Author

Marwes commented May 25, 2020

Used debuginfo-level = 2 in my case which didn't work :/

@nnethercote
Copy link
Contributor

Didn't work how? Can you show me some example output?

@Marwes
Copy link
Contributor Author

Marwes commented May 25, 2020

Running valgrind --tool=cachegrind --cache-sim=no --branch-sim=no ~/Code/rust/build/x86_64-unknown-linux-gnu/stage1/bin/rustc --crate-type=lib src/lib.rs

And then running cg_diff on the output file just gives me a list of c/c++ functions, putting the same file into kcachegrind shows what I can only assume a are memory addresses display in lieu of debug info.

(Diffs a file with itself so everything is zero, still there are not rust functions in the diff)

fl=/build/glibc-OTsEL5/glibc-2.27/malloc/arena.c
fn=_int_free
0 0
fl=???
fn=std::ios_base::ios_base()
0 0
fl=/build/glibc-OTsEL5/glibc-2.27/posix/../sysdeps/unix/sysv/linux/_exit.c
fn=_Exit
0 0
fl=/build/glibc-OTsEL5/glibc-2.27/stdlib/msort.c
fn=qsort_r
0 0
fl=/build/glibc-OTsEL5/glibc-2.27/elf/dl-lookup.c
fn=_dl_setup_hash
0 0
fl=/build/glibc-OTsEL5/glibc-2.27/string/strdup.c
fn=strdup
0 0
fl=/build/glibc-OTsEL5/glibc-2.27/nptl/../include/list.h
fn=__free_tcb
0 0
fl=/build/glibc-OTsEL5/glibc-2.27/elf/../misc/sbrk.c
fn=sbrk
0 0
fl=/build/glibc-OTsEL5/glibc-2.27/string/../sysdeps/x86_64/multiarch/../strlen.S
fn=strlen
0 0
fl=/build/glibc-OTsEL5/glibc-2.27/string/../sysdeps/x86_64/multiarch/strcspn.c
fn=strcspn
0 0
fl=/build/glibc-OTsEL5/glibc-2.27/math/../sysdeps/x86_64/fpu/multiarch/s_rint.c
fn=rint
0 0
fl=???
fn=std::_Rb_tree_decrement(std::_Rb_tree_node_base*)
0 0
fl=/build/glibc-OTsEL5/glibc-2.27/csu/../sysdeps/unix/sysv/linux/x86_64/init-first.c
fn=_init
0 0
fl=/build/glibc-OTsEL5/glibc-2.27/nptl/../sysdeps/unix/sysv/linux/pthread-pids.h
fn=__pthread_initialize_minimal
0 0
fl=/build/glibc-OTsEL5/glibc-2.27/elf/dl-load.c
fn=_dl_map_object
0 0
fl=/build/glibc-OTsEL5/glibc-2.27/io/../sysdeps/unix/sysv/linux/poll.c
fn=poll
0 0
fl=/build/glibc-OTsEL5/glibc-2.27/wcsmbs/../sysdeps/x86_64/multiarch/ifunc-avx2.h
fn=wcslen
0 0
fl=/build/glibc-OTsEL5/glibc-2.27/posix/../sysdeps/unix/syscall-template.S
fn=getpid
0 0
fl=???
fn=std::ostream::flush()
0 0
fl=/build/glibc-OTsEL5/glibc-2.27/nptl/../sysdeps/unix/sysv/linux/x86_64/cancellation.S
fn=__libc_disable_asynccancel
0 0
fl=/build/glibc-OTsEL5/glibc-2.27/malloc/malloc.c
fn=_int_realloc
0 0
summary: 0

Screenshot from 2020-05-25 09-51-40

@Mark-Simulacrum
Copy link
Member

Note that I'm not sure we'll rebuild the compiler for you, so you may need to x.py clean and rebuild.

Also, debuginfo level of 1 should be sufficient I suspect and significantly lighter on disk usage - I've personally seen IIRC tens of gigabytes of space savings.

@nnethercote
Copy link
Contributor

I also recommend running Cachegrind via collector, rather than directly. I find it simpler that way.

@Marwes
Copy link
Contributor Author

Marwes commented May 26, 2020

Cleaning and rebuilding does not help, nor does just using level 1 or running via collector.

Could use a perf run to validate that inlining helps.

@nikomatsakis
Copy link
Contributor

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented May 26, 2020

⌛ Trying commit ebc7eda with merge ad74291237c3c0233ace59eec9db124ff1c17082...

@bors
Copy link
Contributor

bors commented May 26, 2020

☀️ Try build successful - checks-azure
Build commit: ad74291237c3c0233ace59eec9db124ff1c17082 (ad74291237c3c0233ace59eec9db124ff1c17082)

@rust-timer
Copy link
Collaborator

Queued ad74291237c3c0233ace59eec9db124ff1c17082 with parent aeca4d6, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit ad74291237c3c0233ace59eec9db124ff1c17082, comparison URL.

@Aaron1011
Copy link
Member

Great work @Marwes!

@Marwes
Copy link
Contributor Author

Marwes commented May 26, 2020

It doesn't recover all the performance but it is at least most of it (1.5% regression remaining locally). However with #69218 it probably doesn't matter anyway, since that improves keccak and inflate by so much that it won't really be noticed.

@nikomatsakis
Copy link
Contributor

@bors r+ rollup=neve

@bors
Copy link
Contributor

bors commented May 27, 2020

📌 Commit ebc7eda has been approved by nikomatsakis

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 27, 2020
@nikomatsakis
Copy link
Contributor

@bors rollup=never

@bors
Copy link
Contributor

bors commented May 27, 2020

⌛ Testing commit ebc7eda with merge 664fcd3...

@bors
Copy link
Contributor

bors commented May 27, 2020

☀️ Test successful - checks-azure
Approved by: nikomatsakis
Pushing 664fcd3 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label May 27, 2020
@bors bors merged commit 664fcd3 into rust-lang:master May 27, 2020
@Marwes Marwes deleted the detach_undo_log branch May 28, 2020 06:09
@nnethercote
Copy link
Contributor

The perf improvements from the landing are here. Nice work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.