-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use more accurate estimate of generated LLVM IR with llvm-lines #1049
Conversation
Impact of default MIR transforms on the rustc:
It would be nice to teach cargo-llvm-lines about LLVM bitcode, to avoid disk overhead of IR textual representation. |
Thanks for investigating this. I'll test this next week :) When I wrote this section and experimented with different MIR optimization levels, I noticed some weird behavior, like functions appearing to be inlined although they shouldn't be. Hopefully these commands fix that. Just to be sure I understand what's going on:
But shouldn't the |
Combining The two methods would be alternatives with slightly different trade-offs. The |
The `--emit=llvm-ir` emits an optimized LLVM IR. For optimized builds it will be highly inaccurate estimate of the amount IR generated initially. While the inaccuracy can be somewhat reduce after disabling the optimization, that in turn has other unintended consequences, since opt-level controls the emission of lifetime markers, sharing of generics between crates, instantiation of inline functions, etc. Use `-Csave-temps` and `no-opt` bitcode as a basis for more accurate estimate of initial work handed of to the LLVM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finally got around to testing this today. Sorry for the long wait.
Besides my issue with parallel
, this works great! Where in my test the old method showed 13 million total lines, this new one shows 47 million, and all the small functions like Option<T>::map
are now shown. So yes, it is more accurate now.
src/profiling.md
Outdated
# Specify all crates of the compiler. (Relies on the glob support of your shell.) | ||
cargo llvm-lines --files ./build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/debug/deps/*.ll > llvm-lines.txt | ||
parallel ./build/x86_64-unknown-linux-gnu/llvm/bin/llvm-dis -- ./build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/release/deps/*.no-opt.bc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parallel
command doesn't work like this on my machine (Manjaro Arch Linux).
First, it wasn't installed by default, so writing how to install it would be helpful.
Second, parallel
only accepts input via a pipe.
parallel ./build/x86_64-unknown-linux-gnu/llvm/bin/llvm-dis -- ./build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/release/deps/*.no-opt.bc | |
# On Arch Linux, install `parallel` with | |
sudo pacman -S parallel | |
... | |
ls ./build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/release/deps/*.no-opt.bc | parallel ./build/x86_64-unknown-linux-gnu/llvm/bin/llvm-dis |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced with a bash for loop. Only after the fact did I realize that there is another, more popular, parallel
utility. In that case I would rather limit usage of external tools to a minimum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably a good idea.
src/profiling.md
Outdated
env RUSTFLAGS=-Csave-temps ./x.py build --stage 0 compiler/rustc | ||
|
||
# Single crate, e.g., rustc_middle. | ||
parallel ./build/x86_64-unknown-linux-gnu/llvm/bin/llvm-dis -- ./build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/release/deps/rustc_middle-*.no-opt.bc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parallel ./build/x86_64-unknown-linux-gnu/llvm/bin/llvm-dis -- ./build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/release/deps/rustc_middle-*.no-opt.bc | |
ls ./build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/release/deps/rustc_middle-*.no-opt.bc | parallel ./build/x86_64-unknown-linux-gnu/llvm/bin/llvm-dis |
Like below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The for loop works better. And nice description for what save-temps
does. Thanks!
Now someone with merge rights to review it. Maybe @jyn514?
cargo llvm-lines --files ./build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/debug/deps/*.ll > llvm-lines.txt | ||
env RUSTFLAGS=-Csave-temps ./x.py build --stage 0 compiler/rustc | ||
|
||
# Single crate, e.g., rustc_middle. (Relies on the glob support of your shell.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Single crate, e.g., rustc_middle. (Relies on the glob support of your shell.) | |
# Single crate, e.g. rustc_middle. (Relies on the glob support of your shell.) |
|
||
# Single crate, e.g., rustc_middle. (Relies on the glob support of your shell.) | ||
# Convert unoptimized LLVM bitcode into a human readable LLVM assembly accepted by cargo-llvm-lines. | ||
for f in build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/release/deps/rustc_middle-*.no-opt.bc; do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think this should mention that x86_64-unknown-linux-gnu
varies based on the target? Or is it self-evident?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say it's easy enough to figure out what's wrong when you get file not found errors. Also bash syntax is used here, so unfortunately copy-pasting wouldn't work for windows anyway.
The
--emit=llvm-ir
emits an optimized LLVM IR. For optimized builds it will behighly inaccurate estimate of the amount IR generated initially. While the
inaccuracy can be somewhat reduce after disabling the optimization, that in turn
has other unintended consequences, since opt-level controls the emission of
lifetime markers, sharing of generics between crates, instantiation of inline
functions, etc.
Use
-Csave-temps
andno-opt
bitcode as a basis for more accurate estimate ofinitial work handed of to the LLVM.