Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use more accurate estimate of generated LLVM IR with llvm-lines #1049

Merged
merged 1 commit into from
Mar 7, 2021
Merged

Use more accurate estimate of generated LLVM IR with llvm-lines #1049

merged 1 commit into from
Mar 7, 2021

Conversation

tmiasko
Copy link
Contributor

@tmiasko tmiasko commented Feb 7, 2021

The --emit=llvm-ir emits an optimized LLVM IR. For optimized builds it will be
highly inaccurate estimate of the amount IR generated initially. While the
inaccuracy can be somewhat reduce after disabling the optimization, that in turn
has other unintended consequences, since opt-level controls the emission of
lifetime markers, sharing of generics between crates, instantiation of inline
functions, etc.

Use -Csave-temps and no-opt bitcode as a basis for more accurate estimate of
initial work handed of to the LLVM.

@tmiasko
Copy link
Contributor Author

tmiasko commented Feb 7, 2021

Impact of default MIR transforms on the rustc:

MIR opt level IR Lines
0 53717761
1 45207720

It would be nice to teach cargo-llvm-lines about LLVM bitcode, to avoid disk overhead of IR textual representation.

@tmiasko
Copy link
Contributor Author

tmiasko commented Feb 7, 2021

cc @Julian-Wollersberger

@Julian-Wollersberger
Copy link
Contributor

Thanks for investigating this. I'll test this next week :)

When I wrote this section and experimented with different MIR optimization levels, I noticed some weird behavior, like functions appearing to be inlined although they shouldn't be. Hopefully these commands fix that.

Just to be sure I understand what's going on:

  • -Csave-temps makes rustc emit the unoptimiced LLVM IR bitcode
  • The llvm-dis tool converts the bitcode into the textual representation
  • cargo-llvm-lines then analyzes that text

But shouldn't the -Cno-prepopulate-passes you mention in rust-lang/rust#81747 also be specified, or is that the same as -Zmir-opt-level=0? (The rustc docs are a bit short here)

@tmiasko
Copy link
Contributor Author

tmiasko commented Feb 10, 2021

  • -Csave-temps preserves temporary files created during compilation and saves LLVM bitcode generated at different phases of compilation. In this case, it is used to access LLVM IR which was used as input to the LLVM optimization passes.
  • --emit=llvm-ir emits LLVM IR after LLVM optimizations have been already performed.
  • -Cno-prepopulate-passes uses an empty pass manager when running LLVM optimizations (it has not effect MIR pipeline).
  • The MIR optimizations controlled by -Zmir-opt-level are independent of all of above.

Combining --emit=llvm-ir & -Cno-prepopulate-passes together should be more or less equivalent to *.no-opt.bc files obtained from -Csave-temps. The --emit=llvm-ir does have an extra side effect of attaching human readable names to LLVM IR.

The two methods would be alternatives with slightly different trade-offs. The -Csave-temps is non-invasive, gives you access to LLVM IR at different phases of compilation (the insight into of optimized IR might be valuable still). The --emit=llvm-ir & -Cno-prepopulate-passes doesn't run optimizations so it expedient but invasive method, and produces toolchain impractical for further use.

The `--emit=llvm-ir` emits an optimized LLVM IR. For optimized builds it will be
highly inaccurate estimate of the amount IR generated initially.  While the
inaccuracy can be somewhat reduce after disabling the optimization, that in turn
has other unintended consequences, since opt-level controls the emission of
lifetime markers, sharing of generics between crates, instantiation of inline
functions, etc.

Use `-Csave-temps` and `no-opt` bitcode as a basis for more accurate estimate of
initial work handed of to the LLVM.
Copy link
Contributor

@Julian-Wollersberger Julian-Wollersberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finally got around to testing this today. Sorry for the long wait.

Besides my issue with parallel, this works great! Where in my test the old method showed 13 million total lines, this new one shows 47 million, and all the small functions like Option<T>::map are now shown. So yes, it is more accurate now.

src/profiling.md Outdated
# Specify all crates of the compiler. (Relies on the glob support of your shell.)
cargo llvm-lines --files ./build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/debug/deps/*.ll > llvm-lines.txt
parallel ./build/x86_64-unknown-linux-gnu/llvm/bin/llvm-dis -- ./build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/release/deps/*.no-opt.bc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parallel command doesn't work like this on my machine (Manjaro Arch Linux).
First, it wasn't installed by default, so writing how to install it would be helpful.
Second, parallel only accepts input via a pipe.

Suggested change
parallel ./build/x86_64-unknown-linux-gnu/llvm/bin/llvm-dis -- ./build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/release/deps/*.no-opt.bc
# On Arch Linux, install `parallel` with
sudo pacman -S parallel
...
ls ./build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/release/deps/*.no-opt.bc | parallel ./build/x86_64-unknown-linux-gnu/llvm/bin/llvm-dis

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced with a bash for loop. Only after the fact did I realize that there is another, more popular, parallel utility. In that case I would rather limit usage of external tools to a minimum.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a good idea.

src/profiling.md Outdated
env RUSTFLAGS=-Csave-temps ./x.py build --stage 0 compiler/rustc

# Single crate, e.g., rustc_middle.
parallel ./build/x86_64-unknown-linux-gnu/llvm/bin/llvm-dis -- ./build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/release/deps/rustc_middle-*.no-opt.bc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
parallel ./build/x86_64-unknown-linux-gnu/llvm/bin/llvm-dis -- ./build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/release/deps/rustc_middle-*.no-opt.bc
ls ./build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/release/deps/rustc_middle-*.no-opt.bc | parallel ./build/x86_64-unknown-linux-gnu/llvm/bin/llvm-dis

Like below.

Copy link
Contributor

@Julian-Wollersberger Julian-Wollersberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The for loop works better. And nice description for what save-temps does. Thanks!

Now someone with merge rights to review it. Maybe @jyn514?

cargo llvm-lines --files ./build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/debug/deps/*.ll > llvm-lines.txt
env RUSTFLAGS=-Csave-temps ./x.py build --stage 0 compiler/rustc

# Single crate, e.g., rustc_middle. (Relies on the glob support of your shell.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Single crate, e.g., rustc_middle. (Relies on the glob support of your shell.)
# Single crate, e.g. rustc_middle. (Relies on the glob support of your shell.)


# Single crate, e.g., rustc_middle. (Relies on the glob support of your shell.)
# Convert unoptimized LLVM bitcode into a human readable LLVM assembly accepted by cargo-llvm-lines.
for f in build/x86_64-unknown-linux-gnu/stage0-rustc/x86_64-unknown-linux-gnu/release/deps/rustc_middle-*.no-opt.bc; do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this should mention that x86_64-unknown-linux-gnu varies based on the target? Or is it self-evident?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say it's easy enough to figure out what's wrong when you get file not found errors. Also bash syntax is used here, so unfortunately copy-pasting wouldn't work for windows anyway.

@jyn514 jyn514 merged commit 8bb61fe into rust-lang:master Mar 7, 2021
@tmiasko tmiasko deleted the save-temps branch March 7, 2021 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants