-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use ICF (identical code folding) for building rustc #99062
Conversation
r? @jyn514 (rust-highfive has picked a reviewer for you, use r? to override) |
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 89b5a07d384be9096bdffb981ba7785c1871f2e4 with merge a7e16562cc91c83495165f476a643fba7f508872... |
This comment has been minimized.
This comment has been minimized.
This comment was marked as outdated.
This comment was marked as outdated.
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit dc494a1a51a37a4d237d68d1c50c8d6faa0d535f with merge 144f21cd0f517feabe97a817811901326c170fe6... |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Looks like no, you'll have to change that too. What version of llvm was this introduced in? |
Indeed, I tried to do that now for the
If you're talking about ICF, it's not a LLVM thing, it's a linker pass that seems to be supported by multiple linkers ( |
☀️ Try build successful - checks-actions |
Queued 144f21cd0f517feabe97a817811901326c170fe6 with parent 45263fc, future comparison URL. |
Finished benchmarking commit (144f21cd0f517feabe97a817811901326c170fe6): comparison url. Instruction count
Max RSS (memory usage)Results
CyclesResults
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.
Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Footnotes |
Cycles look really promising. I'll now try to do @bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 29e5413535ccb0ec42c8be02c0c49afcfa198a8c with merge 986e50fc1e83d611bf485cf02723d7bd89a093b8... |
☀️ Try build successful - checks-actions |
Queued 986e50fc1e83d611bf485cf02723d7bd89a093b8 with parent 86b8dd5, future comparison URL. |
@@ -122,7 +122,8 @@ ENV RUST_CONFIGURE_ARGS \ | |||
--set target.x86_64-unknown-linux-gnu.ranlib=/rustroot/bin/llvm-ranlib \ | |||
--set llvm.thin-lto=true \ | |||
--set llvm.ninja=false \ | |||
--set rust.jemalloc | |||
--set rust.jemalloc \ | |||
--set rust.use-lld=true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only affects Linux; is that intentional? Or should we use lld on other Unix platforms too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed I wanted to only try it first for Linux x64, as the most common toolchain. I usually test CI compilation optimization flags and similar things for it first, since it tends to have the best support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this option wouldn't work on windows anyway as we don't use a single section per function there because COFF has a 2^16 - something limit on the section count and it is rather easy to hit this limit if every function has it's own section.
@Kobzol btw, please use (I made this hard for you just now by commenting at different times, sorry - will try and only leave "review" comments in the future so they're all posted at the same time) |
Ok, I'll try to remember! 😅 I'm used to using GH notifications heavily, but I understand that for you it's quite impractical. I'll try to use rustbot more. |
@jyn514 Since this is perf-sensitive and if there are some issues with ICF, we might want to revert it, maybe it would be better to avoid rollup on this one? |
@bors rollup=never But I suspect this is already true; perf should set it after each run. |
Sorry, forgot about that. |
☀️ Test successful - checks-actions |
Finished benchmarking commit (246f66a): comparison url. Instruction count
Max RSS (memory usage)Results
CyclesResults
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression Footnotes |
…imulacrum Revert "Use ICF (identical code folding) for building rustc" Reverts rust-lang#99062 Fixes: rust-lang#99440
@Kobzol the perf hit looks fairly small. Should we mark this as triaged? |
I would mark it as triaged, since it was a nice win on cycles actually, but this PR got reverted immediately after landing (#99442) since there are some issues with |
@rustbot label: +perf-regression-triaged |
Make `[rust] use-lld=true` work on windows Before, it would fail with "error: ignoring unknown argument '-Wl,--icf=all'" This option was introduced in rust-lang#99062 (well, technically rust-lang#99680) See zulip thread: https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/rust-lld.3A.20error.3A.20ignoring.20unknown.20argument.20'-Wl.2C--icf.3Dall'
It seems that ICF (identical code folding) is able to remove duplicated functions created by monomorphization from binaries, resulting in smaller binary size and better i-cache utilization. Let's see if it helps for
rustc
.I'm not sure if
lld
is even used for linkingrustc
on the Linuxdist
builder, let's see.