-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apply BOLT optimizations without rebuilding LLVM #107723
Conversation
@bors try |
⌛ Trying commit e686fc13c0b16035839e374ead4ff9fef0d68cc1 with merge 5866a9a1a511dcca9566a367b0ccdf69e2b0aedf... |
I don't have time for reviews right now. r? @Mark-Simulacrum cc @nikic |
☀️ Try build successful - checks-actions |
Hmm, the build has worked, but BOLT was executed multiple times and the whole build wasn't very fast. We should probably add information about the individual bootstrap step durations into the CI timer first. |
Doesn't that already exist? I remember seeing RUSTC-TIMER log output or something like that |
Yes, it's printed during the build, and stored into |
e686fc1
to
250528f
Compare
@bors try |
⌛ Trying commit 250528f2590671e8d865d663c38aab0620a66916 with merge 6b1d08dcfe231d72d8f2310c341771dd724d43fa... |
☀️ Try build successful - checks-actions |
1 similar comment
☀️ Try build successful - checks-actions |
It seems that with this optimization (however it's implemented in the end), we can get to ~2h 5m Linux dist time. I'll check if perf hasn't regressed first though. @rust-timer build 6b1d08dcfe231d72d8f2310c341771dd724d43fa |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (6b1d08dcfe231d72d8f2310c341771dd724d43fa): comparison URL. Overall result: ❌ regressions - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
|
these benchmarks are currently noisy |
Any thoughts on the variant described in #107521 (comment)? I think this approach (and #107521 as well) is going to be something of a dead end, because it leaves behind some problems that can't really be addressed while keeping the general approach of reusing LLVM artifacts from a previous build. Apart from the unnecessary final rustc rebuild (which is at least fairly fast), we also do the previous rustc build with a BOLT-instrumented LLVM, so that build ends up being slow. The other consideration is how this is going to generalize to optimizing rustc itself with BOLT, where sharing artifacts from previous builds would be less straightforward. |
I discussed your idea with @jyn514 (https://rust-lang.zulipchat.com/#narrow/stream/326414-t-infra.2Fbootstrap/topic/bootstrap.20LLVM.20postprocess.20step), and based on that discussion, I implemented a BOLT bootstrap step (I pushed it now) that applies BOLT changes on-the-fly when LLVM is copied to sysroot. But I'm not sure if it's exactly what you had in mind.
This could be solved simply by only performing the BOLT steps in stage2/dist, right? We want to apply BOLT when we copy LLVM to stage2 sysroot, but not before.
BOLT for |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This looks reasonable to me. Thanks for working on it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
r=me unless these cleanups seem like improvements worth making
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit 9aad2ad with merge 341579aa2fdf71f726bc9f49d02b0160e27a9edf... |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (341579aa2fdf71f726bc9f49d02b0160e27a9edf): comparison URL. Overall result: no relevant changes - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)This benchmark run did not return any relevant results for this metric. CyclesThis benchmark run did not return any relevant results for this metric. |
Perf. looks good. Rebased on @rustbot ready |
@bors r+ Thanks! |
☀️ Test successful - checks-actions |
Finished benchmarking commit (35636f9): comparison URL. Overall result: no relevant changes - no action needed@rustbot label: -perf-regression Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
|
This PR adds an explicit BOLT bootstrap step which applies BOLT on the fly when LLVM artifacts are copied to a sysroot (it only does this once per bootstrap invocation, the result is cached). This avoids one LLVM rebuild in the Linux CI dist build.
r? @jyn514