-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update logic around PGO staging #101744
Update logic around PGO staging #101744
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @jyn514 (or someone else) soon. Please see the contribution instructions for more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall your description makes sense to me, except this bit:
Because the LLVM libraries are compiled only once if they are PGO instrumented then the compiler must be PGO instrumented at every stage, including Stage 1.
Why does the compiler need to be instrumented if LLVM is instrumented? Isn't instrumenting stage 2 enough?
CI definitions will need to be modified to use the Stage 2 compiler when generating PGO profiles.
Can you please do that as part of this pr? I think src/ci/pgo.sh
is the place to start looking.
@@ -678,13 +678,9 @@ impl Step for Rustc { | |||
false | |||
} | |||
} else if let Some(path) = &builder.config.rust_profile_use { | |||
if compiler.stage == 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why was this condition removed? Shouldn't it be the same as above, !link_shared() || stage >= 2
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a profile is provided there is no harm in using it. If there are checksum missmatches or missing functions then we have the same result as not using one, but if there are signature matches then the Stage1 compiler will be faster and the total build should complete sooner.
Please make sure we run perf to ensure equivalency of results before merging. |
Sorry for taking so long with my response but I wanted to do some additional investigations to see if I could find an alternative way of addressing the issue below.
If the LLVM library is instrumented it needs to link against support libraries for functions related to writing the profile data. When LLVM is linked dynamically that isn't a problem as the profile related symbols can be resolved dynamically as well. However, when LLVM is instrumented and linked statically those symbols need to be resolved at link time. I have been unable to find a way to get an instrumented
Will do. |
Regarding my previous comment, I now realize that several of my points rely on details of how we use the build system and not the build system itself. The bootstrap system doesn't provide functionality for PGO instrumenting the LLVM build and we achieve this by passing LLVM the appropriate flags, so it's entirely possible for others to build an instrumented Rust toolchain with static linkage against a non-instrumented LLVM library. I think the change is still worthwhile though because with it developers building the toolchain without a PGO instrumented LLVM will get slightly slower builds while without the change those building with PGO instrumented LLVM will have to make a similar change to the bootstrapy system. Providing the profile arguments via |
9e82f4f
to
2071506
Compare
I looked at It looks like no changes need to be made? |
What's the easiest way to invoke the PGO pipeline locally? |
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 2071506c1d60d52e72539f183b819f8554023c20 with merge 591c425928b66f3e73ef8a840c109792df09441a... |
💔 Test failed - checks-actions |
|
@chriswailes You can run
locally to run the dist CI pipeline locally. On CI, we gather both LLVM and Rust PGO profiles. It's done separately in two runs to avoid the profiling runtimes to clash. I want to revisit this decision soon to see if it's still needed. Removing this two stage approach would shorten dist/perf CI times. |
This comment has been minimized.
This comment has been minimized.
I wonder what exactly is this PR trying to achieve? 🤔 Currently we only instrument stage 2 compilers with PGO. if compiler.stage == 1 means that we only instrument stage 2, since the |
ping from triage: Can you please post your status on this PR? FYI: when a PR is ready for review, send a message containing |
2071506
to
7874e91
Compare
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
Some changes occurred in src/tools/cargo cc @ehuss |
Sorry for the delay (I got distracted by other projects) and thanks for the additional context on the |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 257ec3b with merge fa338c9ddc5996f71f5ee1cfd9b6bbc5d5318bf4... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How have you tested this change?
if compiler.stage == 1 { | ||
cargo.rustflag(&format!("-Cprofile-use={}", path)); | ||
cargo.rustflag("-Cllvm-args=-pgo-warn-missing-function"); | ||
} else if let Some(path) = &builder.config.llvm_profile_generate { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are rust_profile_generate
and llvm_profile_generate
mutually exclusive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding this function is only responsible for building Rust crates, not the LLVM libraries. If llvm_profile_generate
is set, the code that handles building LLVM will take care of that and then this code will compile the crates appropriately.
This code now makes the original check to see if Rust instrumentation was requested. If not, it will still instrument the Rust crates if the LLVM libraries are instrumented and linkage is set to static
, as a way of ensuring that the necessary runtime libraries are present.
💔 Test failed - checks-actions |
The job Click to see the possible cause of the failure (guessed by this bot)
|
After some experimentation I've confirmed that when compiling both LLVM and the Rust toolchain with PGO instrumentation and static LLVM linkage the build will fail if two different profile paths are used. Do people have a preference on how this is handled by the bootstrap system? Should an error be produced if the provided profile paths differ? Should the compiler silently (or with a warning) use Rust's profile path? |
Do we have some idea why it fails? Is this a bug that should be filed with LLVM, for example? |
I don't think it's a bug. I believe it's due to conflicting entries in the two different profiles. I'll reproduce the exact error and post it here. |
I guess what it sounds like to me is that when we share a common profile directory, it's likely that we're just clobbering one or both of the pgo data collections which "fixes" the problem but gives us worse performance (less benefit). I recall that we landed a PR a while back which stopped sharing a profile directory intentionally and it was a significant (several percent) win in terms of performance. I wonder if it makes sense to "allow" static linkage during pgo collection - presumably we might be able to only do it for the final artifacts produced? |
☔ The latest upstream changes (presumably #105525) made this pull request unmergeable. Please resolve the merge conflicts. |
Hm, well, I can't say that just having both types of symbols is fully convincing that nothing is getting clobbered. But, I think our CI doesn't use static linkage for LLVM anywhere we do PGO, so it probably is fine to error on this case with a comment that it didn't work but may not be impossible to fix (or something to that effect). |
I'd rather it not error in this configuration as it is one that we currently build and test on occasion. Would a warning be OK? |
Hm, well, let's include an error on the specific scenario:
But I'm okay leaving the maybe working scenario as a warning. (Even if it's likely to be useless in practice, because no one reads a full log of pgo output...) |
@chriswailes any updates on this? |
Closing this as inactive. Feel free to reöpen this pr or create a new pr if you get the time to work on this. Thanks |
PGO instrumentation occurs fairly late in the code generation pipeline after many transformations and optimizations are applied. This means that profiles for the same source code can differ if the binary that generated them was compiled with a different compiler or options.
When bootstrapping the Rust compiler this is relevant because it means that profiles that are generated by the Stage 1 compiler may have entries with checksums that don't match the functions in the Stage 2 compiler. This isn't actively harmful, but it may prevent some functions in later stage compilers from receiving profile-guided optimizations.
This commit changes the build system to instrument the Stage 2 and 3 compilers (with one exception discussed below) and to use the provided profile to optimize the compiler at every stage.
Not instrumenting the Stage 1 compiler has two advantages:
Because the LLVM libraries are compiled only once if they are PGO instrumented then the compiler must be PGO instrumented at every stage, including Stage 1.
CI definitions will need to be modified to use the Stage 2 compiler when generating PGO profiles.