-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inline before merging cgus #112695
Inline before merging cgus #112695
Conversation
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit e33faed6f2008464fb05a2950a650618acc860c7 with merge 48e47fb2e55e4d8be243b32435a7df261fa7d5e2... |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (48e47fb2e55e4d8be243b32435a7df261fa7d5e2): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 647.175s -> 650.881s (0.57%) |
I bet I can fix those terrible results for |
e33faed
to
1bf78f6
Compare
I changed the minimum size to 2500, which gave good results locally. @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
⌛ Trying commit 1bf78f64646a33c92f51d291a2539ffc0ef06f77 with merge bf804a36bcebc22d1b7558fe1e8ed711ba78e36f... |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (bf804a36bcebc22d1b7558fe1e8ed711ba78e36f): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 657.452s -> 659.079s (0.25%) |
1bf78f6
to
7448144
Compare
Let's try 1800. @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
I added a couple of additional refactoring commits. |
Because CGU merging relies on CGU sizes, but the CGU sizes before inlining aren't accurate. This requires tweaking how the sizes are updated during merging: if CGU A and B both have an inlined function F, then `size(A + B)` will be a little less than `size(A) + size(B)`, because `A + B` will only have one copy of F. Also, the minimum CGU size is increased because it now has to account for inlined functions. This change doesn't have much effect on compile perf, but it makes follow-on changes that involve more sophisticated reasoning about CGU sizes much easier.
There's no longer any need for them to be separate, and putting them together reduces the amount of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
- Rename `create_size_estimate` as `compute_size_estimate`, because that makes more sense for the second and subsequent calls for each CGU. - Change `CodegenUnit::size_estimate` from `Option<usize>` to `usize`. We can still assert that `compute_size_estimate` is called first. - Move the size estimation for `place_mono_items` inside the function, for consistency with `merge_codegen_units`.
76e27ce
to
abde9ba
Compare
I addressed the nit. @bors r=wesleywiser |
⌛ Testing commit abde9ba with merge bb08fa4bf942bd019b0156a317f6d48291b52963... |
The job Click to see the possible cause of the failure (guessed by this bot)
|
💔 Test failed - checks-actions |
Not sure if this is an intermittent crash, or if my changes have triggered something latent. Lets try again @bors retry |
☀️ Test successful - checks-actions |
Finished benchmarking commit (fa06a37): comparison URL. Overall result: ❌ regressions - ACTION NEEDEDNext Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 659.903s -> 661.846s (0.29%) |
The instructions results here look bad, but this was expected. Walltimes are less bad, though a little worse than the pre-merge CI run, which is annoying. This change is a necessary prerequisite for follow-up work to improve CGU formation: CGU merging is entirely driven by CGU size estimates, but when inlining happens after merging, those estimates are unavoidably inaccurate. @rustbot label: +perf-regression-triaged |
Because CGU merging relies on CGU sizes, but the CGU sizes before inlining aren't accurate.
This change doesn't have much effect on compile perf, but it makes follow-on changes that involve more sophisticated reasoning about CGU sizes much easier.
r? @wesleywiser