-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
recursively evaluate the constants in everything that is 'mentioned' #122568
Conversation
883885b
to
80f6186
Compare
Okay so this should hopefully codegen nothing more than what was already codegen'd before this PR... only the collector and MIR passes are doing more work. @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Draft: recursively evaluate the constants in everything that is 'mentioned' This is another attempt at fixing rust-lang#107503. The previous attempt at rust-lang#112879 seems stuck in figuring out where the [perf regression](https://perf.rust-lang.org/compare.html?start=c55d1ee8d4e3162187214692229a63c2cc5e0f31&end=ec8de1ebe0d698b109beeaaac83e60f4ef8bb7d1&stat=instructions:u) comes from. In rust-lang#122258 I learned some things, which informed the approach this PR is taking. r? `@ghost`
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
80f6186
to
ae2f685
Compare
Finished benchmarking commit (55a1404): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 669.355s -> 671.716s (0.35%) |
ae2f685
to
aa7dd32
Compare
Well that looks a lot better. :) And almost all regressions are for "incr". That makes sense, as collection is re-done when anything changes (right?) so making collection a bit slower would be a much bigger absolute change when the build is fast due to being mostly cached. |
e37145c
to
dd9cda6
Compare
I've added an optimization that should help with the common case of "most 'mentioned items' are also still used". Let's see if that makes a difference. @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Draft: recursively evaluate the constants in everything that is 'mentioned' This is another attempt at fixing rust-lang#107503. The previous attempt at rust-lang#112879 seems stuck in figuring out where the [perf regression](https://perf.rust-lang.org/compare.html?start=c55d1ee8d4e3162187214692229a63c2cc5e0f31&end=ec8de1ebe0d698b109beeaaac83e60f4ef8bb7d1&stat=instructions:u) comes from. In rust-lang#122258 I learned some things, which informed the approach this PR is taking. r? `@ghost`
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (02d1d38): comparison URL. Overall result: ❌ regressions - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 666.919s -> 670.307s (0.51%) |
Finished benchmarking commit (df8ac8f): comparison URL. Overall result: ❌ regressions - ACTION NEEDEDNext Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 669.21s -> 672.146s (0.44%) |
These are expected, we are doing more work -- because we were previously skipping work that shouldn't have been skipped. Regressions affect almost exclusively incr benchmarks, where collector changes show up disproportionally -- there are still tons of items but all the queries are already cached, so there's little work per item outside the collector. A large fraction of the regression is from just reading and writing the list of "mentioned items" for each MIR body. @rustbot label: +perf-regression-triaged |
Okay, the compiler is doing more work, fine. But why the massive regression in binary size? |
Eh, no idea. rlib files would get a bit bigger since they are now storing the "mentioned items" list. What kind of artifact is measured here, does it include Rust metadata? We should be sending the exact same stuff to codegen as before. (And we'd see a perf regression if we did more codegen.) |
This is all metadata (see https://perf.rust-lang.org/compare.html?start=47dd709bedda8127e8daec33327e0a9d0cdae845&end=df8ac8f1d74cffb96a93ae702d16e224f5b9ee8c&stat=size%3Acrate_metadata). If you uncheck libraries in the binary size metric, there are no changes ("binary size" for libraries is the size of the .rlib, including metadata). That being said, even regressions in metadata size are unfortunate, especially of such magnitude (not sure if they can be avoided here though). |
Yeah I think it's the "mentioned items". This here is a benchmark from early development of this patch when all I did was add the "mentioned items" list, but not do anything with it. That's actually a bigger regression, I later did some changes that made the list shorter.
We could strip the |
How does something like The rlib format could store all types in a table and then just index that table. But again, that's a big change. |
Even for live code we're now duplicating information. If a MIR body calls 100 functions, then it has 100 basic blocks for these calls, and 100 "mentioned items" with the types of the callees. |
I have opened an issue for this: #122936. If we can make the on-disk representation of "mentioned items" smaller, that would also fix a significant part of the perf regression here. I just don't know if it can be done. |
This is another attempt at fixing #107503. The previous attempt at #112879 seems stuck in figuring out where the perf regression comes from. In #122258 I learned some things, which informed the approach this PR is taking.
Quoting from the new collector docs, which explain the high-level idea:
And the
mentioned_items
MIR body field docs:Fixes #107503