- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.9k
Avoid codegen for Result::into_ok in lang_start #88988
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| r? @yaahc (rust-highfive has picked a reviewer for you, use r? to override) | 
| @bors try @rust-timer queue | 
| Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf | 
| ⌛ Trying commit 76e71b012bd77dc310ce036e14fb56b2d9e0a43d with merge 526cb50d2abbf6ee2ee3c62cabe2ce1735313f63... | 
| ☀️ Try build successful - checks-actions | 
| Queued 526cb50d2abbf6ee2ee3c62cabe2ce1735313f63 with parent 2c7bc5e, future comparison URL. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we may want to open-code map_err in the function above as well which would likely help too.
| Finished benchmarking commit (526cb50d2abbf6ee2ee3c62cabe2ce1735313f63): comparison url. Summary: This change led to large relevant improvements 🎉 in compiler performance. 
 If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf. @bors rollup=never | 
Otherwise, we end up pulling in an extra module as part of codegen, and that costs us a sizeable amount of work (both in LLVM and outside).
76e71b0    to
    db5ecd5      
    Compare
  
    | 
 lang_start_internal is already codegen'd as part of std, so unless we're doing LTO or similar, it's body doesn't really matter. If LTO is running, then based on some local testing it looks like we pull in tons of code from std anyway (and backtrace etc) so it really doesn't matter whether we're using map_err or not. I also want to avoid doing too much in this particular PR - makes it harder to compare things in perf, for example - so I am going to go ahead and @bors r=nagisa | 
| 📌 Commit db5ecd5 has been approved by  | 
| ☀️ Test successful - checks-actions | 
| Finished benchmarking commit (6cdd42f): comparison url. Summary: This change led to large relevant improvements 🎉 in compiler performance. 
 If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. @rustbot label: -perf-regression | 
This extra codegen seems to be the cause for the regressions in max-rss on #86034. While LLVM will certainly optimize the dead code away, avoiding it's generation in the first place seems good, particularly when it is so simple.
#86034 produced this diff for a simple
fn main() {}. With this PR, that diff becomes limited to just a few extra IR instructions -- no extra functions.Note that these are pre-optimization; LLVM surely will eliminate this during optimization. However, that optimization can end up generating more work and bump memory usage, and this eliminates that.