Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit #[inline] on derive(Debug) #117727

Merged
merged 1 commit into from
Nov 9, 2023
Merged

Conversation

saethlin
Copy link
Member

@saethlin saethlin commented Nov 8, 2023

While working on #116583 I noticed that the cross_crate_inlinable query identifies a lot of derived Debug impls as a MIR body that's little more than a call, which suggests they may be a good candidate for #[inline]. So here I've implemented that change specifically.

It seems to provide a nice improvement to build times.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 8, 2023
@saethlin saethlin added S-experimental Status: Ongoing experiment that does not require reviewing and won't be merged in its current state. and removed T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 8, 2023
@saethlin

This comment was marked as outdated.

@rust-timer

This comment was marked as outdated.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Nov 8, 2023
@bors

This comment was marked as outdated.

bors added a commit to rust-lang-ci/rust that referenced this pull request Nov 8, 2023
Emit #[inline] on derive(Debug)

Breaking out part of rust-lang#116583 (comment)

r? `@ghost`
@bors

This comment was marked as outdated.

@bors bors added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Nov 8, 2023
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Nov 8, 2023
@saethlin
Copy link
Member Author

saethlin commented Nov 8, 2023

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Nov 8, 2023

⌛ Trying commit a317db1 with merge bd37a3c...

bors added a commit to rust-lang-ci/rust that referenced this pull request Nov 8, 2023
Emit #[inline] on derive(Debug)

Breaking out part of rust-lang#116583 (comment)

r? `@ghost`
@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Nov 9, 2023

☀️ Try build successful - checks-actions
Build commit: bd37a3c (bd37a3c11ef1592e6e8a9dfe6290477f7ffa2dc8)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (bd37a3c): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.5% [0.3%, 0.8%] 4
Regressions ❌
(secondary)
2.7% [1.1%, 3.5%] 4
Improvements ✅
(primary)
-2.4% [-10.5%, -0.2%] 107
Improvements ✅
(secondary)
-3.4% [-25.1%, -0.2%] 163
All ❌✅ (primary) -2.3% [-10.5%, 0.8%] 111

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.2% [0.4%, 4.7%] 9
Regressions ❌
(secondary)
4.1% [2.4%, 5.8%] 2
Improvements ✅
(primary)
-2.3% [-5.3%, -0.6%] 17
Improvements ✅
(secondary)
-2.9% [-5.0%, -1.1%] 11
All ❌✅ (primary) -0.8% [-5.3%, 4.7%] 26

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.9% [0.4%, 1.4%] 2
Regressions ❌
(secondary)
3.0% [2.8%, 3.4%] 3
Improvements ✅
(primary)
-4.4% [-11.4%, -0.7%] 53
Improvements ✅
(secondary)
-5.5% [-24.5%, -1.6%] 89
All ❌✅ (primary) -4.2% [-11.4%, 1.4%] 55

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.9% [0.1%, 3.1%] 26
Regressions ❌
(secondary)
4.4% [0.2%, 7.1%] 6
Improvements ✅
(primary)
-2.7% [-6.9%, -0.1%] 113
Improvements ✅
(secondary)
-3.7% [-11.9%, -0.6%] 93
All ❌✅ (primary) -2.0% [-6.9%, 3.1%] 139

Bootstrap: 663.159s -> 673.479s (1.56%)
Artifact size: 308.75 MiB -> 308.99 MiB (0.08%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Nov 9, 2023
@saethlin saethlin added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. S-experimental Status: Ongoing experiment that does not require reviewing and won't be merged in its current state. labels Nov 9, 2023
@saethlin saethlin marked this pull request as ready for review November 9, 2023 15:26
@Noratrieb
Copy link
Member

Noratrieb commented Nov 9, 2023

I suspect that the speedup is for a very simple reason: #[inline] causes functions to be codegened downstream, or in the case of dead code: never. Debug impls are frequently dead code.
Evidence for this lies in the hello world benchmark. Linking got significantly faster, so the linker has less to process. This makes sense, the standard library contains lots of Debug impls that hello world doesn't use. The same reasoning can extend to other benchmarks.

@nbdd0121
Copy link
Contributor

nbdd0121 commented Nov 9, 2023

In that case I think what we want is to only mono the function downstream, but not have it marked as inline by LLVM. Instead of inlining should we experiment something like #[rustc_late_mono]?

@saethlin
Copy link
Member Author

saethlin commented Nov 9, 2023

Inlining is still at LLVM's discretion, all this attribute does is apply the inlinehint attribute. It also makes functions eligible for MIR inlining and causes them to not only be lazily lowered, but also to perhaps be lowered multiple times. (it also possibly has neighbor effects on symbol visibility, including that of statics)

It does a lot. I am not sure to what degree all these effects are separable, but we're having this conversation because I am studying separating the codegen style and MIR inlining from the LLVM attribute and inferring the non-inlinehint effects. If you want to have a broader conversation about this we should do that outside this PR, maybe on Zulip, but in my opinion we are primarily short on information to feed a useful discussion.

@scottmcm
Copy link
Member

scottmcm commented Nov 9, 2023

@nnethercote My hypothesis here is that it's a happy consequence of the move to debug_tuple_field4_finish and friends -- those functions are not #[inline], so the fmt methods that're being inlined in this PR aren't the ones with lots of calls to DebugStruct::field and such, just doing the pointer projection and vtable pointer copying before making the single call to the appropriate fmt_helpers_for_derive.

@scottmcm
Copy link
Member

scottmcm commented Nov 9, 2023

Actually, one thing that puzzles me: how does this make a bunch of check builds so much faster? What are those check builds doing that cares enough about inlining of these things to show up? Naïvely I'd have though it would be impossible for inlining to matter for things like check that don't do codegen (or even optimized MIR, right?).

The usual possibility of "oh well it make rustc faster" feels like it can't possibly be the answer since I can't imagine that Debug-printing stuff is in the rustc critical path for check builds either...

@bors
Copy link
Contributor

bors commented Nov 9, 2023

☀️ Test successful - checks-actions
Approved by: nnethercote
Pushing 0f44eb3 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Nov 9, 2023
@bors bors merged commit 0f44eb3 into rust-lang:master Nov 9, 2023
12 checks passed
@rustbot rustbot added this to the 1.75.0 milestone Nov 9, 2023
@saethlin saethlin deleted the inline-derived-fmt branch November 9, 2023 23:39
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (0f44eb3): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Warning ⚠: The following benchmark(s) failed to build:

  • cargo-0.60.0

cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
3.1% [0.3%, 13.7%] 5
Regressions ❌
(secondary)
1.8% [0.2%, 3.4%] 6
Improvements ✅
(primary)
-2.3% [-10.6%, -0.1%] 110
Improvements ✅
(secondary)
-3.2% [-25.2%, -0.1%] 177
All ❌✅ (primary) -2.1% [-10.6%, 13.7%] 115

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.7% [0.5%, 3.9%] 6
Regressions ❌
(secondary)
2.1% [0.9%, 4.6%] 7
Improvements ✅
(primary)
-2.1% [-6.4%, -0.4%] 27
Improvements ✅
(secondary)
-3.7% [-4.8%, -2.1%] 9
All ❌✅ (primary) -1.4% [-6.4%, 3.9%] 33

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
3.2% [0.6%, 9.9%] 4
Regressions ❌
(secondary)
3.3% [3.0%, 3.5%] 3
Improvements ✅
(primary)
-3.9% [-11.4%, -0.6%] 62
Improvements ✅
(secondary)
-5.5% [-24.2%, -1.4%] 89
All ❌✅ (primary) -3.4% [-11.4%, 9.9%] 66

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.9% [0.1%, 3.1%] 26
Regressions ❌
(secondary)
4.4% [0.2%, 7.1%] 6
Improvements ✅
(primary)
-2.8% [-6.9%, -0.1%] 108
Improvements ✅
(secondary)
-3.7% [-11.9%, -0.6%] 93
All ❌✅ (primary) -2.1% [-6.9%, 3.1%] 134

Bootstrap: 662.531s -> 673.197s (1.61%)
Artifact size: 308.76 MiB -> 311.10 MiB (0.76%)

@saethlin
Copy link
Member Author

Improvements outweigh regressions. And the big regression looks spurious, probably related to the collector slowness that started a few days ago.
@rustbot label: +perf-regression-triaged

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Nov 10, 2023
@RalfJung
Copy link
Member

Would it be worth trying to only apply this attribute to structs with at most 5 fields, but not enums or larger structs? Those are exactly the ones where the body will be just a single function call.

bors added a commit to rust-lang-ci/rust that referenced this pull request Nov 18, 2023
…-debug, r=<try>

Try not applying #[inline] to big structs

Per rust-lang#117727 (comment)

r? `@ghost`
@saethlin
Copy link
Member Author

I tried: #118031 (comment) and it's a huge regression.

So I suspect the dead code effect and the MIR optimization effects are both important here.

@RalfJung
Copy link
Member

RalfJung commented Nov 18, 2023 via email

JelteF added a commit to JelteF/derive_more that referenced this pull request Jan 22, 2024
To be clear the `#[inline]` does not hint that inlining is beneficial,
but it does give the compiler the option to inline if the compiler
things it would be beneficial.

This starts adding the `#[inline]` attribute to:

1. `IsVariant`: it's expected that this is often beneficial since its
   body is tiny.
2. `Debug`: This is to stay in line with the `std` implementation of the
   `Debug` derive. rust-lang/rust#117727

It also explicitely doesn't add the attribute to the methods of `Error`,
since those are almost never called in hot code paths.

Fixes #317
JelteF added a commit to JelteF/derive_more that referenced this pull request Jan 22, 2024
To be clear the `#[inline]` does not hint that inlining is beneficial,
but it does give the compiler the option to inline if the compiler
things it would be beneficial.

This starts adding the `#[inline]` attribute to:

1. `IsVariant`: it's expected that this is often beneficial since its
   body is tiny.
2. `Debug`: This is to stay in line with the `std` implementation of the
   `Debug` derive. rust-lang/rust#117727

It also explicitely doesn't add the attribute to the methods of `Error`,
since those are almost never called in hot code paths.

Fixes #317
tyranron pushed a commit to JelteF/derive_more that referenced this pull request Jan 23, 2024
To be clear, the `#[inline]` does not hint that inlining is beneficial,
but it does give the compiler the option to inline if the compiler
things it would be beneficial.

This starts adding the `#[inline]` attribute to:

1. `IsVariant`: it's expected that this is often beneficial, since its
   body is tiny.
2. `Debug`: This is to stay in line with the `std` implementation of the
   `Debug` derive. rust-lang/rust#117727

It also explicitly doesn't add the attribute to the methods of `Error`,
since those are almost never called in hot code paths.
taiki-e added a commit to taiki-e/portable-atomic that referenced this pull request Mar 17, 2024
taiki-e added a commit to taiki-e/atomic-maybe-uninit that referenced this pull request Mar 19, 2024
celinval added a commit to celinval/rust-dev that referenced this pull request Jun 4, 2024
Update Rust toolchain from nightly-2023-11-09 to nightly-2023-11-10
without any other source changes.
This is an automatically generated pull request. If any of the CI checks
fail, manual intervention is required. In such a case, review the
changes at https://github.com/rust-lang/rust from
rust-lang@fdaaaf9
up to
rust-lang@0f44eb3.
The log for this commit range is:
rust-lang@0f44eb32f1 Auto merge of
rust-lang#117727 - saethlin:inline-derived-fmt, r=nnethercote
rust-lang@eae4135939 Auto merge of
rust-lang#117708 - onur-ozkan:x-setup, r=clubby789
rust-lang@4c8862b263 Auto merge of
rust-lang#117122 - ferrocene:pa-configure-git-diff, r=albertlarsan68
rust-lang@d32d9238cf Emit #[inline] on
derive(Debug)
rust-lang@b7583d38b7 Auto merge of
rust-lang#117712 - lcnr:expand-coroutine, r=jackh726
rust-lang@488dd9bc73 fmt
rust-lang@e7998aa21f Auto merge of
rust-lang#117734 - nnethercote:rm-Zstrip, r=davidtwco
rust-lang@287ae4db75 Auto merge of
rust-lang#117632 - Nilstrieb:icup, r=davidtwco
rust-lang@492e57c6ad Auto merge of
rust-lang#117736 - TaKO8Ki:rollup-fjrtmlb, r=TaKO8Ki
rust-lang@d1e26401bc chore(bootstrap):
capitalize {error, warning, info, note} tags
rust-lang@42fbf3ebf5 allow users to
override the existing configuration during x setup
rust-lang@3d6417fc7a check config file
before prompts on x setup
rust-lang@f5195c52bb Rollup merge of
rust-lang#117724 - Kobzol:shim-error-message, r=onur-ozkan
rust-lang@e603f4491c Rollup merge of
rust-lang#117723 - onur-ozkan:keep-bootstrap-on-x-clean, r=albertlarsan68
rust-lang@6533c62ce7 Rollup merge of
rust-lang#117705 - tshepang:patch-2, r=Nilstrieb
rust-lang@b4fa5b7004 Rollup merge of
rust-lang#117694 - jmillikin:core-io-borrowed-buf, r=m-ou-se
rust-lang@4cc549811f Rollup merge of
rust-lang#117645 - compiler-errors:auto-trait-subst, r=petrochenkov
rust-lang@a1a8d6fe9c Rollup merge of
rust-lang#116762 - WaffleLapkin:fixup_fromptr_docs, r=RalfJung
rust-lang@d8dbf7ca0e Auto merge of
rust-lang#117557 - Zoxc:panic-prio, r=petrochenkov
rust-lang@ecc936b155 Remove `-Z strip`.
rust-lang@57fb1e643a Auto merge of
rust-lang#117454 - shepmaster:github-actions-m1-tests,
r=GuillaumeGomez,onur-ozkan
rust-lang@341c85648c Move `BorrowedBuf`
and `BorrowedCursor` from `std:io` to `core::io`
rust-lang@92267c9794 update mir-opt tests
rust-lang@992d93f687 rename
`BorrowKind::Shallow` to `Fake`
rust-lang@a42eca42df generator layout:
ignore fake borrows
rust-lang@622be2d138 Restore rustc shim
error message
rust-lang@de0458af97 speed up `x clean`
rust-lang@6909992501 Run tests in CI for
aarch64-apple-darwin
rust-lang@64090536d4 Install tidy for
aarch64-apple-darwin
rust-lang@469d34b39b Mark Rustdoc test as
Linux-only
rust-lang@bf360d407e instrument
constituent types computation
rust-lang@03435e6fdd accept review
suggestion
rust-lang@769ad29c3e triagebot.toml: use
inclusive language
rust-lang@102384523a Document how rust
atomics work wrt mixed-sized and non-atomic accesses
rust-lang@c17d33f1df Extend builtin/auto
trait args with error when they have >1 argument
rust-lang@580fa0c1a9 rename
github_repository to git_repository
rust-lang@ffffc2038f Update ICU4X
rust-lang@ff1858e2aa Make
`FatalErrorMarker` lower priority than other panics
rust-lang@4a0873533f update suggest-tests
rust-lang@5a562d962e pass the correct
args to compiletest
rust-lang@545cc830e1 allow configuring
the parent GitHub repository

Co-authored-by: celinval <celinval@users.noreply.github.com>
liveseed added a commit to liveseed/derive_more that referenced this pull request Aug 20, 2024
To be clear, the `#[inline]` does not hint that inlining is beneficial,
but it does give the compiler the option to inline if the compiler
things it would be beneficial.

This starts adding the `#[inline]` attribute to:

1. `IsVariant`: it's expected that this is often beneficial, since its
   body is tiny.
2. `Debug`: This is to stay in line with the `std` implementation of the
   `Debug` derive. rust-lang/rust#117727

It also explicitly doesn't add the attribute to the methods of `Error`,
since those are almost never called in hot code paths.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants