Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline a bunch of trivial conditions in parser #122718

Merged
merged 1 commit into from
Mar 21, 2024

Conversation

workingjubilee
Copy link
Member

@workingjubilee workingjubilee commented Mar 19, 2024

It is often the case that these small, conditional functions, when inlined, reveal notable optimization opportunities to LLVM. While saethlin has done a lot of good work on making these kinds of small functions not need #[inline] tags as much, being clearer about what we want inlined will get both the MIR opts and LLVM to pursue it more aggressively.

On local perf runs, this seems fruitful. Let's see what rust-timer says.

r? @ghost

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Mar 19, 2024
@workingjubilee
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 19, 2024
@bors
Copy link
Collaborator

bors commented Mar 19, 2024

⌛ Trying commit a81d020 with merge 5fa0a16...

bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 19, 2024
…, r=<try>

Inline a bunch of trivial conditions in parser

It is often the case that these small, conditional functions, when inlined, reveal notable optimization opportunities to LLVM. While saethlin has done a lot of good work on making these kinds of small functions not need `#[inline]` tags as much, being clearer about what we want inlined will get both the MIR opts and LLVM to pursue it more aggressively.

On local perf runs, this seems fruitful. Let's see what rust-timer says.

r? `@ghost`
@workingjubilee
Copy link
Member Author

After a bit of additional local testing, I think this may be slightly overzealous, so there's another patch on top that I'd like to try if this turns out to not be as exciting.

@bors
Copy link
Collaborator

bors commented Mar 19, 2024

☀️ Try build successful - checks-actions
Build commit: 5fa0a16 (5fa0a16cfda99b444c5448472ff942d3aeb705cb)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (5fa0a16): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.4% [-0.4%, -0.4%] 2
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.8% [-3.5%, -2.1%] 2
Improvements ✅
(secondary)
-2.5% [-2.8%, -2.0%] 6
All ❌✅ (primary) -2.8% [-3.5%, -2.1%] 2

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.0% [2.0%, 2.0%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 668.321s -> 667.844s (-0.07%)
Artifact size: 312.76 MiB -> 312.76 MiB (-0.00%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 19, 2024
@workingjubilee
Copy link
Member Author

Not quite as exciting as I hoped but an acceptable result I guess. Let's try a slight variant to see if my intuition is correct:

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 19, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 19, 2024
…, r=<try>

Inline a bunch of trivial conditions in parser

It is often the case that these small, conditional functions, when inlined, reveal notable optimization opportunities to LLVM. While saethlin has done a lot of good work on making these kinds of small functions not need `#[inline]` tags as much, being clearer about what we want inlined will get both the MIR opts and LLVM to pursue it more aggressively.

On local perf runs, this seems fruitful. Let's see what rust-timer says.

r? `@ghost`
@bors
Copy link
Collaborator

bors commented Mar 19, 2024

⌛ Trying commit 0033de1 with merge a5bf49a...

@Noratrieb
Copy link
Member

I would be surprised if the PGO pipeline didn't already figure out the mostly optimal inlining here.

@bors
Copy link
Collaborator

bors commented Mar 19, 2024

☀️ Try build successful - checks-actions
Build commit: a5bf49a (a5bf49af1a1cd4234f0ba2e0d38abe6923b18df0)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (a5bf49a): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.2% [-0.3%, -0.2%] 2
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-5.8% [-5.9%, -5.7%] 2
All ❌✅ (primary) - - 0

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.3% [-1.3%, -1.3%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -1.3% [-1.3%, -1.3%] 1

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 669.309s -> 668.414s (-0.13%)
Artifact size: 312.82 MiB -> 312.74 MiB (-0.03%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 19, 2024
There are a bunch of small helper conditionals we use.
Inline them to get slightly better perf in a few cases,
especially when rustc is compiled without PGO.
@workingjubilee
Copy link
Member Author

I would be surprised if the PGO pipeline didn't already figure out the mostly optimal inlining here.

Yeah but we don't make the PGO we do easy to apply nor easily replicable, so many rustcs don't have it.

@workingjubilee workingjubilee force-pushed the eyeliner-for-contrast branch from 0033de1 to 140b4c6 Compare March 19, 2024 20:57
@workingjubilee
Copy link
Member Author

r? compiler

@lcnr
Copy link
Contributor

lcnr commented Mar 21, 2024

this doesn't result in significant improvements but also shouldn't hurt

@bors r+

@bors
Copy link
Collaborator

bors commented Mar 21, 2024

📌 Commit 140b4c6 has been approved by lcnr

It is now in the queue for this repository.

@bors bors removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 21, 2024
@bors bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Mar 21, 2024
@bors
Copy link
Collaborator

bors commented Mar 21, 2024

⌛ Testing commit 140b4c6 with merge 03994e4...

@bors
Copy link
Collaborator

bors commented Mar 21, 2024

☀️ Test successful - checks-actions
Approved by: lcnr
Pushing 03994e4 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Mar 21, 2024
@bors bors merged commit 03994e4 into rust-lang:master Mar 21, 2024
12 checks passed
@rustbot rustbot added this to the 1.79.0 milestone Mar 21, 2024
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (03994e4): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.8% [-3.7%, -2.2%] 11
All ❌✅ (primary) - - 0

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 672.146s -> 674.027s (0.28%)
Artifact size: 314.90 MiB -> 314.83 MiB (-0.02%)

@workingjubilee workingjubilee deleted the eyeliner-for-contrast branch March 22, 2024 03:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants