Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable top down MIR inlining #105119

Merged
merged 1 commit into from
Dec 6, 2022
Merged

Conversation

JakobDegen
Copy link
Contributor

@JakobDegen JakobDegen commented Dec 1, 2022

The current MIR inliner has exponential behavior in some cases: https://godbolt.org/z/7jnWah4fE. The cause of this is top-down inlining, where we repeatedly do inlining like call_a() => { call_b(); call_b(); }. Each decision on its own seems to make sense, but the result is exponential.

Disabling top-down inlining fundamentally prevents this. Each call site in the original, unoptimized source code is now considered for inlining exactly one time, which means that the total growth in MIR size is limited to number of call sites * inlining threshold.

Top down inlining may be worth re-introducing at some point, but it needs to be accompanied with a principled way to prevent this kind of behavior.

@rustbot
Copy link
Collaborator

rustbot commented Dec 1, 2022

r? @wesleywiser

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 1, 2022
@JakobDegen JakobDegen changed the title [experiment] Disable top down MIR inlining Disable top down MIR inlining Dec 2, 2022
@JakobDegen JakobDegen marked this pull request as ready for review December 2, 2022 02:01
@rustbot
Copy link
Collaborator

rustbot commented Dec 2, 2022

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

@JakobDegen
Copy link
Contributor Author

r? @cjgillot

@rustbot rustbot assigned cjgillot and unassigned wesleywiser Dec 2, 2022
@JakobDegen
Copy link
Contributor Author

Going to run perf on this just because, but we should merge it regardless, since it's basically an I-hang fix.

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 2, 2022
@bors
Copy link
Collaborator

bors commented Dec 2, 2022

⌛ Trying commit cfd5bf1404c89422ae78e43b7d67eb917b3438e3 with merge cb981251cfe10f6776a0ae1a831c476651314b24...

@JakobDegen
Copy link
Contributor Author

@bors try

@bors
Copy link
Collaborator

bors commented Dec 2, 2022

⌛ Trying commit f4f7777 with merge 986a5e14efc8b034defaa7a2f9099cc48b44bd73...

@bors
Copy link
Collaborator

bors commented Dec 2, 2022

☀️ Try build successful - checks-actions
Build commit: 986a5e14efc8b034defaa7a2f9099cc48b44bd73 (986a5e14efc8b034defaa7a2f9099cc48b44bd73)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (986a5e14efc8b034defaa7a2f9099cc48b44bd73): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.5% [0.4%, 0.7%] 4
Regressions ❌
(secondary)
1.3% [0.8%, 2.3%] 11
Improvements ✅
(primary)
-0.4% [-1.0%, -0.2%] 8
Improvements ✅
(secondary)
-0.5% [-0.8%, -0.1%] 20
All ❌✅ (primary) -0.1% [-1.0%, 0.7%] 12

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
3.1% [0.5%, 7.0%] 8
Regressions ❌
(secondary)
2.6% [2.5%, 2.6%] 2
Improvements ✅
(primary)
-1.6% [-1.6%, -1.6%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.5% [-1.6%, 7.0%] 9

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.0% [1.8%, 2.1%] 2
Improvements ✅
(primary)
-1.0% [-1.0%, -1.0%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -1.0% [-1.0%, -1.0%] 1

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Dec 2, 2022
@cjgillot
Copy link
Contributor

cjgillot commented Dec 4, 2022

As a lighter hammer, should we gate this recursion in a call to tcx.consider_optimizing? Or is it not enough?
Is there a difference in runtime of compiled code, or does LLVM make up for the difference?

@JakobDegen
Copy link
Contributor Author

consider_optimizing is always true unless the user happens to have passed the -Zfuel flag, so I don't think that would do anything

@cjgillot
Copy link
Contributor

cjgillot commented Dec 5, 2022

@bors r+
I'm a bit disappointed with that solution, but I don't have anything else to suggest.

@bors
Copy link
Collaborator

bors commented Dec 5, 2022

📌 Commit f4f7777 has been approved by cjgillot

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 5, 2022
@bors
Copy link
Collaborator

bors commented Dec 6, 2022

⌛ Testing commit f4f7777 with merge 226202d...

@JakobDegen
Copy link
Contributor Author

I'm a bit disappointed with that solution, but I don't have anything else to suggest.

Agreed. To be clear, it is definitely possible to do this better. The problem is just that based on my understanding, top-down inlining is not easy to get right, and I don't think we should be attempting it while our inlining logic is as naive as it is today

@bors
Copy link
Collaborator

bors commented Dec 6, 2022

☀️ Test successful - checks-actions
Approved by: cjgillot
Pushing 226202d to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Dec 6, 2022
@bors bors merged commit 226202d into rust-lang:master Dec 6, 2022
@rustbot rustbot added this to the 1.67.0 milestone Dec 6, 2022
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (226202d): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.4% [0.2%, 1.0%] 12
Regressions ❌
(secondary)
1.0% [0.2%, 2.1%] 15
Improvements ✅
(primary)
-0.4% [-1.0%, -0.2%] 8
Improvements ✅
(secondary)
-0.7% [-1.0%, -0.1%] 18
All ❌✅ (primary) 0.1% [-1.0%, 1.0%] 20

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.9% [0.5%, 6.2%] 9
Regressions ❌
(secondary)
7.0% [7.0%, 7.0%] 1
Improvements ✅
(primary)
-1.7% [-3.5%, -0.9%] 4
Improvements ✅
(secondary)
-1.5% [-1.5%, -1.5%] 1
All ❌✅ (primary) 1.5% [-3.5%, 6.2%] 13

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.1% [2.1%, 2.1%] 1
Improvements ✅
(primary)
-1.2% [-1.5%, -0.9%] 9
Improvements ✅
(secondary)
-2.7% [-3.4%, -2.4%] 8
All ❌✅ (primary) -1.2% [-1.5%, -0.9%] 9

@nnethercote
Copy link
Contributor

For instruction counts, there's a roughly even mix of wins and losses. For cycles, the wins outweigh the losses. All of that seems fine.

@rustbot label: +perf-regression-triaged

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Dec 6, 2022
@JakobDegen JakobDegen deleted the inline-experiments branch December 8, 2022 06:33
bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 2, 2023
…llot

Reenable limited top-down MIR inlining

Reverts most of rust-lang#105119 and uses an alternative strategy to prevent exponential blowup. Specifically, we allow doing top-down inlining up to depth at most five, and for at most one call site per nested body.

r? `@cjgillot`
Aaron1011 pushed a commit to Aaron1011/rust that referenced this pull request Jan 6, 2023
…illot

Disable top down MIR inlining

The current MIR inliner has exponential behavior in some cases: <https://godbolt.org/z/7jnWah4fE>. The cause of this is top-down inlining, where we repeatedly do inlining like `call_a() => { call_b(); call_b(); }`. Each decision on its own seems to make sense, but the result is exponential.

Disabling top-down inlining fundamentally prevents this. Each call site in the original, unoptimized source code is now considered for inlining exactly one time, which means that the total growth in MIR size is limited to number of call sites * inlining threshold.

Top down inlining may be worth re-introducing at some point, but it needs to be accompanied with a principled way to prevent this kind of behavior.
bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 4, 2023
Only check inlining counter after recursing.

This PR aims to reduce the strength of rust-lang#105119 even more.

In the current implementation, we check the inline count before recursing. This means that we never actually reach inlining depth 3.

This PR checks the counter after recursion, to give a chance to inline at depth >= 3.

r? `@scottmcm`
cc `@JakobDegen`
RalfJung pushed a commit to RalfJung/miri that referenced this pull request Jun 11, 2023
Only check inlining counter after recursing.

This PR aims to reduce the strength of rust-lang/rust#105119 even more.

In the current implementation, we check the inline count before recursing. This means that we never actually reach inlining depth 3.

This PR checks the counter after recursion, to give a chance to inline at depth >= 3.

r? `@scottmcm`
cc `@JakobDegen`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants