Disable top down MIR inlining #105119

JakobDegen · 2022-12-01T08:23:27Z

The current MIR inliner has exponential behavior in some cases: https://godbolt.org/z/7jnWah4fE. The cause of this is top-down inlining, where we repeatedly do inlining like call_a() => { call_b(); call_b(); }. Each decision on its own seems to make sense, but the result is exponential.

Disabling top-down inlining fundamentally prevents this. Each call site in the original, unoptimized source code is now considered for inlining exactly one time, which means that the total growth in MIR size is limited to number of call sites * inlining threshold.

Top down inlining may be worth re-introducing at some point, but it needs to be accompanied with a principled way to prevent this kind of behavior.

rustbot · 2022-12-01T08:23:33Z

r? @wesleywiser

(rustbot has picked a reviewer for you, use r? to override)

rustbot · 2022-12-02T02:01:50Z

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

JakobDegen · 2022-12-02T02:01:56Z

r? @cjgillot

JakobDegen · 2022-12-02T02:02:29Z

Going to run perf on this just because, but we should merge it regardless, since it's basically an I-hang fix.

@bors try @rust-timer queue

bors · 2022-12-02T02:02:40Z

⌛ Trying commit cfd5bf1404c89422ae78e43b7d67eb917b3438e3 with merge cb981251cfe10f6776a0ae1a831c476651314b24...

JakobDegen · 2022-12-02T03:48:26Z

@bors try

bors · 2022-12-02T03:48:36Z

⌛ Trying commit f4f7777 with merge 986a5e14efc8b034defaa7a2f9099cc48b44bd73...

bors · 2022-12-02T06:14:21Z

☀️ Try build successful - checks-actions
Build commit: 986a5e14efc8b034defaa7a2f9099cc48b44bd73 (986a5e14efc8b034defaa7a2f9099cc48b44bd73)

rust-timer · 2022-12-02T07:35:52Z

Finished benchmarking commit (986a5e14efc8b034defaa7a2f9099cc48b44bd73): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.5%	[0.4%, 0.7%]	4
Regressions ❌ (secondary)	1.3%	[0.8%, 2.3%]	11
Improvements ✅ (primary)	-0.4%	[-1.0%, -0.2%]	8
Improvements ✅ (secondary)	-0.5%	[-0.8%, -0.1%]	20
All ❌✅ (primary)	-0.1%	[-1.0%, 0.7%]	12

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	3.1%	[0.5%, 7.0%]	8
Regressions ❌ (secondary)	2.6%	[2.5%, 2.6%]	2
Improvements ✅ (primary)	-1.6%	[-1.6%, -1.6%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	2.5%	[-1.6%, 7.0%]	9

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.0%	[1.8%, 2.1%]	2
Improvements ✅ (primary)	-1.0%	[-1.0%, -1.0%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-1.0%	[-1.0%, -1.0%]	1

cjgillot · 2022-12-04T16:41:28Z

As a lighter hammer, should we gate this recursion in a call to tcx.consider_optimizing? Or is it not enough?
Is there a difference in runtime of compiled code, or does LLVM make up for the difference?

JakobDegen · 2022-12-05T09:59:48Z

consider_optimizing is always true unless the user happens to have passed the -Zfuel flag, so I don't think that would do anything

cjgillot · 2022-12-05T21:06:44Z

@bors r+
I'm a bit disappointed with that solution, but I don't have anything else to suggest.

bors · 2022-12-05T21:06:46Z

📌 Commit f4f7777 has been approved by cjgillot

It is now in the queue for this repository.

bors · 2022-12-06T00:53:04Z

⌛ Testing commit f4f7777 with merge 226202d...

JakobDegen · 2022-12-06T01:51:20Z

I'm a bit disappointed with that solution, but I don't have anything else to suggest.

Agreed. To be clear, it is definitely possible to do this better. The problem is just that based on my understanding, top-down inlining is not easy to get right, and I don't think we should be attempting it while our inlining logic is as naive as it is today

bors · 2022-12-06T03:47:33Z

☀️ Test successful - checks-actions
Approved by: cjgillot
Pushing 226202d to master...

rust-timer · 2022-12-06T05:07:40Z

Finished benchmarking commit (226202d): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.4%	[0.2%, 1.0%]	12
Regressions ❌ (secondary)	1.0%	[0.2%, 2.1%]	15
Improvements ✅ (primary)	-0.4%	[-1.0%, -0.2%]	8
Improvements ✅ (secondary)	-0.7%	[-1.0%, -0.1%]	18
All ❌✅ (primary)	0.1%	[-1.0%, 1.0%]	20

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.9%	[0.5%, 6.2%]	9
Regressions ❌ (secondary)	7.0%	[7.0%, 7.0%]	1
Improvements ✅ (primary)	-1.7%	[-3.5%, -0.9%]	4
Improvements ✅ (secondary)	-1.5%	[-1.5%, -1.5%]	1
All ❌✅ (primary)	1.5%	[-3.5%, 6.2%]	13

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.1%	[2.1%, 2.1%]	1
Improvements ✅ (primary)	-1.2%	[-1.5%, -0.9%]	9
Improvements ✅ (secondary)	-2.7%	[-3.4%, -2.4%]	8
All ❌✅ (primary)	-1.2%	[-1.5%, -0.9%]	9

nnethercote · 2022-12-06T06:07:00Z

For instruction counts, there's a roughly even mix of wins and losses. For cycles, the wins outweigh the losses. All of that seems fine.

@rustbot label: +perf-regression-triaged

…llot Reenable limited top-down MIR inlining Reverts most of rust-lang#105119 and uses an alternative strategy to prevent exponential blowup. Specifically, we allow doing top-down inlining up to depth at most five, and for at most one call site per nested body. r? `@cjgillot`

…illot Disable top down MIR inlining The current MIR inliner has exponential behavior in some cases: <https://godbolt.org/z/7jnWah4fE>. The cause of this is top-down inlining, where we repeatedly do inlining like `call_a() => { call_b(); call_b(); }`. Each decision on its own seems to make sense, but the result is exponential. Disabling top-down inlining fundamentally prevents this. Each call site in the original, unoptimized source code is now considered for inlining exactly one time, which means that the total growth in MIR size is limited to number of call sites * inlining threshold. Top down inlining may be worth re-introducing at some point, but it needs to be accompanied with a principled way to prevent this kind of behavior.

Only check inlining counter after recursing. This PR aims to reduce the strength of rust-lang#105119 even more. In the current implementation, we check the inline count before recursing. This means that we never actually reach inlining depth 3. This PR checks the counter after recursion, to give a chance to inline at depth >= 3. r? `@scottmcm` cc `@JakobDegen`

Only check inlining counter after recursing. This PR aims to reduce the strength of rust-lang/rust#105119 even more. In the current implementation, we check the inline count before recursing. This means that we never actually reach inlining depth 3. This PR checks the counter after recursion, to give a chance to inline at depth >= 3. r? `@scottmcm` cc `@JakobDegen`

rustbot assigned wesleywiser Dec 1, 2022

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 1, 2022

JakobDegen force-pushed the inline-experiments branch from acbb7c3 to cfd5bf1 Compare December 1, 2022 08:31

JakobDegen changed the title ~~[experiment] Disable top down MIR inlining~~ Disable top down MIR inlining Dec 2, 2022

JakobDegen marked this pull request as ready for review December 2, 2022 02:01

rustbot assigned cjgillot and unassigned wesleywiser Dec 2, 2022

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 2, 2022

Disable top-down inlining

f4f7777

JakobDegen force-pushed the inline-experiments branch from cfd5bf1 to f4f7777 Compare December 2, 2022 02:33

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Dec 2, 2022

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 5, 2022

bors added the merged-by-bors This PR was explicitly merged by bors. label Dec 6, 2022

bors merged commit 226202d into rust-lang:master Dec 6, 2022

rustbot added this to the 1.67.0 milestone Dec 6, 2022

rustbot added the perf-regression-triaged The performance regression has been triaged. label Dec 6, 2022

JakobDegen deleted the inline-experiments branch December 8, 2022 06:33

JakobDegen mentioned this pull request Jan 2, 2023

Reenable limited top-down MIR inlining #106364

Merged

Noratrieb mentioned this pull request Jan 7, 2023

crate fails to compile release build (v1.65+) with rustc crashing #106555

Closed

cjgillot mentioned this pull request Jun 3, 2023

Only check inlining counter after recursing. #112240

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable top down MIR inlining #105119

Disable top down MIR inlining #105119

JakobDegen commented Dec 1, 2022 •

edited

Loading

rustbot commented Dec 1, 2022

rustbot commented Dec 2, 2022

JakobDegen commented Dec 2, 2022

JakobDegen commented Dec 2, 2022

This comment has been minimized.

bors commented Dec 2, 2022

JakobDegen commented Dec 2, 2022

bors commented Dec 2, 2022

bors commented Dec 2, 2022

This comment has been minimized.

rust-timer commented Dec 2, 2022

cjgillot commented Dec 4, 2022

JakobDegen commented Dec 5, 2022

cjgillot commented Dec 5, 2022

bors commented Dec 5, 2022

bors commented Dec 6, 2022

JakobDegen commented Dec 6, 2022

bors commented Dec 6, 2022

rust-timer commented Dec 6, 2022

nnethercote commented Dec 6, 2022

Disable top down MIR inlining #105119

Disable top down MIR inlining #105119

Conversation

JakobDegen commented Dec 1, 2022 • edited Loading

rustbot commented Dec 1, 2022

rustbot commented Dec 2, 2022

JakobDegen commented Dec 2, 2022

JakobDegen commented Dec 2, 2022

This comment has been minimized.

bors commented Dec 2, 2022

JakobDegen commented Dec 2, 2022

bors commented Dec 2, 2022

bors commented Dec 2, 2022

This comment has been minimized.

rust-timer commented Dec 2, 2022

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Instruction count

Max RSS (memory usage)

Cycles

cjgillot commented Dec 4, 2022

JakobDegen commented Dec 5, 2022

cjgillot commented Dec 5, 2022

bors commented Dec 5, 2022

bors commented Dec 6, 2022

JakobDegen commented Dec 6, 2022

bors commented Dec 6, 2022

rust-timer commented Dec 6, 2022

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Instruction count

Max RSS (memory usage)

Cycles

nnethercote commented Dec 6, 2022

JakobDegen commented Dec 1, 2022 •

edited

Loading