Rework MIR inlining costs #123179

scottmcm · 2024-03-28T23:52:39Z

A bunch of the current costs are surprising, probably accidentally from from not writing out the matches in full. For example, a runtime-length memcpy was treated as the same cost as an Unreachable.

This reworks things around two main ideas:

Give everything a baseline cost, because even "free" things do take effort in the compiler (CPU & RAM) to MIR inline, and they're easy to calculate
Then just penalize those things that are materially more than the baseline, like how [foo; 123] is far more work than BinOp::AddUnchecked in an Rvalue

By including costs for locals and vardebuginfo this makes some things overall more expensive, but because it also greatly reduces the cost for simple things like local variable addition, other things also become less expensive overall.

r? ghost

scottmcm · 2024-03-29T00:13:16Z

@bors try @rust-timer queue

bors · 2024-03-29T00:14:27Z

⌛ Trying commit 47a5a7f with merge 8bbcded...

…<try> Rework MIR inlining costs A bunch of the current costs are surprising, probably accidentally from from not writing out the matches in full. For example, a runtime-length `memcpy` was treated as the same cost as an `Unreachable`. This reworks things around two main ideas: - Give everything a baseline cost, because even "free" things do take effort in the compiler (CPU & RAM) to MIR inline, and they're easy to calculate - Then just penalize those things that are materially more than the baseline, like how `[foo; 123]` is far more work than `BinOp::AddUnchecked` in an `Rvalue` By including costs for locals and vardebuginfo this makes some things overall more expensive, but because it also greatly reduces the cost for simple things like local variable addition, other things also become less expensive overall. r? ghost

scottmcm · 2024-03-29T00:31:10Z

compiler/rustc_mir_transform/src/cost_checker.rs

                }
            }
-            TerminatorKind::Call { func: Operand::Constant(ref f), unwind, .. } => {


As an interesting example, Calls to non-constants weren't given the CALL_PENALTY before, because they were hidden down in the _ => arm.

scottmcm · 2024-03-29T01:01:13Z

compiler/rustc_mir_transform/src/cost_checker.rs

+            | Rvalue::Len(..)
+            | Rvalue::Cast(..)
+            | Rvalue::BinaryOp(..)
+            | Rvalue::NullaryOp(..)


For example, _1 = sizeof(T) is now just cost 1 (statement baseline) instead of the previous 5 (instr_cost).

bors · 2024-03-29T01:53:17Z

☀️ Try build successful - checks-actions
Build commit: 8bbcded (8bbcdedb635214cbb8873b903a7772c97117adb9)

rust-timer · 2024-03-29T03:08:25Z

Finished benchmarking commit (8bbcded): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.0%	[0.6%, 1.7%]	6
Regressions ❌ (secondary)	0.2%	[0.2%, 0.2%]	2
Improvements ✅ (primary)	-0.5%	[-1.6%, -0.2%]	14
Improvements ✅ (secondary)	-0.8%	[-1.3%, -0.2%]	15
All ❌✅ (primary)	-0.1%	[-1.6%, 1.7%]	20

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	3.0%	[3.0%, 3.0%]	1
Regressions ❌ (secondary)	3.0%	[3.0%, 3.0%]	1
Improvements ✅ (primary)	-3.1%	[-5.6%, -0.1%]	5
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-2.1%	[-5.6%, 3.0%]	6

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	3.2%	[1.2%, 8.4%]	5
Improvements ✅ (primary)	-1.2%	[-1.3%, -1.2%]	2
Improvements ✅ (secondary)	-3.9%	[-4.5%, -3.2%]	6
All ❌✅ (primary)	-1.2%	[-1.3%, -1.2%]	2

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.2%	[0.0%, 2.0%]	46
Regressions ❌ (secondary)	1.5%	[0.2%, 2.4%]	3
Improvements ✅ (primary)	-0.1%	[-0.8%, -0.0%]	27
Improvements ✅ (secondary)	-0.5%	[-1.5%, -0.0%]	23
All ❌✅ (primary)	0.1%	[-0.8%, 2.0%]	73

Bootstrap: 670.106s -> 665.367s (-0.71%)
Artifact size: 315.70 MiB -> 315.74 MiB (0.01%)

scottmcm · 2024-03-29T03:19:34Z

Ah, that looks way better than #123011 (comment)

Since this is basically a replacement for #123011
r? @wesleywiser
@rustbot ready

rustbot · 2024-03-29T03:19:38Z

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

A bunch of the current costs are surprising, probably accidentally from from not writing out the matches in full. For example, a runtime-length `memcpy` was treated as the same cost as an `Unreachable`. This reworks things around two main ideas: - Give everything a baseline cost, because even "free" things do take effort to MIR inline, and that's easy to calculate - Then just penalize those things that are materially more than the baseline, like how `[foo; 123]` is far more work than `BinOp::AddUnchecked` in an `Rvalue` By including costs for locals and vardebuginfo this makes some things overall more expensive, but because it also greatly reduces the cost for simple things like local variable addition, other things also become less expensive overall.

scottmcm · 2024-03-30T19:46:18Z

Since #122975 made a pretty big difference to how MIR ends up after inlining, rebased to re-check perf.

@bors try @rust-timer queue

bors · 2024-03-30T19:47:27Z

⌛ Trying commit c02cff1 with merge c4f55ec...

…<try> Rework MIR inlining costs A bunch of the current costs are surprising, probably accidentally from from not writing out the matches in full. For example, a runtime-length `memcpy` was treated as the same cost as an `Unreachable`. This reworks things around two main ideas: - Give everything a baseline cost, because even "free" things do take effort in the compiler (CPU & RAM) to MIR inline, and they're easy to calculate - Then just penalize those things that are materially more than the baseline, like how `[foo; 123]` is far more work than `BinOp::AddUnchecked` in an `Rvalue` By including costs for locals and vardebuginfo this makes some things overall more expensive, but because it also greatly reduces the cost for simple things like local variable addition, other things also become less expensive overall. r? ghost

bors · 2024-03-30T21:24:51Z

☀️ Try build successful - checks-actions
Build commit: c4f55ec (c4f55eca0906ae0794bf2dbadbee5038fe5c9242)

rust-timer · 2024-03-30T22:41:01Z

Finished benchmarking commit (c4f55ec): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.9%	[0.2%, 1.9%]	10
Regressions ❌ (secondary)	0.5%	[0.4%, 0.6%]	2
Improvements ✅ (primary)	-0.5%	[-0.9%, -0.2%]	14
Improvements ✅ (secondary)	-1.3%	[-1.7%, -0.6%]	11
All ❌✅ (primary)	0.1%	[-0.9%, 1.9%]	24

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	3.8%	[0.8%, 5.5%]	3
Regressions ❌ (secondary)	4.9%	[4.1%, 5.9%]	3
Improvements ✅ (primary)	-3.8%	[-7.1%, -0.2%]	4
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.5%	[-7.1%, 5.5%]	7

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.6%	[1.5%, 1.7%]	3
Regressions ❌ (secondary)	2.5%	[1.8%, 2.8%]	7
Improvements ✅ (primary)	-1.1%	[-1.1%, -1.1%]	1
Improvements ✅ (secondary)	-3.9%	[-5.6%, -1.0%]	9
All ❌✅ (primary)	0.9%	[-1.1%, 1.7%]	4

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.4%	[0.0%, 3.9%]	30
Regressions ❌ (secondary)	1.2%	[0.0%, 2.4%]	4
Improvements ✅ (primary)	-0.2%	[-0.6%, -0.0%]	35
Improvements ✅ (secondary)	-0.5%	[-1.5%, -0.0%]	24
All ❌✅ (primary)	0.1%	[-0.6%, 3.9%]	65

Bootstrap: 667.603s -> 668.182s (0.09%)
Artifact size: 315.77 MiB -> 315.78 MiB (0.00%)

scottmcm · 2024-04-23T22:13:39Z

(Putting this aside for a while as I do things like #124188 first)

@rustbot author

Dylan-DPC · 2024-10-15T13:14:27Z

@scottmcm any updates on this? thanks

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Mar 28, 2024

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 29, 2024

scottmcm commented Mar 29, 2024

View reviewed changes

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Mar 29, 2024

scottmcm marked this pull request as ready for review March 29, 2024 03:19

rustbot assigned wesleywiser Mar 29, 2024

This was referenced Mar 29, 2024

Add a debug-info cost to MIR inlining #123011

Closed

UB Check blocks MIR inlining of Vec::deref #123174

Closed

scottmcm force-pushed the inlining-baseline-costs branch from 47a5a7f to c02cff1 Compare March 30, 2024 19:44

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 30, 2024

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 30, 2024

scottmcm marked this pull request as draft April 23, 2024 22:13

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 23, 2024

Rework MIR inlining costs #123179

Are you sure you want to change the base?

Rework MIR inlining costs #123179

Uh oh!

Conversation

scottmcm commented Mar 28, 2024

Uh oh!

scottmcm commented Mar 29, 2024

Uh oh!

This comment has been minimized.

bors commented Mar 29, 2024

Uh oh!

scottmcm Mar 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bors commented Mar 29, 2024

Uh oh!

This comment has been minimized.

rust-timer commented Mar 29, 2024

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Uh oh!

scottmcm commented Mar 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented Mar 29, 2024

Uh oh!

scottmcm commented Mar 30, 2024

Uh oh!

This comment has been minimized.

bors commented Mar 30, 2024

Uh oh!

bors commented Mar 30, 2024

Uh oh!

This comment has been minimized.

rust-timer commented Mar 30, 2024

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Uh oh!

scottmcm commented Apr 23, 2024

Uh oh!

Dylan-DPC commented Oct 15, 2024

Uh oh!

Uh oh!

scottmcm Mar 29, 2024 •

edited

Loading

scottmcm commented Mar 29, 2024 •

edited

Loading