[WIP] mir-opt: promoting const read-only arrays #125916

tesuji · 2024-06-03T06:30:40Z

Modified from a copy of PromoteTemps. It's kind of a hack so nothing fancy or easy to follow and review.
I'll to reuse structures from PromoteTemps when there is consensus for this pass.

Compiler is doing more work now with this opt. So I don't think this pass improves compiler performance.
But anyway, for statistics, can I get a perf run?

cc #73825

r? ghost

Current status

Waiting for consensus.
Maybe rewrite to use GVN with mentor from oli
~~ICE on unstable feature: tests/assembly/simd-intrinsic-mask-load.rs#x86-avx512.~~
In particular Simd([literal array]) now transformed to Simd(array_var). Maybe I should ignore array in constructor.
~~Fail test on nested arrays~~

rustbot · 2024-06-03T06:30:45Z

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

Urgau · 2024-06-03T06:36:14Z

@bors try @rust-timer queue

bors · 2024-06-03T06:37:26Z

⌛ Trying commit 42d586c with merge 360f92e...

…=<try> [WIP] mir-opt: promoting const read-only arrays Modified from a copy of PromoteTemps. It's kind of a hack so nothing fancy and easy to follow and review. I'll attempt to reuse structures from PromoteTemps when there is [consensus for this pass][zulip]. Compiler is doing more work now with this opt. So I don't think this pass improves compiler performance. But anyway, for statistics, can I get a perf run? cc rust-lang#73825 r? ghost [zulip]: https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/Could.20const.20read-only.20arrays.20be.20const.20promoted.3F

bors · 2024-06-03T06:49:58Z

💔 Test failed - checks-actions

oli-obk · 2024-06-03T09:30:12Z

compiler/rustc_mir_transform/src/lib.rs

-        &[&promote_pass, &simplify::SimplifyCfg::PromoteConsts, &coverage::InstrumentCoverage],
+        &[
+            &promote_pass,
+            &promote_array,


since this is something that should not have user-visible effects (e.g. affecting dropck, const eval UB or borrowck), it should be run as part of the regular runtime optimization pipeline

oli-obk · 2024-06-03T09:32:05Z

compiler/rustc_mir_transform/src/lib.rs

+    let array_promoted = promote_array.promoted_fragments.into_inner();
+    promoted.extend(array_promoted);


which does mean you won't be able to use the existing promotion scheme, but would need to start looking into create_def and query feeding, which is probably not ready to support this use case yet. I have not yet given it much thought what is needed to fully support that, but if you want we can look into this together.

then again, if all we're doing is creating non-generic static items, that already has precedent (we do that for nested statics), so likely you can do the same in an optimization.

Though in that case I would expect this to fall out of GVN or some similar optimization, not be its own separate path

With GVN or similar you don't even need to create new constants and MIR bodies, you can just stick the fully evaluated constant into a MIR constant

lqd · 2024-06-04T14:28:16Z

@bors try @rust-timer queue

…=<try> [WIP] mir-opt: promoting const read-only arrays Modified from a copy of PromoteTemps. It's kind of a hack so nothing fancy or easy to follow and review. I'll to reuse structures from PromoteTemps when there is [consensus for this pass][zulip]. Compiler is doing more work now with this opt. So I don't think this pass improves compiler performance. But anyway, for statistics, can I get a perf run? cc rust-lang#73825 r? ghost ### Current status - Waiting for [consensus][zulip]. - Fail simd tests: tests/assembly/simd-intrinsic-mask-load.rs#x86-avx512 - *~Fail test on nested arrays~*: hack fix, may possibly fail on struct containings arrays. - Maybe rewrite to [use GVN with mentor from oli][mentor] [zulip]: https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/Could.20const.20read-only.20arrays.20be.20const.20promoted.3F [mentor]: rust-lang#125916 (comment)

bors · 2024-06-04T14:30:07Z

⌛ Trying commit 0a91619 with merge c415513...

bors · 2024-06-04T16:07:33Z

☀️ Try build successful - checks-actions
Build commit: c415513 (c4155130fd61e7fa5e1b138de6a817f1cfb4e2fb)

rust-timer · 2024-06-04T18:53:48Z

Finished benchmarking commit (c415513): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.2%, 0.3%]	13
Regressions ❌ (secondary)	0.6%	[0.5%, 0.7%]	9
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.1%	[-0.1%, -0.1%]	1
All ❌✅ (primary)	0.3%	[0.2%, 0.3%]	13

Max RSS (memory usage)

Results (primary -5.8%, secondary -2.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-5.8%	[-5.8%, -5.8%]	1
Improvements ✅ (secondary)	-2.2%	[-2.2%, -2.2%]	1
All ❌✅ (primary)	-5.8%	[-5.8%, -5.8%]	1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results (primary 0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.0%	[0.0%, 0.0%]	8
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.0%	[-0.0%, -0.0%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.0%	[-0.0%, 0.0%]	9

Bootstrap: 673.596s -> 672.754s (-0.13%)
Artifact size: 318.88 MiB -> 318.85 MiB (-0.01%)

scottmcm · 2024-06-04T23:12:24Z

For the SIMD mention: the long-term goal is to stop allowing projections into repr(simd) types at all, just Transmute. So whatever fix is easiest is fine, as that situation will stop happening hopefully-soon.

tesuji · 2024-06-10T07:02:05Z

Can I get another perf run before switching to use GVN?

Kobzol · 2024-06-10T09:22:18Z

@bors try @rust-timer queue

…=<try> [WIP] mir-opt: promoting const read-only arrays Modified from a copy of PromoteTemps. It's kind of a hack so nothing fancy or easy to follow and review. I'll to reuse structures from PromoteTemps when there is [consensus for this pass][zulip]. Compiler is doing more work now with this opt. So I don't think this pass improves compiler performance. But anyway, for statistics, can I get a perf run? cc rust-lang#73825 r? ghost ### Current status - [ ] Waiting for [consensus][zulip]. - [ ] Maybe rewrite to [use GVN with mentor from oli][mentor] - [x] ~ICE on unstable feature: tests/assembly/simd-intrinsic-mask-load.rs#x86-avx512.~ In particular `Simd([literal array])` now transformed to `Simd(array_var)`. Maybe I should ignore array in constructor. - [x] *~Fail test on nested arrays~* [zulip]: https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/Could.20const.20read-only.20arrays.20be.20const.20promoted.3F [mentor]: rust-lang#125916 (comment)

bors · 2024-06-10T09:23:29Z

⌛ Trying commit dca7207 with merge 63ac52a...

bors · 2024-06-10T11:02:01Z

☀️ Try build successful - checks-actions
Build commit: 63ac52a (63ac52aeb8179ad1a9d0a60cc0cf82812d3ddb65)

rust-timer · 2024-06-10T12:17:24Z

Finished benchmarking commit (63ac52a): comparison URL.

Overall result: ❌ regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.2%, 0.3%]	17
Regressions ❌ (secondary)	0.1%	[0.1%, 0.1%]	3
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.3%	[0.2%, 0.3%]	17

Max RSS (memory usage)

Results (primary -8.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-8.1%	[-8.1%, -8.1%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-8.1%	[-8.1%, -8.1%]	1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results (primary -0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.0%	[-0.0%, -0.0%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.0%	[-0.0%, -0.0%]	1

Bootstrap: 673.707s -> 672.398s (-0.19%)
Artifact size: 319.82 MiB -> 319.84 MiB (0.01%)

[WIP] gvn: Promote/propagate const local array Rewriting of rust-lang#125916 which used PromoteTemps pass. Fix rust-lang#73825 ### Current status - [ ] Waiting for [consensus](https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/Could.20const.20read-only.20arrays.20be.20const.20promoted.3F). r? ghost

tesuji · 2024-06-14T13:01:32Z

Closing in favor of #126444.
But there may have clean-up commits for PromoteTemps.

[WIP] gvn: Promote/propagate const local array Rewriting of rust-lang#125916 which used PromoteTemps pass. Fix rust-lang#73825 ### Current status - [ ] Waiting for [consensus](https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/Could.20const.20read-only.20arrays.20be.20const.20promoted.3F). r? ghost

promote_consts: some clean-up after experimenting This is some clean-up after experimenting in rust-lang#125916, Prefer to review commit-by-commit.

gvn: Promote/propagate const local array Rewriting of rust-lang#125916 which used `PromoteTemps` pass. This allows promoting constant local arrays as anonymous constants. So that's in codegen for a local array, rustc outputs `llvm.memcpy` (which is easy for LLVM to optimize) instead of a series of `store` on stack (a.k.a in-place initialization). This makes rustc on par with clang on this specific case. See more in rust-lang#73825 or [zulip][opsem] for more info. [Here is a simple micro benchmark][bench] that shows the performance differences between promoting arrays or not. [Prior discussions on zulip][opsem]. This patch [saves about 600 KB][perf] (~0.5%) of `librustc_driver.so`. ![image](https://github.com/rust-lang/rust/assets/15225902/0e37559c-f5d9-4cdf-b7e3-a2956fd17bc1) Fix rust-lang#73825 r? cjgillot ### Unresolved questions - [ ] Should we ignore nested arrays? I think that promoting nested arrays is bloating codegen. - [ ] Should stack_threshold be at least 32 bytes? Like the benchmark showed. If yes, the test should be updated to make arrays larger than 32 bytes. - [x] ~Is this concerning that `call(move _1)` is now `call(const [array])`?~ It reverted back to `call(move _1)` [opsem]: https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/Could.20const.20read-only.20arrays.20be.20const.20promoted.3F [bench]: rust-lang/rust-clippy#12854 (comment) [perf]: https://perf.rust-lang.org/compare.html?start=f9515fdd5aa132e27d9b580a35b27f4b453251c1&end=7e160d4b55bb5a27be0696f45db247ccc2e166d9&stat=size%3Alinked_artifact&tab=artifact-size

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jun 3, 2024

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 3, 2024

This comment has been minimized.

Sign in to view

bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 3, 2024

oli-obk reviewed Jun 3, 2024

View reviewed changes

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jun 4, 2024

This comment has been minimized.

Sign in to view

tesuji added 8 commits June 10, 2024 06:52

ignore const/static items

dfcae48

ignore simd temps

e53c231

bless

d11c9e2

ignore rvalue not array

1f40fa4

aos => soa

7c4dcca

fix ice nested array

ea443af

bless

514e4a9

use ccx

dca7207

tesuji force-pushed the mir-opt-const-array-locals branch from 32d9976 to dca7207 Compare June 10, 2024 06:57

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 10, 2024

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 10, 2024

This was referenced Jun 12, 2024

WIP: Trying to promote const local array using GVN pass tesuji/rustc#5

Closed

gvn: Promote/propagate const local array #126444

Closed

tesuji closed this Jun 14, 2024

tesuji deleted the mir-opt-const-array-locals branch June 16, 2024 09:34

tesuji mentioned this pull request Jun 16, 2024

promote_consts: some clean-up after experimenting #125853

Merged

		let array_promoted = promote_array.promoted_fragments.into_inner();
		promoted.extend(array_promoted);

[WIP] mir-opt: promoting const read-only arrays #125916

[WIP] mir-opt: promoting const read-only arrays #125916

Uh oh!

Conversation

tesuji commented Jun 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Current status

Uh oh!

rustbot commented Jun 3, 2024

Uh oh!

Urgau commented Jun 3, 2024

Uh oh!

This comment has been minimized.

bors commented Jun 3, 2024

Uh oh!

This comment has been minimized.

This comment has been minimized.

bors commented Jun 3, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

This comment has been minimized.

lqd commented Jun 4, 2024

Uh oh!

This comment has been minimized.

bors commented Jun 4, 2024

Uh oh!

This comment has been minimized.

bors commented Jun 4, 2024

Uh oh!

This comment has been minimized.

rust-timer commented Jun 4, 2024

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Uh oh!

scottmcm commented Jun 4, 2024

Uh oh!

This comment has been minimized.

tesuji commented Jun 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Kobzol commented Jun 10, 2024

Uh oh!

This comment has been minimized.

bors commented Jun 10, 2024

Uh oh!

bors commented Jun 10, 2024

Uh oh!

This comment has been minimized.

rust-timer commented Jun 10, 2024

Overall result: ❌ regressions - ACTION NEEDED

Uh oh!

tesuji commented Jun 14, 2024

Uh oh!

Uh oh!

tesuji commented Jun 3, 2024 •

edited

Loading

tesuji commented Jun 10, 2024 •

edited

Loading