GVN: Elide more intermediate transmutes #151622

scottmcm · 2026-01-24T23:15:57Z

We already skipped intermediate steps like u32 or i32 that support any (initialized) value.

This extends that to also allow skipping intermediate steps whose values are a superset of either the source or destination type. Most importantly, that means that usize → NonZeroUsize → ptr::Alignment and ptr::Alignment → NonZeroUsize → usize can skip the middle because NonZeroUsize is a superset of Alignment.

Then Alignment::as_usize is updated to take advantage of that and let us remove some more locals in a few places.

r? cjgillot

rustbot · 2026-01-24T23:16:00Z

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

scottmcm · 2026-01-25T07:54:40Z

I'm not expecting much, but just in case:
@bors try @rust-timer queue

GVN: Elide more intermediate transmutes

rust-bors · 2026-01-25T10:12:33Z

☀️ Try build successful (CI)
Build commit: cd2b201 (cd2b201cee18477fa23de21b4a5bbee37815f3dd, parent: 75963ce795666bc1f961e5d60060809809f6bc68)

rust-timer · 2026-01-25T10:53:28Z

Finished benchmarking commit (cd2b201): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.3%, 0.3%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.5%	[-0.8%, -0.2%]	2
Improvements ✅ (secondary)	-0.1%	[-0.1%, -0.1%]	1
All ❌✅ (primary)	-0.2%	[-0.8%, 0.3%]	3

Max RSS (memory usage)

Results (primary -4.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-4.0%	[-5.7%, -2.2%]	3
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-4.0%	[-5.7%, -2.2%]	3

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results (primary -0.0%, secondary -0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	0.1%	[0.0%, 0.5%]	11
Regressions ❌ (secondary)	0.0%	[0.0%, 0.0%]	1
Improvements ✅ (primary)	-0.1%	[-1.1%, -0.0%]	13
Improvements ✅ (secondary)	-0.0%	[-0.1%, -0.0%]	6
All ❌✅ (primary)	-0.0%	[-1.1%, 0.5%]	24

Bootstrap: 471.746s -> 470.956s (-0.17%)
Artifact size: 383.60 MiB -> 383.53 MiB (-0.02%)

cjgillot · 2026-01-25T22:44:25Z

library/core/src/ptr/alignment.rs

+    // as it's just there to convey the validity invariant.
+    // (Hopefully it'll eventually be a pattern type instead.)
+    _inner_repr_trick: AlignmentEnum,
+}


Can this change be done in its own commit? This would make easier to understand what changes from the mir opt diff.

Absolutely. I split it to three:

Just the compiler change to the mir-opt pass

The change to Alignment::as_usize on its own

The field change in Alignment

That way all the library changes are separated.

cjgillot · 2026-01-25T22:45:57Z

compiler/rustc_mir_transform/src/gvn.rs

+                } else if let Ok(from_layout) = self.ecx.layout_of(from_ty)
+                    && !from_layout.uninhabited
+                    && from_layout.size == middle_layout.size
+                    && let BackendRepr::Scalar(from_a) = from_layout.backend_repr
+                    && let a_range = a.valid_range(&self.ecx)
+                    && let from_range = from_a.valid_range(&self.ecx)
+                    && a_range.contains_range(from_range, middle_layout.size)
+                {
+                    false
+                } else if let Ok(to_layout) = self.ecx.layout_of(to_ty)
+                    && !to_layout.uninhabited
+                    && to_layout.size == middle_layout.size
+                    && let BackendRepr::Scalar(to_a) = to_layout.backend_repr
+                    && let a_range = a.valid_range(&self.ecx)
+                    && let to_range = to_a.valid_range(&self.ecx)
+                    && a_range.contains_range(to_range, middle_layout.size)
+                {
+                    false


Do you mind commenting the logic? This looks good to me, but it took me a few frowns to understand, for instance why ranges are compared the way you wrote it

rustbot · 2026-01-26T01:26:22Z

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

cjgillot · 2026-01-31T20:37:21Z

@bors r+

rust-bors · 2026-01-31T20:37:24Z

📌 Commit 9288c20 has been approved by cjgillot

It is now in the queue for this repository.

rust-bors · 2026-01-31T23:55:31Z

☀️ Test successful - CI
Approved by: cjgillot
Duration: 3h 12m 43s
Pushing 905b926 to main...

github-actions · 2026-01-31T23:58:41Z

What is this?

This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 8afe9ff (parent) -> 905b926 (this PR)

Test differences

Show 22 test diffs

22 doctest diffs were found. These are ignored, as they are noisy.

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard 905b9269674ced4b5239f485609a3bf0ab02d01b --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

aarch64-apple: 9476.0s -> 7804.0s (-17.6%)
pr-check-1: 1993.6s -> 1656.5s (-16.9%)
dist-x86_64-apple: 7539.3s -> 6330.4s (-16.0%)
x86_64-gnu-llvm-20-2: 5927.5s -> 5010.4s (-15.5%)
dist-aarch64-msvc: 5557.5s -> 6342.1s (+14.1%)
x86_64-rust-for-linux: 3084.6s -> 2653.0s (-14.0%)
dist-aarch64-apple: 6501.3s -> 7242.3s (+11.4%)
dist-powerpc-linux: 5792.5s -> 5145.3s (-11.2%)
x86_64-gnu-llvm-20-3: 6901.6s -> 6153.6s (-10.8%)
i686-gnu-nopt-1: 8236.5s -> 7369.9s (-10.5%)

How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

rust-timer · 2026-02-01T00:36:51Z

Finished benchmarking commit (905b926): comparison URL.

Overall result: ❌✅ regressions and improvements - no action needed

@rustbot label: -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	0.7%	[0.7%, 0.7%]	1
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.3%	[-0.5%, -0.0%]	3
All ❌✅ (primary)	0.7%	[0.7%, 0.7%]	1

Max RSS (memory usage)

Results (primary 1.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	3.3%	[2.2%, 4.8%]	4
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-2.2%	[-3.5%, -0.9%]	2
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	1.4%	[-3.5%, 4.8%]	6

Cycles

Results (secondary 2.8%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.8%	[2.8%, 2.8%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-	-	0

Binary size

Results (primary -0.1%, secondary -0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	0.2%	[0.2%, 0.2%]	1
Regressions ❌ (secondary)	0.1%	[0.0%, 0.2%]	2
Improvements ✅ (primary)	-0.1%	[-0.5%, -0.0%]	11
Improvements ✅ (secondary)	-0.0%	[-0.1%, -0.0%]	5
All ❌✅ (primary)	-0.1%	[-0.5%, 0.2%]	12

Bootstrap: 476.06s -> 477.374s (0.28%)
Artifact size: 397.87 MiB -> 397.82 MiB (-0.01%)

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jan 24, 2026

rustbot assigned cjgillot Jan 24, 2026

This comment has been minimized.

Sign in to view

rust-bors bot pushed a commit that referenced this pull request Jan 25, 2026

Auto merge of #151622 - scottmcm:elide-more-transmutes, r=<try>

cd2b201

GVN: Elide more intermediate transmutes

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 25, 2026

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jan 25, 2026

This comment has been minimized.

Sign in to view

cjgillot reviewed Jan 25, 2026

View reviewed changes

scottmcm added 3 commits January 25, 2026 17:14

GVN: Elide more intermediate transmutes

3a33ab0

Update ptr::Alignment to go through transmuting

929e280

Adjust Alignment to emphasize that we don't look at its field

9288c20

scottmcm force-pushed the elide-more-transmutes branch from 433703d to 9288c20 Compare January 26, 2026 01:26

rust-bors bot added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 31, 2026

This comment has been minimized.

Sign in to view

rust-bors bot added merged-by-bors This PR was explicitly merged by bors. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jan 31, 2026

rust-bors bot merged commit 905b926 into rust-lang:main Jan 31, 2026
12 checks passed

rustbot added this to the 1.95.0 milestone Jan 31, 2026

This was referenced Feb 1, 2026

GVN: Only propagate borrows from SSA locals #150485

Open

replace box_new with lower-level intrinsics #148190

Open

[perf] Start using pattern types in libcore #148537

Open

rustbot removed the perf-regression Performance regression. label Feb 1, 2026

scottmcm deleted the elide-more-transmutes branch February 1, 2026 00:57

Uh oh!

GVN: Elide more intermediate transmutes #151622

GVN: Elide more intermediate transmutes #151622

Uh oh!

Conversation

scottmcm commented Jan 24, 2026

Uh oh!

rustbot commented Jan 24, 2026

Uh oh!

scottmcm commented Jan 25, 2026

Uh oh!

This comment has been minimized.

This comment has been minimized.

rust-bors bot commented Jan 25, 2026

Uh oh!

This comment has been minimized.

rust-timer commented Jan 25, 2026

Overall result: ❌✅ regressions and improvements - please read the text below

Uh oh!

This comment has been minimized.

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rustbot commented Jan 26, 2026

Uh oh!

cjgillot commented Jan 31, 2026

Uh oh!

rust-bors bot commented Jan 31, 2026

Uh oh!

This comment has been minimized.

rust-bors bot commented Jan 31, 2026

Uh oh!

Uh oh!

github-actions bot commented Jan 31, 2026

Test differences

Job duration changes

Uh oh!

rust-timer commented Feb 1, 2026

Overall result: ❌✅ regressions and improvements - no action needed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants