Extend GVN to perform local value numbering. #143333

cjgillot · 2025-07-02T15:56:21Z

Current GVN MIR opt can be easily extended to track values that change inside a basic block. This PR attempts that.

Most of the algorithm does not change. In particular, all value numbering and the simplifications still work.
The only particular point is handling non-SSA locals.

We separate locals into two groups:

SSA locals whose values are global and can be propagated everywhere in all bbs;
non-SSA locals for which we can only propagate inside a single block.

The maps that handle non-SSA locals are cleared at the beginning of each bb, to ensure we do not propagate across blocks.

FIXME: how do we handle derefs across blocks.

r? @ghost for perf

cjgillot · 2025-07-05T14:13:23Z

@bors try @rust-timer queue

[TOY] Extend GVN to perform local value numbering. This PR is more a toy than anything else, but I still think the implementation is sound. Current GVN MIR opt can be easily extended to track values that change inside a basic block. This PR attempts that. r? `@ghost` for perf

bors · 2025-07-05T14:14:39Z

⌛ Trying commit c3be10b with merge faceb52...

bors · 2025-07-05T17:10:23Z

☀️ Try build successful - checks-actions
Build commit: faceb52 (faceb52e72c06f94883ba351a73c83544ab4154b)

rust-timer · 2025-07-05T18:38:46Z

Finished benchmarking commit (faceb52): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	15.4%	[0.2%, 937.4%]	227
Regressions ❌ (secondary)	10.3%	[0.2%, 569.6%]	237
Improvements ✅ (primary)	-1.1%	[-1.8%, -0.4%]	2
Improvements ✅ (secondary)	-1.3%	[-4.1%, -0.3%]	16
All ❌✅ (primary)	15.3%	[-1.8%, 937.4%]	229

Max RSS (memory usage)

Results (primary 9.1%, secondary 2.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	71.9%	[1.6%, 377.4%]	14
Regressions ❌ (secondary)	28.4%	[2.2%, 134.5%]	15
Improvements ✅ (primary)	-2.0%	[-7.5%, -0.7%]	79
Improvements ✅ (secondary)	-2.5%	[-3.5%, -0.8%]	80
All ❌✅ (primary)	9.1%	[-7.5%, 377.4%]	93

Cycles

Results (primary 44.8%, secondary 29.7%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	46.5%	[2.3%, 299.4%]	27
Regressions ❌ (secondary)	40.6%	[2.4%, 263.2%]	28
Improvements ✅ (primary)	-2.1%	[-2.1%, -2.1%]	1
Improvements ✅ (secondary)	-4.2%	[-6.7%, -2.4%]	9
All ❌✅ (primary)	44.8%	[-2.1%, 299.4%]	28

Binary size

Results (primary -0.1%, secondary 0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	0.2%	[0.0%, 0.6%]	37
Regressions ❌ (secondary)	0.5%	[0.0%, 1.7%]	70
Improvements ✅ (primary)	-0.3%	[-1.3%, -0.0%]	54
Improvements ✅ (secondary)	-1.0%	[-12.8%, -0.0%]	24
All ❌✅ (primary)	-0.1%	[-1.3%, 0.6%]	91

Bootstrap: 458.797s -> 582.037s (26.86%)
Artifact size: 372.11 MiB -> 361.76 MiB (-2.78%)

cjgillot · 2025-07-06T08:54:20Z

@bors try @rust-timer queue

[TOY] Extend GVN to perform local value numbering. This PR is more a toy than anything else, but I still think the implementation is sound. Current GVN MIR opt can be easily extended to track values that change inside a basic block. This PR attempts that. r? `@ghost` for perf

bors · 2025-07-06T08:55:31Z

⌛ Trying commit e521268 with merge dc4aa90...

cjgillot · 2025-07-06T08:56:01Z

@bors try @rust-timer queue

[TOY] Extend GVN to perform local value numbering. This PR is more a toy than anything else, but I still think the implementation is sound. Current GVN MIR opt can be easily extended to track values that change inside a basic block. This PR attempts that. r? `@ghost` for perf

bors · 2025-07-06T08:57:11Z

⌛ Trying commit 0293254 with merge 4d16c54...

bors · 2025-07-06T11:13:52Z

☀️ Try build successful - checks-actions
Build commit: 4d16c54 (4d16c544e50b0f84cc5347467c108e93106aa8f9)

cjgillot · 2025-10-09T01:58:04Z

@bors try @rust-timer queue

Extend GVN to perform local value numbering.

rust-bors · 2025-10-09T04:09:59Z

☀️ Try build successful (CI)
Build commit: 9649fee (9649fee6194a4e52d4931ae2d580eab4394d788a, parent: b6f0945e4681bc4d2faa7c22c5f61dc36abf7dd2)

rust-timer · 2025-10-09T05:19:43Z

Finished benchmarking commit (9649fee): comparison URL.

Overall result: ❌✅ regressions and improvements - BENCHMARK(S) FAILED

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

❗ ❗ ❗ ❗ ❗
Warning ⚠️: The following benchmark(s) failed to build:

runtime:bufreader

❗ ❗ ❗ ❗ ❗

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.1%, 1.3%]	98
Regressions ❌ (secondary)	0.3%	[0.1%, 0.8%]	80
Improvements ✅ (primary)	-0.6%	[-0.6%, -0.5%]	3
Improvements ✅ (secondary)	-0.2%	[-0.2%, -0.0%]	6
All ❌✅ (primary)	0.3%	[-0.6%, 1.3%]	101

Max RSS (memory usage)

Results (primary -1.9%, secondary 1.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	1.4%	[1.2%, 1.5%]	2
Regressions ❌ (secondary)	1.6%	[1.6%, 1.6%]	1
Improvements ✅ (primary)	-3.2%	[-5.3%, -0.9%]	5
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-1.9%	[-5.3%, 1.5%]	7

Cycles

Results (primary -2.3%, secondary -1.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.2%	[2.1%, 2.3%]	2
Improvements ✅ (primary)	-2.3%	[-2.3%, -2.3%]	1
Improvements ✅ (secondary)	-3.3%	[-3.9%, -2.7%]	4
All ❌✅ (primary)	-2.3%	[-2.3%, -2.3%]	1

Binary size

Results (primary -0.0%, secondary -0.2%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	0.2%	[0.0%, 0.9%]	14
Regressions ❌ (secondary)	0.2%	[0.0%, 0.9%]	7
Improvements ✅ (primary)	-0.1%	[-0.4%, -0.0%]	51
Improvements ✅ (secondary)	-0.3%	[-7.2%, -0.0%]	24
All ❌✅ (primary)	-0.0%	[-0.4%, 0.9%]	65

Bootstrap: 479.914s -> 471.289s (-1.80%)
Artifact size: 388.45 MiB -> 390.40 MiB (0.50%)

cjgillot · 2025-10-09T21:39:46Z

@bors try @rust-timer queue

Extend GVN to perform local value numbering.

rust-bors · 2025-10-09T23:58:23Z

☀️ Try build successful (CI)
Build commit: 7398137 (739813722eadc687ef506096a76305cffba4954d, parent: b925a865e2c9a0aefe5a2877863cb4df796f2eaf)

rust-timer · 2025-10-10T01:58:13Z

Finished benchmarking commit (7398137): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

	mean	range	count
Regressions ❌ (primary)	0.3%	[0.1%, 1.8%]	97
Regressions ❌ (secondary)	0.3%	[0.1%, 0.8%]	86
Improvements ✅ (primary)	-0.7%	[-1.0%, -0.5%]	4
Improvements ✅ (secondary)	-0.2%	[-0.9%, -0.0%]	13
All ❌✅ (primary)	0.3%	[-1.0%, 1.8%]	101

Max RSS (memory usage)

Results (primary 1.7%, secondary 1.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	4.0%	[3.2%, 5.5%]	4
Regressions ❌ (secondary)	1.4%	[0.6%, 2.2%]	4
Improvements ✅ (primary)	-2.8%	[-3.0%, -2.5%]	2
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	1.7%	[-3.0%, 5.5%]	6

Cycles

Results (primary 2.6%, secondary 1.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	2.6%	[2.1%, 2.9%]	4
Regressions ❌ (secondary)	5.4%	[2.2%, 14.1%]	8
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-3.8%	[-5.6%, -1.8%]	6
All ❌✅ (primary)	2.6%	[2.1%, 2.9%]	4

Binary size

Results (primary -0.1%, secondary -0.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	0.2%	[0.0%, 0.6%]	13
Regressions ❌ (secondary)	0.1%	[0.0%, 0.5%]	7
Improvements ✅ (primary)	-0.1%	[-0.7%, -0.0%]	55
Improvements ✅ (secondary)	-0.9%	[-7.2%, -0.0%]	9
All ❌✅ (primary)	-0.1%	[-0.7%, 0.6%]	68

Bootstrap: 472.277s -> 472.35s (0.02%)
Artifact size: 388.06 MiB -> 390.76 MiB (0.69%)

bors · 2025-10-18T13:19:27Z

☔ The latest upstream changes (presumably #147654) made this pull request unmergeable. Please resolve the merge conflicts.

dianqk · 2025-10-21T12:51:20Z

Should we introduce reaching definitions to resolve non-SSA locals, mutable borrows, and immutable borrows with its lifetime scope?

bors · 2025-10-22T13:54:13Z

☔ The latest upstream changes (presumably #147687) made this pull request unmergeable. Please resolve the merge conflicts.

oli-obk · 2025-10-31T10:44:46Z

There are some weird regressions like https://perf.rust-lang.org/detailed-query.html?commit=739813722eadc687ef506096a76305cffba4954d&benchmark=cargo-0.87.1-check&scenario=incr-full&base_commit=b925a865e2c9a0aefe5a2877863cb4df796f2eaf&sort=-timeDelta where borrowck and typeck got slower and I have no clue whether it's noise, but most other changes are just LLVM taking more time, which usually means it's finding more things to optimize

rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jul 2, 2025