Optimize empty provenance range checks. #137704

nnethercote · 2025-02-27T07:13:39Z

Currently it gets the pointers in the range and checks if the result is empty, but it can be done faster if you combine those two steps.

r? @oli-obk

Currently it gets the pointers in the range and checks if the result is empty, but it can be done faster if you combine those two steps.

nnethercote · 2025-02-27T10:30:20Z

@bors try @rust-timer queue

bors · 2025-02-27T10:31:31Z

⌛ Trying commit a36d8ac with merge 9c095f0...

…cks, r=<try> Optimize empty provenance range checks. Currently it gets the pointers in the range and checks if the result is empty, but it can be done faster if you combine those two steps. r? `@ghost`

bors · 2025-02-27T12:35:58Z

☀️ Try build successful - checks-actions
Build commit: 9c095f0 (9c095f0622aab4642d603551c5bf12c10ea4942b)

rust-timer · 2025-02-27T13:52:56Z

Finished benchmarking commit (9c095f0): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.3%	[-0.3%, -0.2%]	4
Improvements ✅ (secondary)	-1.8%	[-3.4%, -0.2%]	13
All ❌✅ (primary)	-0.3%	[-0.3%, -0.2%]	4

Max RSS (memory usage)

Results (primary -2.7%, secondary -3.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-2.7%	[-5.7%, -0.7%]	3
Improvements ✅ (secondary)	-3.0%	[-4.5%, -1.5%]	26
All ❌✅ (primary)	-2.7%	[-5.7%, -0.7%]	3

Cycles

Results (secondary -2.9%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.9%	[-3.8%, -2.2%]	4
All ❌✅ (primary)	-	-	0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 770.531s -> 771.055s (0.07%)
Artifact size: 361.97 MiB -> 361.94 MiB (-0.01%)

rustbot · 2025-02-27T20:20:42Z

Some changes occurred to the CTFE / Miri interpreter

cc @rust-lang/miri, @rust-lang/wg-const-eval

nnethercote · 2025-02-27T20:22:24Z

Good perf results for coercions, smaller improvements for ucd and unicode-normalizations. All of which matches the results I saw locally.

oli-obk · 2025-02-27T20:44:34Z

Nice. Now I really wanna try representing the bytes and init map together with a single treemap of offset + Box pairs for only the defined bytes

oli-obk · 2025-02-27T21:20:33Z

@bors r+

bors · 2025-02-27T21:20:36Z

📌 Commit a36d8ac has been approved by oli-obk

It is now in the queue for this repository.

bors · 2025-02-27T21:20:36Z

🌲 The tree is currently closed for pull requests below priority 100. This pull request will be tested once the tree is reopened.

RalfJung · 2025-02-28T07:24:09Z

Nice. Now I really wanna try representing the bytes and init map together with a single treemap of offset + Box pairs for only the defined bytes

FWIW Miri's support for native calls and FFI relies on our memory representation of the actual data bytes being contiguous, and on pre-allocating that range so that we know its address and can use the same address in Miri's internal "virtual" address space. So making this work in Miri could be a bit more tricky (but it'd be a shame to have only const-eval benefit from such an optimization).

That said, we also don't have good benchmarks for this in the test suite. I'd appreciate if, once this lands, you could run the Miri benchmark suite with a rustc before and after this PR to ensure it doesn't have unexpected effects. (We have some nice flags to make comparisons easier: --save-baseline and --load-baseline.)

RalfJung · 2025-02-28T07:27:05Z

compiler/rustc_middle/src/mir/interpret/allocation/provenance_map.rs

@@ -117,7 +132,7 @@ impl<Prov: Provenance> ProvenanceMap<Prov> {
    /// limit access to provenance outside of the `Allocation` abstraction.
    ///
    pub fn range_empty(&self, range: AllocRange, cx: &impl HasDataLayout) -> bool {
-        self.range_get_ptrs(range, cx).is_empty() && self.range_get_bytes(range).is_empty()
+        self.range_get_ptrs_is_empty(range, cx) && self.range_get_bytes(range).is_empty()


We could do something similar for checking that the bytewise provenance is empty in this range, couldn't we?

We could, but for the benchmarks range_get_bytes wasn't hot at all, while range_get_ptrs was very hot, so I didn't bother.

Yeah the per-byte provenance is empty most of the time. Still it just seems oddly asymmetric now, making one wonder if there is a reason for the asymmetry.

To be more precise, it is always empty for CTFE. It will only ever be non-empty in Miri, but even there that should be rather rare.

jieyouxu · 2025-03-02T12:00:43Z

@bors p=5 (encouraging more rollup-nevers to go through before more rollups)

bors · 2025-03-02T22:41:20Z

⌛ Testing commit a36d8ac with merge daf5985...

bors · 2025-03-03T01:57:50Z

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing daf5985 to master...

rust-timer · 2025-03-03T03:27:43Z

Finished benchmarking commit (daf5985): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.3%	[-0.3%, -0.3%]	3
Improvements ✅ (secondary)	-1.9%	[-3.4%, -0.2%]	13
All ❌✅ (primary)	-0.3%	[-0.3%, -0.3%]	3

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

Results (secondary 0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	8.4%	[7.4%, 9.4%]	2
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.6%	[-3.4%, -2.3%]	6
All ❌✅ (primary)	-	-	0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 773.437s -> 773.006s (-0.06%)
Artifact size: 361.95 MiB -> 361.95 MiB (0.00%)

… r=oli-obk interpret/provenance_map: consistently use range_is_empty rust-lang#137704 started using this for per-ptr provenance; let's be consistent and use it also for the per-byte provenance check. Also rename the methods to avoid having both "get" and "is_empty" in the name. r? `@oli-obk`

… r=oli-obk interpret/provenance_map: consistently use range_is_empty rust-lang#137704 started using this for per-ptr provenance; let's be consistent and use it also for the per-byte provenance check. Also rename the methods to avoid having both "get" and "is_empty" in the name. r? ``@oli-obk``

… r=oli-obk interpret/provenance_map: consistently use range_is_empty rust-lang#137704 started using this for per-ptr provenance; let's be consistent and use it also for the per-byte provenance check. Also rename the methods to avoid having both "get" and "is_empty" in the name. r? ```@oli-obk```

… r=oli-obk interpret/provenance_map: consistently use range_is_empty rust-lang#137704 started using this for per-ptr provenance; let's be consistent and use it also for the per-byte provenance check. Also rename the methods to avoid having both "get" and "is_empty" in the name. r? ````@oli-obk````

Rollup merge of rust-lang#137920 - RalfJung:provenance-map-emptiness, r=oli-obk interpret/provenance_map: consistently use range_is_empty rust-lang#137704 started using this for per-ptr provenance; let's be consistent and use it also for the per-byte provenance check. Also rename the methods to avoid having both "get" and "is_empty" in the name. r? ````@oli-obk````

interpret/provenance_map: consistently use range_is_empty rust-lang/rust#137704 started using this for per-ptr provenance; let's be consistent and use it also for the per-byte provenance check. Also rename the methods to avoid having both "get" and "is_empty" in the name. r? ````@oli-obk````

Optimize empty provenance range checks.

a36d8ac

Currently it gets the pointers in the range and checks if the result is empty, but it can be done faster if you combine those two steps.

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 27, 2025

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 27, 2025

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 27, 2025

nnethercote marked this pull request as ready for review February 27, 2025 20:20

rustbot assigned oli-obk Feb 27, 2025

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 27, 2025

RalfJung reviewed Feb 28, 2025

View reviewed changes

bors added the merged-by-bors This PR was explicitly merged by bors. label Mar 3, 2025

bors merged commit daf5985 into rust-lang:master Mar 3, 2025
7 checks passed

rustbot added this to the 1.87.0 milestone Mar 3, 2025

nnethercote deleted the opt-empty-prov-range-checks branch March 3, 2025 03:33

RalfJung mentioned this pull request Mar 3, 2025

interpret/provenance_map: consistently use range_is_empty #137920

Merged

Optimize empty provenance range checks. #137704

Optimize empty provenance range checks. #137704

Conversation

nnethercote commented Feb 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nnethercote commented Feb 27, 2025

Uh oh!

This comment has been minimized.

bors commented Feb 27, 2025

Uh oh!

bors commented Feb 27, 2025

Uh oh!

This comment has been minimized.

rust-timer commented Feb 27, 2025

Overall result: ✅ improvements - no action needed

Uh oh!

rustbot commented Feb 27, 2025

Uh oh!

nnethercote commented Feb 27, 2025

Uh oh!

oli-obk commented Feb 27, 2025

Uh oh!

oli-obk commented Feb 27, 2025

Uh oh!

bors commented Feb 27, 2025

Uh oh!

bors commented Feb 27, 2025

Uh oh!

RalfJung commented Feb 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RalfJung Mar 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jieyouxu commented Mar 2, 2025

Uh oh!

bors commented Mar 2, 2025

Uh oh!

bors commented Mar 3, 2025

Uh oh!

Uh oh!

rust-timer commented Mar 3, 2025

Overall result: ✅ improvements - no action needed

Uh oh!

Uh oh!

nnethercote commented Feb 27, 2025 •

edited

Loading

RalfJung commented Feb 28, 2025 •

edited

Loading

RalfJung Mar 3, 2025 •

edited

Loading