Enable instcombine for mutable reborrows #105274

saethlin · 2022-12-04T22:57:33Z

instcombine used to contain this comment, which is no longer accurate because there it is fine to copy &mut _ in MIR:

// The dereferenced place must have type `&_`, so that we don't copy `&mut _`.

So let's try replacing that check with something much more permissive...

rustbot · 2022-12-04T22:57:38Z

r? @fee1-dead

(rustbot has picked a reviewer for you, use r? to override)

saethlin · 2022-12-04T22:59:09Z

@bors try @rust-timer queue

bors · 2022-12-04T22:59:17Z

⌛ Trying commit d966a3b69baf8b6a60918783f95170f552b79db2 with merge ad3c92c23998265b250e84072921bcacbe08907e...

bors · 2022-12-05T01:39:43Z

☀️ Try build successful - checks-actions
Build commit: ad3c92c23998265b250e84072921bcacbe08907e (ad3c92c23998265b250e84072921bcacbe08907e)

rust-timer · 2022-12-05T06:17:57Z

Finished benchmarking commit (ad3c92c23998265b250e84072921bcacbe08907e): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.7%	[0.7%, 0.8%]	2
Regressions ❌ (secondary)	1.3%	[1.3%, 1.3%]	2
Improvements ✅ (primary)	-0.5%	[-0.7%, -0.3%]	7
Improvements ✅ (secondary)	-0.5%	[-0.7%, -0.3%]	6
All ❌✅ (primary)	-0.2%	[-0.7%, 0.8%]	9

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	3.4%	[2.4%, 4.4%]	2
Improvements ✅ (primary)	-1.8%	[-4.4%, -0.0%]	3
Improvements ✅ (secondary)	-1.5%	[-2.2%, -1.2%]	3
All ❌✅ (primary)	-1.8%	[-4.4%, -0.0%]	3

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.6%	[2.6%, 2.6%]	1
Improvements ✅ (primary)	-1.4%	[-1.6%, -0.8%]	12
Improvements ✅ (secondary)	-2.8%	[-3.5%, -1.6%]	15
All ❌✅ (primary)	-1.4%	[-1.6%, -0.8%]	12

saethlin · 2022-12-05T17:59:21Z

The regressions are in externs, which is noise, and in opt builds of ripgrep and cargo. I've confirmed locally that this PR definitely does alter the optimized codegen for ripgrep. But beyond that simple observation, I don't really have a way to quantify if that is good or not. Perhaps we are enabling more optimizations?

rustbot · 2022-12-05T17:59:30Z

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

fee1-dead · 2022-12-07T13:48:03Z

I don't really have a way to quantify if that is good or not. Perhaps we are enabling more optimizations?

Hmm. Can we confirm that this generates better assembly? Otherwise LLVM might be doing unnecessary work.

fee1-dead · 2022-12-08T11:02:50Z

Let's get a second opinion on this. I would approve it if it did not regress, but now I don't know if it is the most optimal thing to do..

r? compiler

saethlin · 2022-12-08T14:12:12Z

To your previous comment, I have looked through things as best I can.

A diff of nm on a release build of ripgrep before and after this PR indicates that there has been some inlining changes after LLVM. But the changes are primarily in various drop_in_place, with a few changes elsewhere. The only symbol that looked perf-related was a few things related to GlobSet. I ran the benchmarks for the sub-crate globset before and after, and the benchmarks are just too noisy to conclude anything 🤷 There could easily be a perf improvement of 1% in there, I just wouldn't know.

I also looked around the ecosystem at a few other crates. In a few cases I found microbenchmarks whose runtimes seem to have changed with this PR, but if I inspect the assembly for the benchmark, I see no changes at all. I wouldn't be surprised if these changes are wall time changes due to code layout shifts in the criterion or libtest runtime, perhaps perturbing the alignment of the benchmark loop.

The rustc-perf runtime benchmarks are exactly the same before and after this PR.

bjorn3 · 2022-12-11T17:59:48Z

Is this compatible with stacked borrows? AFAIK reborrows have a semantic meaning.

saethlin · 2022-12-11T19:17:32Z

That sounds like a good question for t-opsem 😉 (I am resisting the urge to put together an informal proof that this is valid, there is much else I would like to do)

The fact that the aliasing model cares about this doesn't necessarily mean we can't remove it in a MIR optimization. These optimizations can and do rely on the input program not executing UB, and are only obligated to not add UB.

I would be surprised if this optimization were legal for non-mutable reborrows but not mutable reborrows. Especially considering the comment this PR removes.

saethlin · 2023-01-07T03:16:55Z

This change doesn't delete any MIR statements, so it kind of makes sense for it to be approximately perf-neutral on balance. But in combination with DestinationPropagation, a benefit should be visible. So it would make sense for this PR to wait for #105577.

@rustbot label +S-blocked

saethlin · 2023-01-16T18:05:37Z

Rebased away merge conflict, changing reviewer to Oli because you keep r?'ing yourself on my other MIR opt PRs.

r? @oli-obk

rustbot · 2023-01-16T18:05:39Z

Failed to set assignee to 'ing: invalid assignee

Note: Only org members, users with write permissions, or people who have commented on the PR may be assigned.

bors · 2023-02-16T13:44:28Z

⌛ Trying commit 1409cb5 with merge c5bb1d862806b46d162d2c26fc57fc5f2ef20fc8...

bors · 2023-02-16T16:25:00Z

☀️ Try build successful - checks-actions
Build commit: c5bb1d862806b46d162d2c26fc57fc5f2ef20fc8 (c5bb1d862806b46d162d2c26fc57fc5f2ef20fc8)

rust-timer · 2023-02-16T18:03:09Z

Finished benchmarking commit (c5bb1d862806b46d162d2c26fc57fc5f2ef20fc8): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.6%	[0.2%, 3.2%]	16
Regressions ❌ (secondary)	0.4%	[0.2%, 0.7%]	11
Improvements ✅ (primary)	-0.7%	[-2.4%, -0.2%]	50
Improvements ✅ (secondary)	-0.8%	[-1.7%, -0.3%]	23
All ❌✅ (primary)	-0.4%	[-2.4%, 3.2%]	66

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	3.2%	[0.1%, 7.3%]	4
Regressions ❌ (secondary)	3.4%	[1.2%, 5.6%]	2
Improvements ✅ (primary)	-2.6%	[-7.9%, -0.9%]	8
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.7%	[-7.9%, 7.3%]	12

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.0%	[2.0%, 2.0%]	1
Regressions ❌ (secondary)	3.6%	[3.6%, 3.6%]	2
Improvements ✅ (primary)	-2.1%	[-2.1%, -2.1%]	1
Improvements ✅ (secondary)	-2.2%	[-2.2%, -2.2%]	1
All ❌✅ (primary)	-0.1%	[-2.1%, 2.0%]	2

saethlin · 2023-02-16T18:28:18Z

Hmm, that's a lot more LLVM work in cranelift-codegen and some regressions in check builds. I think the regressions in opt-full builds are due to extra MIR inlining.

Probably the LLVM inlining got bumped as well. I'll study the cachegrind diffs and MIR diffs about 7 hours from now. But I think there may be an argument for merging as-is, it seems unlikely that the big regression is actionable.

saethlin · 2023-02-16T23:25:24Z

I cannot find any MIR inlining differences in cranelift-codegen. Though several reports of inlining now report that they are being inlined at different scopes, so something different has happened to the caller context in a few places. I know that the MIR inlining in the standard library has changed, so I suspect the regression is related to that.

cachegrind diffs for the check regressions point at a smattering of functions. I've looked into types_may_unify and hash_stable for Span. It looks like the code in types_may_unify is rearranged, but I have no idea why it is slower. I can't find any difference in hash_stable for Span. In both cases I cannot use any traditional profiling tools, so I propose we just accept those regressions because on balance this PR is an improvement.

saethlin · 2023-02-16T23:32:12Z

@cjgillot With this PR, there are some copies of &mut that look unnecessary to me. I feel like CopyProp is supposed to delete these. Can you explain why it doesn't? For example, I feel like _3 should become _1:

fn str::<impl at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:136:1: 136:9>::make_ascii_lowercase(_1: &mut str) -> () {
    debug self => _1;                    // in scope 0 at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:2444:33: 2444:42
    let mut _0: ();                      // return place in scope 0 at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:2444:44: 2444:44
    let mut _2: &mut [u8];               // in scope 0 at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:2446:27: 2446:46
    let mut _3: &mut str;                // in scope 0 at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:2446:27: 2446:46
    scope 1 {
        debug me => _2;                  // in scope 1 at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:2446:13: 2446:15
    }
    scope 2 {
        scope 3 (inlined str::<impl str>::as_bytes_mut) { // at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:2446:32: 2446:46
            debug self => _3;            // in scope 3 at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:369:32: 369:41
            let mut _4: *mut [u8];       // in scope 3 at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:374:24: 374:55
            let mut _5: *mut str;        // in scope 3 at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:374:25: 374:41
            scope 4 {
            }
        }
    }

    bb0: {
        StorageLive(_3);                 // scope 2 at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:2446:27: 2446:46
        _3 = _1;                         // scope 2 at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:2446:27: 2446:46
        StorageLive(_4);                 // scope 4 at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:374:24: 374:55
        StorageLive(_5);                 // scope 4 at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:374:25: 374:41
        _5 = &raw mut (*_3);             // scope 4 at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:374:25: 374:29
        _4 = move _5 as *mut [u8] (PtrToPtr); // scope 4 at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:374:24: 374:55
        StorageDead(_5);                 // scope 4 at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:374:54: 374:55
        _2 = &mut (*_4);                 // scope 4 at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:374:18: 374:55
        StorageDead(_4);                 // scope 3 at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:375:5: 375:6
        StorageDead(_3);                 // scope 2 at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:2446:45: 2446:46
        _0 = slice::ascii::<impl [u8]>::make_ascii_lowercase(_2) -> bb1; // scope 1 at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:2447:9: 2447:34
                                         // mir::Constant
                                         // + span: /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:2447:12: 2447:32
                                         // + literal: Const { ty: for<'a> fn(&'a mut [u8]) {slice::ascii::<impl [u8]>::make_ascii_lowercase}, val: Value(<ZST>) }
    }

    bb1: {
        return;                          // scope 0 at /home/ben/rust/build/x86_64-unknown-linux-gnu/stage1/lib/rustlib/src/rust/library/core/src/str/mod.rs:2448:6: 2448:6
    }
}

cjgillot · 2023-02-17T17:05:28Z

@saethlin which function's MIR are you showing? I can't read the span nor its name, truncated by the copy-paste.

saethlin · 2023-02-17T17:20:14Z

Wow yeah I really truncated that didn't I. I updated the comment, and the function is str::make_ascii_lowercase.

cjgillot · 2023-02-17T17:45:29Z

It's a shortcoming of the SSA analysis. Based on a visitor, it checks for PlaceContext::MutatingUse(_). But in the default implementation, a &raw mut * corresponds to a PlaceContext::MutatingUse(MutatingUseContext::Projection) although it does not mutate the local. d8b8371 should solve this.

saethlin · 2023-02-17T17:48:34Z

That makes sense. I've always though the PlaceContext didn't categorize things correctly for MIR opts.

cjgillot · 2023-02-17T18:42:34Z

I cannot find any MIR inlining differences in cranelift-codegen. Though several reports of inlining now report that they are being inlined at different scopes, so something different has happened to the caller context in a few places. I know that the MIR inlining in the standard library has changed, so I suspect the regression is related to that.

cachegrind diffs for the check regressions point at a smattering of functions. I've looked into types_may_unify and hash_stable for Span. It looks like the code in types_may_unify is rearranged, but I have no idea why it is slower. I can't find any difference in hash_stable for Span. In both cases I cannot use any traditional profiling tools, so I propose we just accept those regressions because on balance this PR is an improvement.

I agree with merging this PR as is. In addition, we have a gain up to 3% in terms of binary size and metadata size. I'll propose some improvement to CopyProp in a separate PR.

@bors r+

bors · 2023-02-17T18:42:35Z

📌 Commit 1409cb5 has been approved by cjgillot

It is now in the queue for this repository.

bors · 2023-02-17T20:50:15Z

⌛ Testing commit 1409cb5 with merge 231bcd1...

bors · 2023-02-18T00:20:43Z

☀️ Test successful - checks-actions
Approved by: cjgillot
Pushing 231bcd1 to master...

rust-timer · 2023-02-18T01:26:49Z

Finished benchmarking commit (231bcd1): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.6%	[0.2%, 3.1%]	16
Regressions ❌ (secondary)	0.4%	[0.2%, 0.5%]	11
Improvements ✅ (primary)	-0.7%	[-2.3%, -0.3%]	32
Improvements ✅ (secondary)	-1.0%	[-1.7%, -0.3%]	17
All ❌✅ (primary)	-0.3%	[-2.3%, 3.1%]	48

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.9%	[0.1%, 5.5%]	6
Regressions ❌ (secondary)	4.2%	[4.2%, 4.2%]	1
Improvements ✅ (primary)	-3.1%	[-8.2%, -1.5%]	6
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.1%	[-8.2%, 5.5%]	12

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.0%	[1.6%, 2.4%]	2
Regressions ❌ (secondary)	5.9%	[5.6%, 6.6%]	4
Improvements ✅ (primary)	-1.6%	[-2.2%, -1.0%]	2
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	0.2%	[-2.2%, 2.4%]	4

Do not consider `&mut *x` as mutating `x` in `CopyProp` This PR removes an unfortunate overly cautious case from the current implementation. Found by rust-lang#105274 cc `@saethlin`

Do not consider `&mut *x` as mutating `x` in `CopyProp` This PR removes an unfortunate overly cautious case from the current implementation. Found by rust-lang/rust#105274 cc `@saethlin`

rustbot assigned fee1-dead Dec 4, 2022

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 4, 2022

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 4, 2022

This comment has been minimized.

Sign in to view

rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Dec 5, 2022

saethlin marked this pull request as ready for review December 5, 2022 17:59

rustbot assigned wesleywiser and unassigned fee1-dead Dec 8, 2022

rustbot added the S-blocked Status: Blocked on something else such as an RFC or other implementation work. label Jan 7, 2023

saethlin force-pushed the instcombine-mut-ref branch from d966a3b to f946ab9 Compare January 16, 2023 18:05

rustbot assigned oli-obk Jan 16, 2023

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 16, 2023

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 17, 2023

cjgillot added the perf-regression-triaged The performance regression has been triaged. label Feb 17, 2023

cjgillot mentioned this pull request Feb 17, 2023

Do not consider &mut *x as mutating x in CopyProp #108178

Merged

bors added the merged-by-bors This PR was explicitly merged by bors. label Feb 18, 2023

bors merged commit 231bcd1 into rust-lang:master Feb 18, 2023

rustbot added this to the 1.69.0 milestone Feb 18, 2023

saethlin deleted the instcombine-mut-ref branch March 15, 2023 00:33

Enable instcombine for mutable reborrows #105274

Enable instcombine for mutable reborrows #105274

Uh oh!

Conversation

saethlin commented Dec 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented Dec 4, 2022

Uh oh!

saethlin commented Dec 4, 2022

Uh oh!

This comment has been minimized.

bors commented Dec 4, 2022

Uh oh!

bors commented Dec 5, 2022

Uh oh!

This comment has been minimized.

rust-timer commented Dec 5, 2022

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Uh oh!

saethlin commented Dec 5, 2022

Uh oh!

rustbot commented Dec 5, 2022

Uh oh!

fee1-dead commented Dec 7, 2022

Uh oh!

fee1-dead commented Dec 8, 2022

Uh oh!

saethlin commented Dec 8, 2022

Uh oh!

bjorn3 commented Dec 11, 2022

Uh oh!

saethlin commented Dec 11, 2022

Uh oh!

saethlin commented Jan 7, 2023

Uh oh!

saethlin commented Jan 16, 2023

Uh oh!

rustbot commented Jan 16, 2023

Uh oh!

bors commented Feb 16, 2023

Uh oh!

bors commented Feb 16, 2023

Uh oh!

This comment has been minimized.

rust-timer commented Feb 16, 2023

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Uh oh!

saethlin commented Feb 16, 2023

Uh oh!

saethlin commented Feb 16, 2023

Uh oh!

saethlin commented Feb 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cjgillot commented Feb 17, 2023

Uh oh!

saethlin commented Feb 17, 2023

Uh oh!

cjgillot commented Feb 17, 2023

Uh oh!

saethlin commented Feb 17, 2023

Uh oh!

cjgillot commented Feb 17, 2023

Uh oh!

bors commented Feb 17, 2023

Uh oh!

bors commented Feb 17, 2023

Uh oh!

bors commented Feb 18, 2023

Uh oh!

rust-timer commented Feb 18, 2023

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Uh oh!

Uh oh!

saethlin commented Dec 4, 2022 •

edited

Loading

saethlin commented Feb 16, 2023 •

edited

Loading