Skip to content

Conversation

@ashivaram23
Copy link
Contributor

@ashivaram23 ashivaram23 commented Jun 19, 2025

This allows ElaborateDrops to remove drops when a match wildcard arm covers multiple no-Drop enum variants. It modifies dataflow analysis to update the MaybeUninitializedPlaces and MaybeInitializedPlaces data for a block reached through an otherwise edge.

Fixes #142705.

@rustbot
Copy link
Collaborator

rustbot commented Jun 19, 2025

r? @petrochenkov

rustbot has assigned @petrochenkov.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 19, 2025
@rust-log-analyzer

This comment has been minimized.

@petrochenkov
Copy link
Contributor

r? compiler

@rustbot rustbot assigned fee1-dead and unassigned petrochenkov Jun 19, 2025
@workingjubilee
Copy link
Member

hm.
@bors2 try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors
Copy link

rust-bors bot commented Jun 19, 2025

⌛ Trying commit 5837955 with merge e3d7e41

To cancel the try build, run the command @bors2 try cancel.

rust-bors bot added a commit that referenced this pull request Jun 19, 2025
Update `MaybeUninitializedPlaces` and `MaybeInitializedPlaces` for `otherwise`

<!-- homu-ignore:start -->
<!--
If this PR is related to an unstable feature or an otherwise tracked effort,
please link to the relevant tracking issue here. If you don't know of a related
tracking issue or there are none, feel free to ignore this.

This PR will get automatically assigned to a reviewer. In case you would like
a specific user to review your work, you can assign it to them by using

    r? <reviewer name>
-->
<!-- homu-ignore:end -->

This allows `ElaborateDrops` to remove drops when a `match` wildcard arm covers multiple no-Drop enum variants. It modifies dataflow analysis to update the `MaybeUninitializedPlaces` and `MaybeInitializedPlaces` data for a block reached through an `otherwise` edge.

This appears to fix #142705, but I don't know for sure if it's actually correct (are there cases where this would be wrong and break things?). I also haven't tested that it actually improves compile times, or machine code output outside of the examples in the issue.
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 19, 2025
@rust-bors
Copy link

rust-bors bot commented Jun 19, 2025

☀️ Try build successful (CI)
Build commit: e3d7e41 (e3d7e41d7e8c513f55398f6e34eb2af7fb7df49f, parent: 8de4c7234dd9b97c9d76b58671343fdbbc9a433e)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (e3d7e41): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
1.1% [0.6%, 1.5%] 2
Regressions ❌
(secondary)
0.6% [0.4%, 0.8%] 4
Improvements ✅
(primary)
-0.5% [-1.2%, -0.2%] 32
Improvements ✅
(secondary)
-0.7% [-1.3%, -0.2%] 19
All ❌✅ (primary) -0.4% [-1.2%, 1.5%] 34

Max RSS (memory usage)

Results (primary -0.2%, secondary 2.8%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
3.5% [2.3%, 4.7%] 3
Regressions ❌
(secondary)
2.8% [2.2%, 3.5%] 2
Improvements ✅
(primary)
-2.4% [-3.6%, -0.5%] 5
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.2% [-3.6%, 4.7%] 8

Cycles

Results (primary 0.9%, secondary -5.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
2.7% [2.6%, 2.9%] 2
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-1.0% [-1.2%, -0.8%] 2
Improvements ✅
(secondary)
-5.0% [-5.0%, -5.0%] 1
All ❌✅ (primary) 0.9% [-1.2%, 2.9%] 4

Binary size

Results (primary -0.1%, secondary -0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.1% [0.0%, 0.2%] 7
Regressions ❌
(secondary)
0.0% [0.0%, 0.1%] 7
Improvements ✅
(primary)
-0.1% [-0.5%, -0.0%] 64
Improvements ✅
(secondary)
-0.1% [-0.1%, -0.0%] 50
All ❌✅ (primary) -0.1% [-0.5%, 0.2%] 71

Bootstrap: 691.904s -> 692.572s (0.10%)
Artifact size: 372.02 MiB -> 371.98 MiB (-0.01%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jun 20, 2025
@ashivaram23
Copy link
Contributor Author

Hmm that's unfortunate. It might help to avoid paying the cost of the extra clone and checks when it won't make a difference, like in the trivial otherwise targets when there's no wildcard. Maybe the otherwise could be handled in a separate pass after collecting variants to kill/add in a SmallVec, in case that helps somehow.

What's more concerning is runtime benchmarks being worse. There are only two, and I don't know how noisy they are, but 8% wall time increase for css-parse-fb is pretty bad. There are also some compile time benchmarks whose graphs show significant increases in wall time spent in LLVM. Does this just not play well with later optimizations?

@rustbot
Copy link
Collaborator

rustbot commented Jun 21, 2025

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

@ashivaram23
Copy link
Contributor Author

The last commit makes it so that there's an option in the MaybeUninitializedPlaces and MaybeInitializedPlaces builders for whether or not to update the otherwise block's lattice element, and it's set to only do so for MaybeInitializedPlaces in RemoveUninitDrops and ElaborateDrops. That should be enough to clean up some drops while hopefully being faster and not messing with other passes in ways that cause runtime slowdowns.

@workingjubilee
Copy link
Member

the runtime benchmarks are unfortunately very noisy

enum_place: mir::Place<'tcx>,
active_variant: VariantIdx,
mut handle_inactive_variant: impl FnMut(MovePathIndex),
mut handle_active_variant: Option<impl FnMut(MovePathIndex)>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it will be better, perf-wise to make this not take an option but just use a no-op for callers that do not want to handle it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's definitely possible, since I think in that case on_all_children_bits could be monomorphized into an empty function. I made the change and it does reduce stage2 librustc_driver.so size a bit.

@fee1-dead
Copy link
Member

this part of MIR is a little above my level of confidence for review, so

r? compiler

@rustbot rustbot assigned lcnr and unassigned fee1-dead Jun 22, 2025
@lcnr
Copy link
Contributor

lcnr commented Jun 23, 2025

r? wg-mir-opt

@rustbot rustbot assigned dianqk and unassigned lcnr Jun 23, 2025
@rust-timer

This comment has been minimized.

@rust-bors
Copy link

rust-bors bot commented Jul 8, 2025

⌛ Trying commit 881c62f with merge 237f435

To cancel the try build, run the command @bors2 try cancel.

rust-bors bot added a commit that referenced this pull request Jul 8, 2025
Apply effects to `otherwise` edge in dataflow analysis

<!-- homu-ignore:start -->
<!--
If this PR is related to an unstable feature or an otherwise tracked effort,
please link to the relevant tracking issue here. If you don't know of a related
tracking issue or there are none, feel free to ignore this.

This PR will get automatically assigned to a reviewer. In case you would like
a specific user to review your work, you can assign it to them by using

    r? <reviewer name>
-->
<!-- homu-ignore:end -->

This allows `ElaborateDrops` to remove drops when a `match` wildcard arm covers multiple no-Drop enum variants. It modifies dataflow analysis to update the `MaybeUninitializedPlaces` and `MaybeInitializedPlaces` data for a block reached through an `otherwise` edge.

This appears to fix #142705, but I don't know for sure if it's actually correct (are there cases where this would be wrong and break things?). I also haven't tested that it actually improves compile times, or machine code output outside of the examples in the issue.
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 8, 2025
@rust-bors
Copy link

rust-bors bot commented Jul 8, 2025

☀️ Try build successful (CI)
Build commit: 237f435 (237f435668e899ed46d943ff30ee6f097b6b03fb, parent: 45b80ac21a454d343833aad763ef604510c88375)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (237f435): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.7% [0.7%, 0.7%] 1
Regressions ❌
(secondary)
0.6% [0.2%, 0.9%] 13
Improvements ✅
(primary)
-0.4% [-1.1%, -0.1%] 15
Improvements ✅
(secondary)
-0.3% [-0.5%, -0.1%] 7
All ❌✅ (primary) -0.3% [-1.1%, 0.7%] 16

Max RSS (memory usage)

Results (primary 2.7%, secondary 3.2%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
4.8% [3.5%, 6.2%] 2
Regressions ❌
(secondary)
3.2% [3.2%, 3.2%] 1
Improvements ✅
(primary)
-1.7% [-1.7%, -1.7%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.7% [-1.7%, 6.2%] 3

Cycles

Results (primary 1.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
2.4% [2.4%, 2.5%] 2
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.0% [-2.0%, -2.0%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.0% [-2.0%, 2.5%] 3

Binary size

Results (primary -0.1%, secondary -0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.3% [0.0%, 1.4%] 10
Regressions ❌
(secondary)
0.0% [0.0%, 0.1%] 4
Improvements ✅
(primary)
-0.1% [-1.5%, -0.0%] 63
Improvements ✅
(secondary)
-0.1% [-0.2%, -0.0%] 51
All ❌✅ (primary) -0.1% [-1.5%, 1.4%] 73

Bootstrap: 467.247s -> 463.296s (-0.85%)
Artifact size: 372.36 MiB -> 372.30 MiB (-0.02%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 8, 2025
@ashivaram23
Copy link
Contributor Author

Squashed and force pushed, the only changes since the last commit are addressing the test file review comments.

Regarding the perf run results, there are 8 cases this could affect:

  1. MaybeInitializedPlaces in MIR type check liveness analysis
  2. MaybeUninitializedPlaces in MIR borrow check
  3. MaybeInitializedPlaces in SanityCheck
  4. MaybeUninitializedPlaces in SanityCheck
  5. MaybeInitializedPlaces in lint_tail_expr_drop_order
  6. MaybeInitializedPlaces in RemoveUninitDrops
  7. MaybeInitializedPlaces in ElaborateDrops
  8. MaybeUninitializedPlaces in ElaborateDrops

The first perf run handled otherwise edges everywhere, the second did it only for 6 and 7, and the third did it for 6-8.

The first has the strongest results, with a lot of improvements and a few regressions. (Also a runtime test regression, which seems to follow a step function where it fluctuates between either exactly 0 or exactly 1.9% relative to the baseline over the past 30 days and went back to 1.9% here, not sure what that means)

The second still shows an overall improvement in compile times but applies to much fewer tests. It also has a different runtime regression with the same binary/step pattern.

The third has more regressions and fewer improvements, and an overall regression in compile times for all benchmarks (mostly due to the secondary benchmarks especially tt-muncher, it's a small overall improvement for the primary ones only). But there are no runtime test regressions and bootstrap times are four seconds faster.

Should I just enable the change for all 8 cases again? Or make it as limited as possible (number 7 alone is enough to fix the wildcard problem), or try other combinations?

@dianqk
Copy link
Member

dianqk commented Jul 9, 2025

These can be subsequent PRs. Thanks!
@bors r+

@bors
Copy link
Collaborator

bors commented Jul 9, 2025

📌 Commit c7ef03a has been approved by dianqk

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 9, 2025
@bors
Copy link
Collaborator

bors commented Jul 9, 2025

⌛ Testing commit c7ef03a with merge d350797...

@bors
Copy link
Collaborator

bors commented Jul 9, 2025

☀️ Test successful - checks-actions
Approved by: dianqk
Pushing d350797 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jul 9, 2025
@bors bors merged commit d350797 into rust-lang:master Jul 9, 2025
12 checks passed
@rustbot rustbot added this to the 1.90.0 milestone Jul 9, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Jul 9, 2025

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 34097a3 (parent) -> d350797 (this PR)

Test differences

Show 12 test diffs

Stage 1

  • [mir-opt] tests/mir-opt/otherwise_drops.rs: [missing] -> pass (J1)

Stage 2

  • [mir-opt] tests/mir-opt/otherwise_drops.rs: [missing] -> pass (J0)

Additionally, 10 doctest diffs were found. These are ignored, as they are noisy.

Job group index

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard d350797b7e1a5b6952f5035cbad2a21613b32f6c --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

  1. x86_64-apple-2: 4201.0s -> 5646.5s (34.4%)
  2. aarch64-apple: 5870.2s -> 3930.0s (-33.1%)
  3. x86_64-apple-1: 6297.3s -> 7782.0s (23.6%)
  4. pr-check-2: 2186.0s -> 2662.0s (21.8%)
  5. dist-apple-various: 5540.8s -> 6233.3s (12.5%)
  6. dist-x86_64-apple: 10344.9s -> 9286.7s (-10.2%)
  7. i686-gnu-2: 5414.8s -> 5928.7s (9.5%)
  8. pr-check-1: 1492.8s -> 1621.1s (8.6%)
  9. i686-gnu-1: 7288.5s -> 7894.3s (8.3%)
  10. x86_64-gnu-debug: 5423.9s -> 5855.6s (8.0%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (d350797): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

  • If the regression was expected or you think it can be justified,
    please write a comment with sufficient written justification, and add
    @rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
  • If you think that you know of a way to resolve the regression, try to create
    a new PR with a fix for the regression.
  • If you do not understand the regression or you think that it is just noise,
    you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
    were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.5% [0.4%, 0.7%] 2
Regressions ❌
(secondary)
0.6% [0.1%, 1.0%] 15
Improvements ✅
(primary)
-0.4% [-1.3%, -0.1%] 18
Improvements ✅
(secondary)
-0.3% [-0.5%, -0.0%] 10
All ❌✅ (primary) -0.3% [-1.3%, 0.7%] 20

Max RSS (memory usage)

Results (primary 2.7%, secondary 0.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
3.7% [1.9%, 5.0%] 5
Regressions ❌
(secondary)
2.7% [2.4%, 2.9%] 2
Improvements ✅
(primary)
-2.2% [-2.2%, -2.2%] 1
Improvements ✅
(secondary)
-3.7% [-3.7%, -3.7%] 1
All ❌✅ (primary) 2.7% [-2.2%, 5.0%] 6

Cycles

Results (secondary -0.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.2% [2.2%, 2.2%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-3.4% [-3.4%, -3.4%] 1
All ❌✅ (primary) - - 0

Binary size

Results (primary -0.1%, secondary -0.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.3% [0.0%, 1.4%] 10
Regressions ❌
(secondary)
0.0% [0.0%, 0.1%] 4
Improvements ✅
(primary)
-0.1% [-1.5%, -0.0%] 70
Improvements ✅
(secondary)
-0.1% [-0.2%, -0.0%] 52
All ❌✅ (primary) -0.1% [-1.5%, 1.4%] 80

Bootstrap: 467.127s -> 465.667s (-0.31%)
Artifact size: 372.35 MiB -> 372.26 MiB (-0.02%)

@Kobzol
Copy link
Member

Kobzol commented Jul 9, 2025

More improvements than regressions, and improvements are mostly in primary benchmarks, while regressions are in secondary benchmarks. This comment mentions a plan for possible future improvements.

@rustbot label: +perf-regression-triaged

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Jul 9, 2025
rust-bors bot added a commit that referenced this pull request Jul 12, 2025
Collect `SwitchInt` target `VariantIdx`s while building `MaybePlacesSwitchIntData`

- Filter discriminants involved in a `SwitchInt` and map their values to `VariantIdx` while building `MaybePlacesSwitchIntData` instead of in `apply_effects_in_block` using `next_discr`. This is easier after #143769.
- Use that `Vec` when handling the `otherwise` target instead of making a new `SmallVec`.

This could make `MaybeInitializedPlaces`/`MaybeUninitializedPlaces` analysis slightly more efficient, particularly after #142707.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Drop is called for match wildcards even when all matched variants have no Drop