Skip to content

Introduce let...else #87688

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Sep 1, 2021
Merged

Introduce let...else #87688

merged 13 commits into from
Sep 1, 2021

Conversation

camsteffen
Copy link
Contributor

@camsteffen camsteffen commented Aug 2, 2021

Tracking issue: #87335

The trickiest part for me was enforcing the diverging else block with clear diagnostics. Perhaps the obvious solution is to expand to let _: ! = .., but I decided against this because, when a "mismatched type" error is found in typeck, there is no way to trace where in the HIR the expected type originated, AFAICT. In order to pass down this information, I believe we should introduce Expectation::LetElseNever(HirId) or maybe add HirId to Expectation::HasType, but I left that as a future enhancement. For now, I simply assert that the block is ! with a custom ObligationCauseCode, and I think this is clear enough, at least to start. The downside here is that the error points at the entire block rather than the specific expression with the wrong type. I left a todo to this effect.

Overall, I believe this PR is feature-complete with regard to the RFC.

@rust-highfive
Copy link
Contributor

Some changes occurred in src/tools/clippy.

cc @rust-lang/clippy

Some changes occurred in src/tools/rustfmt.

cc @calebcartwright

@rust-highfive
Copy link
Contributor

r? @davidtwco

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 2, 2021
@rust-log-analyzer

This comment has been minimized.

@cjgillot cjgillot self-assigned this Aug 2, 2021
@bors
Copy link
Collaborator

bors commented Aug 2, 2021

☔ The latest upstream changes (presumably #87698) made this pull request unmergeable. Please resolve the merge conflicts.

Copy link
Contributor

@cjgillot cjgillot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @camsteffen for this implementation! I left a few comments on the lowering part: I am not convinced that you chose the simplest way to express this in HIR. I am interested in your reasonning.

@cjgillot
Copy link
Contributor

cjgillot commented Aug 3, 2021

Also, how does this feature articulate with the refactoring in #80357 ?

@camsteffen
Copy link
Contributor Author

Also, how does this feature articulate with the refactoring in #80357 ?

It don't really see any conflict, surprisingly, because that PR touches every let-containing syntax except for let statements. @c410-f3r do you agree with that statement?

@camsteffen
Copy link
Contributor Author

camsteffen commented Aug 4, 2021

Rebased and...

  • Changed the lowered HIR structure to let val: $ty = $init; match val { $pat => { $trailing_stmts }, _ => $else_blk }
  • Added tests for type annotation and lint attributes
  • Corrected the rustfmt fix

@c410-f3r
Copy link
Contributor

c410-f3r commented Aug 4, 2021

Also, how does this feature articulate with the refactoring in #80357 ?

It don't really see any conflict, surprisingly, because that PR touches every let-containing syntax except for let statements. @c410-f3r do you agree with that statement?

#80357 touches more than +2k lines of code and I have doubts about no potential conflicts. My free-time is mostly limited to weekends so a confirmation, if it is worth something, will have to wait.

It's been 8 months of endless rebases and ppl are waiting years for #53667. Personally, one more possible conflicting PR or one less possible conflicting PR doesn't make a difference any more

@rust-log-analyzer

This comment has been minimized.

@camsteffen
Copy link
Contributor Author

#80357 touches more than +2k lines of code and I have doubts about no potential conflicts. My free-time is mostly limited to weekends so a confirmation, if it is worth something, will have to wait.

No rush. I'm not wondering if there are any potential merge conflicts. Just wondering if there is a fundamental conflict with how the two PR's change the AST/HIR. And I don't think there is from my own cursory scan.

@c410-f3r
Copy link
Contributor

c410-f3r commented Aug 8, 2021

After cherry-picking, it looked like that most conflicts were more related to missing MatchSource variants than logical stuff although I am not completely sure.

But as stated before with all personal intrinsic reasons, don't mind #80357.

@davidtwco davidtwco removed their assignment Aug 12, 2021
@camsteffen
Copy link
Contributor Author

@cjgillot Do you think this is ready to merge after addressing the comments (minor changes) or are you still reviewing?

@cjgillot
Copy link
Contributor

@camsteffen This will be ready to merge once the comments have been addressed.

@bors
Copy link
Collaborator

bors commented Aug 16, 2021

☔ The latest upstream changes (presumably #80357) made this pull request unmergeable. Please resolve the merge conflicts.

@camsteffen
Copy link
Contributor Author

Is now based on #88187.

After rebasing over #80357, it seemed right to change the desugared form (again) to use the new HIR form of if let. So now the desugared form is let value: type = init; if let pat = value { trailing_stmts } else { diverging_block }.

@cjgillot
Copy link
Contributor

@bors r+ rollup=never

@bors
Copy link
Collaborator

bors commented Aug 31, 2021

📌 Commit 3ff1d6b has been approved by cjgillot

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 31, 2021
@bors
Copy link
Collaborator

bors commented Sep 1, 2021

⌛ Testing commit 3ff1d6b with merge c2a4088...

@bors
Copy link
Collaborator

bors commented Sep 1, 2021

☀️ Test successful - checks-actions
Approved by: cjgillot
Pushing c2a4088 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 1, 2021
@bors bors merged commit c2a4088 into rust-lang:master Sep 1, 2021
@rustbot rustbot added this to the 1.56.0 milestone Sep 1, 2021
@camsteffen camsteffen deleted the let-else branch September 1, 2021 12:50
@pnkfelix
Copy link
Member

pnkfelix commented Sep 8, 2021

During weekly performance triage, this was flagged as causing small regressions (approximately 1%) in instruction counts on several of the *-doc tests, found via human eye.

Might be worth double-checking to see if that slowdown was expected.

@pnkfelix pnkfelix added the perf-regression Performance regression. label Sep 8, 2021
@Mark-Simulacrum
Copy link
Member

I've so far failed to arrive at any conclusive evidence with this change -- going to keep investigating, though.

The regression seems real -- cachegrind etc do reproduce it -- but it traces to functions like decode read_str and such, which seems rather unhelpful. I'm working now on building a debuginfo-enabled compiler for both before/after to try and get a better sense of the causes here.

@Mark-Simulacrum
Copy link
Member

This is beginning to look fairly similar to #88881. There's around 20-30k additional instructions executed (for await-call-tree-doc), mostly in Span decoding -- but Span decoding is a very complex piece of code.

I've tried a number of things to try and figure out what the underlying cause for this is, but my current belief is that we're just seeing the effects of different optimizer decisions -- no actual difference in behavior -- but it is both unclear why that is or what's causing it. I've looked at the disassembly before/after and LLVM IR (manually dumped out of rustc with bootstrap adjustments locally), but none of that really yielded any clear indicators of what we can do to fix this.

I'll post further updates as I continue my investigation...

@Mark-Simulacrum
Copy link
Member

It looks like the Span decoding function has worse codegen with this PR. As best as I can tell, this isn't readily "fixable" and seems not directly connected to this PR, but adjustments to the LocalKind Encode impl do seem to fix it (I can't explain why changes to encoding affect decoding). I can reproduce the regression locally with and without PGO, so it doesn't seem related to that either.

AFAICT, there are some pretty weird choices made by LLVM in the assembly for this function -- I've uploaded disassembly here. For example, we can see this diff near the top of the function:

-                       mov     rsi, qword ptr [r13 + 8]
-                       mov     rdi, qword ptr [r13 + 16]
-                       cmp     rdi, rsi
+                       mov     r15, qword ptr [rbp + 8]
+                       mov     rdi, qword ptr [rbp + 16]
+                       mov     rsi, rdi
+                       sub     rsi, r15

If we ignore the register renaming, the meaningful difference is the cmp is replaced with a mov and sub. However, there's no real reason for LLVM to do that. rsi is not used again before being overwritten I think, so this is just wasting a mov for no real reason. It seems like the majority of the changes to this function are similar "weird" small losses, not anything terribly material. There's also some extra stack spilling (note that the stack size increases by 32 bytes).

Replacing the LocalKind Encode impl with empty stubs or otherwise adjusting it seems to fix the codegen here. The LocalKind Encode impl is not invoked by any typical compilation (tested via inserting a panic!). I'm fairly certain now that the indirect effects on Span decoding are due to shuffling of CGU partitions for the rustc_metadata crate

I instrumented the CGU partitioning (-Zprint-mono-items=lazy) on the minimal change which replaces the Encode impl for LocalKind with an empty stub, and locally it looks like the only two crate which have changed are rustc_interface and rustc_metadata.

For rustc_metadata, if we ignore changes in which CGU gets which mono item, the only additions are the extra emit_enum/emit_enum_variant, etc. along with the closures for those:

@@ -3895,0 +3896 @@
+MONO_ITEM fn <rmeta::encoder::EncodeContext as rustc_serialize::Encoder>::emit_enum::<[closure@rustc_ast::ast::_DERIVE_rustc_serialize_Encodable_E_FOR_LocalKind::<impl rustc_serialize::Encodable<rmeta::encoder::EncodeContext> for >
@@ -4311,0 +4313,3 @@
+MONO_ITEM fn <rmeta::encoder::EncodeContext as rustc_serialize::Encoder>::emit_enum_variant::<[closure@rustc_ast::ast::_DERIVE_rustc_serialize_Encodable_E_FOR_LocalKind::<impl rustc_serialize::Encodable<rmeta::encoder::EncodeConte>
+MONO_ITEM fn <rmeta::encoder::EncodeContext as rustc_serialize::Encoder>::emit_enum_variant::<[closure@rustc_ast::ast::_DERIVE_rustc_serialize_Encodable_E_FOR_LocalKind::<impl rustc_serialize::Encodable<rmeta::encoder::EncodeConte>
+MONO_ITEM fn <rmeta::encoder::EncodeContext as rustc_serialize::Encoder>::emit_enum_variant::<[closure@rustc_ast::ast::_DERIVE_rustc_serialize_Encodable_E_FOR_LocalKind::<impl rustc_serialize::Encodable<rmeta::encoder::EncodeConte>
@@ -5977,0 +5982,3 @@
+MONO_ITEM fn <rmeta::encoder::EncodeContext as rustc_serialize::Encoder>::emit_enum_variant_arg::<[closure@rustc_ast::ast::_DERIVE_rustc_serialize_Encodable_E_FOR_LocalKind::<impl rustc_serialize::Encodable<rmeta::encoder::EncodeC>
+MONO_ITEM fn <rmeta::encoder::EncodeContext as rustc_serialize::Encoder>::emit_enum_variant_arg::<[closure@rustc_ast::ast::_DERIVE_rustc_serialize_Encodable_E_FOR_LocalKind::<impl rustc_serialize::Encodable<rmeta::encoder::EncodeC>
+MONO_ITEM fn <rmeta::encoder::EncodeContext as rustc_serialize::Encoder>::emit_enum_variant_arg::<[closure@rustc_ast::ast::_DERIVE_rustc_serialize_Encodable_E_FOR_LocalKind::<impl rustc_serialize::Encodable<rmeta::encoder::EncodeC>
@@ -7817 +7823,0 @@
-MONO_ITEM fn <rustc_ast::LocalKind as rustc_serialize::Encodable<rmeta::encoder::EncodeContext>>::encode
@@ -19645,0 +19652,8 @@
+MONO_ITEM fn rustc_ast::ast::_DERIVE_rustc_serialize_Encodable_E_FOR_LocalKind::<impl rustc_serialize::Encodable<rmeta::encoder::EncodeContext> for rustc_ast::LocalKind>::encode
+MONO_ITEM fn rustc_ast::ast::_DERIVE_rustc_serialize_Encodable_E_FOR_LocalKind::<impl rustc_serialize::Encodable<rmeta::encoder::EncodeContext> for rustc_ast::LocalKind>::encode::{closure#0}
+MONO_ITEM fn rustc_ast::ast::_DERIVE_rustc_serialize_Encodable_E_FOR_LocalKind::<impl rustc_serialize::Encodable<rmeta::encoder::EncodeContext> for rustc_ast::LocalKind>::encode::{closure#0}::{closure#0}
+MONO_ITEM fn rustc_ast::ast::_DERIVE_rustc_serialize_Encodable_E_FOR_LocalKind::<impl rustc_serialize::Encodable<rmeta::encoder::EncodeContext> for rustc_ast::LocalKind>::encode::{closure#0}::{closure#1}
+MONO_ITEM fn rustc_ast::ast::_DERIVE_rustc_serialize_Encodable_E_FOR_LocalKind::<impl rustc_serialize::Encodable<rmeta::encoder::EncodeContext> for rustc_ast::LocalKind>::encode::{closure#0}::{closure#1}::{closure#0}
+MONO_ITEM fn rustc_ast::ast::_DERIVE_rustc_serialize_Encodable_E_FOR_LocalKind::<impl rustc_serialize::Encodable<rmeta::encoder::EncodeContext> for rustc_ast::LocalKind>::encode::{closure#0}::{closure#2}
+MONO_ITEM fn rustc_ast::ast::_DERIVE_rustc_serialize_Encodable_E_FOR_LocalKind::<impl rustc_serialize::Encodable<rmeta::encoder::EncodeContext> for rustc_ast::LocalKind>::encode::{closure#0}::{closure#2}::{closure#0}
+MONO_ITEM fn rustc_ast::ast::_DERIVE_rustc_serialize_Encodable_E_FOR_LocalKind::<impl rustc_serialize::Encodable<rmeta::encoder::EncodeContext> for rustc_ast::LocalKind>::encode::{closure#0}::{closure#2}::{closure#1}

This is pretty expected, since all of these are getting instantiated with rmeta::encoder::EncodeContext for the first time (it's defined in rustc_metadata). This all mostly comes about from a pretty deep callstack, rooted in attribute encoding (since those contain arbitrary expressions and paths, we get essentially all AST types serialize and deserialize impl's codegened).

The partitioning has changed for CGUs 0, 3, and 4.

   8622 after-with-empty-encode/rustc-metadata/rustc_metadata.f82b8cc0-cgu.0
   2824 after-with-empty-encode/rustc-metadata/rustc_metadata.f82b8cc0-cgu.3
   6021 after-with-empty-encode/rustc-metadata/rustc_metadata.f82b8cc0-cgu.4
   8634 after-mono/rustc-metadata/rustc_metadata.f82b8cc0-cgu.0
   4466 after-mono/rustc-metadata/rustc_metadata.f82b8cc0-cgu.3
   4377 after-mono/rustc-metadata/rustc_metadata.f82b8cc0-cgu.4

Generally it seems like CGU 3 is now much larger -- presumably around 1,000 items moved from being put into CGU 4 into CGU 3.

This included a move of fn <rmeta::decoder::DecodeContext as rustc_serialize::Decoder>::read_enum_variant_arg::<rustc_span::Span, for<'r> fn(&'r mut rmeta::decoder::DecodeContext) -> st... from CGU 4 to 3. I believe (though have no hard evidence for it) that this likely caused transitive effects making the codegen worse. Notably other bits of that impl didn't move, so probably inlining is suffering here. (The below is a grep for rustc_span::Span as rustc_serialize::Decodable<.*>>::decode in rustc_metadata CGUs).

--- empty-encode-span	2021-10-11 14:42:56.626182306 -0400
+++ full-encode-span	2021-10-11 14:42:36.389748828 -0400
@@ -2,9 +2,9 @@
 rustc-metadata/rustc_metadata.f82b8cc0-cgu.15:fn <rmeta::decoder::DecodeContext as rustc_serialize::Decoder>::read_enum_variant_arg::<rustc_span::Span, for<'r> fn(&'r mut rmeta::decoder::DecodeContext) -> std::result::Result<rustc_span::Span, <rmeta::decoder::DecodeContext as rustc_serialize::Decoder>::Error> {<rustc_span::Span as rustc_serialize::Decodable<rmeta::decoder::DecodeContext>>::decode}> Internal
 rustc-metadata/rustc_metadata.f82b8cc0-cgu.15:fn <rmeta::decoder::DecodeContext as rustc_serialize::Decoder>::read_struct_field::<rustc_span::Span, for<'r> fn(&'r mut rmeta::decoder::DecodeContext) -> std::result::Result<rustc_span::Span, <rmeta::decoder::DecodeContext as rustc_serialize::Decoder>::Error> {<rustc_span::Span as rustc_serialize::Decodable<rmeta::decoder::DecodeContext>>::decode}> Internal
 rustc-metadata/rustc_metadata.f82b8cc0-cgu.3:fn <for<'r> fn(&'r mut rmeta::decoder::DecodeContext) -> std::result::Result<rustc_span::Span, <rmeta::decoder::DecodeContext as rustc_serialize::Decoder>::Error> {<rustc_span::Span as rustc_serialize::Decodable<rmeta::decoder::DecodeContext>>::decode} as std::ops::FnOnce<(&mut rmeta::decoder::DecodeContext,)>>::call_once - shim(for<'r> fn(&'r mut rmeta::decoder::DecodeContext) -> std::result::Result<rustc_span::Span, <rmeta::decoder::DecodeContext as rustc_serialize::Decoder>::Error> {<rustc_span::Span as rustc_serialize::Decodable<rmeta::decoder::DecodeContext>>::decode}) Internal
+rustc-metadata/rustc_metadata.f82b8cc0-cgu.3:fn <rmeta::decoder::DecodeContext as rustc_serialize::Decoder>::read_enum_variant_arg::<rustc_span::Span, for<'r> fn(&'r mut rmeta::decoder::DecodeContext) -> std::result::Result<rustc_span::Span, <rmeta::decoder::DecodeContext as rustc_serialize::Decoder>::Error> {<rustc_span::Span as rustc_serialize::Decodable<rmeta::decoder::DecodeContext>>::decode}> Internal
 rustc-metadata/rustc_metadata.f82b8cc0-cgu.3:fn <rmeta::decoder::DecodeContext as rustc_serialize::Decoder>::read_struct_field::<rustc_span::Span, for<'r> fn(&'r mut rmeta::decoder::DecodeContext) -> std::result::Result<rustc_span::Span, <rmeta::decoder::DecodeContext as rustc_serialize::Decoder>::Error> {<rustc_span::Span as rustc_serialize::Decodable<rmeta::decoder::DecodeContext>>::decode}> Internal
 rustc-metadata/rustc_metadata.f82b8cc0-cgu.4:fn <for<'r> fn(&'r mut rmeta::decoder::DecodeContext) -> std::result::Result<rustc_span::Span, <rmeta::decoder::DecodeContext as rustc_serialize::Decoder>::Error> {<rustc_span::Span as rustc_serialize::Decodable<rmeta::decoder::DecodeContext>>::decode} as std::ops::FnOnce<(&mut rmeta::decoder::DecodeContext,)>>::call_once - shim(for<'r> fn(&'r mut rmeta::decoder::DecodeContext) -> std::result::Result<rustc_span::Span, <rmeta::decoder::DecodeContext as rustc_serialize::Decoder>::Error> {<rustc_span::Span as rustc_serialize::Decodable<rmeta::decoder::DecodeContext>>::decode}) Internal
-rustc-metadata/rustc_metadata.f82b8cc0-cgu.4:fn <rmeta::decoder::DecodeContext as rustc_serialize::Decoder>::read_enum_variant_arg::<rustc_span::Span, for<'r> fn(&'r mut rmeta::decoder::DecodeContext) -> std::result::Result<rustc_span::Span, <rmeta::decoder::DecodeContext as rustc_serialize::Decoder>::Error> {<rustc_span::Span as rustc_serialize::Decodable<rmeta::decoder::DecodeContext>>::decode}> Internal
 rustc-metadata/rustc_metadata.f82b8cc0-cgu.4:fn <rmeta::decoder::DecodeContext as rustc_serialize::Decoder>::read_struct_field::<rustc_span::Span, for<'r> fn(&'r mut rmeta::decoder::DecodeContext) -> std::result::Result<rustc_span::Span, <rmeta::decoder::DecodeContext as rustc_serialize::Decoder>::Error> {<rustc_span::Span as rustc_serialize::Decodable<rmeta::decoder::DecodeContext>>::decode}> Internal
 rustc-metadata/rustc_metadata.f82b8cc0-cgu.7:fn <for<'r> fn(&'r mut rmeta::decoder::DecodeContext) -> std::result::Result<rustc_span::Span, <rmeta::decoder::DecodeContext as rustc_serialize::Decoder>::Error> {<rustc_span::Span as rustc_serialize::Decodable<rmeta::decoder::DecodeContext>>::decode} as std::ops::FnOnce<(&mut rmeta::decoder::DecodeContext,)>>::call_once - shim(for<'r> fn(&'r mut rmeta::decoder::DecodeContext) -> std::result::Result<rustc_span::Span, <rmeta::decoder::DecodeContext as rustc_serialize::Decoder>::Error> {<rustc_span::Span as rustc_serialize::Decodable<rmeta::decoder::DecodeContext>>::decode}) Internal
 rustc-metadata/rustc_metadata.f82b8cc0-cgu.7:fn <rmeta::decoder::DecodeContext as rustc_serialize::Decoder>::read_struct_field::<rustc_span::Span, for<'r> fn(&'r mut rmeta::decoder::DecodeContext) -> std::result::Result<rustc_span::Span, <rmeta::decoder::DecodeContext as rustc_serialize::Decoder>::Error> {<rustc_span::Span as rustc_serialize::Decodable<rmeta::decoder::DecodeContext>>::decode}> Internal

Overall I think this means we probably can't really do much about the hit here -- it's just more CGU partitioning sometimes leading to suboptimal results. It's possible that adding some targeted #[inline] annotations or otherwise shifting around code might help with this, but generally that seems brittle and not particularly warranted. I'm going to mark this regression as triaged, because I don't think we can "fix" it at this time. Hopefully we'll eventually have a nicer partitioning algorithm that avoids this kind of trouble.

In the meantime, it's possible we can help by avoiding so much transitive codegen being necessary in rustc_metadata for all the serialize impls for rustc_ast and such. But that's a hard project and not one with obvious solutions, so I think we shouldn't block triaging this PR on it.

@rustbot label +perf-regression-triaged

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Oct 11, 2021
@cormacrelf cormacrelf mentioned this pull request Feb 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.