Notes about MIR generation for match expressions

I wrote these up while working on https://github.com/rust-lang/rust/issues/72680

I'm not an expert on this code (this is the fist time I'm reading it), so take this with a grain of salt.

# Background

First, my assumption in the last post is incorrect. or-patterns aren't desugared into different patterns. IIUC, this is to avoid exponential blowup both in the size of the MIR and of the generated code, as well as the computational time to check compile. Here is a quick overview of the code as it stands today (to the best of my understanding).

- All of the relevant code is in `rustc_mir_build::build::matches`. The comments throughout this module are pretty helpful, but kind of high-level.

- We have a `match` expression, which consists of a `place` to match on (the "scrutinee") and collection of `Arm`s. Each arm has a pattern and possible a guard expression, and each pattern may be composed of arbitrarily nested subpatterns. Given this, we want to generate the MIR that checks all of the relevant patterns to find the first matching arm.

- The main entry point is `match_expr` https://github.com/rust-lang/rust/blob/0256d065d4901b63def6c04013da4f781d0752bb/compiler/rustc_mir_build/src/build/matches/mod.rs#L87 The comments on this function are helpful. They talk about the overall algorithm.

- The heart of the algorithm is the `match_candidates` method, which also has great comments. https://github.com/rust-lang/rust/blob/0256d065d4901b63def6c04013da4f781d0752bb/compiler/rustc_mir_build/src/build/matches/mod.rs#L890 This is where the MIR is actually built for the matching itself.

- `match_candidates` takes a list of `Candidate`s and a few references to `BasicBlock`s. The `BasicBlock`s are where we branch to in different cases, and we will be generating more blocks as we match on the scrutinee. We will recursively call `match_candidates` as we evaluate the candidates.

- The argument (of `match_candidates`) `candidates: &mut [&mut Candidate]` is roughly equivalent to an arm (or part of an arm). Specifically: https://github.com/rust-lang/rust/blob/0256d065d4901b63def6c04013da4f781d0752bb/compiler/rustc_mir_build/src/build/matches/mod.rs#L687-L712 A particular `Candidate` is satisfied if all of its `match_pairs` are satisfied. Each `MatchPair` in `match_pairs` is a single pattern match against a single place. We would extend the `match_pairs` of the `Candidate` recursively to get nested matches. So for example if we had an arm like `Foo { x: Some(0), y: false }`, we would first have a `MatchPair` matching the scrutinee place with `Foo`; then, we would add a couple more to match the places `scrutinee.x` and `scrutinee.y` with `Some` and `false`, respectively; then, we would add one more to match place `scrutinee.x.0` to `0`. So by the end of this process, all of these `MatchPair`s need to match for the `Candidate` to match.

- Notably, throughout the algorithm, there are recursive calls to `match_candidates` with subsets of `candidates`. This simplifies the logic significantly, but it means that `candidates` may be non-exhaustive (even though Rust `match`es are exhaustive).

- `match_candidates` does a bunch of stuff, but most relevant for this issue is that it calls `test_candidates_with_or`. In the terminology of the code, a "test" is a check of whether a place matches part of a pattern (e.g. `scrutinee.x` matches `Some`). (Note: we are not actually executing this test right now; it's still compile time! Rather we are generating the code that does the test into the user's program. For some reason, this mindset was hard for me to stay in). `test_candidates_with_or` generates the MIR for the tests for each candidate in `candidates` by recursively calling `match_candidates`. Specifically, we pop the first candidate from `candidates` and generate it's MIR; then, we recursively `match_candidates` on the tail.

- Or-patterns at the top-level of a pattern are handled specially in `test_candidates_with_or`. If the first pattern in `candidates` is an or-pattern, we call `test_or_pattern` https://github.com/rust-lang/rust/blob/0256d065d4901b63def6c04013da4f781d0752bb/compiler/rustc_mir_build/src/build/matches/mod.rs#L1202 This method splits the candidate into two or more `Candidate`s temporarily and calls `match_candidates` on this small list of temporary candidates. Afterwards, if the candidates are all simple, we can generate some glue MIR that OR's them all together; otherwise, we just leave them as a collection of different candidates and we just have a larger MIR.

- We continue recursively calling the above chain of methods until all or-patterns have been removed from the top level patterns. So we are left with only non-or-patterns at the top level.

- When we have only non-or-patterns, `test_candidates_with_or` calls `test_candidates` https://github.com/rust-lang/rust/blob/0256d065d4901b63def6c04013da4f781d0752bb/compiler/rustc_mir_build/src/build/matches/mod.rs#L1377 `test_candidates` has a lot of vary useful comments explaining what's going on. Basically, we only try to process one candidate, but in processing it, we may be able to share work for some of the other candidates. Processing a candidate requires figuring out what kind of test to apply for each `MatchPair` in a candidate. For example, if we are matching against a variant, we do a `Switch` test; if we are matching against an integer, we do a `SwitchInt` test, etc...

- During `test_candidates`, we may add more `MatchPair`s to a candidate to test for nested subpatterns (more in the next few bullets). At the end of `test_candidates`, we recursively call `match_candidates` on the remaining candidates, which ensures that we process all subpatterns.

- `test_candidates` calls `test::test` to generate an actual test for a `MatchPair`. https://github.com/rust-lang/rust/blob/0256d065d4901b63def6c04013da4f781d0752bb/compiler/rustc_mir_build/src/build/matches/test.rs#L27 This function assumes that it will not have to generate a test for an or-pattern (they should all have been dealt with by `test_or_pattern`), and this is where the ICE is (more later).

- `test_candidates` then tries to share the test with all candidates it can. For example, if more than one candidate need a test for `scrutinee.x` matches `Some`, then both candidates can branch from the basic block where that test is done, effectively combining two (or more) basic blocks in an otherwise exponential set of basic blocks. To determine which candidates can share the test, `test_candidates` calls `test::sort_candidates` https://github.com/rust-lang/rust/blob/0256d065d4901b63def6c04013da4f781d0752bb/compiler/rustc_mir_build/src/build/matches/test.rs#L485 (which also has good comments).

- `sort_candidates` will find all candidates that can share the test and simplify them by adding `MatchPair`s to the relevant `Candidates` for the immediate child subpatterns of the top-level pattern (if there are subpatterns). These subpatterns are later evaluated whenever `match_candidates` is recursively called later (and they may expand to more subpatterns, etc) until we have processed all the `MatchPair`s in all the candidates and generated tests for them.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Notes about MIR generation for match expressions #943

Background

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Notes about MIR generation for match expressions #943

Description

Background

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions