Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix, document, and test parser and pretty-printer edge cases related to braced macro calls #119427

Merged
merged 17 commits into from
May 12, 2024

Conversation

dtolnay
Copy link
Member

@dtolnay dtolnay commented Dec 30, 2023

Review note: this is a deceptively small PR because it comes with 145 lines of docs and 196 lines of tests, and only 25 lines of compiler code changed. However, I recommend reviewing it 1 commit at a time because much of the effect of the code changes is non-local i.e. affecting code that is not visible in the final state of the PR. I have paid attention that reviewing the PR one commit at a time is as easy as I can make it. All of the code you need to know about is touched in those commits, even if some of those changes disappear by the end of the stack.

This is a follow-up to #119105. One case that is not relevant to -Zunpretty=expanded, but which came up as I'm porting #119105 and #118726 into syn's printer and prettyplease's printer where it is relevant, and is also relevant to rustc's stringify!, is statement boundaries in the vicinity of braced macro calls.

Rustc's AST pretty-printer produces invalid syntax for statements that begin with a braced macro call:

macro_rules! stringify_item {
    ($i:item) => {
        stringify!($i)
    };
}

macro_rules! repro {
    ($e:expr) => {
        stringify_item!(fn main() { $e + 1; })
    };
}

fn main() {
    println!("{}", repro!(m! {}));
}

Before this PR: output is not valid Rust syntax.

fn main() { m! {} + 1; }
error: leading `+` is not supported
 --> <anon>:1:19
  |
1 | fn main() { m! {} + 1; }
  |                   ^ unexpected `+`
  |
help: try removing the `+`
  |
1 - fn main() { m! {} + 1; }
1 + fn main() { m! {}  1; }
  |

After this PR: valid syntax.

fn main() { (m! {}) + 1; }

@rustbot
Copy link
Collaborator

rustbot commented Dec 30, 2023

r? @WaffleLapkin

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 30, 2023
@dtolnay dtolnay added the A-pretty Area: Pretty printing (including `-Z unpretty`) label Dec 30, 2023
@rust-log-analyzer

This comment has been minimized.

@dtolnay dtolnay force-pushed the maccall branch 2 times, most recently from 584de16 to 48ac127 Compare December 30, 2023 03:38
@dtolnay dtolnay added the A-parser Area: The parsing of Rust source code to an AST label Dec 30, 2023
Copy link
Member

@WaffleLapkin WaffleLapkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree with the idea, but I'm not sure about the implementation. Is there a way to make this nicer?

| TryBlock(..)
| ConstBlock(..) => false,

MacCall(mac_call) => mac_call.args.delim != Delimiter::Brace,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed, if all the call sites are not calling this if e ≈ MacCall?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the commit that adds this MacCall case inside expr_requires_semi_to_be_stmt, I change all the call sites to preserve their pre-existing behavior. So this case starts out as not reachable. But after that, one at a time for each call site, I update each call site that should be reaching this, together with adding documentation and tests for that call site.

tests/ui/macros/stringify.rs Outdated Show resolved Hide resolved
compiler/rustc_parse/src/parser/expr.rs Outdated Show resolved Hide resolved
@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 2, 2024
@bors
Copy link
Contributor

bors commented Jan 19, 2024

☔ The latest upstream changes (presumably #120121) made this pull request unmergeable. Please resolve the merge conflicts.

@WaffleLapkin
Copy link
Member

r? compiler

@rustbot rustbot assigned b-naber and unassigned WaffleLapkin Jan 20, 2024
@fmease
Copy link
Member

fmease commented Mar 6, 2024

#120690
r? compiler

@rustbot rustbot assigned michaelwoerister and unassigned b-naber Mar 6, 2024
@dtolnay
Copy link
Member Author

dtolnay commented Mar 6, 2024

This is waiting on me to incorporate Waffle's insightful feedback, and to rebase.

jieyouxu added a commit to jieyouxu/rust that referenced this pull request Apr 20, 2024
Give a name to each distinct manipulation of pretty-printer FixupContext

There are only 7 distinct ways that the AST pretty-printer interacts with FixupContext: 3 constructors (including Default), 2 transformations, and 2 queries.

This PR turns these into associated functions which can be documented with examples.

This PR unblocks rust-lang#119427 (comment). In order to improve the pretty-printer's behavior regarding parenthesization of braced macro calls in match arms, which have different grammar than macro calls in statements, FixupContext needs to be extended with 2 new fields. In the previous approach, that would be onerous. In the new approach, all it entails is 1 new constructor (`FixupContext::new_match_arm()`).
jieyouxu added a commit to jieyouxu/rust that referenced this pull request Apr 20, 2024
Give a name to each distinct manipulation of pretty-printer FixupContext

There are only 7 distinct ways that the AST pretty-printer interacts with FixupContext: 3 constructors (including Default), 2 transformations, and 2 queries.

This PR turns these into associated functions which can be documented with examples.

This PR unblocks rust-lang#119427 (comment). In order to improve the pretty-printer's behavior regarding parenthesization of braced macro calls in match arms, which have different grammar than macro calls in statements, FixupContext needs to be extended with 2 new fields. In the previous approach, that would be onerous. In the new approach, all it entails is 1 new constructor (`FixupContext::new_match_arm()`).
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Apr 21, 2024
Rollup merge of rust-lang#124191 - dtolnay:fixup, r=compiler-errors

Give a name to each distinct manipulation of pretty-printer FixupContext

There are only 7 distinct ways that the AST pretty-printer interacts with FixupContext: 3 constructors (including Default), 2 transformations, and 2 queries.

This PR turns these into associated functions which can be documented with examples.

This PR unblocks rust-lang#119427 (comment). In order to improve the pretty-printer's behavior regarding parenthesization of braced macro calls in match arms, which have different grammar than macro calls in statements, FixupContext needs to be extended with 2 new fields. In the previous approach, that would be onerous. In the new approach, all it entails is 1 new constructor (`FixupContext::new_match_arm()`).
@dtolnay
Copy link
Member Author

dtolnay commented Apr 21, 2024

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 21, 2024
@michaelwoerister
Copy link
Member

Thanks for the PR, @dtolnay, looks great! However, after looking through all the commits, I think I don't understand enough about the parsing subtleties involved here to review this.

r? parser

@dtolnay
Copy link
Member Author

dtolnay commented May 11, 2024

compiler/rustc_ast_pretty/src/pprust/state.rs Outdated Show resolved Hide resolved
compiler/rustc_lint/src/unused.rs Outdated Show resolved Hide resolved
compiler/rustc_parse/src/parser/expr.rs Show resolved Hide resolved
self.restrictions.contains(Restrictions::STMT_EXPR)
&& !classify::expr_requires_semi_to_be_stmt(e)
&& !classify::expr_requires_comma_to_be_match_arm(e)
Copy link
Member

@compiler-errors compiler-errors May 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is maybe my only nit in the PR: that expr_is_complete uses expr_requires_comma_to_be_match_arm "feels" a bit less self-descriptive than it could be, since we use expr_is_complete for things like parse_expr_dot_or_call_with_ which doesn't really have to do with match arms.

Not really sure how to make this actionable though, since I'm not sure if a rename seems tough. AFAICT this just preserves the old behavior, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this is preserving old behavior. The exact same behavior that used to be incorrectly called expr_requires_semi_to_be_stmt is now called expr_requires_comma_to_be_match_arm.

Regarding naming, one possibility is that instead of distinguishing a "commas in match arms" and "semicolons in statements" case (and the implicit 3rd case, expressions that are neither statement nor match arm, such as a function arg), we could rename things along the following lines:

  • "parse the longest possible expression"
  • "parse an expression using earlier boundary rule" (for a match arm, range, or whatever else looks at expr_is_complete)
  • "parse an expression until the earliest possible boundary" (for a statement)

Another possibility is to omit this specific commit ("Add classify::expr_requires_comma_to_be_match_arm") and keep using match e { MacCall(_) => …, _ => expr_requires_semi_to_be_stmt(e) } in the cases where expr_requires_semi_to_be_stmt is not the desired behavior.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more option — instead of having what's currently in the PR:

// compiler/rustc_ast/src/util/classify.rs

pub fn expr_requires_semi_to_be_stmt(e: &ast::Expr) -> bool {
    match &e.kind {
        If(..)
        | Match(..)
        | Block(..)
        | While(..)
        | Loop(..)
        | ForLoop { .. }
        | TryBlock(..)
        | ConstBlock(..) => false,

        MacCall(mac_call) => mac_call.args.delim != Delimiter::Brace,

        _ => true,
    }
}

pub fn expr_requires_comma_to_be_match_arm(e: &ast::Expr) -> bool {
    match &e.kind {
        MacCall(_) => true,
        _ => expr_requires_semi_to_be_stmt(e),
    }
}

we could have:

pub fn expr_is_complete(e: &ast::Expr) -> bool {
    // this is negative of the old "expr_requires_semi_to_be_stmt"
    match &e.kind {
        If(..)
        | Match(..)
        | Block(..)
        | While(..)
        | Loop(..)
        | ForLoop { .. }
        | TryBlock(..)
        | ConstBlock(..) => true,

        MacCall(_) | _ => false,
    }
}

pub fn expr_requires_semi_to_be_stmt(e: &ast::Expr) -> bool {
    match &e.kind {
        MacCall(mac_call) => mac_call.args.delim != Delimiter::Brace,
        _ => !expr_is_complete(e),
    }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer either the final option or the renaming you suggested first above. I'll leave it up to you. Otherwise this PR looks good to me.

Copy link
Member Author

@dtolnay dtolnay May 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed 5d5954c and added 10227ea.

Copy link
Member

@compiler-errors compiler-errors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🆒

@compiler-errors
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented May 12, 2024

📌 Commit a9ec581 has been approved by compiler-errors

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 12, 2024
@rust-log-analyzer

This comment has been minimized.

@dtolnay
Copy link
Member Author

dtolnay commented May 12, 2024

@bors r=compiler-errors

@bors
Copy link
Contributor

bors commented May 12, 2024

📌 Commit 78c8dc1 has been approved by compiler-errors

It is now in the queue for this repository.

@bors
Copy link
Contributor

bors commented May 12, 2024

⌛ Testing commit 78c8dc1 with merge 8cc6f34...

@bors
Copy link
Contributor

bors commented May 12, 2024

☀️ Test successful - checks-actions
Approved by: compiler-errors
Pushing 8cc6f34 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label May 12, 2024
@bors bors merged commit 8cc6f34 into rust-lang:master May 12, 2024
7 checks passed
@rustbot rustbot added this to the 1.80.0 milestone May 12, 2024
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (8cc6f34): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
4.1% [4.1%, 4.1%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.5% [-2.5%, -2.5%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.8% [-2.5%, 4.1%] 2

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.8% [2.8%, 5.6%] 3
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 675.636s -> 678.17s (0.38%)
Artifact size: 315.95 MiB -> 315.89 MiB (-0.02%)

@dtolnay dtolnay deleted the maccall branch May 12, 2024 09:48
jhpratt added a commit to jhpratt/rust that referenced this pull request Nov 26, 2024
Inline ExprPrecedence::order into Expr::precedence

The representation of expression precedence in rustc_ast has been an obstacle to further improvements in the pretty-printer (continuing from rust-lang#119105 and rust-lang#119427).

Previously the operation of *"does this expression have lower precedence than that one"* (relevant for parenthesis insertion in macro-generated syntax trees) consisted of 3 steps:

1. Convert `Expr` to `ExprPrecedence` using `.precedence()`
2. Convert `ExprPrecedence` to `i8` using `.order()`
3. Compare using `<`

As far as I can guess, the reason for the separation between `precedence()` and `order()` was so that both `rustc_ast::Expr` and `rustc_hir::Expr` could convert as straightforwardly as possible to the same `ExprPrecedence` enum, and then the more finicky logic performed by `order` could be present just once.

The mapping between `Expr` and `ExprPrecedence` was intended to be as straightforward as possible:

```rust
match self.kind {
    ExprKind::Closure(..) => ExprPrecedence::Closure,
    ...
}
```

although there were exceptions of both many-to-one, and one-to-many:

```rust
    ExprKind::Underscore => ExprPrecedence::Path,
    ExprKind::Path(..) => ExprPrecedence::Path,
    ...
    ExprKind::Match(_, _, MatchKind::Prefix) => ExprPrecedence::Match,
    ExprKind::Match(_, _, MatchKind::Postfix) => ExprPrecedence::PostfixMatch,
```

Where the nature of `ExprPrecedence` becomes problematic is when a single expression kind might be associated with multiple different precedence levels depending on context (outside the expression) and contents (inside the expression). For example consider what is the precedence of an ExprKind::Closure `$closure`. Well, on the left-hand side of a binary operator it would need parentheses in order to avoid the trailing binary operator being absorbed into the closure body: `($closure) + Rhs`, so the precedence is something lower than that of `+`. But on the right-hand side of a binary operator, a closure is just a straightforward prefix expression like a unary op, which is a relatively high precedence level, higher than binops but lower than method calls: `Lhs + $closure` is fine without parens but `($closure).method()` needs them. But as a third case, if the closure contains an explicit return type, then the precedence is an even higher level than that, never needing parenthesization even in a binop left-hand side or method call: `|| -> bool { false } + Rhs` or `|| -> bool { false }.method()`.

You can see that trying to capture all of this resolution about expressions into `ExprPrecedence` violates the intention of `ExprPrecedence` being a straightforward one-to-one correspondence from each AST and HIR `ExprKind` variant. It would be possible to attempt that by doing stuff like `ExprPrecedence::Closure(Side::Leading, ReturnType::No)`, but I don't foresee the original envisioned benefit of the `precedence()`/`order()` distinction being retained in this approach. Instead I want to move toward a model that Syn has been using successfully. In Syn, there is a Precedence enum but it differs from rustc in the following ways:

- There are [relatively few variants](https://github.com/dtolnay/syn/blob/2.0.87/src/precedence.rs#L11-L47) compared to rustc's `ExprPrecedence`. For example there is no distinction at the precedence level between returns and closures, or between loops and method calls.

- We distinguish between [leading](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L293) and [trailing](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L309) precedence, taking into account an expression's context such as what token follows it (for various syntactic bail-outs in Rust's grammar, like ambiguities around break-with-value) and how it relates to operators from the surrounding syntax tree.

- There are no hardcoded mysterious integer quantities like rustc's `PREC_CLOSURE = -40`. All precedence comparisons are performed via PartialOrd on a C-like enum.

This PR is just a first step in these changes. As you can tell from Syn, I definitely think there is value in having a dedicated type to represent precedence, instead of what `order()` is doing with `i8`. But that is a whole separate adventure because rustc_ast doesn't even agree consistently on `i8` being the type for precedence order; `AssocOp::precedence` instead uses `usize` and there are casts in both directions. It is likely that a type called `ExprPrecedence` will re-appear, but it will look substantially different from the one that existed before this PR.
compiler-errors added a commit to compiler-errors/rust that referenced this pull request Nov 26, 2024
Inline ExprPrecedence::order into Expr::precedence

The representation of expression precedence in rustc_ast has been an obstacle to further improvements in the pretty-printer (continuing from rust-lang#119105 and rust-lang#119427).

Previously the operation of *"does this expression have lower precedence than that one"* (relevant for parenthesis insertion in macro-generated syntax trees) consisted of 3 steps:

1. Convert `Expr` to `ExprPrecedence` using `.precedence()`
2. Convert `ExprPrecedence` to `i8` using `.order()`
3. Compare using `<`

As far as I can guess, the reason for the separation between `precedence()` and `order()` was so that both `rustc_ast::Expr` and `rustc_hir::Expr` could convert as straightforwardly as possible to the same `ExprPrecedence` enum, and then the more finicky logic performed by `order` could be present just once.

The mapping between `Expr` and `ExprPrecedence` was intended to be as straightforward as possible:

```rust
match self.kind {
    ExprKind::Closure(..) => ExprPrecedence::Closure,
    ...
}
```

although there were exceptions of both many-to-one, and one-to-many:

```rust
    ExprKind::Underscore => ExprPrecedence::Path,
    ExprKind::Path(..) => ExprPrecedence::Path,
    ...
    ExprKind::Match(_, _, MatchKind::Prefix) => ExprPrecedence::Match,
    ExprKind::Match(_, _, MatchKind::Postfix) => ExprPrecedence::PostfixMatch,
```

Where the nature of `ExprPrecedence` becomes problematic is when a single expression kind might be associated with multiple different precedence levels depending on context (outside the expression) and contents (inside the expression). For example consider what is the precedence of an ExprKind::Closure `$closure`. Well, on the left-hand side of a binary operator it would need parentheses in order to avoid the trailing binary operator being absorbed into the closure body: `($closure) + Rhs`, so the precedence is something lower than that of `+`. But on the right-hand side of a binary operator, a closure is just a straightforward prefix expression like a unary op, which is a relatively high precedence level, higher than binops but lower than method calls: `Lhs + $closure` is fine without parens but `($closure).method()` needs them. But as a third case, if the closure contains an explicit return type, then the precedence is an even higher level than that, never needing parenthesization even in a binop left-hand side or method call: `|| -> bool { false } + Rhs` or `|| -> bool { false }.method()`.

You can see that trying to capture all of this resolution about expressions into `ExprPrecedence` violates the intention of `ExprPrecedence` being a straightforward one-to-one correspondence from each AST and HIR `ExprKind` variant. It would be possible to attempt that by doing stuff like `ExprPrecedence::Closure(Side::Leading, ReturnType::No)`, but I don't foresee the original envisioned benefit of the `precedence()`/`order()` distinction being retained in this approach. Instead I want to move toward a model that Syn has been using successfully. In Syn, there is a Precedence enum but it differs from rustc in the following ways:

- There are [relatively few variants](https://github.com/dtolnay/syn/blob/2.0.87/src/precedence.rs#L11-L47) compared to rustc's `ExprPrecedence`. For example there is no distinction at the precedence level between returns and closures, or between loops and method calls.

- We distinguish between [leading](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L293) and [trailing](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L309) precedence, taking into account an expression's context such as what token follows it (for various syntactic bail-outs in Rust's grammar, like ambiguities around break-with-value) and how it relates to operators from the surrounding syntax tree.

- There are no hardcoded mysterious integer quantities like rustc's `PREC_CLOSURE = -40`. All precedence comparisons are performed via PartialOrd on a C-like enum.

This PR is just a first step in these changes. As you can tell from Syn, I definitely think there is value in having a dedicated type to represent precedence, instead of what `order()` is doing with `i8`. But that is a whole separate adventure because rustc_ast doesn't even agree consistently on `i8` being the type for precedence order; `AssocOp::precedence` instead uses `usize` and there are casts in both directions. It is likely that a type called `ExprPrecedence` will re-appear, but it will look substantially different from the one that existed before this PR.
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Nov 27, 2024
Rollup merge of rust-lang#133140 - dtolnay:precedence, r=fmease

Inline ExprPrecedence::order into Expr::precedence

The representation of expression precedence in rustc_ast has been an obstacle to further improvements in the pretty-printer (continuing from rust-lang#119105 and rust-lang#119427).

Previously the operation of *"does this expression have lower precedence than that one"* (relevant for parenthesis insertion in macro-generated syntax trees) consisted of 3 steps:

1. Convert `Expr` to `ExprPrecedence` using `.precedence()`
2. Convert `ExprPrecedence` to `i8` using `.order()`
3. Compare using `<`

As far as I can guess, the reason for the separation between `precedence()` and `order()` was so that both `rustc_ast::Expr` and `rustc_hir::Expr` could convert as straightforwardly as possible to the same `ExprPrecedence` enum, and then the more finicky logic performed by `order` could be present just once.

The mapping between `Expr` and `ExprPrecedence` was intended to be as straightforward as possible:

```rust
match self.kind {
    ExprKind::Closure(..) => ExprPrecedence::Closure,
    ...
}
```

although there were exceptions of both many-to-one, and one-to-many:

```rust
    ExprKind::Underscore => ExprPrecedence::Path,
    ExprKind::Path(..) => ExprPrecedence::Path,
    ...
    ExprKind::Match(_, _, MatchKind::Prefix) => ExprPrecedence::Match,
    ExprKind::Match(_, _, MatchKind::Postfix) => ExprPrecedence::PostfixMatch,
```

Where the nature of `ExprPrecedence` becomes problematic is when a single expression kind might be associated with multiple different precedence levels depending on context (outside the expression) and contents (inside the expression). For example consider what is the precedence of an ExprKind::Closure `$closure`. Well, on the left-hand side of a binary operator it would need parentheses in order to avoid the trailing binary operator being absorbed into the closure body: `($closure) + Rhs`, so the precedence is something lower than that of `+`. But on the right-hand side of a binary operator, a closure is just a straightforward prefix expression like a unary op, which is a relatively high precedence level, higher than binops but lower than method calls: `Lhs + $closure` is fine without parens but `($closure).method()` needs them. But as a third case, if the closure contains an explicit return type, then the precedence is an even higher level than that, never needing parenthesization even in a binop left-hand side or method call: `|| -> bool { false } + Rhs` or `|| -> bool { false }.method()`.

You can see that trying to capture all of this resolution about expressions into `ExprPrecedence` violates the intention of `ExprPrecedence` being a straightforward one-to-one correspondence from each AST and HIR `ExprKind` variant. It would be possible to attempt that by doing stuff like `ExprPrecedence::Closure(Side::Leading, ReturnType::No)`, but I don't foresee the original envisioned benefit of the `precedence()`/`order()` distinction being retained in this approach. Instead I want to move toward a model that Syn has been using successfully. In Syn, there is a Precedence enum but it differs from rustc in the following ways:

- There are [relatively few variants](https://github.com/dtolnay/syn/blob/2.0.87/src/precedence.rs#L11-L47) compared to rustc's `ExprPrecedence`. For example there is no distinction at the precedence level between returns and closures, or between loops and method calls.

- We distinguish between [leading](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L293) and [trailing](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L309) precedence, taking into account an expression's context such as what token follows it (for various syntactic bail-outs in Rust's grammar, like ambiguities around break-with-value) and how it relates to operators from the surrounding syntax tree.

- There are no hardcoded mysterious integer quantities like rustc's `PREC_CLOSURE = -40`. All precedence comparisons are performed via PartialOrd on a C-like enum.

This PR is just a first step in these changes. As you can tell from Syn, I definitely think there is value in having a dedicated type to represent precedence, instead of what `order()` is doing with `i8`. But that is a whole separate adventure because rustc_ast doesn't even agree consistently on `i8` being the type for precedence order; `AssocOp::precedence` instead uses `usize` and there are casts in both directions. It is likely that a type called `ExprPrecedence` will re-appear, but it will look substantially different from the one that existed before this PR.
flip1995 pushed a commit to flip1995/rust that referenced this pull request Nov 28, 2024
Inline ExprPrecedence::order into Expr::precedence

The representation of expression precedence in rustc_ast has been an obstacle to further improvements in the pretty-printer (continuing from rust-lang#119105 and rust-lang#119427).

Previously the operation of *"does this expression have lower precedence than that one"* (relevant for parenthesis insertion in macro-generated syntax trees) consisted of 3 steps:

1. Convert `Expr` to `ExprPrecedence` using `.precedence()`
2. Convert `ExprPrecedence` to `i8` using `.order()`
3. Compare using `<`

As far as I can guess, the reason for the separation between `precedence()` and `order()` was so that both `rustc_ast::Expr` and `rustc_hir::Expr` could convert as straightforwardly as possible to the same `ExprPrecedence` enum, and then the more finicky logic performed by `order` could be present just once.

The mapping between `Expr` and `ExprPrecedence` was intended to be as straightforward as possible:

```rust
match self.kind {
    ExprKind::Closure(..) => ExprPrecedence::Closure,
    ...
}
```

although there were exceptions of both many-to-one, and one-to-many:

```rust
    ExprKind::Underscore => ExprPrecedence::Path,
    ExprKind::Path(..) => ExprPrecedence::Path,
    ...
    ExprKind::Match(_, _, MatchKind::Prefix) => ExprPrecedence::Match,
    ExprKind::Match(_, _, MatchKind::Postfix) => ExprPrecedence::PostfixMatch,
```

Where the nature of `ExprPrecedence` becomes problematic is when a single expression kind might be associated with multiple different precedence levels depending on context (outside the expression) and contents (inside the expression). For example consider what is the precedence of an ExprKind::Closure `$closure`. Well, on the left-hand side of a binary operator it would need parentheses in order to avoid the trailing binary operator being absorbed into the closure body: `($closure) + Rhs`, so the precedence is something lower than that of `+`. But on the right-hand side of a binary operator, a closure is just a straightforward prefix expression like a unary op, which is a relatively high precedence level, higher than binops but lower than method calls: `Lhs + $closure` is fine without parens but `($closure).method()` needs them. But as a third case, if the closure contains an explicit return type, then the precedence is an even higher level than that, never needing parenthesization even in a binop left-hand side or method call: `|| -> bool { false } + Rhs` or `|| -> bool { false }.method()`.

You can see that trying to capture all of this resolution about expressions into `ExprPrecedence` violates the intention of `ExprPrecedence` being a straightforward one-to-one correspondence from each AST and HIR `ExprKind` variant. It would be possible to attempt that by doing stuff like `ExprPrecedence::Closure(Side::Leading, ReturnType::No)`, but I don't foresee the original envisioned benefit of the `precedence()`/`order()` distinction being retained in this approach. Instead I want to move toward a model that Syn has been using successfully. In Syn, there is a Precedence enum but it differs from rustc in the following ways:

- There are [relatively few variants](https://github.com/dtolnay/syn/blob/2.0.87/src/precedence.rs#L11-L47) compared to rustc's `ExprPrecedence`. For example there is no distinction at the precedence level between returns and closures, or between loops and method calls.

- We distinguish between [leading](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L293) and [trailing](https://github.com/dtolnay/syn/blob/2.0.87/src/fixup.rs#L309) precedence, taking into account an expression's context such as what token follows it (for various syntactic bail-outs in Rust's grammar, like ambiguities around break-with-value) and how it relates to operators from the surrounding syntax tree.

- There are no hardcoded mysterious integer quantities like rustc's `PREC_CLOSURE = -40`. All precedence comparisons are performed via PartialOrd on a C-like enum.

This PR is just a first step in these changes. As you can tell from Syn, I definitely think there is value in having a dedicated type to represent precedence, instead of what `order()` is doing with `i8`. But that is a whole separate adventure because rustc_ast doesn't even agree consistently on `i8` being the type for precedence order; `AssocOp::precedence` instead uses `usize` and there are casts in both directions. It is likely that a type called `ExprPrecedence` will re-appear, but it will look substantially different from the one that existed before this PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-parser Area: The parsing of Rust source code to an AST A-pretty Area: Pretty printing (including `-Z unpretty`) merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.