Skip to content

Commit

Permalink
Avoid error handling duplication for starred, yield, lambda expressio…
Browse files Browse the repository at this point in the history
…ns (#10809)

## Summary

This PR updates the error handling logic for certain expressions in a
way to either perform it automatically or provide an option for the
user. The expression in discussion here are `lambda`, starred and
`yield` expression.

### Problem

The current parser allows these expressions at arbitrary context. This
is because the mentioned expressions are parsed using
`parse_lhs_expression` which is part of other higher level grammar
rules. This means that the caller needs to validate the parsed
expression and report an error if it isn't allowed in that context. This
can get quite cumbersome to do so as it needs to be done for all of the
call sites for following methods:

1. `parse_expression_list`: 14 references
2. `parse_star_expression_list`: 4 references
3. `parse_star_expression_or_higher`: 8 references
4. `parse_named_expression_or_higher`: 10 references
5. `parse_conditional_expression_or_higher`: 25 references
6. `parse_simple_expression`: 4 references

The numbers corresponding to the methods are the number of references as
of today. This list is also in the correct hierarchy of grammar
precedence. For example, `parse_expression_list` calls into
`parse_conditional_expression_or_higher` but not the other way around.

### Solution

We'll take the above expression one at a time to understand the
solution:

#### Lambda expression

Lambda expressions are only allowed in `expression` grammar rule which
corresponds to `parse_conditional_expression_or_higher`. This means that
this expression is only allowed when using either of the following
functions:

1. `parse_expression_list`
2. `parse_star_expression_list`
3. `parse_star_expression_or_higher`
4. `parse_named_expression_or_higher`
5. `parse_conditional_expression_or_higher`

The solution is to move the error handling in `parse_simple_expression`
and parameterize it where any of the above listed function would always
use `AllowLambdaExpression::Yes`.

#### Starred expression

There are two grammar rules related to starred expression:
1. `star_expression` which corresponds to
`parse_star_expression_or_higher`
2. `starred_expression` which is parsed in LHS parsing

Remember that LHS parsing isn't accessed directly but only via any of
the above listed functions in the problem section. Now, starred
expressions are allowed in a lot of places but sometimes in a limited
capacity. For example, an assignment target can have a starred
expression but only if it is a name node (`*x`).

The solution here is to adopt the one used in star pattern matching
which is to use a parameter. The following functions are parameterized:
1. `parse_expression_list`
2. `parse_named_expression_or_higher`
3. `parse_conditional_expression_or_higher`

Now, `parse_star_expression_list` and `parse_star_expression_or_higher`
aren't parameterized because they handle the `star_expression` grammar
which means that the caller wants to parse a starred expression but with
a limited precedence.

#### Yield expression

Yield expressions are only allowed in the following context:
1. Top level as yield statement
2. Parenthesized
3. F-string expression
4. Assignment (including annotated and augmented) value

We could parameterize it similar to starred expression but that seems
like a waste given the limited number of locations they're allowed.

The solution is to add a `parse_yield_expression_or_else` method which
parses a yield expression if the parser is at `yield` token or else
calls the given method to parse the expression. The call site would
like:

```rs
// (yield_expr | named_expression)
self.try_parse_yield_expression()
  .unwrap_or_else(|| self.parse_named_expression_or_higher())

// (yield_expr | star_expressions)
self.try_parse_yield_expression()
  .unwrap_or_else(|| self.parse_star_expression_list())
```

An added benefit for this is that the call site looks exactly like the
grammar.

## Review

* The reviewer would mainly just look at the de-duplication logic.
* The reviewer doesn't really need to verify the call sites as they're
verified by existing test cases. For nodes which aren't yet tested, they
will be done so in their own PR.

## Test Plan

Run existing test cases and verify the snapshot updates.

Additional test cases will be added when working on specific nodes.
  • Loading branch information
dhruvmanila committed Apr 16, 2024
1 parent a585856 commit 7673a5a
Show file tree
Hide file tree
Showing 20 changed files with 330 additions and 319 deletions.
10 changes: 10 additions & 0 deletions crates/ruff_python_parser/src/error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,10 @@ pub enum ParseErrorType {
UnparenthesizedNamedExpression,
/// An unparenthesized tuple expression was found where it is not allowed.
UnparenthesizedTupleExpression,
/// An invalid usage of a lambda expression was found.
InvalidLambdaExpressionUsage,
/// An invalid usage of a yield expression was found.
InvalidYieldExpressionUsage,

/// An invalid expression was found in the assignment `target`.
InvalidAssignmentTarget,
Expand Down Expand Up @@ -208,6 +212,12 @@ impl std::fmt::Display for ParseErrorType {
ParseErrorType::UnparenthesizedTupleExpression => {
write!(f, "unparenthesized tuple expression cannot be used here")
}
ParseErrorType::InvalidYieldExpressionUsage => {
write!(f, "yield expression cannot be used here")
}
ParseErrorType::InvalidLambdaExpressionUsage => {
write!(f, "`lambda` expression cannot be used here")
}
ParseErrorType::StarredExpressionUsage => {
write!(f, "starred expression cannot be used here")
}
Expand Down
409 changes: 167 additions & 242 deletions crates/ruff_python_parser/src/parser/expression.rs

Large diffs are not rendered by default.

4 changes: 3 additions & 1 deletion crates/ruff_python_parser/src/parser/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ use crate::{
Mode, ParseError, ParseErrorType, Tok, TokenKind,
};

use self::expression::AllowStarredExpression;

mod expression;
mod helpers;
mod pattern;
Expand Down Expand Up @@ -177,7 +179,7 @@ impl<'src> Parser<'src> {
pub(crate) fn parse_program(mut self) -> Program {
let ast = if self.mode == Mode::Expression {
let start = self.node_start();
let parsed_expr = self.parse_expression_list();
let parsed_expr = self.parse_expression_list(AllowStarredExpression::No);
let mut progress = ParserProgress::default();

// TODO: How should error recovery work here? Just truncate after the expression?
Expand Down
113 changes: 72 additions & 41 deletions crates/ruff_python_parser/src/parser/statement.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ use crate::parser::{
use crate::token_set::TokenSet;
use crate::{Mode, ParseErrorType, Tok, TokenKind};

use super::expression::AllowStarredExpression;
use super::Parenthesized;

/// Tokens that can appear after an expression.
Expand Down Expand Up @@ -195,7 +196,8 @@ impl<'src> Parser<'src> {
}
_ => {
let start = self.node_start();
let parsed_expr = self.parse_star_expression_list();
let parsed_expr =
self.parse_yield_expression_or_else(Parser::parse_star_expression_list);

if self.eat(TokenKind::Equal) {
Stmt::Assign(self.parse_assign_statement(parsed_expr, start))
Expand Down Expand Up @@ -253,7 +255,10 @@ impl<'src> Parser<'src> {
let targets = self.parse_comma_separated_list_into_vec(
RecoveryContextKind::DeleteTargets,
|parser| {
let mut target = parser.parse_conditional_expression_or_higher();
// Allow starred expression to raise a better error message for
// an invalid delete target later.
let mut target =
parser.parse_conditional_expression_or_higher(AllowStarredExpression::Yes);
helpers::set_expr_ctx(&mut target.expr, ExprContext::Del);

if !helpers::is_valid_del_target(&target.expr) {
Expand Down Expand Up @@ -329,7 +334,7 @@ impl<'src> Parser<'src> {
// raise *x
// raise yield x
// raise x := 1
let exc = self.parse_expression_list();
let exc = self.parse_expression_list(AllowStarredExpression::No);

if let Some(ast::ExprTuple {
parenthesized: false,
Expand All @@ -352,7 +357,7 @@ impl<'src> Parser<'src> {
// raise x from *y
// raise x from yield y
// raise x from y := 1
let cause = self.parse_expression_list();
let cause = self.parse_expression_list(AllowStarredExpression::No);

if let Some(ast::ExprTuple {
parenthesized: false,
Expand Down Expand Up @@ -654,7 +659,7 @@ impl<'src> Parser<'src> {
// assert assert x
// assert yield x
// assert x := 1
let test = self.parse_conditional_expression_or_higher();
let test = self.parse_conditional_expression_or_higher(AllowStarredExpression::No);

let msg = if self.eat(TokenKind::Comma) {
if self.at_expr() {
Expand All @@ -664,7 +669,10 @@ impl<'src> Parser<'src> {
// assert False, assert x
// assert False, yield x
// assert False, x := 1
Some(Box::new(self.parse_conditional_expression_or_higher().expr))
Some(Box::new(
self.parse_conditional_expression_or_higher(AllowStarredExpression::No)
.expr,
))
} else {
// test_err assert_empty_msg
// assert x,
Expand Down Expand Up @@ -800,7 +808,7 @@ impl<'src> Parser<'src> {

self.expect(TokenKind::Equal);

let value = self.parse_conditional_expression_or_higher();
let value = self.parse_conditional_expression_or_higher(AllowStarredExpression::No);

ast::StmtTypeAlias {
name: Box::new(name),
Expand All @@ -826,13 +834,17 @@ impl<'src> Parser<'src> {
/// See: <https://docs.python.org/3/reference/simple_stmts.html#grammar-token-python-grammar-assignment_stmt>
fn parse_assign_statement(&mut self, target: ParsedExpr, start: TextSize) -> ast::StmtAssign {
let mut targets = vec![target.expr];
let mut value = self.parse_expression_list();
let mut value = self.parse_yield_expression_or_else(|p| {
p.parse_expression_list(AllowStarredExpression::Yes)
});

if self.at(TokenKind::Equal) {
self.parse_list(RecoveryContextKind::AssignmentTargets, |parser| {
parser.bump(TokenKind::Equal);

let mut parsed_expr = parser.parse_expression_list();
let mut parsed_expr = parser.parse_yield_expression_or_else(|p| {
p.parse_expression_list(AllowStarredExpression::Yes)
});

std::mem::swap(&mut value, &mut parsed_expr);

Expand Down Expand Up @@ -885,7 +897,7 @@ impl<'src> Parser<'src> {

// Annotation is actually just an `expression` but we use `expressions`
// for better error recovery in case tuple isn't parenthesized.
let annotation = self.parse_expression_list();
let annotation = self.parse_expression_list(AllowStarredExpression::No);

if matches!(
annotation.expr,
Expand All @@ -900,9 +912,14 @@ impl<'src> Parser<'src> {
);
}

let value = self
.eat(TokenKind::Equal)
.then(|| Box::new(self.parse_expression_list().expr));
let value = self.eat(TokenKind::Equal).then(|| {
Box::new(
self.parse_yield_expression_or_else(|p| {
p.parse_expression_list(AllowStarredExpression::Yes)
})
.expr,
)
});

ast::StmtAnnAssign {
target: Box::new(target.expr),
Expand Down Expand Up @@ -938,7 +955,9 @@ impl<'src> Parser<'src> {

helpers::set_expr_ctx(&mut target.expr, ExprContext::Store);

let value = self.parse_expression_list();
let value = self.parse_yield_expression_or_else(|p| {
p.parse_expression_list(AllowStarredExpression::Yes)
});

ast::StmtAugAssign {
target: Box::new(target.expr),
Expand All @@ -953,7 +972,7 @@ impl<'src> Parser<'src> {
let if_start = self.node_start();
self.bump(TokenKind::If);

let test = self.parse_named_expression_or_higher();
let test = self.parse_named_expression_or_higher(AllowStarredExpression::No);
self.expect(TokenKind::Colon);

let body = self.parse_body(Clause::If);
Expand All @@ -974,7 +993,7 @@ impl<'src> Parser<'src> {
let elif_start = p.node_start();
p.bump(TokenKind::Elif);

let test = p.parse_named_expression_or_higher();
let test = p.parse_named_expression_or_higher(AllowStarredExpression::No);
p.expect(TokenKind::Colon);

let body = p.parse_body(Clause::ElIf);
Expand Down Expand Up @@ -1026,7 +1045,7 @@ impl<'src> Parser<'src> {
let type_ = if p.at(TokenKind::Colon) && !is_star {
None
} else {
let parsed_expr = p.parse_expression_list();
let parsed_expr = p.parse_expression_list(AllowStarredExpression::No);
if matches!(
parsed_expr.expr,
Expr::Tuple(ast::ExprTuple {
Expand Down Expand Up @@ -1099,14 +1118,14 @@ impl<'src> Parser<'src> {
self.bump(TokenKind::For);

let saved_context = self.set_ctx(ParserCtxFlags::FOR_TARGET);
let mut target = self.parse_expression_list();
let mut target = self.parse_expression_list(AllowStarredExpression::Yes);
self.restore_ctx(ParserCtxFlags::FOR_TARGET, saved_context);

helpers::set_expr_ctx(&mut target.expr, ExprContext::Store);

self.expect(TokenKind::In);

let iter = self.parse_expression_list();
let iter = self.parse_expression_list(AllowStarredExpression::Yes);

self.expect(TokenKind::Colon);

Expand Down Expand Up @@ -1134,7 +1153,7 @@ impl<'src> Parser<'src> {
let while_start = self.node_start();
self.bump(TokenKind::While);

let test = self.parse_named_expression_or_higher();
let test = self.parse_named_expression_or_higher(AllowStarredExpression::No);
self.expect(TokenKind::Colon);

let body = self.parse_body(Clause::While);
Expand Down Expand Up @@ -1171,7 +1190,7 @@ impl<'src> Parser<'src> {
parameters.range = self.node_range(parameters_start);

let returns = self.eat(TokenKind::Rarrow).then(|| {
let returns = self.parse_expression_list();
let returns = self.parse_expression_list(AllowStarredExpression::No);
if !returns.is_parenthesized && matches!(returns.expr, Expr::Tuple(_)) {
self.add_error(
ParseErrorType::OtherError(
Expand Down Expand Up @@ -1545,7 +1564,9 @@ impl<'src> Parser<'src> {
fn parse_with_item(&mut self, state: WithItemParsingState) -> ParsedWithItem {
let start = self.node_start();

let parsed_expr = self.parse_conditional_expression_or_higher();
let parsed_expr = self.parse_yield_expression_or_else(|p| {
p.parse_conditional_expression_or_higher(AllowStarredExpression::Yes)
});
let mut used_ambiguous_lpar = false;

// While parsing a with item after an ambiguous `(` token, we need to check
Expand Down Expand Up @@ -1656,7 +1677,7 @@ impl<'src> Parser<'src> {
fn parse_with_item_optional_vars(&mut self) -> ParsedExpr {
self.bump(TokenKind::As);

let mut target = self.parse_conditional_expression_or_higher();
let mut target = self.parse_conditional_expression_or_higher(AllowStarredExpression::Yes);

// This has the same semantics as an assignment target.
if !helpers::is_valid_assignment_target(&target.expr) {
Expand All @@ -1681,14 +1702,12 @@ impl<'src> Parser<'src> {
self.bump(TokenKind::Match);

let subject_start = self.node_start();
let subject = self.parse_named_expression_or_higher();
let subject = self.parse_named_expression_or_higher(AllowStarredExpression::No);
let subject = if self.at(TokenKind::Comma) {
let tuple = self.parse_tuple_expression(
subject.expr,
subject_start,
Parenthesized::No,
Parser::parse_named_expression_or_higher,
);
let tuple =
self.parse_tuple_expression(subject.expr, subject_start, Parenthesized::No, |p| {
p.parse_named_expression_or_higher(AllowStarredExpression::No)
});

Expr::Tuple(tuple).into()
} else {
Expand Down Expand Up @@ -1752,9 +1771,12 @@ impl<'src> Parser<'src> {
self.bump(TokenKind::Case);
let pattern = self.parse_match_patterns();

let guard = self
.eat(TokenKind::If)
.then(|| Box::new(self.parse_named_expression_or_higher().expr));
let guard = self.eat(TokenKind::If).then(|| {
Box::new(
self.parse_named_expression_or_higher(AllowStarredExpression::No)
.expr,
)
});

self.expect(TokenKind::Colon);
let body = self.parse_body(Clause::Match);
Expand Down Expand Up @@ -1812,7 +1834,7 @@ impl<'src> Parser<'src> {
let decorator_start = self.node_start();
self.bump(TokenKind::At);

let parsed_expr = self.parse_named_expression_or_higher();
let parsed_expr = self.parse_named_expression_or_higher(AllowStarredExpression::No);
decorators.push(ast::Decorator {
expression: parsed_expr.expr,
range: self.node_range(decorator_start),
Expand Down Expand Up @@ -1887,7 +1909,10 @@ impl<'src> Parser<'src> {
// this is the `lambda`'s body, don't try to parse as an annotation.
let annotation = if function_kind == FunctionKind::FunctionDef && self.eat(TokenKind::Colon)
{
Some(Box::new(self.parse_conditional_expression_or_higher().expr))
Some(Box::new(
self.parse_conditional_expression_or_higher(AllowStarredExpression::Yes)
.expr,
))
} else {
None
};
Expand All @@ -1906,9 +1931,12 @@ impl<'src> Parser<'src> {
let start = self.node_start();
let parameter = self.parse_parameter(function_kind);

let default = self
.eat(TokenKind::Equal)
.then(|| Box::new(self.parse_conditional_expression_or_higher().expr));
let default = self.eat(TokenKind::Equal).then(|| {
Box::new(
self.parse_conditional_expression_or_higher(AllowStarredExpression::No)
.expr,
)
});

ast::ParameterWithDefault {
range: self.node_range(start),
Expand Down Expand Up @@ -2053,9 +2081,12 @@ impl<'src> Parser<'src> {
})
} else {
let name = self.parse_identifier();
let bound = self
.eat(TokenKind::Colon)
.then(|| Box::new(self.parse_conditional_expression_or_higher().expr));
let bound = self.eat(TokenKind::Colon).then(|| {
Box::new(
self.parse_conditional_expression_or_higher(AllowStarredExpression::No)
.expr,
)
});

ast::TypeParam::TypeVar(ast::TypeParamTypeVar {
range: self.node_range(start),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ Module(

|
1 | call(**yield x)
| ^^^^^^^ Syntax Error: Unparenthesized yield expression cannot be used here
| ^^^^^^^ Syntax Error: yield expression cannot be used here
2 | call(** *x)
3 | call(***x)
|
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -187,13 +187,13 @@ Module(
2 | call(x := 1 = 1)
3 |
4 | call(yield x)
| ^^^^^^^ Syntax Error: unparenthesized yield expression cannot be used here
| ^^^^^^^ Syntax Error: yield expression cannot be used here
5 | call(yield from x)
|


|
4 | call(yield x)
5 | call(yield from x)
| ^^^^^^^^^^^^ Syntax Error: unparenthesized yield expression cannot be used here
| ^^^^^^^^^^^^ Syntax Error: yield expression cannot be used here
|
Original file line number Diff line number Diff line change
Expand Up @@ -197,7 +197,7 @@ Module(

|
1 | call(x = yield y)
| ^^^^^^^ Syntax Error: Unparenthesized yield expression cannot be used here
| ^^^^^^^ Syntax Error: yield expression cannot be used here
2 | call(x = yield from y)
3 | call(x = *y)
|
Expand All @@ -206,7 +206,7 @@ Module(
|
1 | call(x = yield y)
2 | call(x = yield from y)
| ^^^^^^^^^^^^ Syntax Error: Unparenthesized yield expression cannot be used here
| ^^^^^^^^^^^^ Syntax Error: yield expression cannot be used here
3 | call(x = *y)
4 | call(x = (*y))
|
Expand Down
Loading

0 comments on commit 7673a5a

Please sign in to comment.