Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement rules around star expressions with different precedence #10623

Merged
merged 1 commit into from
Mar 29, 2024

Conversation

dhruvmanila
Copy link
Member

Summary

This PR adds implementations around the following grammar rules to support the other expression and statement parsing. This PR doesn't update the related expressions and statements as they will be done in their own PRs.

  • star_expressions
  • star_expression
  • '*' bitwise_or

Problem

There are two grammar entries around star expressions:

star_expression:
    | '*' bitwise_or 
    | expression

starred_expression:
    '*' expression

The main difference is the precedence at which the expression should be parsed if a * token is encountered. Currently, we parse both rules at the same precedence which makes the parser more lenient. There are various expressions and statements which uses either one of the above rules.

Solutions

Current solution

Add the following new methods to the parser:

  1. parse_star_expression_list which would be similar to parse_expression_list and would match the star_expressions grammar rule
  2. parse_star_expression_or_higher which would match the star_expression or star_named_expression grammar rule
  3. parse_expression_with_bitwise_or_precedence which would match the bitwise_or grammar rule

Here, we could possibly merge the parse_expression_list and parse_star_expression_list into a single function as it only differs in the method it delegates the parsing to. The parameter could be called allow_star_expression. It could be confusing because even if it’s false, we would allow starred expression but at a higher binding power.

This solution closely resembles the actual grammar and is easier to follow in the parser code. It doesn't require a lot of verification by the caller as the precedence parser makes sure to not allow that.

Other solution

Add a new verify_bitwise_or_precedence method which reports an error if the grammar expected a starred expression with bitwise_or precedence:

pub(super) fn verify_star_expressions(
    &mut self,
    expr: &Expr,
    allow_named_expression: AllowNamedExpression,
) {
    match expr {
        Expr::Tuple(ast::ExprTuple { elts, .. }) => {
            for expr in elts {
                self.verify_star_expressions(expr, allow_named_expression);
            }
        }
        Expr::Starred(ast::ExprStarred { value, .. }) => {
            let expr_kind = match &**value {
                Expr::Compare(_) => "comparison",
                Expr::UnaryOp(ast::ExprUnaryOp {
                    op: ast::UnaryOp::Not,
                    ..
                }) => "unary `not`",
                Expr::BoolOp(_) => "boolean",
                Expr::If(_) => "`if-else`",
                Expr::Lambda(_) => "lambda",
                Expr::Named(_) if allow_named_expression.is_no() => "named",
                _ => return,
            };

            self.add_error(
                ParseErrorType::OtherError(format!(
                    "{expr_kind} expression cannot be used here",
                )),
                expr.range(),
            );
        }
        _ => {}
    }
}

For this solution, the parser will need to verify the precedence of a starred expression wherever grammar expected the star_expressions, star_expression or star_named_expression rule.

We could possibly adopt this solution for better error recovery but I think for now having a strict parser is better suited because we don't really know what complete error recovery looks like.

Follow-up

There's a related follow-up issue which I'm working on and is related to starred expressions.
The starred expression is only allowed in certain expressions and context which the parser needs to make sure and report an error if found otherwise.

Test Plan

Tested this locally by changing various expression and statement to use these methods. As mentioned they will be in their own PR.

@dhruvmanila dhruvmanila added the parser Related to the parser label Mar 27, 2024
@dhruvmanila dhruvmanila marked this pull request as draft March 27, 2024 08:57
@dhruvmanila

This comment was marked as outdated.

Copy link
Contributor

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

@dhruvmanila dhruvmanila marked this pull request as ready for review March 27, 2024 09:01
@@ -83,7 +83,7 @@ impl<'src> Parser<'src> {
/// Matches the `expressions` rule in the [Python grammar].
///
/// [Python grammar]: https://docs.python.org/3/reference/grammar.html
pub(super) fn parse_expressions(&mut self) -> ParsedExpr {
pub(super) fn parse_expression_list(&mut self) -> ParsedExpr {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: i find the postfix list a bit confusing because we aren't parsing a list (and the term is overloaded in the Parser context where we have parse_list and parse_list_expression) but I don't have a better name recommendation.

@dhruvmanila
Copy link
Member Author

For future reference, this change is going to be updated to use the "other" solution mentioned in the PR description. Not all will be reverted but instead of having a parse_expression_with_bitwise_or_precedence, we'll allow the starred expression to be parsed with the highest precedence and limit it later by checking if it's allowed as per "bitwise or" precedence and report an error.

I don't want to add the change here as it'll require updating all of the PRs in the stack. I'll link the PR when it's opened.

@dhruvmanila dhruvmanila merged commit 499d412 into dhruv/parser Mar 29, 2024
17 checks passed
@dhruvmanila dhruvmanila deleted the dhruv/star-expression-list branch March 29, 2024 07:48
dhruvmanila added a commit that referenced this pull request Mar 31, 2024
…0623)

## Summary

This PR adds implementations around the following grammar rules to
support the other expression and statement parsing. This PR doesn't
update the related expressions and statements as they will be done in
their own PRs.

* `star_expressions`
* `star_expression`
* `'*' bitwise_or`

### Problem

There are two grammar entries around star expressions:

```
star_expression:
    | '*' bitwise_or 
    | expression

starred_expression:
    '*' expression
```

The main difference is the precedence at which the expression should be
parsed if a `*` token is encountered. Currently, we parse both rules at
the same precedence which makes the parser more lenient. There are
various expressions and statements which uses either one of the above
rules.

### Solutions

#### Current solution

Add the following new methods to the parser:
1. `parse_star_expression_list` which would be similar to
`parse_expression_list` and would match the `star_expressions` grammar
rule
2. `parse_star_expression_or_higher` which would match the
`star_expression` or `star_named_expression` grammar rule
3. `parse_expression_with_bitwise_or_precedence` which would match the
`bitwise_or` grammar rule

Here, we could possibly merge the `parse_expression_list` and
`parse_star_expression_list` into a single function as it only differs
in the method it delegates the parsing to. The parameter could be called
`allow_star_expression`. It could be confusing because even if it’s
false, we would allow starred expression but at a higher binding power.

This solution closely resembles the actual grammar and is easier to
follow in the parser code. It doesn't require a lot of verification by
the caller as the precedence parser makes sure to not allow that.

#### Other solution

Add a new `verify_bitwise_or_precedence` method which reports an error
if the grammar expected a starred expression with `bitwise_or`
precedence:

```rs
pub(super) fn verify_star_expressions(
    &mut self,
    expr: &Expr,
    allow_named_expression: AllowNamedExpression,
) {
    match expr {
        Expr::Tuple(ast::ExprTuple { elts, .. }) => {
            for expr in elts {
                self.verify_star_expressions(expr, allow_named_expression);
            }
        }
        Expr::Starred(ast::ExprStarred { value, .. }) => {
            let expr_kind = match &**value {
                Expr::Compare(_) => "comparison",
                Expr::UnaryOp(ast::ExprUnaryOp {
                    op: ast::UnaryOp::Not,
                    ..
                }) => "unary `not`",
                Expr::BoolOp(_) => "boolean",
                Expr::If(_) => "`if-else`",
                Expr::Lambda(_) => "lambda",
                Expr::Named(_) if allow_named_expression.is_no() => "named",
                _ => return,
            };

            self.add_error(
                ParseErrorType::OtherError(format!(
                    "{expr_kind} expression cannot be used here",
                )),
                expr.range(),
            );
        }
        _ => {}
    }
}
```

For this solution, the parser will need to verify the precedence of a
starred expression wherever grammar expected the `star_expressions`,
`star_expression` or `star_named_expression` rule.

We could possibly adopt this solution for better error recovery but I
think for now having a strict parser is better suited because we don't
really know what complete error recovery looks like.

### Follow-up

There's a related follow-up issue which I'm working on and is related to
starred expressions.
The starred expression is only allowed in certain expressions and
context which the parser needs to make sure and report an error if found
otherwise.

## Test Plan

Tested this locally by changing various expression and statement to use
these methods. As mentioned they will be in their own PR.
dhruvmanila added a commit that referenced this pull request Mar 31, 2024
…0623)

## Summary

This PR adds implementations around the following grammar rules to
support the other expression and statement parsing. This PR doesn't
update the related expressions and statements as they will be done in
their own PRs.

* `star_expressions`
* `star_expression`
* `'*' bitwise_or`

### Problem

There are two grammar entries around star expressions:

```
star_expression:
    | '*' bitwise_or 
    | expression

starred_expression:
    '*' expression
```

The main difference is the precedence at which the expression should be
parsed if a `*` token is encountered. Currently, we parse both rules at
the same precedence which makes the parser more lenient. There are
various expressions and statements which uses either one of the above
rules.

### Solutions

#### Current solution

Add the following new methods to the parser:
1. `parse_star_expression_list` which would be similar to
`parse_expression_list` and would match the `star_expressions` grammar
rule
2. `parse_star_expression_or_higher` which would match the
`star_expression` or `star_named_expression` grammar rule
3. `parse_expression_with_bitwise_or_precedence` which would match the
`bitwise_or` grammar rule

Here, we could possibly merge the `parse_expression_list` and
`parse_star_expression_list` into a single function as it only differs
in the method it delegates the parsing to. The parameter could be called
`allow_star_expression`. It could be confusing because even if it’s
false, we would allow starred expression but at a higher binding power.

This solution closely resembles the actual grammar and is easier to
follow in the parser code. It doesn't require a lot of verification by
the caller as the precedence parser makes sure to not allow that.

#### Other solution

Add a new `verify_bitwise_or_precedence` method which reports an error
if the grammar expected a starred expression with `bitwise_or`
precedence:

```rs
pub(super) fn verify_star_expressions(
    &mut self,
    expr: &Expr,
    allow_named_expression: AllowNamedExpression,
) {
    match expr {
        Expr::Tuple(ast::ExprTuple { elts, .. }) => {
            for expr in elts {
                self.verify_star_expressions(expr, allow_named_expression);
            }
        }
        Expr::Starred(ast::ExprStarred { value, .. }) => {
            let expr_kind = match &**value {
                Expr::Compare(_) => "comparison",
                Expr::UnaryOp(ast::ExprUnaryOp {
                    op: ast::UnaryOp::Not,
                    ..
                }) => "unary `not`",
                Expr::BoolOp(_) => "boolean",
                Expr::If(_) => "`if-else`",
                Expr::Lambda(_) => "lambda",
                Expr::Named(_) if allow_named_expression.is_no() => "named",
                _ => return,
            };

            self.add_error(
                ParseErrorType::OtherError(format!(
                    "{expr_kind} expression cannot be used here",
                )),
                expr.range(),
            );
        }
        _ => {}
    }
}
```

For this solution, the parser will need to verify the precedence of a
starred expression wherever grammar expected the `star_expressions`,
`star_expression` or `star_named_expression` rule.

We could possibly adopt this solution for better error recovery but I
think for now having a strict parser is better suited because we don't
really know what complete error recovery looks like.

### Follow-up

There's a related follow-up issue which I'm working on and is related to
starred expressions.
The starred expression is only allowed in certain expressions and
context which the parser needs to make sure and report an error if found
otherwise.

## Test Plan

Tested this locally by changing various expression and statement to use
these methods. As mentioned they will be in their own PR.
dhruvmanila added a commit that referenced this pull request Apr 2, 2024
…0623)

## Summary

This PR adds implementations around the following grammar rules to
support the other expression and statement parsing. This PR doesn't
update the related expressions and statements as they will be done in
their own PRs.

* `star_expressions`
* `star_expression`
* `'*' bitwise_or`

### Problem

There are two grammar entries around star expressions:

```
star_expression:
    | '*' bitwise_or 
    | expression

starred_expression:
    '*' expression
```

The main difference is the precedence at which the expression should be
parsed if a `*` token is encountered. Currently, we parse both rules at
the same precedence which makes the parser more lenient. There are
various expressions and statements which uses either one of the above
rules.

### Solutions

#### Current solution

Add the following new methods to the parser:
1. `parse_star_expression_list` which would be similar to
`parse_expression_list` and would match the `star_expressions` grammar
rule
2. `parse_star_expression_or_higher` which would match the
`star_expression` or `star_named_expression` grammar rule
3. `parse_expression_with_bitwise_or_precedence` which would match the
`bitwise_or` grammar rule

Here, we could possibly merge the `parse_expression_list` and
`parse_star_expression_list` into a single function as it only differs
in the method it delegates the parsing to. The parameter could be called
`allow_star_expression`. It could be confusing because even if it’s
false, we would allow starred expression but at a higher binding power.

This solution closely resembles the actual grammar and is easier to
follow in the parser code. It doesn't require a lot of verification by
the caller as the precedence parser makes sure to not allow that.

#### Other solution

Add a new `verify_bitwise_or_precedence` method which reports an error
if the grammar expected a starred expression with `bitwise_or`
precedence:

```rs
pub(super) fn verify_star_expressions(
    &mut self,
    expr: &Expr,
    allow_named_expression: AllowNamedExpression,
) {
    match expr {
        Expr::Tuple(ast::ExprTuple { elts, .. }) => {
            for expr in elts {
                self.verify_star_expressions(expr, allow_named_expression);
            }
        }
        Expr::Starred(ast::ExprStarred { value, .. }) => {
            let expr_kind = match &**value {
                Expr::Compare(_) => "comparison",
                Expr::UnaryOp(ast::ExprUnaryOp {
                    op: ast::UnaryOp::Not,
                    ..
                }) => "unary `not`",
                Expr::BoolOp(_) => "boolean",
                Expr::If(_) => "`if-else`",
                Expr::Lambda(_) => "lambda",
                Expr::Named(_) if allow_named_expression.is_no() => "named",
                _ => return,
            };

            self.add_error(
                ParseErrorType::OtherError(format!(
                    "{expr_kind} expression cannot be used here",
                )),
                expr.range(),
            );
        }
        _ => {}
    }
}
```

For this solution, the parser will need to verify the precedence of a
starred expression wherever grammar expected the `star_expressions`,
`star_expression` or `star_named_expression` rule.

We could possibly adopt this solution for better error recovery but I
think for now having a strict parser is better suited because we don't
really know what complete error recovery looks like.

### Follow-up

There's a related follow-up issue which I'm working on and is related to
starred expressions.
The starred expression is only allowed in certain expressions and
context which the parser needs to make sure and report an error if found
otherwise.

## Test Plan

Tested this locally by changing various expression and statement to use
these methods. As mentioned they will be in their own PR.
dhruvmanila added a commit that referenced this pull request Apr 4, 2024
…0623)

## Summary

This PR adds implementations around the following grammar rules to
support the other expression and statement parsing. This PR doesn't
update the related expressions and statements as they will be done in
their own PRs.

* `star_expressions`
* `star_expression`
* `'*' bitwise_or`

### Problem

There are two grammar entries around star expressions:

```
star_expression:
    | '*' bitwise_or 
    | expression

starred_expression:
    '*' expression
```

The main difference is the precedence at which the expression should be
parsed if a `*` token is encountered. Currently, we parse both rules at
the same precedence which makes the parser more lenient. There are
various expressions and statements which uses either one of the above
rules.

### Solutions

#### Current solution

Add the following new methods to the parser:
1. `parse_star_expression_list` which would be similar to
`parse_expression_list` and would match the `star_expressions` grammar
rule
2. `parse_star_expression_or_higher` which would match the
`star_expression` or `star_named_expression` grammar rule
3. `parse_expression_with_bitwise_or_precedence` which would match the
`bitwise_or` grammar rule

Here, we could possibly merge the `parse_expression_list` and
`parse_star_expression_list` into a single function as it only differs
in the method it delegates the parsing to. The parameter could be called
`allow_star_expression`. It could be confusing because even if it’s
false, we would allow starred expression but at a higher binding power.

This solution closely resembles the actual grammar and is easier to
follow in the parser code. It doesn't require a lot of verification by
the caller as the precedence parser makes sure to not allow that.

#### Other solution

Add a new `verify_bitwise_or_precedence` method which reports an error
if the grammar expected a starred expression with `bitwise_or`
precedence:

```rs
pub(super) fn verify_star_expressions(
    &mut self,
    expr: &Expr,
    allow_named_expression: AllowNamedExpression,
) {
    match expr {
        Expr::Tuple(ast::ExprTuple { elts, .. }) => {
            for expr in elts {
                self.verify_star_expressions(expr, allow_named_expression);
            }
        }
        Expr::Starred(ast::ExprStarred { value, .. }) => {
            let expr_kind = match &**value {
                Expr::Compare(_) => "comparison",
                Expr::UnaryOp(ast::ExprUnaryOp {
                    op: ast::UnaryOp::Not,
                    ..
                }) => "unary `not`",
                Expr::BoolOp(_) => "boolean",
                Expr::If(_) => "`if-else`",
                Expr::Lambda(_) => "lambda",
                Expr::Named(_) if allow_named_expression.is_no() => "named",
                _ => return,
            };

            self.add_error(
                ParseErrorType::OtherError(format!(
                    "{expr_kind} expression cannot be used here",
                )),
                expr.range(),
            );
        }
        _ => {}
    }
}
```

For this solution, the parser will need to verify the precedence of a
starred expression wherever grammar expected the `star_expressions`,
`star_expression` or `star_named_expression` rule.

We could possibly adopt this solution for better error recovery but I
think for now having a strict parser is better suited because we don't
really know what complete error recovery looks like.

### Follow-up

There's a related follow-up issue which I'm working on and is related to
starred expressions.
The starred expression is only allowed in certain expressions and
context which the parser needs to make sure and report an error if found
otherwise.

## Test Plan

Tested this locally by changing various expression and statement to use
these methods. As mentioned they will be in their own PR.
dhruvmanila added a commit that referenced this pull request Apr 7, 2024
…0623)

## Summary

This PR adds implementations around the following grammar rules to
support the other expression and statement parsing. This PR doesn't
update the related expressions and statements as they will be done in
their own PRs.

* `star_expressions`
* `star_expression`
* `'*' bitwise_or`

### Problem

There are two grammar entries around star expressions:

```
star_expression:
    | '*' bitwise_or 
    | expression

starred_expression:
    '*' expression
```

The main difference is the precedence at which the expression should be
parsed if a `*` token is encountered. Currently, we parse both rules at
the same precedence which makes the parser more lenient. There are
various expressions and statements which uses either one of the above
rules.

### Solutions

#### Current solution

Add the following new methods to the parser:
1. `parse_star_expression_list` which would be similar to
`parse_expression_list` and would match the `star_expressions` grammar
rule
2. `parse_star_expression_or_higher` which would match the
`star_expression` or `star_named_expression` grammar rule
3. `parse_expression_with_bitwise_or_precedence` which would match the
`bitwise_or` grammar rule

Here, we could possibly merge the `parse_expression_list` and
`parse_star_expression_list` into a single function as it only differs
in the method it delegates the parsing to. The parameter could be called
`allow_star_expression`. It could be confusing because even if it’s
false, we would allow starred expression but at a higher binding power.

This solution closely resembles the actual grammar and is easier to
follow in the parser code. It doesn't require a lot of verification by
the caller as the precedence parser makes sure to not allow that.

#### Other solution

Add a new `verify_bitwise_or_precedence` method which reports an error
if the grammar expected a starred expression with `bitwise_or`
precedence:

```rs
pub(super) fn verify_star_expressions(
    &mut self,
    expr: &Expr,
    allow_named_expression: AllowNamedExpression,
) {
    match expr {
        Expr::Tuple(ast::ExprTuple { elts, .. }) => {
            for expr in elts {
                self.verify_star_expressions(expr, allow_named_expression);
            }
        }
        Expr::Starred(ast::ExprStarred { value, .. }) => {
            let expr_kind = match &**value {
                Expr::Compare(_) => "comparison",
                Expr::UnaryOp(ast::ExprUnaryOp {
                    op: ast::UnaryOp::Not,
                    ..
                }) => "unary `not`",
                Expr::BoolOp(_) => "boolean",
                Expr::If(_) => "`if-else`",
                Expr::Lambda(_) => "lambda",
                Expr::Named(_) if allow_named_expression.is_no() => "named",
                _ => return,
            };

            self.add_error(
                ParseErrorType::OtherError(format!(
                    "{expr_kind} expression cannot be used here",
                )),
                expr.range(),
            );
        }
        _ => {}
    }
}
```

For this solution, the parser will need to verify the precedence of a
starred expression wherever grammar expected the `star_expressions`,
`star_expression` or `star_named_expression` rule.

We could possibly adopt this solution for better error recovery but I
think for now having a strict parser is better suited because we don't
really know what complete error recovery looks like.

### Follow-up

There's a related follow-up issue which I'm working on and is related to
starred expressions.
The starred expression is only allowed in certain expressions and
context which the parser needs to make sure and report an error if found
otherwise.

## Test Plan

Tested this locally by changing various expression and statement to use
these methods. As mentioned they will be in their own PR.
dhruvmanila added a commit that referenced this pull request Apr 11, 2024
…0623)

## Summary

This PR adds implementations around the following grammar rules to
support the other expression and statement parsing. This PR doesn't
update the related expressions and statements as they will be done in
their own PRs.

* `star_expressions`
* `star_expression`
* `'*' bitwise_or`

### Problem

There are two grammar entries around star expressions:

```
star_expression:
    | '*' bitwise_or 
    | expression

starred_expression:
    '*' expression
```

The main difference is the precedence at which the expression should be
parsed if a `*` token is encountered. Currently, we parse both rules at
the same precedence which makes the parser more lenient. There are
various expressions and statements which uses either one of the above
rules.

### Solutions

#### Current solution

Add the following new methods to the parser:
1. `parse_star_expression_list` which would be similar to
`parse_expression_list` and would match the `star_expressions` grammar
rule
2. `parse_star_expression_or_higher` which would match the
`star_expression` or `star_named_expression` grammar rule
3. `parse_expression_with_bitwise_or_precedence` which would match the
`bitwise_or` grammar rule

Here, we could possibly merge the `parse_expression_list` and
`parse_star_expression_list` into a single function as it only differs
in the method it delegates the parsing to. The parameter could be called
`allow_star_expression`. It could be confusing because even if it’s
false, we would allow starred expression but at a higher binding power.

This solution closely resembles the actual grammar and is easier to
follow in the parser code. It doesn't require a lot of verification by
the caller as the precedence parser makes sure to not allow that.

#### Other solution

Add a new `verify_bitwise_or_precedence` method which reports an error
if the grammar expected a starred expression with `bitwise_or`
precedence:

```rs
pub(super) fn verify_star_expressions(
    &mut self,
    expr: &Expr,
    allow_named_expression: AllowNamedExpression,
) {
    match expr {
        Expr::Tuple(ast::ExprTuple { elts, .. }) => {
            for expr in elts {
                self.verify_star_expressions(expr, allow_named_expression);
            }
        }
        Expr::Starred(ast::ExprStarred { value, .. }) => {
            let expr_kind = match &**value {
                Expr::Compare(_) => "comparison",
                Expr::UnaryOp(ast::ExprUnaryOp {
                    op: ast::UnaryOp::Not,
                    ..
                }) => "unary `not`",
                Expr::BoolOp(_) => "boolean",
                Expr::If(_) => "`if-else`",
                Expr::Lambda(_) => "lambda",
                Expr::Named(_) if allow_named_expression.is_no() => "named",
                _ => return,
            };

            self.add_error(
                ParseErrorType::OtherError(format!(
                    "{expr_kind} expression cannot be used here",
                )),
                expr.range(),
            );
        }
        _ => {}
    }
}
```

For this solution, the parser will need to verify the precedence of a
starred expression wherever grammar expected the `star_expressions`,
`star_expression` or `star_named_expression` rule.

We could possibly adopt this solution for better error recovery but I
think for now having a strict parser is better suited because we don't
really know what complete error recovery looks like.

### Follow-up

There's a related follow-up issue which I'm working on and is related to
starred expressions.
The starred expression is only allowed in certain expressions and
context which the parser needs to make sure and report an error if found
otherwise.

## Test Plan

Tested this locally by changing various expression and statement to use
these methods. As mentioned they will be in their own PR.
dhruvmanila added a commit that referenced this pull request Apr 15, 2024
…0623)

## Summary

This PR adds implementations around the following grammar rules to
support the other expression and statement parsing. This PR doesn't
update the related expressions and statements as they will be done in
their own PRs.

* `star_expressions`
* `star_expression`
* `'*' bitwise_or`

### Problem

There are two grammar entries around star expressions:

```
star_expression:
    | '*' bitwise_or 
    | expression

starred_expression:
    '*' expression
```

The main difference is the precedence at which the expression should be
parsed if a `*` token is encountered. Currently, we parse both rules at
the same precedence which makes the parser more lenient. There are
various expressions and statements which uses either one of the above
rules.

### Solutions

#### Current solution

Add the following new methods to the parser:
1. `parse_star_expression_list` which would be similar to
`parse_expression_list` and would match the `star_expressions` grammar
rule
2. `parse_star_expression_or_higher` which would match the
`star_expression` or `star_named_expression` grammar rule
3. `parse_expression_with_bitwise_or_precedence` which would match the
`bitwise_or` grammar rule

Here, we could possibly merge the `parse_expression_list` and
`parse_star_expression_list` into a single function as it only differs
in the method it delegates the parsing to. The parameter could be called
`allow_star_expression`. It could be confusing because even if it’s
false, we would allow starred expression but at a higher binding power.

This solution closely resembles the actual grammar and is easier to
follow in the parser code. It doesn't require a lot of verification by
the caller as the precedence parser makes sure to not allow that.

#### Other solution

Add a new `verify_bitwise_or_precedence` method which reports an error
if the grammar expected a starred expression with `bitwise_or`
precedence:

```rs
pub(super) fn verify_star_expressions(
    &mut self,
    expr: &Expr,
    allow_named_expression: AllowNamedExpression,
) {
    match expr {
        Expr::Tuple(ast::ExprTuple { elts, .. }) => {
            for expr in elts {
                self.verify_star_expressions(expr, allow_named_expression);
            }
        }
        Expr::Starred(ast::ExprStarred { value, .. }) => {
            let expr_kind = match &**value {
                Expr::Compare(_) => "comparison",
                Expr::UnaryOp(ast::ExprUnaryOp {
                    op: ast::UnaryOp::Not,
                    ..
                }) => "unary `not`",
                Expr::BoolOp(_) => "boolean",
                Expr::If(_) => "`if-else`",
                Expr::Lambda(_) => "lambda",
                Expr::Named(_) if allow_named_expression.is_no() => "named",
                _ => return,
            };

            self.add_error(
                ParseErrorType::OtherError(format!(
                    "{expr_kind} expression cannot be used here",
                )),
                expr.range(),
            );
        }
        _ => {}
    }
}
```

For this solution, the parser will need to verify the precedence of a
starred expression wherever grammar expected the `star_expressions`,
`star_expression` or `star_named_expression` rule.

We could possibly adopt this solution for better error recovery but I
think for now having a strict parser is better suited because we don't
really know what complete error recovery looks like.

### Follow-up

There's a related follow-up issue which I'm working on and is related to
starred expressions.
The starred expression is only allowed in certain expressions and
context which the parser needs to make sure and report an error if found
otherwise.

## Test Plan

Tested this locally by changing various expression and statement to use
these methods. As mentioned they will be in their own PR.
dhruvmanila added a commit that referenced this pull request Apr 16, 2024
…0623)

## Summary

This PR adds implementations around the following grammar rules to
support the other expression and statement parsing. This PR doesn't
update the related expressions and statements as they will be done in
their own PRs.

* `star_expressions`
* `star_expression`
* `'*' bitwise_or`

### Problem

There are two grammar entries around star expressions:

```
star_expression:
    | '*' bitwise_or 
    | expression

starred_expression:
    '*' expression
```

The main difference is the precedence at which the expression should be
parsed if a `*` token is encountered. Currently, we parse both rules at
the same precedence which makes the parser more lenient. There are
various expressions and statements which uses either one of the above
rules.

### Solutions

#### Current solution

Add the following new methods to the parser:
1. `parse_star_expression_list` which would be similar to
`parse_expression_list` and would match the `star_expressions` grammar
rule
2. `parse_star_expression_or_higher` which would match the
`star_expression` or `star_named_expression` grammar rule
3. `parse_expression_with_bitwise_or_precedence` which would match the
`bitwise_or` grammar rule

Here, we could possibly merge the `parse_expression_list` and
`parse_star_expression_list` into a single function as it only differs
in the method it delegates the parsing to. The parameter could be called
`allow_star_expression`. It could be confusing because even if it’s
false, we would allow starred expression but at a higher binding power.

This solution closely resembles the actual grammar and is easier to
follow in the parser code. It doesn't require a lot of verification by
the caller as the precedence parser makes sure to not allow that.

#### Other solution

Add a new `verify_bitwise_or_precedence` method which reports an error
if the grammar expected a starred expression with `bitwise_or`
precedence:

```rs
pub(super) fn verify_star_expressions(
    &mut self,
    expr: &Expr,
    allow_named_expression: AllowNamedExpression,
) {
    match expr {
        Expr::Tuple(ast::ExprTuple { elts, .. }) => {
            for expr in elts {
                self.verify_star_expressions(expr, allow_named_expression);
            }
        }
        Expr::Starred(ast::ExprStarred { value, .. }) => {
            let expr_kind = match &**value {
                Expr::Compare(_) => "comparison",
                Expr::UnaryOp(ast::ExprUnaryOp {
                    op: ast::UnaryOp::Not,
                    ..
                }) => "unary `not`",
                Expr::BoolOp(_) => "boolean",
                Expr::If(_) => "`if-else`",
                Expr::Lambda(_) => "lambda",
                Expr::Named(_) if allow_named_expression.is_no() => "named",
                _ => return,
            };

            self.add_error(
                ParseErrorType::OtherError(format!(
                    "{expr_kind} expression cannot be used here",
                )),
                expr.range(),
            );
        }
        _ => {}
    }
}
```

For this solution, the parser will need to verify the precedence of a
starred expression wherever grammar expected the `star_expressions`,
`star_expression` or `star_named_expression` rule.

We could possibly adopt this solution for better error recovery but I
think for now having a strict parser is better suited because we don't
really know what complete error recovery looks like.

### Follow-up

There's a related follow-up issue which I'm working on and is related to
starred expressions.
The starred expression is only allowed in certain expressions and
context which the parser needs to make sure and report an error if found
otherwise.

## Test Plan

Tested this locally by changing various expression and statement to use
these methods. As mentioned they will be in their own PR.
dhruvmanila added a commit that referenced this pull request Apr 16, 2024
…0623)

## Summary

This PR adds implementations around the following grammar rules to
support the other expression and statement parsing. This PR doesn't
update the related expressions and statements as they will be done in
their own PRs.

* `star_expressions`
* `star_expression`
* `'*' bitwise_or`

### Problem

There are two grammar entries around star expressions:

```
star_expression:
    | '*' bitwise_or 
    | expression

starred_expression:
    '*' expression
```

The main difference is the precedence at which the expression should be
parsed if a `*` token is encountered. Currently, we parse both rules at
the same precedence which makes the parser more lenient. There are
various expressions and statements which uses either one of the above
rules.

### Solutions

#### Current solution

Add the following new methods to the parser:
1. `parse_star_expression_list` which would be similar to
`parse_expression_list` and would match the `star_expressions` grammar
rule
2. `parse_star_expression_or_higher` which would match the
`star_expression` or `star_named_expression` grammar rule
3. `parse_expression_with_bitwise_or_precedence` which would match the
`bitwise_or` grammar rule

Here, we could possibly merge the `parse_expression_list` and
`parse_star_expression_list` into a single function as it only differs
in the method it delegates the parsing to. The parameter could be called
`allow_star_expression`. It could be confusing because even if it’s
false, we would allow starred expression but at a higher binding power.

This solution closely resembles the actual grammar and is easier to
follow in the parser code. It doesn't require a lot of verification by
the caller as the precedence parser makes sure to not allow that.

#### Other solution

Add a new `verify_bitwise_or_precedence` method which reports an error
if the grammar expected a starred expression with `bitwise_or`
precedence:

```rs
pub(super) fn verify_star_expressions(
    &mut self,
    expr: &Expr,
    allow_named_expression: AllowNamedExpression,
) {
    match expr {
        Expr::Tuple(ast::ExprTuple { elts, .. }) => {
            for expr in elts {
                self.verify_star_expressions(expr, allow_named_expression);
            }
        }
        Expr::Starred(ast::ExprStarred { value, .. }) => {
            let expr_kind = match &**value {
                Expr::Compare(_) => "comparison",
                Expr::UnaryOp(ast::ExprUnaryOp {
                    op: ast::UnaryOp::Not,
                    ..
                }) => "unary `not`",
                Expr::BoolOp(_) => "boolean",
                Expr::If(_) => "`if-else`",
                Expr::Lambda(_) => "lambda",
                Expr::Named(_) if allow_named_expression.is_no() => "named",
                _ => return,
            };

            self.add_error(
                ParseErrorType::OtherError(format!(
                    "{expr_kind} expression cannot be used here",
                )),
                expr.range(),
            );
        }
        _ => {}
    }
}
```

For this solution, the parser will need to verify the precedence of a
starred expression wherever grammar expected the `star_expressions`,
`star_expression` or `star_named_expression` rule.

We could possibly adopt this solution for better error recovery but I
think for now having a strict parser is better suited because we don't
really know what complete error recovery looks like.

### Follow-up

There's a related follow-up issue which I'm working on and is related to
starred expressions.
The starred expression is only allowed in certain expressions and
context which the parser needs to make sure and report an error if found
otherwise.

## Test Plan

Tested this locally by changing various expression and statement to use
these methods. As mentioned they will be in their own PR.
dhruvmanila added a commit that referenced this pull request Apr 17, 2024
…0623)

## Summary

This PR adds implementations around the following grammar rules to
support the other expression and statement parsing. This PR doesn't
update the related expressions and statements as they will be done in
their own PRs.

* `star_expressions`
* `star_expression`
* `'*' bitwise_or`

### Problem

There are two grammar entries around star expressions:

```
star_expression:
    | '*' bitwise_or 
    | expression

starred_expression:
    '*' expression
```

The main difference is the precedence at which the expression should be
parsed if a `*` token is encountered. Currently, we parse both rules at
the same precedence which makes the parser more lenient. There are
various expressions and statements which uses either one of the above
rules.

### Solutions

#### Current solution

Add the following new methods to the parser:
1. `parse_star_expression_list` which would be similar to
`parse_expression_list` and would match the `star_expressions` grammar
rule
2. `parse_star_expression_or_higher` which would match the
`star_expression` or `star_named_expression` grammar rule
3. `parse_expression_with_bitwise_or_precedence` which would match the
`bitwise_or` grammar rule

Here, we could possibly merge the `parse_expression_list` and
`parse_star_expression_list` into a single function as it only differs
in the method it delegates the parsing to. The parameter could be called
`allow_star_expression`. It could be confusing because even if it’s
false, we would allow starred expression but at a higher binding power.

This solution closely resembles the actual grammar and is easier to
follow in the parser code. It doesn't require a lot of verification by
the caller as the precedence parser makes sure to not allow that.

#### Other solution

Add a new `verify_bitwise_or_precedence` method which reports an error
if the grammar expected a starred expression with `bitwise_or`
precedence:

```rs
pub(super) fn verify_star_expressions(
    &mut self,
    expr: &Expr,
    allow_named_expression: AllowNamedExpression,
) {
    match expr {
        Expr::Tuple(ast::ExprTuple { elts, .. }) => {
            for expr in elts {
                self.verify_star_expressions(expr, allow_named_expression);
            }
        }
        Expr::Starred(ast::ExprStarred { value, .. }) => {
            let expr_kind = match &**value {
                Expr::Compare(_) => "comparison",
                Expr::UnaryOp(ast::ExprUnaryOp {
                    op: ast::UnaryOp::Not,
                    ..
                }) => "unary `not`",
                Expr::BoolOp(_) => "boolean",
                Expr::If(_) => "`if-else`",
                Expr::Lambda(_) => "lambda",
                Expr::Named(_) if allow_named_expression.is_no() => "named",
                _ => return,
            };

            self.add_error(
                ParseErrorType::OtherError(format!(
                    "{expr_kind} expression cannot be used here",
                )),
                expr.range(),
            );
        }
        _ => {}
    }
}
```

For this solution, the parser will need to verify the precedence of a
starred expression wherever grammar expected the `star_expressions`,
`star_expression` or `star_named_expression` rule.

We could possibly adopt this solution for better error recovery but I
think for now having a strict parser is better suited because we don't
really know what complete error recovery looks like.

### Follow-up

There's a related follow-up issue which I'm working on and is related to
starred expressions.
The starred expression is only allowed in certain expressions and
context which the parser needs to make sure and report an error if found
otherwise.

## Test Plan

Tested this locally by changing various expression and statement to use
these methods. As mentioned they will be in their own PR.
dhruvmanila added a commit that referenced this pull request Apr 18, 2024
…0623)

## Summary

This PR adds implementations around the following grammar rules to
support the other expression and statement parsing. This PR doesn't
update the related expressions and statements as they will be done in
their own PRs.

* `star_expressions`
* `star_expression`
* `'*' bitwise_or`

### Problem

There are two grammar entries around star expressions:

```
star_expression:
    | '*' bitwise_or 
    | expression

starred_expression:
    '*' expression
```

The main difference is the precedence at which the expression should be
parsed if a `*` token is encountered. Currently, we parse both rules at
the same precedence which makes the parser more lenient. There are
various expressions and statements which uses either one of the above
rules.

### Solutions

#### Current solution

Add the following new methods to the parser:
1. `parse_star_expression_list` which would be similar to
`parse_expression_list` and would match the `star_expressions` grammar
rule
2. `parse_star_expression_or_higher` which would match the
`star_expression` or `star_named_expression` grammar rule
3. `parse_expression_with_bitwise_or_precedence` which would match the
`bitwise_or` grammar rule

Here, we could possibly merge the `parse_expression_list` and
`parse_star_expression_list` into a single function as it only differs
in the method it delegates the parsing to. The parameter could be called
`allow_star_expression`. It could be confusing because even if it’s
false, we would allow starred expression but at a higher binding power.

This solution closely resembles the actual grammar and is easier to
follow in the parser code. It doesn't require a lot of verification by
the caller as the precedence parser makes sure to not allow that.

#### Other solution

Add a new `verify_bitwise_or_precedence` method which reports an error
if the grammar expected a starred expression with `bitwise_or`
precedence:

```rs
pub(super) fn verify_star_expressions(
    &mut self,
    expr: &Expr,
    allow_named_expression: AllowNamedExpression,
) {
    match expr {
        Expr::Tuple(ast::ExprTuple { elts, .. }) => {
            for expr in elts {
                self.verify_star_expressions(expr, allow_named_expression);
            }
        }
        Expr::Starred(ast::ExprStarred { value, .. }) => {
            let expr_kind = match &**value {
                Expr::Compare(_) => "comparison",
                Expr::UnaryOp(ast::ExprUnaryOp {
                    op: ast::UnaryOp::Not,
                    ..
                }) => "unary `not`",
                Expr::BoolOp(_) => "boolean",
                Expr::If(_) => "`if-else`",
                Expr::Lambda(_) => "lambda",
                Expr::Named(_) if allow_named_expression.is_no() => "named",
                _ => return,
            };

            self.add_error(
                ParseErrorType::OtherError(format!(
                    "{expr_kind} expression cannot be used here",
                )),
                expr.range(),
            );
        }
        _ => {}
    }
}
```

For this solution, the parser will need to verify the precedence of a
starred expression wherever grammar expected the `star_expressions`,
`star_expression` or `star_named_expression` rule.

We could possibly adopt this solution for better error recovery but I
think for now having a strict parser is better suited because we don't
really know what complete error recovery looks like.

### Follow-up

There's a related follow-up issue which I'm working on and is related to
starred expressions.
The starred expression is only allowed in certain expressions and
context which the parser needs to make sure and report an error if found
otherwise.

## Test Plan

Tested this locally by changing various expression and statement to use
these methods. As mentioned they will be in their own PR.
dhruvmanila added a commit that referenced this pull request Apr 23, 2024
## Summary

This PR adds a new `ExpressionContext` struct which is used in
expression parsing.

This solves the following problem:
1. Allowing starred expression with different precedence
2. Allowing yield expression in certain context
3. Remove ambiguity with `in` keyword when parsing a `for ... in`
statement

For context, (1) was solved by adding `parse_star_expression_list` and
`parse_star_expression_or_higher` in #10623, (2) was solved by by adding
`parse_yield_expression_or_else` in #10809, and (3) was fixed in #11009.
All of the mentioned functions have been removed in favor of the context
flags.

As mentioned in #11009, an ideal solution would be to implement an
expression context which is what this PR implements. This is passed
around as function parameter and the call stack is used to automatically
reset the context.

### Recovery

How should the parser recover if the target expression is invalid when
an expression can consume the `in` keyword?

1. Should the `in` keyword be part of the target expression?
2. Or, should the expression parsing stop as soon as `in` keyword is
encountered, no matter the expression?

For example:
```python
for yield x in y: ...

# Here, should this be parsed as
for (yield x) in (y): ...
# Or
for (yield x in y): ...
# where the `in iter` part is missing
```

Or, for binary expression parsing:
```python
for x or y in z: ...

# Should this be parsed as
for (x or y) in z: ...
# Or
for (x or y in z): ...
# where the `in iter` part is missing
```

This need not be solved now, but is very easy to change. For context
this PR does the following:
* For binary, comparison, and unary expressions, stop at `in`
* For lambda, yield expressions, consume the `in`

## Test Plan

1. Add test cases for the `for ... in` statement and verify the
snapshots
2. Make sure the existing test suite pass
3. Run the fuzzer for around 3000 generated source code
4. Run the updated logic on a dozen or so open source repositories
(codename "parser-checkouts")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parser Related to the parser
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants