Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Double-store assertions when encountering invalid syntax #14313

Closed
sharkdp opened this issue Nov 13, 2024 · 2 comments · Fixed by #14317
Closed

[red-knot] Double-store assertions when encountering invalid syntax #14313

sharkdp opened this issue Nov 13, 2024 · 2 comments · Fixed by #14317
Assignees
Labels
bug Something isn't working red-knot Multi-file analysis & type inference

Comments

@sharkdp
Copy link
Contributor

sharkdp commented Nov 13, 2024

During type inference on some invalid syntax examples, we encounter situations where the exact same expression appears in multiple places inside the AST. The smallest example I could find is this:

for

When parsing it, we get the following AST. Note that the same Name(…) expression (same range) appears both as target and as iter field on StmtFor, once with Store context, once with Invalid context:

Module(
    ModModule {
        range: 0..3,
        body: [
            For(
                StmtFor {
                    range: 0..3,
                    is_async: false,
                    target: Name(
                        ExprName {
                            range: 3..3,
                            id: Name(""),
                            ctx: Store,
                        },
                    ),
                    iter: Name(
                        ExprName {
                            range: 3..3,
                            id: Name(""),
                            ctx: Invalid,
                        },
                    ),
                    body: [],
                    orelse: [],
                },
            ),
        ],
    },
)

This leads to a subsequent failing double-store assertion in store_expression_type because we infer types for both the target and the iter expressions:

fn store_expression_type(
&mut self,
expression: &impl HasScopedAstId<Id = ScopedExpressionId>,
ty: Type<'db>,
) {
let expr_id = expression.scoped_ast_id(self.db, self.scope());
let previous = self.types.expressions.insert(expr_id, ty);
assert_eq!(previous, None);
}

I'm not sure how to fix this.

Skipping storage when encountering expressions with Invalid context would work for this example, but there are other cases (:x=) where such an invalid expression is not "backed up" by a second valid expression.

Ignoring double-storage for Invalid expressions is also not an option, since in this example here, we run inference on iter first.

@sharkdp sharkdp added bug Something isn't working red-knot Multi-file analysis & type inference labels Nov 13, 2024
@MichaReiser
Copy link
Member

MichaReiser commented Nov 13, 2024

Hmm, this is interesting. We could consider using the ptr address instead of Range + Kind

@sharkdp
Copy link
Contributor Author

sharkdp commented Nov 13, 2024

A similar example is

x if $z

Mentioning it here because it leads to a slightly different panic (Calling self.infer_expression on a standalone-expression is not allowed…). But the underlying issue is the same, I believe.

@sharkdp sharkdp self-assigned this Nov 13, 2024
sharkdp added a commit that referenced this issue Nov 13, 2024
## Summary

Use the memory address to uniquely identify AST nodes, instead of
relying on source range and kind. The latter fails for ASTs resulting
from invalid syntax examples. See #14313 for details.

Also results in a 1-2% speedup
(https://codspeed.io/astral-sh/ruff/runs/67349cf55f36b36baa211360)

closes #14313 

## Review

Here are the places where we use `NodeKey` directly or indirectly (via
`ExpressionNodeKey` or `DefinitionNodeKey`):

```rs
// semantic_index.rs
pub(crate) struct SemanticIndex<'db> { 
    // [...]
    /// Map expressions to their corresponding scope.
    scopes_by_expression: FxHashMap<ExpressionNodeKey, FileScopeId>,

    /// Map from a node creating a definition to its definition.
    definitions_by_node: FxHashMap<DefinitionNodeKey, Definition<'db>>,

    /// Map from a standalone expression to its [`Expression`] ingredient.
    expressions_by_node: FxHashMap<ExpressionNodeKey, Expression<'db>>,
    // [...]
}

// semantic_index/builder.rs
pub(super) struct SemanticIndexBuilder<'db> {
    // [...]
    scopes_by_expression: FxHashMap<ExpressionNodeKey, FileScopeId>,
    definitions_by_node: FxHashMap<ExpressionNodeKey, Definition<'db>>,
    expressions_by_node: FxHashMap<ExpressionNodeKey, Expression<'db>>,
}

// semantic_index/ast_ids.rs
pub(crate) struct AstIds {
    /// Maps expressions to their expression id.
    expressions_map: FxHashMap<ExpressionNodeKey, ScopedExpressionId>,
    /// Maps expressions which "use" a symbol (that is, [`ast::ExprName`]) to a use id.
    uses_map: FxHashMap<ExpressionNodeKey, ScopedUseId>,
}

pub(super) struct AstIdsBuilder {
    expressions_map: FxHashMap<ExpressionNodeKey, ScopedExpressionId>,
    uses_map: FxHashMap<ExpressionNodeKey, ScopedUseId>,
}
```

## Test Plan

Added two failing examples to the corpus.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants