[red-knot] Double-store assertions when encountering invalid syntax #14313

sharkdp · 2024-11-13T10:57:55Z

During type inference on some invalid syntax examples, we encounter situations where the exact same expression appears in multiple places inside the AST. The smallest example I could find is this:

for

When parsing it, we get the following AST. Note that the same Name(…) expression (same range) appears both as target and as iter field on StmtFor, once with Store context, once with Invalid context:

Module(
    ModModule {
        range: 0..3,
        body: [
            For(
                StmtFor {
                    range: 0..3,
                    is_async: false,
                    target: Name(
                        ExprName {
                            range: 3..3,
                            id: Name(""),
                            ctx: Store,
                        },
                    ),
                    iter: Name(
                        ExprName {
                            range: 3..3,
                            id: Name(""),
                            ctx: Invalid,
                        },
                    ),
                    body: [],
                    orelse: [],
                },
            ),
        ],
    },
)

This leads to a subsequent failing double-store assertion in store_expression_type because we infer types for both the target and the iter expressions:

ruff/crates/red_knot_python_semantic/src/types/infer.rs

Lines 2176 to 2184 in f789b12

    
           fn store_expression_type( 
        
               &mut self, 
        
               expression: &impl HasScopedAstId<Id = ScopedExpressionId>, 
        
               ty: Type<'db>, 
        
           ) { 
        
               let expr_id = expression.scoped_ast_id(self.db, self.scope()); 
        
               let previous = self.types.expressions.insert(expr_id, ty); 
        
               assert_eq!(previous, None); 
        
           }

I'm not sure how to fix this.

Skipping storage when encountering expressions with Invalid context would work for this example, but there are other cases (:x=) where such an invalid expression is not "backed up" by a second valid expression.

Ignoring double-storage for Invalid expressions is also not an option, since in this example here, we run inference on iter first.

The text was updated successfully, but these errors were encountered:

MichaReiser · 2024-11-13T11:06:21Z

Hmm, this is interesting. We could consider using the ptr address instead of Range + Kind

sharkdp · 2024-11-13T11:54:32Z

A similar example is

x if $z

Mentioning it here because it leads to a slightly different panic (Calling self.infer_expression on a standalone-expression is not allowed…). But the underlying issue is the same, I believe.

## Summary Use the memory address to uniquely identify AST nodes, instead of relying on source range and kind. The latter fails for ASTs resulting from invalid syntax examples. See #14313 for details. Also results in a 1-2% speedup (https://codspeed.io/astral-sh/ruff/runs/67349cf55f36b36baa211360) closes #14313 ## Review Here are the places where we use `NodeKey` directly or indirectly (via `ExpressionNodeKey` or `DefinitionNodeKey`): ```rs // semantic_index.rs pub(crate) struct SemanticIndex<'db> { // [...] /// Map expressions to their corresponding scope. scopes_by_expression: FxHashMap<ExpressionNodeKey, FileScopeId>, /// Map from a node creating a definition to its definition. definitions_by_node: FxHashMap<DefinitionNodeKey, Definition<'db>>, /// Map from a standalone expression to its [`Expression`] ingredient. expressions_by_node: FxHashMap<ExpressionNodeKey, Expression<'db>>, // [...] } // semantic_index/builder.rs pub(super) struct SemanticIndexBuilder<'db> { // [...] scopes_by_expression: FxHashMap<ExpressionNodeKey, FileScopeId>, definitions_by_node: FxHashMap<ExpressionNodeKey, Definition<'db>>, expressions_by_node: FxHashMap<ExpressionNodeKey, Expression<'db>>, } // semantic_index/ast_ids.rs pub(crate) struct AstIds { /// Maps expressions to their expression id. expressions_map: FxHashMap<ExpressionNodeKey, ScopedExpressionId>, /// Maps expressions which "use" a symbol (that is, [`ast::ExprName`]) to a use id. uses_map: FxHashMap<ExpressionNodeKey, ScopedUseId>, } pub(super) struct AstIdsBuilder { expressions_map: FxHashMap<ExpressionNodeKey, ScopedExpressionId>, uses_map: FxHashMap<ExpressionNodeKey, ScopedUseId>, } ``` ## Test Plan Added two failing examples to the corpus.

sharkdp added bug Something isn't working red-knot Multi-file analysis & type inference labels Nov 13, 2024

sharkdp mentioned this issue Nov 13, 2024

[red-knot] Use memory address as AST node key #14317

Merged

sharkdp self-assigned this Nov 13, 2024

sharkdp closed this as completed in #14317 Nov 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[red-knot] Double-store assertions when encountering invalid syntax #14313

[red-knot] Double-store assertions when encountering invalid syntax #14313

sharkdp commented Nov 13, 2024 •

edited

Loading

MichaReiser commented Nov 13, 2024 •

edited

Loading

sharkdp commented Nov 13, 2024

[red-knot] Double-store assertions when encountering invalid syntax #14313

[red-knot] Double-store assertions when encountering invalid syntax #14313

Comments

sharkdp commented Nov 13, 2024 • edited Loading

MichaReiser commented Nov 13, 2024 • edited Loading

sharkdp commented Nov 13, 2024

sharkdp commented Nov 13, 2024 •

edited

Loading

MichaReiser commented Nov 13, 2024 •

edited

Loading