Separate syntactic and semantic nodes?

There is something that I was thinking about for a year or so that became more apparent while working on type alias refactoring (and planning the module refactoring). We have two different things mixed together in mypy code: syntactic nodes and "semantic" nodes. This is most obvious in `fixup.py` where we use `NodeVisitor` interface (intended to traverse syntactic tree) to patch symbol tables (something that carries purely semantic info). As a result, there are various weird things like `visit_var`, `visit_type_info`, etc. The idea is to have clear separation between syntactic and semantic nodes and have separate visitors for them. We currently already have this partially, we have `TypeInfo` vs `ClassDef`, and `Var` vs `AssignmentStmt`.

Here is a _short (and approximate)_ summary of the proposal:
* There are following semantic nodes: `Var`, `Function`, `Class` (non-leaf, has symbol table), `Module` (non-leaf, ditto), `TypeVar`, `TypeAlias`, `ConditionalNode` (see below).
* The above nodes inherit from `SymbolNode`, while syntactic nodes inherit from `Node`, these two both inherit from `Context`.
* Only the above nodes (and types) are serialized in cache and deserialized in incremental runs.
* Semantic nodes can have attributes that point to the relevant (defining) syntactic nodes (for example variable can have defining assignment, or function statement if property), but we should limit this to minimize cache size.
* The above nodes will have their separate visitor, so that in total we have three: for AST, for symbol tables, and for types.
* `ConditionalNode` exists to avoid having `.nodes` instead of `.node` in `SymbolTableNode`s (the latter are thin wrappers around semantic nodes), while still supporting certain conditional definitions like conditional imports. This node will have an attribute that is a list of other semantic nodes.

This is a very large refactoring, but IMO this will add robustness and clarity, and will simplify addition of new features (e.g. conditional imports). This is probably a low priority (long term) thing. We just discussed this with @JukkaL, he likes the idea but is concerned about the size of this refactoring. A possible way to go forward with this is to split this in several separate PRs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Separate syntactic and semantic nodes? #5159

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Separate syntactic and semantic nodes? #5159

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions