-
-
Notifications
You must be signed in to change notification settings - Fork 485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ast_node_id
field to all ast nodes
#2818
Comments
cc @Boshen |
Let's say we are in the process of visiting an expression, Is there any way that can get that expression's For doing such a thing 2 things come to my mind.
Without any one of these, we have to basically intercept the visit at a much higher level than necessary to gain knowledge of the surrounding nodes. I believe that having |
It's a 👍 for adding a node id because rustc has them as well https://doc.rust-lang.org/beta/nightly-rustc/rustc_ast/node_id/struct.NodeId.html, so the arguments are not theoretical but practical. Since this requires a lot of effort, I think we need a few real world scenarios that can motivate this change, to decide whether we want to do this right now, or later when it's absolutely necessary. |
Well, let me try my best to explain a real-world situation in which this change can help make things easier. Let's say we want to transform a given node based on its parent, Right now this is achieved by monitoring all nodes and keeping a stack of them. I'm going to take the impl<'a> VisitMut<'a> for ArrowFunctions<'a> {
fn enter_node(&mut self, kind: AstType) {
self.nodes.push(kind);
}
fn leave_node(&mut self, _kind: AstType) {
self.nodes.pop();
}
fn visit_jsx_identifier(&mut self, ident: &mut JSXIdentifier<'a>) {
let parent_kind = self.nodes.last().unwrap();
let parent_parent_kind = self.nodes[self.nodes.len() - 2];
// ...
walk_jsx_identifier_mut(self, ident);
}
} Well if we go for this change it can be simplified into something more akin to this: impl<'a> VisitMut<'a> for ArrowFunctions<'a> {
fn visit_jsx_identifier(&mut self, ident: &mut JSXIdentifier<'a>) {
let parent = self.ctx.semantics().nodes().parent(ident.ast_node_id());
let parent_kind = parent.kind();
let parent_parent_kind = self.ctx.semantics().nodes().parent_kind(parent.ast_node_id());
// ...
walk_jsx_identifier_mut(self, ident);
}
} Keep in mind that the example above is only using its parent's kind, If we want the actual parent node in |
Even in I guess it makes sense that Rust does it; since it needs complex analysis. If you are worried about having extra fields in the AST, We may be able to make it optional via features if we make the |
In some cases we can intercept the parent using a visit on a higher node, for example instead of detecting a function parameter and then getting its parent function node, We can detect functions and search for a viable function parameter. I strongly believe that these cases should be written/ported from Babel in a manner that keeps the data flow local, We should go for using semantics information if it is absolutely necessary however it doesn't mean that we won't need it to implement some of these cases. EditIn other situations, it may result in cleaner code which could be favored over data locality. |
Just want to point out an interesting approach, If we implement the ast_node_id method via the impl<'a> VisitMut<'a> for ArrowFunctions<'a> {
fn visit_jsx_identifier(&mut self, ident: &mut JSXIdentifier<'a>) {
let parent = self.ctx.semantics().nodes().parent(ident);
let parent_kind = parent.kind();
let parent_parent_kind = self.ctx.semantics().nodes().parent_kind(parent);
// ...
walk_jsx_identifier_mut(self, ident);
}
} The |
Just to make it clear, This approach is necessary if we want to replace the node instead of mutating it in place. So I believe that it makes sense to default to this when appropriate. At least for the minifier and transformers! Linters have more leeway for having declarative implementations. |
How do we deal with enum types? The current approach has enum types in the parent pointing tree, it's going to be a impractical to migrate it (I think). |
Can you give an example of these enums? Several types are enum but I can't find why it would conflict with them. pub enum Declaration<'a> {
VariableDeclaration(Box<'a, VariableDeclaration<'a>>),
FunctionDeclaration(Box<'a, Function<'a>>),
ClassDeclaration(Box<'a, Class<'a>>),
UsingDeclaration(Box<'a, UsingDeclaration<'a>>),
TSTypeAliasDeclaration(Box<'a, TSTypeAliasDeclaration<'a>>),
TSInterfaceDeclaration(Box<'a, TSInterfaceDeclaration<'a>>),
TSEnumDeclaration(Box<'a, TSEnumDeclaration<'a>>),
TSModuleDeclaration(Box<'a, TSModuleDeclaration<'a>>),
TSImportEqualsDeclaration(Box<'a, TSImportEqualsDeclaration<'a>>),
} As you can see all of these enum variants would contain a node at the end, So we can use the underlying node's ID, Here's how we can implement this trait for enums: trait AstNode {
fn ast_node_id(&self) -> AstNodeId;
}
impl<'a> AstNode for Declaration<'a> {
fn ast_node_id(&self) -> AstNodeId {
match self {
VariableDeclaration(node) => node.ast_node_id(),
FunctionDeclaration(node) => node.ast_node_id(),
ClassDeclaration(node) => node.ast_node_id(),
UsingDeclaration(node) => node.ast_node_id(),
// ....
}
}
} |
/// Array Expression Element
#[derive(Debug, Hash)]
#[cfg_attr(feature = "serialize", derive(Serialize))]
#[cfg_attr(feature = "serialize", serde(untagged))]
pub enum ArrayExpressionElement<'a> {
SpreadElement(Box<'a, SpreadElement<'a>>),
Expression(Expression<'a>),
/// Array hole for sparse arrays
/// <https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Trailing_commas#arrays>
// Serialize as `null`. `serialize_elision` in serialize.rs.
#[cfg_attr(
feature = "serialize",
serde(serialize_with = "ArrayExpressionElement::serialize_elision")
)]
Elision(Span),
} The Edit:We can also rewrite it as such: Elision(Span, AstNodeId), |
Feel free to create another issue for this change. You may have missed the thread on discord https://discord.com/channels/1079625926024900739/1222938985224081528 I'll create an umbrella issue for progress report and a milestone to officially kick off new transformer implementation. |
I think it is a part of this issue and just something we have to do in order to get this working. It won't have any concerning effect on the other parts, We almost always ignore elision elements. It is just there to fill a space in the array.
Yes, I saw this after commenting here, I don't have a habit of checking my discord servers(I still have notifications for directs). I guess It's decided then, I'll start by creating a PR for this change. As soon as it gets to a working state I'll mention you guys for feedback. |
FYI I created the AST builder back in the day envisioning that we are going to add node ids ... I think this is the day. Good luck! |
@overlookmotel @Boshen I still think there is a use for this ID, If we have the node ID along with our nodes we can traverse through things like scopes with no issue(both in linters and transformers). It makes rules like #2637 simpler(since we can rely on scope data without having to do a lot of iterations). Let me know what you think, Since we are doing a lot of changes on AST maybe it is a good time to also sneak this in. |
I do not have a clear idea at present of best way to implement scope binding lookups in transformers. You may be right that an AST node ID is the best way. On the other hand, we may be able to find a more efficient way using the kind of "pointer tricks" that So... er... don't know! But I've made a comment on #2861 to not delete the branch in case we want to come back to this. Could you also please make sure you keep a local copy of the branch too, just in case? |
I'll fork the branch; Graphite might cause me to delete the local branch by accident. |
We should have a field similar to
IdentifierReference
'sreference_id
that would keep theAstNodeId
of each AST node.Can be written manually, using a declarative
ast_node
macro for definition or a procedural derive macro.I guess we can also remove
AstNode
as it can be changed into SOA in theAstNodes
structure.If we remove the
AstNode
struct we can add a trait with the same name which would implement a function calledast_node_id
that would give us the information for getting things thatAstNode
would've provided before. With this trait, we can use any ast node with the help of generics since with the ast_node_id alone we can infer the node's kind, parents, scope, control flow, and anything that gets added down the road.The text was updated successfully, but these errors were encountered: