-
-
Notifications
You must be signed in to change notification settings - Fork 496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Codegen AST related code #3819
Comments
The AST also includes some types which are in (I've never been clear why these types are in |
No, let's not scope creep. There are other traits and functions that depend on these common operators. |
Are we able to manually inline |
Can you elaborate on it? Wouldn't we lose some information if we don't express the inheritance in the type system? While the actual code would behave the same, It might be useful for the user to see this as the definition: inherit_variants! {
/// Argument
///
/// Inherits variants from [`Expression`]. See [`ast` module docs] for explanation of inheritance.
///
/// [`ast` module docs]: `super`
#[visited_node]
#[repr(C, u8)]
#[derive(Debug, Hash)]
#[cfg_attr(feature = "serialize", derive(Serialize, Tsify))]
#[cfg_attr(feature = "serialize", serde(untagged))]
pub enum Argument<'a> {
SpreadElement(Box<'a, SpreadElement<'a>>) = 64,
// `Expression` variants added here by `inherit_variants!` macro
@inherit Expression
}
} Instead of this: /// Argument
///
/// Inherits variants from [`Expression`]. See [`ast` module docs] for explanation of inheritance.
///
/// [`ast` module docs]: `super`
#[visited_node]
#[repr(C, u8)]
#[derive(Debug, Hash)]
#[cfg_attr(feature = "serialize", derive(Serialize, Tsify))]
#[cfg_attr(feature = "serialize", serde(untagged))]
pub enum Argument<'a> {
SpreadElement(Box<'a, SpreadElement<'a>>) = 64,
BooleanLiteral(Box<'a, BooleanLiteral>) = 0,
NullLiteral(Box<'a, NullLiteral>) = 1,
NumericLiteral(Box<'a, NumericLiteral<'a>>) = 2,
BigIntLiteral(Box<'a, BigIntLiteral<'a>>) = 3,
RegExpLiteral(Box<'a, RegExpLiteral<'a>>) = 4,
StringLiteral(Box<'a, StringLiteral<'a>>) = 5,
TemplateLiteral(Box<'a, TemplateLiteral<'a>>) = 6,
Identifier(Box<'a, IdentifierReference<'a>>) = 7,
MetaProperty(Box<'a, MetaProperty<'a>>) = 8,
Super(Box<'a, Super>) = 9,
ArrayExpression(Box<'a, ArrayExpression<'a>>) = 10,
ArrowFunctionExpression(Box<'a, ArrowFunctionExpression<'a>>) = 11,
AssignmentExpression(Box<'a, AssignmentExpression<'a>>) = 12,
AwaitExpression(Box<'a, AwaitExpression<'a>>) = 13,
BinaryExpression(Box<'a, BinaryExpression<'a>>) = 14,
CallExpression(Box<'a, CallExpression<'a>>) = 15,
ChainExpression(Box<'a, ChainExpression<'a>>) = 16,
// .
// .
// .
} On another note, if we get the macro caching to work (related to this) we can have this without any compilation penalty which I think is much superior(And I guess @overlookmotel would agree). /// Argument
///
/// Inherits variants from [`Expression`]. See [`ast` module docs] for explanation of inheritance.
///
/// [`ast` module docs]: `super`
#[inheritance]
#[visited_node]
#[repr(C, u8)]
#[derive(Debug, Hash)]
#[cfg_attr(feature = "serialize", derive(Serialize, Tsify))]
#[cfg_attr(feature = "serialize", serde(untagged))]
pub enum Argument<'a> {
SpreadElement(Box<'a, SpreadElement<'a>>) = 64,
#[inherit] // or maybe `#[inline]` since it is more rust-like?
Expression(Expression<'a>),
} |
If we want to have the inheritance expanded we should do the expansion with the codegen to keep it consistent and schema close to the ECMAspec. But IMHO, manually inlining the variants is going to be a maintenance nightmare. because they would easily drift apart with some time and careless reviewing(which always happens no matter how many people review the same piece of code). |
@Boshen @overlookmotel I think we should remove this one from our milestones and here's why. The performance gain from pre-expanding builtin derives is going to chop off a handful of seconds from our build time and even that is an optimistic estimate. It might at first feel unintuitive but let me explain myself. Syntax ExtensionsThe macro calls(including the macro_rules itself), attributes, and derives are among syntax extension kinds, A syntax extension is the last form of a macro(usually) before turning into the desired AST. If you look at the implementation of SyntaxExtensionKind you'd find that there are 2 kinds of any macro kind, There are 2 sets What's the difference?In a nutshell, legacy kinds have What happens if they phase out the legacy kinds?Well again it might seem intuitive that with this change a pre-expand can improve the build time by a good margin but the truth is that for dead simple macros such as these 2 if written as a procedural macro you are going to mostly wait on the initial build of the macro crate and then spend a good amount of time on the overhead of the proc_macro_server, So if they were to phase out the legacy expanders they still can implement the MultiItemModifier differently to short-circuit to the builtin implementation avoiding all of the shortcomings of normal procedural macros. That's exactly why declarative macros are fast to build. ConclusionI guess my conclusion is in the form of a question. Linkshttps://github.com/rust-lang/rust/blob/32e692681e457b06efedd05c5a4be0d718f39292/compiler/rustc_builtin_macros/src/derive.rs#L1 |
You make a strong case for why performance wouldn't be a good reason to codegen these 2 traits. Thanks for all the background info. I was completely unaware of the legacy macro kinds, and Rust's macro machinery in general. Really interesting. But, just to explain, my rationale for putting these 2 on the list wasn't for perf reasons. The reasons were:
|
impl Hash for Span { | |
fn hash<H: Hasher>(&self, _state: &mut H) { | |
// hash to nothing so all ast spans can be comparable with hash | |
} | |
} |
I saw this as a bit of a hack, and thought it would be more "proper" solution to implement Hash
on all AST types with the span
field explicitly skipped. That would be laborious and error-prone to do by hand, but much easier with codegen.
PartialEq
The reason why we are implementing Hash
on AST types is for (if I remember right) comparing expressions in a lint rule to identify pointless if
branches e.g. if (x) {} else if (x) { /* unreachable */ }
. But actually Hash
is not ideal for this use - PartialEq
would be more efficient.
But it's hard to implement PartialEq
because, again, we need to skip span
fields, and the no-op hash hack isn't viable here, so we can't use #[derive(PartialEq)]
.
So I thought we could do that with codegen.
Conclusion
I'm not at all arguing against what you've said @rzvxa. I just wanted to explain that there was another angle.
Thanks for the explanation, Now I see the full picture for this one. |
We can defer the implementation of As for |
Consider this is done, we can migrate |
Background
For keeping compilation speed down for ourselves and our downstream crate users, I have avoided using macros to generate boilerplate code at all cost.
But due to the amount of AST related supporting code has expanded over the last few months, and our desire to experiment with different ideas with the AST, our current approach of manually updating everything is no longer functional.
The goal of this task is to provide code generation for all AST node related code.
Requirements
crates/oxc_ast
should be the source of truthWorkflow
#[visited_node]
struct ASTModel { name: String, attributes: Vec<Attributes> }
Code to generate
ASTKind
(Milestone 1)GetSpan
(Milestone 2)Hash
,PartialEq
implementations (Milestone 3)ASTBuilder
(Milestone 4)Visit
(Milestone 5)VisitMut
(Milestone 5)Traverse
, remove current js code (Milestone 6)typescript definition for napi and wasmmoving the scope to AST transfer.Every milestone should be a stack of PRs, since they will involve changing other files as well.
The text was updated successfully, but these errors were encountered: