Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: codegen AST related codes #4134

Closed
Boshen opened this issue Jun 20, 2024 · 45 comments
Closed

RFC: codegen AST related codes #4134

Boshen opened this issue Jun 20, 2024 · 45 comments

Comments

@Boshen
Copy link
Member

Boshen commented Jun 20, 2024

Requirements

  • no build.rs published to the user
  • all generated code are checked into git
  • no nightly
  • Rust code is source of truth, need to read parse #[visited_node]

Workflow

  • a user changes the oxc_crate
  • a watch change picks it up
  • parse all #[visited_node]
  • saves them into a model, e.g. struct ASTModel { name: String, attributes: Vec<Attributes> }
  • generate code and save file

Code to generate

  • ASTKind (Milestone 1)
  • GetSpan (Milestone 2)
  • Hash, PartialEq implementations (Milestone 3)
  • ASTBuilder (Milestone 4)
  • Visit (Milestone 5)
  • VisitMut (Milestone 5)
  • Traverse, remove current js code (Milestone 6)
  • typescript definition for napi and wasm moving the scope to AST transfer.

Every milestone should be a stack of PRs, since they will involve changing other files as well.


Things we need to discuss:

  • Rust or JavaScript based code generation? traverse currently uses JavaScript, I think it's fine, but does it scale to supporting these many code patterns?

update: The consensus is stay with our current approach and parse #[visited_node] as the source of truth.

Topic not part of this RFC:

  • Change or reduce AST visit patterns
@Boshen
Copy link
Member Author

Boshen commented Jun 20, 2024

@overlookmotel @rzvxa @Dunqing

I think we can start prototyping once we decide on the generation method.

I vote for all Rust, but I'm not sure about its implications or downsides.

@rzvxa
Copy link
Contributor

rzvxa commented Jun 20, 2024

👍for all Rust, Now that we know our requirements and are done with prototyping we could leverage syn for our code generation.

It is harder to iterate on but we won't need to parse and generate rust code ourselves, We can treat it like a global macro if you are willing to use nightly for codegen, We quite literally can use !#[code_gen] at the top of the AST definition file, macro expand it for the build and call it a day.

If we want to break our ast definition into separate files or don't use nightly at all; We can manually parse the string using syn and vice versa.

Nice to have: we might want to generate the AST itself from our definition file, So it is easy to do AST-wide refactors in the future.

@Boshen
Copy link
Member Author

Boshen commented Jun 20, 2024

Good catch, we should move all ast implementations to another file, so we can parse ast nodes easily.

Or maybe we should scan the file using syn, and extract everything marked as #[visited_node].

On nightly: hard no, sorry.

@rzvxa
Copy link
Contributor

rzvxa commented Jun 20, 2024

Good catch, we should move all ast implementations to another file, so we can parse ast nodes easily.

Or maybe we should scan the file using syn, and extract everything marked as #[visited_node].

I'm more in favor of having the definition file(s) for ast containing only the struct/enum and its attribute/derives, With this we are not only simplifying the build process(since any item in the global scope of definition file is an AST type) but also provide a way to add to the codegen, For example, if you remember for ast_node_id I wanted to add an ast_node attribute that would add the id and implement its trait, Which slows down the build process. But if we can easily add such things as a part of the code gen process we get the change fast and free (in terms of build times).
But if we only process the types and use it to generate the visitors/traversals we won't be able to change ast types themselves.
On the other hand, if we read these structs and modify them before generating the actual AST types we get to do magical things such as reducing all inherit macros in the generated code, It would be a much better experience if the actual type is there, Both in terms of LSP and documentation, Not to mention easier to understand if the resulting code is there instead of an inherit.

@Boshen
Copy link
Member Author

Boshen commented Jun 20, 2024

I wanted to follow how React Compiler does this

but @overlookmotel prefers writing it in Rust because it feels more natural.

I don't mind either.

Also good catch on wanting to add common attributes to the nodes.

@rzvxa
Copy link
Contributor

rzvxa commented Jun 20, 2024

I think in the end both the rust syntax and these JSON files are the same thing. Having it as rust syntax and parsing it to get our serialized data is more steps for sure but I have to agree with @overlookmotel, It is much easier to understand and reason about our schema this way, Unless we want to share these schemas with other projects I think having ast_def.rs is > than ast_def.json.

@overlookmotel
Copy link
Contributor

Rust or JS?

I vote Rust too. I only wrote Traverse codegen in JS to be able to bang it out quickly. It's a hack, and brittle, as it doesn't parse the Rust code properly - it's basically using regexes - so it could easily break. My intent was always to "do it properly" and rewrite in Rust using syn etc, once the impl was stable.

Design

I propose we do this in 2 parts:

  1. A build script to build a schema of AST types.
  2. Other scripts to generate code from that schema.

The schema would be generated by:

  • Parsing all relevant .rs files (NB: includes files in oxc_span and oxc_syntax, not just oxc_ast).
  • Generating a StructDefinition / EnumDefinition for each type.
  • Linking up the types (i.e. resolve Bar in struct Foo { bar: Bar } to enum Bar defined elsewhere).

Schema types could be something like https://github.com/overlookmotel/layout_inspect/blob/master/layout_inspect/src/defs.rs
(we don't need align and size at present - we can miss those parts out)

This schema can then be output as both:

  1. JSON
  2. Rust types data serialized with rkyv which can be loaded into memory as a static (see Pre-compile data as binary blobs oxc-browserslist#23)

Why?

2 reasons:

Firstly, AST transfer needs a schema just like this - so we can kill 2 birds with 1 stone.

Secondly, the schema can also be used in proc macros. Fixing #4297 requires replacing the dodgy inherit_variants! macro with a proc macro. Problem is that proc macros only have access to the section of code they're applied to, but inherited enums need to know what variants other enums have. The #[inherit_variants] proc macro can get that info from the schema.

(see "Option 3" in #4297)

We don't need to implement rkyv serialization initially. We can just use JSON to serialize the schema to start with. rkyv is an optimization to avoid having serde and serde_json as dependencies of the macro crate, which will massively reduce the compile time impact of those macros.

Generating AST types

Personally I am anti generating the AST types themselves. I think we should write the AST types in Rust (as at present) and then generate everything around them.

I laid out my reasoning for this in #4297 (at length, sorry probably way too much length).

When we want to alter the AST types themselves, we can use proc macros. This does have a downside in terms of compile time, but the rkyv-serialized schema thing is designed to reduce that impact to a minimal level.

These are just my thoughts... feel free to disagree!

@rzvxa
Copy link
Contributor

rzvxa commented Jun 20, 2024

Generating AST types

Personally I am anti generating the AST types themselves. I think we should write the AST types in Rust (as at present) and then generate everything around them.

I laid out my reasoning for this in #4297 (at length, sorry probably way too much length).

When we want to alter the AST types themselves, we can use proc macros. This does have a downside in terms of compile time, but the rkyv-serialized schema thing is designed to reduce that impact to a minimal level.

These are just my thoughts... feel free to disagree!

What if we have definitions (schema) written as rust struct/enums? Then if we even need to output a JSON schema we can create it from the syntax data instead of json information.

Look at this and tell me it isn't possible to write all of this in rust https://github.com/facebook/react/blob/main/compiler/crates/react_estree_codegen/src/ecmascript.json.

What are the differences between these 2 pieces of string other than deserialization method?

        "StringLiteral": {
            "fields": {
                "value": {
                    "type": "String",
                    "hermes_convert_with": "convert_string_value"
                }
            }
        },
struct StringLiteral {
    #[hermes_convert_with("convert_string_value")]
    value: String,
}

My point is that we can have our cake and eat it too, We can write definitions in the rust itself, yet still use this information and build upon it to generate the actual AST types.

@rzvxa
Copy link
Contributor

rzvxa commented Jun 20, 2024

Example usecase:
codegen/ast_def.rs

#[magic_attr]
struct StringLiteral<'a> {
    #[hermes_convert_with("convert_string_value")]
    pub value: Atom<'a>,
}

ast/ast.rs

#[hermes_stuff]
struct StringLiteral<'a> {
    #[hermes_convert_with("convert_string_value")]
    pub value: Atom<'a>,
    magical_field: MagicType<'a>,
}

impl<'a, 'm> BlackMagic<'a, 'm>for StringLiteral<'a> {
    fn cast(&'m self) -> MagicMisile<'a, 'm> {
        self.magical_field.cast(MagicKind::Black).into()
    }
}

@rzvxa
Copy link
Contributor

rzvxa commented Jun 20, 2024

AST definition file should have its own prelude so we are confined to the use of known types in our ast definitions(happens in the proc macro/codegen). This way we can disallow the use statements or path qualifiers. With this imposed limitation on our ast definition we no longer have to resolve and link types, We just use a hashmap since they are all known beforehand. Ast uses other AST types, Box, Vec, Atom, str, and a handful of other types but there is no random type there(as it should be).

@overlookmotel
Copy link
Contributor

overlookmotel commented Jun 20, 2024

What are the differences between these 2 pieces of string other than deserialization method?

Agreed. No difference. Except... then we have 2 .rs files which both look like they are the AST definitions file. Which do you edit? As I said under option "2c" on #4297:

However, there's a major DX problem. IDEs' "jump to definition" would take you to the built folder, not src. I am sure people would end up editing the files in built and then be surprised when all the code they wrote is lost when the build script re-runs and overwrites these files.

One of our goals is for the codebase to be accessible to new contributors. The AST is a central pillar of Oxc, probably the first place you go when you start exploring the codebase. So I think it's particularly important that the AST "just makes sense". Entering a world of confusion on your very first step could be a massive turn-off.

To summarize: as I understand it, the major pluses and minuses are:

  • Generate AST from schema: Minimize compile time as work happens in build script, rather than in macros.
  • Write AST as Rust types and alter them with proc macros as required (as now): Less magic to understand.

So, my view is:

  1. In this particular case, increased approachability of codebase is more important than compile time.
  2. By avoiding using serde in macros, we can minimize the compile time impact.

But... I've said my piece now. It's a preference rather than a strongly held view - I can see both sides of the argument. If you both feel strongly that you prefer to codegen AST, please feel free to overrule me. I won't be upset!

Only thing I ask is please do try to go through my thinking in #4297, and try to follow my reasoning. If you get my arguments, but don't agree with them, no problem. But please do consider them.

MagicKind::Black

🤣🤣🤣 Love it!

@rzvxa
Copy link
Contributor

rzvxa commented Jun 20, 2024

One of our goals is for the codebase to be accessible to new contributors. The AST is a central pillar of Oxc, probably the first place you go when you start exploring the codebase. So I think it's particularly important that the AST "just makes sense". Entering a world of confusion on your very first step could be a massive turn-off.

It is ironic that it is my exact point for going in the opposite direction😆
I feel since AST doesn't change often after we hit stable release, it wouldn't result in confusion it is almost impossible for a first-time contributor to have to change it if it is all up to spec, All of our traits and impls are in a separate file that isn't generated. What we are generating with the AST are additions that would happen to the structures such as inherits and adding fields or macros.

I think since these are the actual things existing on the type, jumping to the generated definition is self-documenting because of no magic code, basically, what you see is what you get.

so I can see oxc_ast with this structure(I think the generated directory communicates well enough that these files shouldn't be edited directly, And if it isn't enough we can always add a giant docs comment at the top of the file stating this fact):

oxc_ast/
|_src/
   |_______generated/
                  |_______ast.rs(or js.rs,ts.rs,...)
                  |_______also contains visit/VisitMut,Visit and traverse/Traverse, etc.
   |_______ast_impl.rs
   |_______lib.rs
   |_______....
oxc_ast_def(or maybe oxc_ast_gen)/
|_src/
   |______ast_def.rs(or js_def.rs,ts_def.rs,...)

Then we manually export these generated codes in our lib.rs.

@overlookmotel
Copy link
Contributor

I think for you to make the call on this @Boshen!

@Boshen
Copy link
Member Author

Boshen commented Jun 20, 2024

Haha, can we start somewhere small?

I propose:

Gather the AST nodes somehow (we don't need the attributes yet),

and layout the backbone of codegen for

  • ASTKind
  • GetSpan
  • Visit
  • VisitMut
  • Traverse
  • ASTBuilder
  • typescript definition for napi and wasm

once this is in place, we can experiment with different ast definition strategies and pick the one that fits my requirements.

If parsing #[visit_node] works for the time being, then we can stick with it and iterate later.

The pipeline is

Definition -> Model in Rust -> an easy mechanism for writing out boilerplates -> code.

The first milestone will just be the backbone of the [Model in Rust -> an easy mechanism for writing out boilerplates -> code] part.

@overlookmotel
Copy link
Contributor

I'm going to go for a walk and mull this over. Give me an hour and I'll come back with some comments.

@rzvxa
Copy link
Contributor

rzvxa commented Jun 20, 2024

To back my point, This is a common pattern in our current AST definition:

inherit_variants! {
/// Expression
///
/// Inherits variants from [`MemberExpression`]. See [`ast` module docs] for explanation of inheritance.
///
/// [`ast` module docs]: `super`
#[visited_node]
#[repr(C, u8)]
#[derive(Debug, Hash)]
#[cfg_attr(feature = "serialize", derive(Serialize, Tsify))]
#[cfg_attr(feature = "serialize", serde(untagged))]
pub enum Expression<'a> {
    BooleanLiteral(Box<'a, BooleanLiteral>) = 0,
    NullLiteral(Box<'a, NullLiteral>) = 1,
    NumericLiteral(Box<'a, NumericLiteral<'a>>) = 2,
    BigintLiteral(Box<'a, BigIntLiteral<'a>>) = 3,
    RegExpLiteral(Box<'a, RegExpLiteral<'a>>) = 4,
    StringLiteral(Box<'a, StringLiteral<'a>>) = 5,
    TemplateLiteral(Box<'a, TemplateLiteral<'a>>) = 6,

    Identifier(Box<'a, IdentifierReference<'a>>) = 7,

    MetaProperty(Box<'a, MetaProperty<'a>>) = 8,
    Super(Box<'a, Super>) = 9,

    ArrayExpression(Box<'a, ArrayExpression<'a>>) = 10,
    ArrowFunctionExpression(Box<'a, ArrowFunctionExpression<'a>>) = 11,
    AssignmentExpression(Box<'a, AssignmentExpression<'a>>) = 12,
    AwaitExpression(Box<'a, AwaitExpression<'a>>) = 13,
    BinaryExpression(Box<'a, BinaryExpression<'a>>) = 14,
    CallExpression(Box<'a, CallExpression<'a>>) = 15,
    ChainExpression(Box<'a, ChainExpression<'a>>) = 16,
    ClassExpression(Box<'a, Class<'a>>) = 17,
    ConditionalExpression(Box<'a, ConditionalExpression<'a>>) = 18,
    FunctionExpression(Box<'a, Function<'a>>) = 19,
    ImportExpression(Box<'a, ImportExpression<'a>>) = 20,
    LogicalExpression(Box<'a, LogicalExpression<'a>>) = 21,
    NewExpression(Box<'a, NewExpression<'a>>) = 22,
    ObjectExpression(Box<'a, ObjectExpression<'a>>) = 23,
    ParenthesizedExpression(Box<'a, ParenthesizedExpression<'a>>) = 24,
    SequenceExpression(Box<'a, SequenceExpression<'a>>) = 25,
    TaggedTemplateExpression(Box<'a, TaggedTemplateExpression<'a>>) = 26,
    ThisExpression(Box<'a, ThisExpression>) = 27,
    UnaryExpression(Box<'a, UnaryExpression<'a>>) = 28,
    UpdateExpression(Box<'a, UpdateExpression<'a>>) = 29,
    YieldExpression(Box<'a, YieldExpression<'a>>) = 30,
    PrivateInExpression(Box<'a, PrivateInExpression<'a>>) = 31,

    JSXElement(Box<'a, JSXElement<'a>>) = 32,
    JSXFragment(Box<'a, JSXFragment<'a>>) = 33,

    TSAsExpression(Box<'a, TSAsExpression<'a>>) = 34,
    TSSatisfiesExpression(Box<'a, TSSatisfiesExpression<'a>>) = 35,
    TSTypeAssertion(Box<'a, TSTypeAssertion<'a>>) = 36,
    TSNonNullExpression(Box<'a, TSNonNullExpression<'a>>) = 37,
    TSInstantiationExpression(Box<'a, TSInstantiationExpression<'a>>) = 38,

    // `MemberExpression` variants added here by `inherit_variants!` macro
    @inherit MemberExpression
}
}

It is only this way because it isn't feasible to copy all of these manually and keep track of them when one changes.

So with code-gening the AST we still have this syntax for our definitions to keep it consistent,

/// Expression
///
/// Inherits variants from [`MemberExpression`]. See [`ast` module docs] for explanation of inheritance.
///
/// [`ast` module docs]: `super`
#[visited_node]
#[repr(C, u8)]
#[derive(Debug, Hash)]
#[cfg_attr(feature = "serialize", derive(Serialize, Tsify))]
#[cfg_attr(feature = "serialize", serde(untagged))]
pub enum Expression<'a> {
    BooleanLiteral(Box<'a, BooleanLiteral>) = 0,
    NullLiteral(Box<'a, NullLiteral>) = 1,
    NumericLiteral(Box<'a, NumericLiteral<'a>>) = 2,
    BigintLiteral(Box<'a, BigIntLiteral<'a>>) = 3,
    RegExpLiteral(Box<'a, RegExpLiteral<'a>>) = 4,
    StringLiteral(Box<'a, StringLiteral<'a>>) = 5,
    TemplateLiteral(Box<'a, TemplateLiteral<'a>>) = 6,

    Identifier(Box<'a, IdentifierReference<'a>>) = 7,

    MetaProperty(Box<'a, MetaProperty<'a>>) = 8,
    Super(Box<'a, Super>) = 9,

    ArrayExpression(Box<'a, ArrayExpression<'a>>) = 10,
    ArrowFunctionExpression(Box<'a, ArrowFunctionExpression<'a>>) = 11,
    AssignmentExpression(Box<'a, AssignmentExpression<'a>>) = 12,
    AwaitExpression(Box<'a, AwaitExpression<'a>>) = 13,
    BinaryExpression(Box<'a, BinaryExpression<'a>>) = 14,
    CallExpression(Box<'a, CallExpression<'a>>) = 15,
    ChainExpression(Box<'a, ChainExpression<'a>>) = 16,
    ClassExpression(Box<'a, Class<'a>>) = 17,
    ConditionalExpression(Box<'a, ConditionalExpression<'a>>) = 18,
    FunctionExpression(Box<'a, Function<'a>>) = 19,
    ImportExpression(Box<'a, ImportExpression<'a>>) = 20,
    LogicalExpression(Box<'a, LogicalExpression<'a>>) = 21,
    NewExpression(Box<'a, NewExpression<'a>>) = 22,
    ObjectExpression(Box<'a, ObjectExpression<'a>>) = 23,
    ParenthesizedExpression(Box<'a, ParenthesizedExpression<'a>>) = 24,
    SequenceExpression(Box<'a, SequenceExpression<'a>>) = 25,
    TaggedTemplateExpression(Box<'a, TaggedTemplateExpression<'a>>) = 26,
    ThisExpression(Box<'a, ThisExpression>) = 27,
    UnaryExpression(Box<'a, UnaryExpression<'a>>) = 28,
    UpdateExpression(Box<'a, UpdateExpression<'a>>) = 29,
    YieldExpression(Box<'a, YieldExpression<'a>>) = 30,
    PrivateInExpression(Box<'a, PrivateInExpression<'a>>) = 31,

    JSXElement(Box<'a, JSXElement<'a>>) = 32,
    JSXFragment(Box<'a, JSXFragment<'a>>) = 33,

    TSAsExpression(Box<'a, TSAsExpression<'a>>) = 34,
    TSSatisfiesExpression(Box<'a, TSSatisfiesExpression<'a>>) = 35,
    TSTypeAssertion(Box<'a, TSTypeAssertion<'a>>) = 36,
    TSNonNullExpression(Box<'a, TSNonNullExpression<'a>>) = 37,
    TSInstantiationExpression(Box<'a, TSInstantiationExpression<'a>>) = 38,

    @inherit MemberExpression
}

But we give the end user this well-documented code to read:

/// Expression
///
/// Inherits variants from [`MemberExpression`]. See [`ast` module docs] for explanation of inheritance.
///
/// [`ast` module docs]: `super`
#[visited_node]
#[repr(C, u8)]
#[derive(Debug, Hash)]
#[cfg_attr(feature = "serialize", derive(Serialize, Tsify))]
#[cfg_attr(feature = "serialize", serde(untagged))]
pub enum Expression<'a> {
    BooleanLiteral(Box<'a, BooleanLiteral>) = 0,
    NullLiteral(Box<'a, NullLiteral>) = 1,
    NumericLiteral(Box<'a, NumericLiteral<'a>>) = 2,
    BigintLiteral(Box<'a, BigIntLiteral<'a>>) = 3,
    RegExpLiteral(Box<'a, RegExpLiteral<'a>>) = 4,
    StringLiteral(Box<'a, StringLiteral<'a>>) = 5,
    TemplateLiteral(Box<'a, TemplateLiteral<'a>>) = 6,

    Identifier(Box<'a, IdentifierReference<'a>>) = 7,

    MetaProperty(Box<'a, MetaProperty<'a>>) = 8,
    Super(Box<'a, Super>) = 9,

    ArrayExpression(Box<'a, ArrayExpression<'a>>) = 10,
    ArrowFunctionExpression(Box<'a, ArrowFunctionExpression<'a>>) = 11,
    AssignmentExpression(Box<'a, AssignmentExpression<'a>>) = 12,
    AwaitExpression(Box<'a, AwaitExpression<'a>>) = 13,
    BinaryExpression(Box<'a, BinaryExpression<'a>>) = 14,
    CallExpression(Box<'a, CallExpression<'a>>) = 15,
    ChainExpression(Box<'a, ChainExpression<'a>>) = 16,
    ClassExpression(Box<'a, Class<'a>>) = 17,
    ConditionalExpression(Box<'a, ConditionalExpression<'a>>) = 18,
    FunctionExpression(Box<'a, Function<'a>>) = 19,
    ImportExpression(Box<'a, ImportExpression<'a>>) = 20,
    LogicalExpression(Box<'a, LogicalExpression<'a>>) = 21,
    NewExpression(Box<'a, NewExpression<'a>>) = 22,
    ObjectExpression(Box<'a, ObjectExpression<'a>>) = 23,
    ParenthesizedExpression(Box<'a, ParenthesizedExpression<'a>>) = 24,
    SequenceExpression(Box<'a, SequenceExpression<'a>>) = 25,
    TaggedTemplateExpression(Box<'a, TaggedTemplateExpression<'a>>) = 26,
    ThisExpression(Box<'a, ThisExpression>) = 27,
    UnaryExpression(Box<'a, UnaryExpression<'a>>) = 28,
    UpdateExpression(Box<'a, UpdateExpression<'a>>) = 29,
    YieldExpression(Box<'a, YieldExpression<'a>>) = 30,
    PrivateInExpression(Box<'a, PrivateInExpression<'a>>) = 31,

    JSXElement(Box<'a, JSXElement<'a>>) = 32,
    JSXFragment(Box<'a, JSXFragment<'a>>) = 33,

    TSAsExpression(Box<'a, TSAsExpression<'a>>) = 34,
    TSSatisfiesExpression(Box<'a, TSSatisfiesExpression<'a>>) = 35,
    TSTypeAssertion(Box<'a, TSTypeAssertion<'a>>) = 36,
    TSNonNullExpression(Box<'a, TSNonNullExpression<'a>>) = 37,
    TSInstantiationExpression(Box<'a, TSInstantiationExpression<'a>>) = 38,

    // begin inherit from `MemberExpression` (if you know what inherit means in this context good for you, Otherwise you still get all the information without needing to jump 2 or more times first to the inherit_variants and then to the MemberExpression to understand it.)

    /// `MemberExpression[?Yield, ?Await] [ Expression[+In, ?Yield, ?Await] ]`
    ComputedMemberExpression(Box<'a, ComputedMemberExpression<'a>>) = 48,
    /// `MemberExpression[?Yield, ?Await] . IdentifierName`
    StaticMemberExpression(Box<'a, StaticMemberExpression<'a>>) = 49,
    /// `MemberExpression[?Yield, ?Await] . PrivateIdentifier`
    PrivateFieldExpression(Box<'a, PrivateFieldExpression<'a>>) = 50,

    // end inherit from `MemberExpression`
}

We would also generate the relevant match macros and those are also there to read(without nested declarative macro calls which can confuse newcomers) and maybe even omit the discriminates(in the definition file) since now we can calculate them at comptime (but having them explicitly defined is better in my opinion).

I can't help but find a generated file more understandable than how @inherit and inherit_variants works.

We can call our definition file ast.d.rs to further communicate that it isn't a normal rust file(and there are things like @inherit available).

If we don't want to have magical @inherit we can use this syntax in our definition:

#[inheritance]
enum E {
   A(A),
   #[inherit]
   B(B),
}

@overlookmotel
Copy link
Contributor

overlookmotel commented Jun 20, 2024

Not going to go into the detail of your comment, Rez, because this thread is getting too long. But I agree inherit_variants! is a mess as it is now. There are various ways to fix it, and one of those ways is what you've suggested.

More generally, I've had a think, and have a few comments:

To gen or not to gen?

I have loosened my opposition to generating the AST types. I think what I was really opposed to is defining the schema as JSON (like Hermes/React Compiler does) which is verbose, ugly, and hard to read. Defining the schema as Rust types is much better. If we do the following, I think it would answer all my criticisms:

  • Name the "schema" Rust files as .def.rs or something to indicate they are "special".
  • Name all generated files .gen.rs, or put them in a generated directory.
  • Link the "schema" Rust files into a crate somehow so Rust Analyser gives you "jump to definition" and all that good stuff (jump to definition within the schema, not from outside).
  • In codegen-ed files, precede every type with a comment // This file is generated. Do not edit. Instead edit source at ../codegen/js.def.rs:321.
  • Split up codegen-ed AST types into multiple files, not one massive long one.

(not claiming all these ideas as my own, just summarizing in one place)

How to begin

As Boshen says, we don't need to jump to doing this straight away. We can initially write our codegen for e.g. Visit by reading the existing .rs files in oxc_ast, like Traverse's codegen does.

If we're agreed that in the end our "schema" will be written as Rust types, making the change later to "schema files" should not involve a huge change to the codegen.

Schema

AST transfer needs a schema of all the AST types as JSON. So could we please build our codegen foundation along these lines:

  1. Parse definition Rust files.
  2. Generate schema (like layout_inspect's definition types).
  3. Generate output Rust code from the schema.

If we do it like that, we can kill 2 birds with one stone, rather than having to build a 2nd parser/codegen for AST transfer later.

(I think this is the plan anyway, but just repeating to make sure)

Visit and VisitMut

We should factor in #3392

TypeScript definitions

TS definitions will be trickier to generate from Rust types schema because there isn't an exact correspondence between Rust types and JS types, due to the modifications we do with custom serde serializers to conform to ESTree format.

I suggest making generating TS defs a separate effort. I was intending to tackle that as part of AST transfer impl - probably easier because AST transfer deals in the JS types not the Rust types.

@rzvxa
Copy link
Contributor

rzvxa commented Jun 20, 2024

I am glad that you've backed off a little bit on your opposition, I believe all of your points are addressable(And I have a vague idea of how to achieve the rust-analyzer compatibility with preluding types and have a mock macro for inheritance to prevent syntax errors).

Since you've written the Traverse codegen, Would you like to also give this one a go?

I can start working on it next week so it is up to you to decide.

Just a note: Even if we don't want to have rust schema we might want to consider starting with extracting type definitions from impls in our ast files, Having separate files like js.rs and js_impl.rs can help to simplify the rust parsing part, We no longer have to parse a bunch of random items and check if they are AST type or not. I think private ast fields all can be pub(crate) as a compromise to get them separated into 2 files.

@rzvxa
Copy link
Contributor

rzvxa commented Jun 21, 2024

@Boshen
Copy link
Member Author

Boshen commented Jun 21, 2024

Apparently we can't transfer an issue from a private repo to a public repo.

I'll chat with @rzvxa on discord and set the milestones once he finishes all cfg related code.

I'll create the issues in oxc when we have a consensus on what to build - the final version is in this issue's description.

@Boshen
Copy link
Member Author

Boshen commented Jun 21, 2024

Question: what should be done with code surrounded by inherit_variants!? Feels like a blocker if we want #[visited_node] to be the source of truth.

@overlookmotel
Copy link
Contributor

overlookmotel commented Jun 21, 2024

We could also codegen for all AST types:

NB: CloneIn could also be implemented on AST types with #[derive(CloneIn)], but it may improve compile time if we codegen the impls, to avoid macro expansion at compile time.

@overlookmotel
Copy link
Contributor

overlookmotel commented Jun 21, 2024

Apparently we can't transfer an issue from a private repo to a public repo.

Oh dear! It would be ideal for all of the conversation in this issue to be made public, so people can see not only what decisions were reached, but also the reasons for reaching them.

Personally, I think this is quite important. When I find code in Oxc which I don't understand the rationale for, I often look at git blame to find out in what PR that code was introduced, and then I read the surrounding issues. This helps me to understand why things have been designed the way they have.

As a workaround, can we temporarily make backlog repo public, transfer the issue to oxc repo, then quickly make backlog repo private again??

@overlookmotel
Copy link
Contributor

Question: what should be done with code surrounded by inherit_variants!? Feels like a blocker if we want #[visited_node] to be the source of truth.

The codegen will need to parse and understand the @inherits Expression syntax. This is what Traverse's current codegen does.

We can use that understanding to codegen all the impls which inherit_variants! macro currently generates, so inherit_variants! becomes much slimmer than it is now.

@overlookmotel
Copy link
Contributor

Sorry @rzvxa I didn't answer your question:

Since you've written the Traverse codegen, Would you like to also give this one a go?

I'd be happy to, but I only have a couple of days before I go away, and am keen to finish off correcting spans and scopes in the transformer before I do. So if you're willing to work on this, by all means go ahead.

@rzvxa
Copy link
Contributor

rzvxa commented Jun 23, 2024

ASTKind (Milestone 1)

I think this one should be done after Visit and VisitMut Since without those there are going to be a lot of errors that we have to mark as todo. So we have to do a bunch of manual refactoring just to redo it again after generating the new visitors.

Otherwise, it is the easiest to generate and has already been added to the ast_codegen.

@Boshen
Copy link
Member Author

Boshen commented Jun 23, 2024

ASTKind (Milestone 1)

I think this one should be done after Visit and VisitMut Since without those there are going to be a lot of errors that we have to mark as todo. So we have to do a bunch of manual refactoring just to redo it again after generating the new visitors.

Otherwise, it is the easiest to generate and has already been added to the ast_codegen.

Oh ok, pick the easiest one to have it codegened so we can get it reviewed, merged, and replaced 😅

@rzvxa
Copy link
Contributor

rzvxa commented Jun 23, 2024

ASTKind (Milestone 1)

I think this one should be done after Visit and VisitMut Since without those there are going to be a lot of errors that we have to mark as todo. So we have to do a bunch of manual refactoring just to redo it again after generating the new visitors.
Otherwise, it is the easiest to generate and has already been added to the ast_codegen.

Oh ok, pick the easiest one to have it codegened so we can get it reviewed, merged, and replaced 😅

I've already did, GetSpan is ready to replace the old one.
#3852

Didn't mark it for review so you guys can give a green light to the underlying architecture first.

Edit: We are basically done with both AstKind and GetSpan generators, We just hold onto the AstKind till the visitor generators get ready.

@Dunqing
Copy link
Member

Dunqing commented Jul 9, 2024

These visit functions are outdated, do we have a better way to reimplement them here? It is now difficult to manually update

fn visit_program(&mut self, program: &Program<'a>) {
let kind = AstKind::Program(self.alloc(program));
self.enter_scope({
let mut flags = ScopeFlags::Top;
if program.is_strict() {
flags |= ScopeFlags::StrictMode;
}
flags
});
program.scope_id.set(Some(self.current_scope_id));
/* cfg */
let error_harness = control_flow!(|self, cfg| {
let error_harness = cfg.attach_error_harness(ErrorEdgeKind::Implicit);
let _program_basic_block = cfg.new_basic_block_normal();
error_harness
});
/* cfg - must be above directives as directives are in cfg */
self.enter_node(kind);
for directive in &program.directives {
self.visit_directive(directive);
}
self.visit_statements(&program.body);
/* cfg */
control_flow!(|self, cfg| cfg.release_error_harness(error_harness));
/* cfg */
self.leave_node(kind);
self.leave_scope();
}
fn visit_block_statement(&mut self, stmt: &BlockStatement<'a>) {
let kind = AstKind::BlockStatement(self.alloc(stmt));
self.enter_scope(ScopeFlags::empty());
stmt.scope_id.set(Some(self.current_scope_id));
self.enter_node(kind);
self.visit_statements(&stmt.body);
self.leave_node(kind);
self.leave_scope();
}
fn visit_break_statement(&mut self, stmt: &BreakStatement<'a>) {
let kind = AstKind::BreakStatement(self.alloc(stmt));
self.enter_node(kind);
/* cfg */
let node_id = self.current_node_id;
/* cfg */
if let Some(ref break_target) = stmt.label {
self.visit_label_identifier(break_target);
}
/* cfg */
control_flow!(
|self, cfg| cfg.append_break(node_id, stmt.label.as_ref().map(|it| it.name.as_str()))
);
/* cfg */
self.leave_node(kind);
}
fn visit_continue_statement(&mut self, stmt: &ContinueStatement<'a>) {
let kind = AstKind::ContinueStatement(self.alloc(stmt));
self.enter_node(kind);
/* cfg */
let node_id = self.current_node_id;
/* cfg */
if let Some(continue_target) = &stmt.label {
self.visit_label_identifier(continue_target);
}
/* cfg */
control_flow!(|self, cfg| cfg
.append_continue(node_id, stmt.label.as_ref().map(|it| it.name.as_str())));
/* cfg */
self.leave_node(kind);
}
fn visit_debugger_statement(&mut self, stmt: &DebuggerStatement) {
let kind = AstKind::DebuggerStatement(self.alloc(stmt));
self.enter_node(kind);
self.leave_node(kind);
}
fn visit_do_while_statement(&mut self, stmt: &DoWhileStatement<'a>) {
let kind = AstKind::DoWhileStatement(self.alloc(stmt));
self.enter_node(kind);
/* cfg */
let (before_do_while_stmt_graph_ix, start_body_graph_ix) = control_flow!(|self, cfg| {
let before_do_while_stmt_graph_ix = cfg.current_node_ix;
let start_body_graph_ix = cfg.new_basic_block_normal();
cfg.ctx(None).default().allow_break().allow_continue();
(before_do_while_stmt_graph_ix, start_body_graph_ix)
});
/* cfg */
self.visit_statement(&stmt.body);
/* cfg - condition basic block */
let (after_body_graph_ix, start_of_condition_graph_ix) = control_flow!(|self, cfg| {
let after_body_graph_ix = cfg.current_node_ix;
let start_of_condition_graph_ix = cfg.new_basic_block_normal();
(after_body_graph_ix, start_of_condition_graph_ix)
});
/* cfg */
self.record_ast_nodes();
self.visit_expression(&stmt.test);
let test_node = self.retrieve_recorded_ast_nodes().into_iter().next();
/* cfg */
control_flow!(|self, cfg| {
cfg.append_condition_to(start_of_condition_graph_ix, test_node);
let end_of_condition_graph_ix = cfg.current_node_ix;
let end_do_while_graph_ix = cfg.new_basic_block_normal();
// before do while to start of body basic block
cfg.add_edge(before_do_while_stmt_graph_ix, start_body_graph_ix, EdgeType::Normal);
// body of do-while to start of condition
cfg.add_edge(after_body_graph_ix, start_of_condition_graph_ix, EdgeType::Normal);
// end of condition to after do while
cfg.add_edge(end_of_condition_graph_ix, end_do_while_graph_ix, EdgeType::Normal);
// end of condition to after start of body
cfg.add_edge(end_of_condition_graph_ix, start_body_graph_ix, EdgeType::Backedge);
cfg.ctx(None)
.mark_break(end_do_while_graph_ix)
.mark_continue(start_of_condition_graph_ix)
.resolve_with_upper_label();
});
/* cfg */
self.leave_node(kind);
}
fn visit_expression_statement(&mut self, stmt: &ExpressionStatement<'a>) {
let kind = AstKind::ExpressionStatement(self.alloc(stmt));
self.enter_node(kind);
self.visit_expression(&stmt.expression);
self.leave_node(kind);
}
fn visit_logical_expression(&mut self, expr: &LogicalExpression<'a>) {
// logical expressions are short-circuiting, and therefore
// also represent control flow.
// For example, in:
// foo && bar();
// the bar() call will only be executed if foo is truthy.
let kind = AstKind::LogicalExpression(self.alloc(expr));
self.enter_node(kind);
self.visit_expression(&expr.left);
/* cfg */
let (left_expr_end_ix, right_expr_start_ix) = control_flow!(|self, cfg| {
let left_expr_end_ix = cfg.current_node_ix;
let right_expr_start_ix = cfg.new_basic_block_normal();
(left_expr_end_ix, right_expr_start_ix)
});
/* cfg */
self.visit_expression(&expr.right);
/* cfg */
control_flow!(|self, cfg| {
let right_expr_end_ix = cfg.current_node_ix;
let after_logical_expr_ix = cfg.new_basic_block_normal();
cfg.add_edge(left_expr_end_ix, right_expr_start_ix, EdgeType::Normal);
cfg.add_edge(left_expr_end_ix, after_logical_expr_ix, EdgeType::Normal);
cfg.add_edge(right_expr_end_ix, after_logical_expr_ix, EdgeType::Normal);
});
/* cfg */
self.leave_node(kind);
}
fn visit_assignment_expression(&mut self, expr: &AssignmentExpression<'a>) {
// assignment expressions can include an operator, which
// can be used to determine the control flow of the expression.
// For example, in:
// foo &&= super();
// the super() call will only be executed if foo is truthy.
let kind = AstKind::AssignmentExpression(self.alloc(expr));
self.enter_node(kind);
self.visit_assignment_target(&expr.left);
/* cfg */
let cfg_ixs = control_flow!(|self, cfg| {
if expr.operator.is_logical() {
let target_end_ix = cfg.current_node_ix;
let expr_start_ix = cfg.new_basic_block_normal();
Some((target_end_ix, expr_start_ix))
} else {
None
}
});
/* cfg */
self.visit_expression(&expr.right);
/* cfg */
control_flow!(|self, cfg| {
if let Some((target_end_ix, expr_start_ix)) = cfg_ixs {
let expr_end_ix = cfg.current_node_ix;
let after_assignment_ix = cfg.new_basic_block_normal();
cfg.add_edge(target_end_ix, expr_start_ix, EdgeType::Normal);
cfg.add_edge(target_end_ix, after_assignment_ix, EdgeType::Normal);
cfg.add_edge(expr_end_ix, after_assignment_ix, EdgeType::Normal);
}
});
/* cfg */
self.leave_node(kind);
}
fn visit_conditional_expression(&mut self, expr: &ConditionalExpression<'a>) {
let kind = AstKind::ConditionalExpression(self.alloc(expr));
self.enter_node(kind);
/* cfg - condition basic block */
let (before_conditional_graph_ix, start_of_condition_graph_ix) =
control_flow!(|self, cfg| {
let before_conditional_graph_ix = cfg.current_node_ix;
let start_of_condition_graph_ix = cfg.new_basic_block_normal();
(before_conditional_graph_ix, start_of_condition_graph_ix)
});
/* cfg */
self.record_ast_nodes();
self.visit_expression(&expr.test);
let test_node = self.retrieve_recorded_ast_nodes().into_iter().next();
/* cfg */
let (after_condition_graph_ix, before_consequent_expr_graph_ix) =
control_flow!(|self, cfg| {
cfg.append_condition_to(start_of_condition_graph_ix, test_node);
let after_condition_graph_ix = cfg.current_node_ix;
// conditional expression basic block
let before_consequent_expr_graph_ix = cfg.new_basic_block_normal();
(after_condition_graph_ix, before_consequent_expr_graph_ix)
});
/* cfg */
self.visit_expression(&expr.consequent);
/* cfg */
let (after_consequent_expr_graph_ix, start_alternate_graph_ix) =
control_flow!(|self, cfg| {
let after_consequent_expr_graph_ix = cfg.current_node_ix;
let start_alternate_graph_ix = cfg.new_basic_block_normal();
(after_consequent_expr_graph_ix, start_alternate_graph_ix)
});
/* cfg */
self.visit_expression(&expr.alternate);
/* cfg */
control_flow!(|self, cfg| {
let after_alternate_graph_ix = cfg.current_node_ix;
/* bb after conditional expression joins consequent and alternate */
let after_conditional_graph_ix = cfg.new_basic_block_normal();
cfg.add_edge(
before_conditional_graph_ix,
start_of_condition_graph_ix,
EdgeType::Normal,
);
cfg.add_edge(
after_consequent_expr_graph_ix,
after_conditional_graph_ix,
EdgeType::Normal,
);
cfg.add_edge(after_condition_graph_ix, before_consequent_expr_graph_ix, EdgeType::Jump);
cfg.add_edge(after_condition_graph_ix, start_alternate_graph_ix, EdgeType::Normal);
cfg.add_edge(after_alternate_graph_ix, after_conditional_graph_ix, EdgeType::Normal);
});
/* cfg */
self.leave_node(kind);
}
fn visit_for_statement(&mut self, stmt: &ForStatement<'a>) {
let kind = AstKind::ForStatement(self.alloc(stmt));
let is_lexical_declaration =
stmt.init.as_ref().is_some_and(ForStatementInit::is_lexical_declaration);
if is_lexical_declaration {
self.enter_scope(ScopeFlags::empty());
stmt.scope_id.set(Some(self.current_scope_id));
}
self.enter_node(kind);
if let Some(init) = &stmt.init {
self.visit_for_statement_init(init);
}
/* cfg */
let (before_for_graph_ix, test_graph_ix) = control_flow!(|self, cfg| {
let before_for_graph_ix = cfg.current_node_ix;
let test_graph_ix = cfg.new_basic_block_normal();
(before_for_graph_ix, test_graph_ix)
});
/* cfg */
if let Some(test) = &stmt.test {
self.record_ast_nodes();
self.visit_expression(test);
let test_node = self.retrieve_recorded_ast_nodes().into_iter().next();
/* cfg */
control_flow!(|self, cfg| cfg.append_condition_to(test_graph_ix, test_node));
/* cfg */
}
/* cfg */
let (after_test_graph_ix, update_graph_ix) =
control_flow!(|self, cfg| (cfg.current_node_ix, cfg.new_basic_block_normal()));
/* cfg */
if let Some(update) = &stmt.update {
self.visit_expression(update);
}
/* cfg */
let before_body_graph_ix = control_flow!(|self, cfg| {
let before_body_graph_ix = cfg.new_basic_block_normal();
cfg.ctx(None).default().allow_break().allow_continue();
before_body_graph_ix
});
/* cfg */
self.visit_statement(&stmt.body);
/* cfg */
control_flow!(|self, cfg| {
let after_body_graph_ix = cfg.current_node_ix;
let after_for_stmt = cfg.new_basic_block_normal();
cfg.add_edge(before_for_graph_ix, test_graph_ix, EdgeType::Normal);
cfg.add_edge(after_test_graph_ix, before_body_graph_ix, EdgeType::Jump);
cfg.add_edge(after_body_graph_ix, update_graph_ix, EdgeType::Backedge);
cfg.add_edge(update_graph_ix, test_graph_ix, EdgeType::Backedge);
cfg.add_edge(after_test_graph_ix, after_for_stmt, EdgeType::Normal);
cfg.ctx(None)
.mark_break(after_for_stmt)
.mark_continue(update_graph_ix)
.resolve_with_upper_label();
});
/* cfg */
self.leave_node(kind);
if is_lexical_declaration {
self.leave_scope();
}
}
fn visit_for_statement_init(&mut self, init: &ForStatementInit<'a>) {
let kind = AstKind::ForStatementInit(self.alloc(init));
self.enter_node(kind);
match init {
ForStatementInit::UsingDeclaration(decl) => {
self.visit_using_declaration(decl);
}
ForStatementInit::VariableDeclaration(decl) => {
self.visit_variable_declaration(decl);
}
match_expression!(ForStatementInit) => self.visit_expression(init.to_expression()),
}
self.leave_node(kind);
}
fn visit_for_in_statement(&mut self, stmt: &ForInStatement<'a>) {
let kind = AstKind::ForInStatement(self.alloc(stmt));
let is_lexical_declaration = stmt.left.is_lexical_declaration();
if is_lexical_declaration {
self.enter_scope(ScopeFlags::empty());
stmt.scope_id.set(Some(self.current_scope_id));
}
self.enter_node(kind);
self.visit_for_statement_left(&stmt.left);
/* cfg */
let (before_for_stmt_graph_ix, start_prepare_cond_graph_ix) =
control_flow!(|self, cfg| (cfg.current_node_ix, cfg.new_basic_block_normal(),));
/* cfg */
self.record_ast_nodes();
self.visit_expression(&stmt.right);
let right_node = self.retrieve_recorded_ast_nodes().into_iter().next();
/* cfg */
let (end_of_prepare_cond_graph_ix, iteration_graph_ix, body_graph_ix) =
control_flow!(|self, cfg| {
let end_of_prepare_cond_graph_ix = cfg.current_node_ix;
let iteration_graph_ix = cfg.new_basic_block_normal();
cfg.append_iteration(right_node, IterationInstructionKind::In);
let body_graph_ix = cfg.new_basic_block_normal();
cfg.ctx(None).default().allow_break().allow_continue();
(end_of_prepare_cond_graph_ix, iteration_graph_ix, body_graph_ix)
});
/* cfg */
self.visit_statement(&stmt.body);
/* cfg */
control_flow!(|self, cfg| {
let end_of_body_graph_ix = cfg.current_node_ix;
let after_for_graph_ix = cfg.new_basic_block_normal();
// connect before for statement to the iterable expression
cfg.add_edge(before_for_stmt_graph_ix, start_prepare_cond_graph_ix, EdgeType::Normal);
// connect the end of the iterable expression to the basic block with back edge
cfg.add_edge(end_of_prepare_cond_graph_ix, iteration_graph_ix, EdgeType::Normal);
// connect the basic block with back edge to the start of the body
cfg.add_edge(iteration_graph_ix, body_graph_ix, EdgeType::Jump);
// connect the end of the body back to the basic block
// with back edge for the next iteration
cfg.add_edge(end_of_body_graph_ix, iteration_graph_ix, EdgeType::Backedge);
// connect the basic block with back edge to the basic block after the for loop
// for when there are no more iterations left in the iterable
cfg.add_edge(iteration_graph_ix, after_for_graph_ix, EdgeType::Normal);
cfg.ctx(None)
.mark_break(after_for_graph_ix)
.mark_continue(iteration_graph_ix)
.resolve_with_upper_label();
});
/* cfg */
self.leave_node(kind);
if is_lexical_declaration {
self.leave_scope();
}
}
fn visit_for_of_statement(&mut self, stmt: &ForOfStatement<'a>) {
let kind = AstKind::ForOfStatement(self.alloc(stmt));
let is_lexical_declaration = stmt.left.is_lexical_declaration();
if is_lexical_declaration {
self.enter_scope(ScopeFlags::empty());
stmt.scope_id.set(Some(self.current_scope_id));
}
self.enter_node(kind);
self.visit_for_statement_left(&stmt.left);
/* cfg */
let (before_for_stmt_graph_ix, start_prepare_cond_graph_ix) =
control_flow!(|self, cfg| (cfg.current_node_ix, cfg.new_basic_block_normal()));
/* cfg */
self.record_ast_nodes();
self.visit_expression(&stmt.right);
let right_node = self.retrieve_recorded_ast_nodes().into_iter().next();
/* cfg */
let (end_of_prepare_cond_graph_ix, iteration_graph_ix, body_graph_ix) =
control_flow!(|self, cfg| {
let end_of_prepare_cond_graph_ix = cfg.current_node_ix;
let iteration_graph_ix = cfg.new_basic_block_normal();
cfg.append_iteration(right_node, IterationInstructionKind::Of);
let body_graph_ix = cfg.new_basic_block_normal();
cfg.ctx(None).default().allow_break().allow_continue();
(end_of_prepare_cond_graph_ix, iteration_graph_ix, body_graph_ix)
});
/* cfg */
self.visit_statement(&stmt.body);
/* cfg */
control_flow!(|self, cfg| {
let end_of_body_graph_ix = cfg.current_node_ix;
let after_for_graph_ix = cfg.new_basic_block_normal();
// connect before for statement to the iterable expression
cfg.add_edge(before_for_stmt_graph_ix, start_prepare_cond_graph_ix, EdgeType::Normal);
// connect the end of the iterable expression to the basic block with back edge
cfg.add_edge(end_of_prepare_cond_graph_ix, iteration_graph_ix, EdgeType::Normal);
// connect the basic block with back edge to the start of the body
cfg.add_edge(iteration_graph_ix, body_graph_ix, EdgeType::Jump);
// connect the end of the body back to the basic block
// with back edge for the next iteration
cfg.add_edge(end_of_body_graph_ix, iteration_graph_ix, EdgeType::Backedge);
// connect the basic block with back edge to the basic block after the for loop
// for when there are no more iterations left in the iterable
cfg.add_edge(iteration_graph_ix, after_for_graph_ix, EdgeType::Normal);
cfg.ctx(None)
.mark_break(after_for_graph_ix)
.mark_continue(iteration_graph_ix)
.resolve_with_upper_label();
});
/* cfg */
self.leave_node(kind);
if is_lexical_declaration {
self.leave_scope();
}
}
fn visit_if_statement(&mut self, stmt: &IfStatement<'a>) {
let kind = AstKind::IfStatement(self.alloc(stmt));
self.enter_node(kind);
/* cfg - condition basic block */
let (before_if_stmt_graph_ix, start_of_condition_graph_ix) =
control_flow!(|self, cfg| (cfg.current_node_ix, cfg.new_basic_block_normal(),));
/* cfg */
self.record_ast_nodes();
self.visit_expression(&stmt.test);
let test_node = self.retrieve_recorded_ast_nodes().into_iter().next();
/* cfg */
let (after_test_graph_ix, before_consequent_stmt_graph_ix) = control_flow!(|self, cfg| {
cfg.append_condition_to(start_of_condition_graph_ix, test_node);
(cfg.current_node_ix, cfg.new_basic_block_normal())
});
/* cfg */
self.visit_statement(&stmt.consequent);
/* cfg */
let after_consequent_stmt_graph_ix = control_flow!(|self, cfg| cfg.current_node_ix);
/* cfg */
let else_graph_ix = if let Some(alternate) = &stmt.alternate {
/* cfg */
let else_graph_ix = control_flow!(|self, cfg| cfg.new_basic_block_normal());
/* cfg */
self.visit_statement(alternate);
control_flow!(|self, cfg| Some((else_graph_ix, cfg.current_node_ix)))
} else {
None
};
/* cfg - bb after if statement joins consequent and alternate */
control_flow!(|self, cfg| {
let after_if_graph_ix = cfg.new_basic_block_normal();
cfg.add_edge(before_if_stmt_graph_ix, start_of_condition_graph_ix, EdgeType::Normal);
cfg.add_edge(after_consequent_stmt_graph_ix, after_if_graph_ix, EdgeType::Normal);
cfg.add_edge(after_test_graph_ix, before_consequent_stmt_graph_ix, EdgeType::Jump);
if let Some((start_of_alternate_stmt_graph_ix, after_alternate_stmt_graph_ix)) =
else_graph_ix
{
cfg.add_edge(
before_if_stmt_graph_ix,
start_of_alternate_stmt_graph_ix,
EdgeType::Normal,
);
cfg.add_edge(after_alternate_stmt_graph_ix, after_if_graph_ix, EdgeType::Normal);
} else {
cfg.add_edge(before_if_stmt_graph_ix, after_if_graph_ix, EdgeType::Normal);
}
});
/* cfg */
self.leave_node(kind);
}
fn visit_labeled_statement(&mut self, stmt: &LabeledStatement<'a>) {
let kind = AstKind::LabeledStatement(self.alloc(stmt));
self.enter_node(kind);
/* cfg */
let label = &stmt.label.name;
control_flow!(|self, cfg| {
let ctx = cfg.ctx(Some(label.as_str())).default().allow_break();
if stmt.body.is_iteration_statement() {
ctx.allow_continue();
}
});
/* cfg */
self.visit_label_identifier(&stmt.label);
self.visit_statement(&stmt.body);
/* cfg */
control_flow!(|self, cfg| {
let after_body_graph_ix = cfg.current_node_ix;
let after_labeled_stmt_graph_ix = cfg.new_basic_block_normal();
cfg.add_edge(after_body_graph_ix, after_labeled_stmt_graph_ix, EdgeType::Normal);
cfg.ctx(Some(label.as_str())).mark_break(after_labeled_stmt_graph_ix).resolve();
});
/* cfg */
self.leave_node(kind);
}
fn visit_return_statement(&mut self, stmt: &ReturnStatement<'a>) {
let kind = AstKind::ReturnStatement(self.alloc(stmt));
self.enter_node(kind);
/* cfg */
let node_id = self.current_node_id;
/* cfg */
let ret_kind = if let Some(arg) = &stmt.argument {
self.visit_expression(arg);
ReturnInstructionKind::NotImplicitUndefined
} else {
ReturnInstructionKind::ImplicitUndefined
};
/* cfg */
control_flow!(|self, cfg| {
cfg.push_return(ret_kind, node_id);
cfg.append_unreachable();
});
/* cfg */
self.leave_node(kind);
}
fn visit_switch_statement(&mut self, stmt: &SwitchStatement<'a>) {
let kind = AstKind::SwitchStatement(self.alloc(stmt));
self.enter_node(kind);
self.visit_expression(&stmt.discriminant);
self.enter_scope(ScopeFlags::empty());
stmt.scope_id.set(Some(self.current_scope_id));
/* cfg */
let discriminant_graph_ix = control_flow!(|self, cfg| {
let discriminant_graph_ix = cfg.current_node_ix;
cfg.ctx(None).default().allow_break();
discriminant_graph_ix
});
let mut switch_case_graph_spans = vec![];
let mut have_default_case = false;
/* cfg */
for case in &stmt.cases {
let before_case_graph_ix = control_flow!(|self, cfg| cfg.new_basic_block_normal());
self.visit_switch_case(case);
if case.is_default_case() {
have_default_case = true;
}
control_flow!(|self, cfg| switch_case_graph_spans
.push((before_case_graph_ix, cfg.current_node_ix)));
}
/* cfg */
// for each switch case
control_flow!(|self, cfg| {
for i in 0..switch_case_graph_spans.len() {
let case_graph_span = switch_case_graph_spans[i];
// every switch case condition can be skipped,
// so there's a possible jump from it to the next switch case condition
for y in switch_case_graph_spans.iter().skip(i + 1) {
cfg.add_edge(case_graph_span.0, y.0, EdgeType::Normal);
}
// connect the end of each switch statement to
// the condition of the next switch statement
if switch_case_graph_spans.len() > i + 1 {
let (_, end_of_switch_case) = switch_case_graph_spans[i];
let (next_switch_statement_condition, _) = switch_case_graph_spans[i + 1];
cfg.add_edge(
end_of_switch_case,
next_switch_statement_condition,
EdgeType::Normal,
);
}
cfg.add_edge(discriminant_graph_ix, case_graph_span.0, EdgeType::Normal);
}
let end_of_switch_case_statement = cfg.new_basic_block_normal();
if let Some(last) = switch_case_graph_spans.last() {
cfg.add_edge(last.1, end_of_switch_case_statement, EdgeType::Normal);
}
// if we don't have a default case there should be an edge from discriminant to the end of
// the statement.
if !have_default_case {
cfg.add_edge(discriminant_graph_ix, end_of_switch_case_statement, EdgeType::Normal);
}
cfg.ctx(None).mark_break(end_of_switch_case_statement).resolve();
});
/* cfg */
self.leave_scope();
self.leave_node(kind);
}
fn visit_switch_case(&mut self, case: &SwitchCase<'a>) {
let kind = AstKind::SwitchCase(self.alloc(case));
self.enter_node(kind);
if let Some(expr) = &case.test {
self.record_ast_nodes();
self.visit_expression(expr);
let test_node = self.retrieve_recorded_ast_nodes().into_iter().next();
control_flow!(|self, cfg| cfg.append_condition_to(cfg.current_node_ix, test_node));
}
/* cfg */
control_flow!(|self, cfg| {
let after_test_graph_ix = cfg.current_node_ix;
let statements_in_switch_graph_ix = cfg.new_basic_block_normal();
cfg.add_edge(after_test_graph_ix, statements_in_switch_graph_ix, EdgeType::Jump);
});
/* cfg */
self.visit_statements(&case.consequent);
self.leave_node(kind);
}
fn visit_throw_statement(&mut self, stmt: &ThrowStatement<'a>) {
let kind = AstKind::ThrowStatement(self.alloc(stmt));
self.enter_node(kind);
/* cfg */
let node_id = self.current_node_id;
/* cfg */
self.visit_expression(&stmt.argument);
/* cfg */
control_flow!(|self, cfg| cfg.append_throw(node_id));
/* cfg */
self.leave_node(kind);
}
fn visit_try_statement(&mut self, stmt: &TryStatement<'a>) {
let kind = AstKind::TryStatement(self.alloc(stmt));
self.enter_node(kind);
/* cfg */
let (
before_try_statement_graph_ix,
error_harness,
before_finalizer_graph_ix,
before_try_block_graph_ix,
) = control_flow!(|self, cfg| {
let before_try_statement_graph_ix = cfg.current_node_ix;
let error_harness =
stmt.handler.as_ref().map(|_| cfg.attach_error_harness(ErrorEdgeKind::Explicit));
let before_finalizer_graph_ix = stmt.finalizer.as_ref().map(|_| cfg.attach_finalizer());
let before_try_block_graph_ix = cfg.new_basic_block_normal();
(
before_try_statement_graph_ix,
error_harness,
before_finalizer_graph_ix,
before_try_block_graph_ix,
)
});
/* cfg */
self.visit_block_statement(&stmt.block);
/* cfg */
let after_try_block_graph_ix = control_flow!(|self, cfg| cfg.current_node_ix);
/* cfg */
let catch_block_end_ix = if let Some(handler) = &stmt.handler {
/* cfg */
control_flow!(|self, cfg| {
let Some(error_harness) = error_harness else {
unreachable!("we always create an error harness if we have a catch block.");
};
cfg.release_error_harness(error_harness);
let catch_block_start_ix = cfg.new_basic_block_normal();
cfg.add_edge(error_harness, catch_block_start_ix, EdgeType::Normal);
});
/* cfg */
self.visit_catch_clause(handler);
/* cfg */
control_flow!(|self, cfg| {
let catch_block_end_ix = cfg.current_node_ix;
// TODO: we shouldn't directly change the current node index.
cfg.current_node_ix = after_try_block_graph_ix;
Some(catch_block_end_ix)
})
/* cfg */
} else {
None
};
let finally_block_end_ix = if let Some(finalizer) = &stmt.finalizer {
/* cfg */
control_flow!(|self, cfg| {
let Some(before_finalizer_graph_ix) = before_finalizer_graph_ix else {
unreachable!("we always create a finalizer when there is a finally block.");
};
cfg.release_finalizer(before_finalizer_graph_ix);
let start_finally_graph_ix = cfg.new_basic_block_normal();
cfg.add_edge(before_finalizer_graph_ix, start_finally_graph_ix, EdgeType::Normal);
});
/* cfg */
self.visit_finally_clause(finalizer);
/* cfg */
control_flow!(|self, cfg| {
let finally_block_end_ix = cfg.current_node_ix;
// TODO: we shouldn't directly change the current node index.
cfg.current_node_ix = after_try_block_graph_ix;
Some(finally_block_end_ix)
})
/* cfg */
} else {
None
};
/* cfg */
control_flow!(|self, cfg| {
let after_try_statement_block_ix = cfg.new_basic_block_normal();
cfg.add_edge(
before_try_statement_graph_ix,
before_try_block_graph_ix,
EdgeType::Normal,
);
if let Some(catch_block_end_ix) = catch_block_end_ix {
if finally_block_end_ix.is_none() {
cfg.add_edge(
after_try_block_graph_ix,
after_try_statement_block_ix,
EdgeType::Normal,
);
cfg.add_edge(
catch_block_end_ix,
after_try_statement_block_ix,
EdgeType::Normal,
);
}
}
if let Some(finally_block_end_ix) = finally_block_end_ix {
if catch_block_end_ix.is_some() {
cfg.add_edge(
finally_block_end_ix,
after_try_statement_block_ix,
EdgeType::Normal,
);
} else {
cfg.add_edge(
finally_block_end_ix,
after_try_statement_block_ix,
if cfg.basic_block(after_try_block_graph_ix).unreachable {
EdgeType::Unreachable
} else {
EdgeType::Join
},
);
}
}
});
/* cfg */
self.leave_node(kind);
}
fn visit_catch_clause(&mut self, clause: &CatchClause<'a>) {
let kind = AstKind::CatchClause(self.alloc(clause));
self.enter_scope(ScopeFlags::empty());
clause.scope_id.set(Some(self.current_scope_id));
self.enter_node(kind);
if let Some(param) = &clause.param {
self.visit_catch_parameter(param);
}
self.visit_statements(&clause.body.body);
self.leave_node(kind);
self.leave_scope();
}
fn visit_finally_clause(&mut self, clause: &BlockStatement<'a>) {
let kind = AstKind::FinallyClause(self.alloc(clause));
self.enter_scope(ScopeFlags::empty());
clause.scope_id.set(Some(self.current_scope_id));
self.enter_node(kind);
self.visit_statements(&clause.body);
self.leave_node(kind);
self.leave_scope();
}
fn visit_while_statement(&mut self, stmt: &WhileStatement<'a>) {
let kind = AstKind::WhileStatement(self.alloc(stmt));
self.enter_node(kind);
/* cfg - condition basic block */
let (before_while_stmt_graph_ix, condition_graph_ix) =
control_flow!(|self, cfg| (cfg.current_node_ix, cfg.new_basic_block_normal()));
/* cfg */
self.record_ast_nodes();
self.visit_expression(&stmt.test);
let test_node = self.retrieve_recorded_ast_nodes().into_iter().next();
/* cfg - body basic block */
let body_graph_ix = control_flow!(|self, cfg| {
cfg.append_condition_to(condition_graph_ix, test_node);
let body_graph_ix = cfg.new_basic_block_normal();
cfg.ctx(None).default().allow_break().allow_continue();
body_graph_ix
});
/* cfg */
self.visit_statement(&stmt.body);
/* cfg - after body basic block */
control_flow!(|self, cfg| {
let after_body_graph_ix = cfg.current_node_ix;
let after_while_graph_ix = cfg.new_basic_block_normal();
cfg.add_edge(before_while_stmt_graph_ix, condition_graph_ix, EdgeType::Normal);
cfg.add_edge(condition_graph_ix, body_graph_ix, EdgeType::Jump);
cfg.add_edge(after_body_graph_ix, condition_graph_ix, EdgeType::Backedge);
cfg.add_edge(condition_graph_ix, after_while_graph_ix, EdgeType::Normal);
cfg.ctx(None)
.mark_break(after_while_graph_ix)
.mark_continue(condition_graph_ix)
.resolve_with_upper_label();
});
/* cfg */
self.leave_node(kind);
}
fn visit_with_statement(&mut self, stmt: &WithStatement<'a>) {
let kind = AstKind::WithStatement(self.alloc(stmt));
self.enter_node(kind);
/* cfg - condition basic block */
let (before_with_stmt_graph_ix, condition_graph_ix) =
control_flow!(|self, cfg| (cfg.current_node_ix, cfg.new_basic_block_normal()));
/* cfg */
self.visit_expression(&stmt.object);
/* cfg - body basic block */
let body_graph_ix = control_flow!(|self, cfg| cfg.new_basic_block_normal());
/* cfg */
self.visit_statement(&stmt.body);
/* cfg - after body basic block */
control_flow!(|self, cfg| {
let after_body_graph_ix = cfg.new_basic_block_normal();
cfg.add_edge(before_with_stmt_graph_ix, condition_graph_ix, EdgeType::Normal);
cfg.add_edge(condition_graph_ix, body_graph_ix, EdgeType::Normal);
cfg.add_edge(body_graph_ix, after_body_graph_ix, EdgeType::Normal);
cfg.add_edge(condition_graph_ix, after_body_graph_ix, EdgeType::Normal);
});
/* cfg */
self.leave_node(kind);
}
fn visit_function(&mut self, func: &Function<'a>, flags: Option<ScopeFlags>) {
let kind = AstKind::Function(self.alloc(func));
self.enter_scope({
let mut flags = flags.unwrap_or(ScopeFlags::empty()) | ScopeFlags::Function;
if func.is_strict() {
flags |= ScopeFlags::StrictMode;
}
flags
});
func.scope_id.set(Some(self.current_scope_id));
/* cfg */
let (before_function_graph_ix, error_harness, function_graph_ix) =
control_flow!(|self, cfg| {
let before_function_graph_ix = cfg.current_node_ix;
cfg.push_finalization_stack();
let error_harness = cfg.attach_error_harness(ErrorEdgeKind::Implicit);
let function_graph_ix = cfg.new_basic_block_function();
cfg.ctx(None).new_function();
(before_function_graph_ix, error_harness, function_graph_ix)
});
/* cfg */
// We add a new basic block to the cfg before entering the node
// so that the correct cfg_ix is associated with the ast node.
self.enter_node(kind);
/* cfg */
control_flow!(|self, cfg| cfg.add_edge(
before_function_graph_ix,
function_graph_ix,
EdgeType::NewFunction
));
/* cfg */
if let Some(ident) = &func.id {
self.visit_binding_identifier(ident);
}
self.visit_formal_parameters(&func.params);
if let Some(body) = &func.body {
self.visit_function_body(body);
}
/* cfg */
control_flow!(|self, cfg| {
cfg.ctx(None).resolve_expect(CtxFlags::FUNCTION);
cfg.release_error_harness(error_harness);
cfg.pop_finalization_stack();
let after_function_graph_ix = cfg.new_basic_block_normal();
cfg.add_edge(before_function_graph_ix, after_function_graph_ix, EdgeType::Normal);
});
/* cfg */
if let Some(parameters) = &func.type_parameters {
self.visit_ts_type_parameter_declaration(parameters);
}
if let Some(annotation) = &func.return_type {
self.visit_ts_type_annotation(annotation);
}
self.leave_node(kind);
self.leave_scope();
}
fn visit_class(&mut self, class: &Class<'a>) {
// Class level decorators are transpiled as functions outside of the class taking the class
// itself as argument. They should be visited before class is entered. E.g., they inherit
// strict mode from the enclosing scope rather than from class.
for decorator in &class.decorators {
self.visit_decorator(decorator);
}
let kind = AstKind::Class(self.alloc(class));
// FIXME(don): Should we enter a scope when visiting class declarations?
let is_class_expr = class.r#type == ClassType::ClassExpression;
if is_class_expr {
// Class expressions create a temporary scope with the class name as its only variable
// E.g., `let c = class A { foo() { console.log(A) } }`
self.enter_scope(ScopeFlags::empty());
class.scope_id.set(Some(self.current_scope_id));
}
self.enter_node(kind);
if let Some(id) = &class.id {
self.visit_binding_identifier(id);
}
if let Some(parameters) = &class.type_parameters {
self.visit_ts_type_parameter_declaration(parameters);
}
if let Some(super_class) = &class.super_class {
self.visit_class_heritage(super_class);
}
if let Some(super_parameters) = &class.super_type_parameters {
self.visit_ts_type_parameter_instantiation(super_parameters);
}
self.visit_class_body(&class.body);
self.leave_node(kind);
if is_class_expr {
self.leave_scope();
}
}
fn visit_static_block(&mut self, block: &StaticBlock<'a>) {
let kind = AstKind::StaticBlock(self.alloc(block));
self.enter_scope(ScopeFlags::ClassStaticBlock);
block.scope_id.set(Some(self.current_scope_id));
self.enter_node(kind);
self.visit_statements(&block.body);
self.leave_node(kind);
self.leave_scope();
}
fn visit_arrow_function_expression(&mut self, expr: &ArrowFunctionExpression<'a>) {
let kind = AstKind::ArrowFunctionExpression(self.alloc(expr));
self.enter_scope(ScopeFlags::Function | ScopeFlags::Arrow);
expr.scope_id.set(Some(self.current_scope_id));
/* cfg */
let (current_node_ix, error_harness, function_graph_ix) = control_flow!(|self, cfg| {
let current_node_ix = cfg.current_node_ix;
cfg.push_finalization_stack();
let error_harness = cfg.attach_error_harness(ErrorEdgeKind::Implicit);
let function_graph_ix = cfg.new_basic_block_function();
cfg.ctx(None).new_function();
(current_node_ix, error_harness, function_graph_ix)
});
/* cfg */
// We add a new basic block to the cfg before entering the node
// so that the correct cfg_ix is associated with the ast node.
self.enter_node(kind);
self.visit_formal_parameters(&expr.params);
/* cfg */
control_flow!(|self, cfg| cfg.add_edge(
current_node_ix,
function_graph_ix,
EdgeType::NewFunction
));
/* cfg */
self.visit_function_body(&expr.body);
/* cfg */
control_flow!(|self, cfg| {
cfg.ctx(None).resolve_expect(CtxFlags::FUNCTION);
cfg.release_error_harness(error_harness);
cfg.pop_finalization_stack();
cfg.current_node_ix = current_node_ix;
});
/* cfg */
if let Some(parameters) = &expr.type_parameters {
self.visit_ts_type_parameter_declaration(parameters);
}
self.leave_node(kind);
self.leave_scope();
}
fn visit_ts_enum_declaration(&mut self, decl: &TSEnumDeclaration<'a>) {
let kind = AstKind::TSEnumDeclaration(self.alloc(decl));
self.enter_node(kind);
self.visit_binding_identifier(&decl.id);
self.enter_scope(ScopeFlags::empty());
decl.scope_id.set(Some(self.current_scope_id));
for member in &decl.members {
self.visit_ts_enum_member(member);
}
self.leave_scope();
self.leave_node(kind);
}
fn visit_ts_module_declaration(&mut self, decl: &TSModuleDeclaration<'a>) {
let kind = AstKind::TSModuleDeclaration(self.alloc(decl));
self.enter_node(kind);
match &decl.id {
TSModuleDeclarationName::Identifier(ident) => self.visit_identifier_name(ident),
TSModuleDeclarationName::StringLiteral(lit) => self.visit_string_literal(lit),
}
self.enter_scope(ScopeFlags::TsModuleBlock);
decl.scope_id.set(Some(self.current_scope_id));
match &decl.body {
Some(TSModuleDeclarationBody::TSModuleDeclaration(decl)) => {
self.visit_ts_module_declaration(decl);
}
Some(TSModuleDeclarationBody::TSModuleBlock(block)) => {
self.visit_ts_module_block(block);
}
None => {}
}
self.leave_scope();
self.leave_node(kind);
}
fn visit_ts_type_parameter(&mut self, ty: &TSTypeParameter<'a>) {
let kind = AstKind::TSTypeParameter(self.alloc(ty));
self.enter_scope(ScopeFlags::empty());
ty.scope_id.set(Some(self.current_scope_id));
self.enter_node(kind);
if let Some(constraint) = &ty.constraint {
self.visit_ts_type(constraint);
}
if let Some(default) = &ty.default {
self.visit_ts_type(default);
}
self.leave_node(kind);
self.leave_scope();
}

@overlookmotel
Copy link
Contributor

These visit functions are outdated

Outdated as in field visitation order is wrong?

@overlookmotel overlookmotel transferred this issue from oxc-project/backlog Jul 9, 2024
@overlookmotel
Copy link
Contributor

I've moved this issue to main oxc repo as we're currently working on this.

@Dunqing
Copy link
Member

Dunqing commented Jul 9, 2024

These visit functions are outdated

Outdated as in field visitation order is wrong?

or missing visits to some fields

@Boshen
Copy link
Member Author

Boshen commented Jul 9, 2024

These visit functions are outdated, do we have a better way to reimplement them here? It is now difficult to manually update

fn visit_program(&mut self, program: &Program<'a>) {
let kind = AstKind::Program(self.alloc(program));
self.enter_scope({
let mut flags = ScopeFlags::Top;
if program.is_strict() {
flags |= ScopeFlags::StrictMode;
}
flags
});
program.scope_id.set(Some(self.current_scope_id));
/* cfg */
let error_harness = control_flow!(|self, cfg| {
let error_harness = cfg.attach_error_harness(ErrorEdgeKind::Implicit);
let _program_basic_block = cfg.new_basic_block_normal();
error_harness
});
/* cfg - must be above directives as directives are in cfg */
self.enter_node(kind);
for directive in &program.directives {
self.visit_directive(directive);
}
self.visit_statements(&program.body);
/* cfg */
control_flow!(|self, cfg| cfg.release_error_harness(error_harness));
/* cfg */
self.leave_node(kind);
self.leave_scope();
}
fn visit_block_statement(&mut self, stmt: &BlockStatement<'a>) {
let kind = AstKind::BlockStatement(self.alloc(stmt));
self.enter_scope(ScopeFlags::empty());
stmt.scope_id.set(Some(self.current_scope_id));
self.enter_node(kind);
self.visit_statements(&stmt.body);
self.leave_node(kind);
self.leave_scope();
}
fn visit_break_statement(&mut self, stmt: &BreakStatement<'a>) {
let kind = AstKind::BreakStatement(self.alloc(stmt));
self.enter_node(kind);
/* cfg */
let node_id = self.current_node_id;
/* cfg */
if let Some(ref break_target) = stmt.label {
self.visit_label_identifier(break_target);
}
/* cfg */
control_flow!(
|self, cfg| cfg.append_break(node_id, stmt.label.as_ref().map(|it| it.name.as_str()))
);
/* cfg */
self.leave_node(kind);
}
fn visit_continue_statement(&mut self, stmt: &ContinueStatement<'a>) {
let kind = AstKind::ContinueStatement(self.alloc(stmt));
self.enter_node(kind);
/* cfg */
let node_id = self.current_node_id;
/* cfg */
if let Some(continue_target) = &stmt.label {
self.visit_label_identifier(continue_target);
}
/* cfg */
control_flow!(|self, cfg| cfg
.append_continue(node_id, stmt.label.as_ref().map(|it| it.name.as_str())));
/* cfg */
self.leave_node(kind);
}
fn visit_debugger_statement(&mut self, stmt: &DebuggerStatement) {
let kind = AstKind::DebuggerStatement(self.alloc(stmt));
self.enter_node(kind);
self.leave_node(kind);
}
fn visit_do_while_statement(&mut self, stmt: &DoWhileStatement<'a>) {
let kind = AstKind::DoWhileStatement(self.alloc(stmt));
self.enter_node(kind);
/* cfg */
let (before_do_while_stmt_graph_ix, start_body_graph_ix) = control_flow!(|self, cfg| {
let before_do_while_stmt_graph_ix = cfg.current_node_ix;
let start_body_graph_ix = cfg.new_basic_block_normal();
cfg.ctx(None).default().allow_break().allow_continue();
(before_do_while_stmt_graph_ix, start_body_graph_ix)
});
/* cfg */
self.visit_statement(&stmt.body);
/* cfg - condition basic block */
let (after_body_graph_ix, start_of_condition_graph_ix) = control_flow!(|self, cfg| {
let after_body_graph_ix = cfg.current_node_ix;
let start_of_condition_graph_ix = cfg.new_basic_block_normal();
(after_body_graph_ix, start_of_condition_graph_ix)
});
/* cfg */
self.record_ast_nodes();
self.visit_expression(&stmt.test);
let test_node = self.retrieve_recorded_ast_nodes().into_iter().next();
/* cfg */
control_flow!(|self, cfg| {
cfg.append_condition_to(start_of_condition_graph_ix, test_node);
let end_of_condition_graph_ix = cfg.current_node_ix;
let end_do_while_graph_ix = cfg.new_basic_block_normal();
// before do while to start of body basic block
cfg.add_edge(before_do_while_stmt_graph_ix, start_body_graph_ix, EdgeType::Normal);
// body of do-while to start of condition
cfg.add_edge(after_body_graph_ix, start_of_condition_graph_ix, EdgeType::Normal);
// end of condition to after do while
cfg.add_edge(end_of_condition_graph_ix, end_do_while_graph_ix, EdgeType::Normal);
// end of condition to after start of body
cfg.add_edge(end_of_condition_graph_ix, start_body_graph_ix, EdgeType::Backedge);
cfg.ctx(None)
.mark_break(end_do_while_graph_ix)
.mark_continue(start_of_condition_graph_ix)
.resolve_with_upper_label();
});
/* cfg */
self.leave_node(kind);
}
fn visit_expression_statement(&mut self, stmt: &ExpressionStatement<'a>) {
let kind = AstKind::ExpressionStatement(self.alloc(stmt));
self.enter_node(kind);
self.visit_expression(&stmt.expression);
self.leave_node(kind);
}
fn visit_logical_expression(&mut self, expr: &LogicalExpression<'a>) {
// logical expressions are short-circuiting, and therefore
// also represent control flow.
// For example, in:
// foo && bar();
// the bar() call will only be executed if foo is truthy.
let kind = AstKind::LogicalExpression(self.alloc(expr));
self.enter_node(kind);
self.visit_expression(&expr.left);
/* cfg */
let (left_expr_end_ix, right_expr_start_ix) = control_flow!(|self, cfg| {
let left_expr_end_ix = cfg.current_node_ix;
let right_expr_start_ix = cfg.new_basic_block_normal();
(left_expr_end_ix, right_expr_start_ix)
});
/* cfg */
self.visit_expression(&expr.right);
/* cfg */
control_flow!(|self, cfg| {
let right_expr_end_ix = cfg.current_node_ix;
let after_logical_expr_ix = cfg.new_basic_block_normal();
cfg.add_edge(left_expr_end_ix, right_expr_start_ix, EdgeType::Normal);
cfg.add_edge(left_expr_end_ix, after_logical_expr_ix, EdgeType::Normal);
cfg.add_edge(right_expr_end_ix, after_logical_expr_ix, EdgeType::Normal);
});
/* cfg */
self.leave_node(kind);
}
fn visit_assignment_expression(&mut self, expr: &AssignmentExpression<'a>) {
// assignment expressions can include an operator, which
// can be used to determine the control flow of the expression.
// For example, in:
// foo &&= super();
// the super() call will only be executed if foo is truthy.
let kind = AstKind::AssignmentExpression(self.alloc(expr));
self.enter_node(kind);
self.visit_assignment_target(&expr.left);
/* cfg */
let cfg_ixs = control_flow!(|self, cfg| {
if expr.operator.is_logical() {
let target_end_ix = cfg.current_node_ix;
let expr_start_ix = cfg.new_basic_block_normal();
Some((target_end_ix, expr_start_ix))
} else {
None
}
});
/* cfg */
self.visit_expression(&expr.right);
/* cfg */
control_flow!(|self, cfg| {
if let Some((target_end_ix, expr_start_ix)) = cfg_ixs {
let expr_end_ix = cfg.current_node_ix;
let after_assignment_ix = cfg.new_basic_block_normal();
cfg.add_edge(target_end_ix, expr_start_ix, EdgeType::Normal);
cfg.add_edge(target_end_ix, after_assignment_ix, EdgeType::Normal);
cfg.add_edge(expr_end_ix, after_assignment_ix, EdgeType::Normal);
}
});
/* cfg */
self.leave_node(kind);
}
fn visit_conditional_expression(&mut self, expr: &ConditionalExpression<'a>) {
let kind = AstKind::ConditionalExpression(self.alloc(expr));
self.enter_node(kind);
/* cfg - condition basic block */
let (before_conditional_graph_ix, start_of_condition_graph_ix) =
control_flow!(|self, cfg| {
let before_conditional_graph_ix = cfg.current_node_ix;
let start_of_condition_graph_ix = cfg.new_basic_block_normal();
(before_conditional_graph_ix, start_of_condition_graph_ix)
});
/* cfg */
self.record_ast_nodes();
self.visit_expression(&expr.test);
let test_node = self.retrieve_recorded_ast_nodes().into_iter().next();
/* cfg */
let (after_condition_graph_ix, before_consequent_expr_graph_ix) =
control_flow!(|self, cfg| {
cfg.append_condition_to(start_of_condition_graph_ix, test_node);
let after_condition_graph_ix = cfg.current_node_ix;
// conditional expression basic block
let before_consequent_expr_graph_ix = cfg.new_basic_block_normal();
(after_condition_graph_ix, before_consequent_expr_graph_ix)
});
/* cfg */
self.visit_expression(&expr.consequent);
/* cfg */
let (after_consequent_expr_graph_ix, start_alternate_graph_ix) =
control_flow!(|self, cfg| {
let after_consequent_expr_graph_ix = cfg.current_node_ix;
let start_alternate_graph_ix = cfg.new_basic_block_normal();
(after_consequent_expr_graph_ix, start_alternate_graph_ix)
});
/* cfg */
self.visit_expression(&expr.alternate);
/* cfg */
control_flow!(|self, cfg| {
let after_alternate_graph_ix = cfg.current_node_ix;
/* bb after conditional expression joins consequent and alternate */
let after_conditional_graph_ix = cfg.new_basic_block_normal();
cfg.add_edge(
before_conditional_graph_ix,
start_of_condition_graph_ix,
EdgeType::Normal,
);
cfg.add_edge(
after_consequent_expr_graph_ix,
after_conditional_graph_ix,
EdgeType::Normal,
);
cfg.add_edge(after_condition_graph_ix, before_consequent_expr_graph_ix, EdgeType::Jump);
cfg.add_edge(after_condition_graph_ix, start_alternate_graph_ix, EdgeType::Normal);
cfg.add_edge(after_alternate_graph_ix, after_conditional_graph_ix, EdgeType::Normal);
});
/* cfg */
self.leave_node(kind);
}
fn visit_for_statement(&mut self, stmt: &ForStatement<'a>) {
let kind = AstKind::ForStatement(self.alloc(stmt));
let is_lexical_declaration =
stmt.init.as_ref().is_some_and(ForStatementInit::is_lexical_declaration);
if is_lexical_declaration {
self.enter_scope(ScopeFlags::empty());
stmt.scope_id.set(Some(self.current_scope_id));
}
self.enter_node(kind);
if let Some(init) = &stmt.init {
self.visit_for_statement_init(init);
}
/* cfg */
let (before_for_graph_ix, test_graph_ix) = control_flow!(|self, cfg| {
let before_for_graph_ix = cfg.current_node_ix;
let test_graph_ix = cfg.new_basic_block_normal();
(before_for_graph_ix, test_graph_ix)
});
/* cfg */
if let Some(test) = &stmt.test {
self.record_ast_nodes();
self.visit_expression(test);
let test_node = self.retrieve_recorded_ast_nodes().into_iter().next();
/* cfg */
control_flow!(|self, cfg| cfg.append_condition_to(test_graph_ix, test_node));
/* cfg */
}
/* cfg */
let (after_test_graph_ix, update_graph_ix) =
control_flow!(|self, cfg| (cfg.current_node_ix, cfg.new_basic_block_normal()));
/* cfg */
if let Some(update) = &stmt.update {
self.visit_expression(update);
}
/* cfg */
let before_body_graph_ix = control_flow!(|self, cfg| {
let before_body_graph_ix = cfg.new_basic_block_normal();
cfg.ctx(None).default().allow_break().allow_continue();
before_body_graph_ix
});
/* cfg */
self.visit_statement(&stmt.body);
/* cfg */
control_flow!(|self, cfg| {
let after_body_graph_ix = cfg.current_node_ix;
let after_for_stmt = cfg.new_basic_block_normal();
cfg.add_edge(before_for_graph_ix, test_graph_ix, EdgeType::Normal);
cfg.add_edge(after_test_graph_ix, before_body_graph_ix, EdgeType::Jump);
cfg.add_edge(after_body_graph_ix, update_graph_ix, EdgeType::Backedge);
cfg.add_edge(update_graph_ix, test_graph_ix, EdgeType::Backedge);
cfg.add_edge(after_test_graph_ix, after_for_stmt, EdgeType::Normal);
cfg.ctx(None)
.mark_break(after_for_stmt)
.mark_continue(update_graph_ix)
.resolve_with_upper_label();
});
/* cfg */
self.leave_node(kind);
if is_lexical_declaration {
self.leave_scope();
}
}
fn visit_for_statement_init(&mut self, init: &ForStatementInit<'a>) {
let kind = AstKind::ForStatementInit(self.alloc(init));
self.enter_node(kind);
match init {
ForStatementInit::UsingDeclaration(decl) => {
self.visit_using_declaration(decl);
}
ForStatementInit::VariableDeclaration(decl) => {
self.visit_variable_declaration(decl);
}
match_expression!(ForStatementInit) => self.visit_expression(init.to_expression()),
}
self.leave_node(kind);
}
fn visit_for_in_statement(&mut self, stmt: &ForInStatement<'a>) {
let kind = AstKind::ForInStatement(self.alloc(stmt));
let is_lexical_declaration = stmt.left.is_lexical_declaration();
if is_lexical_declaration {
self.enter_scope(ScopeFlags::empty());
stmt.scope_id.set(Some(self.current_scope_id));
}
self.enter_node(kind);
self.visit_for_statement_left(&stmt.left);
/* cfg */
let (before_for_stmt_graph_ix, start_prepare_cond_graph_ix) =
control_flow!(|self, cfg| (cfg.current_node_ix, cfg.new_basic_block_normal(),));
/* cfg */
self.record_ast_nodes();
self.visit_expression(&stmt.right);
let right_node = self.retrieve_recorded_ast_nodes().into_iter().next();
/* cfg */
let (end_of_prepare_cond_graph_ix, iteration_graph_ix, body_graph_ix) =
control_flow!(|self, cfg| {
let end_of_prepare_cond_graph_ix = cfg.current_node_ix;
let iteration_graph_ix = cfg.new_basic_block_normal();
cfg.append_iteration(right_node, IterationInstructionKind::In);
let body_graph_ix = cfg.new_basic_block_normal();
cfg.ctx(None).default().allow_break().allow_continue();
(end_of_prepare_cond_graph_ix, iteration_graph_ix, body_graph_ix)
});
/* cfg */
self.visit_statement(&stmt.body);
/* cfg */
control_flow!(|self, cfg| {
let end_of_body_graph_ix = cfg.current_node_ix;
let after_for_graph_ix = cfg.new_basic_block_normal();
// connect before for statement to the iterable expression
cfg.add_edge(before_for_stmt_graph_ix, start_prepare_cond_graph_ix, EdgeType::Normal);
// connect the end of the iterable expression to the basic block with back edge
cfg.add_edge(end_of_prepare_cond_graph_ix, iteration_graph_ix, EdgeType::Normal);
// connect the basic block with back edge to the start of the body
cfg.add_edge(iteration_graph_ix, body_graph_ix, EdgeType::Jump);
// connect the end of the body back to the basic block
// with back edge for the next iteration
cfg.add_edge(end_of_body_graph_ix, iteration_graph_ix, EdgeType::Backedge);
// connect the basic block with back edge to the basic block after the for loop
// for when there are no more iterations left in the iterable
cfg.add_edge(iteration_graph_ix, after_for_graph_ix, EdgeType::Normal);
cfg.ctx(None)
.mark_break(after_for_graph_ix)
.mark_continue(iteration_graph_ix)
.resolve_with_upper_label();
});
/* cfg */
self.leave_node(kind);
if is_lexical_declaration {
self.leave_scope();
}
}
fn visit_for_of_statement(&mut self, stmt: &ForOfStatement<'a>) {
let kind = AstKind::ForOfStatement(self.alloc(stmt));
let is_lexical_declaration = stmt.left.is_lexical_declaration();
if is_lexical_declaration {
self.enter_scope(ScopeFlags::empty());
stmt.scope_id.set(Some(self.current_scope_id));
}
self.enter_node(kind);
self.visit_for_statement_left(&stmt.left);
/* cfg */
let (before_for_stmt_graph_ix, start_prepare_cond_graph_ix) =
control_flow!(|self, cfg| (cfg.current_node_ix, cfg.new_basic_block_normal()));
/* cfg */
self.record_ast_nodes();
self.visit_expression(&stmt.right);
let right_node = self.retrieve_recorded_ast_nodes().into_iter().next();
/* cfg */
let (end_of_prepare_cond_graph_ix, iteration_graph_ix, body_graph_ix) =
control_flow!(|self, cfg| {
let end_of_prepare_cond_graph_ix = cfg.current_node_ix;
let iteration_graph_ix = cfg.new_basic_block_normal();
cfg.append_iteration(right_node, IterationInstructionKind::Of);
let body_graph_ix = cfg.new_basic_block_normal();
cfg.ctx(None).default().allow_break().allow_continue();
(end_of_prepare_cond_graph_ix, iteration_graph_ix, body_graph_ix)
});
/* cfg */
self.visit_statement(&stmt.body);
/* cfg */
control_flow!(|self, cfg| {
let end_of_body_graph_ix = cfg.current_node_ix;
let after_for_graph_ix = cfg.new_basic_block_normal();
// connect before for statement to the iterable expression
cfg.add_edge(before_for_stmt_graph_ix, start_prepare_cond_graph_ix, EdgeType::Normal);
// connect the end of the iterable expression to the basic block with back edge
cfg.add_edge(end_of_prepare_cond_graph_ix, iteration_graph_ix, EdgeType::Normal);
// connect the basic block with back edge to the start of the body
cfg.add_edge(iteration_graph_ix, body_graph_ix, EdgeType::Jump);
// connect the end of the body back to the basic block
// with back edge for the next iteration
cfg.add_edge(end_of_body_graph_ix, iteration_graph_ix, EdgeType::Backedge);
// connect the basic block with back edge to the basic block after the for loop
// for when there are no more iterations left in the iterable
cfg.add_edge(iteration_graph_ix, after_for_graph_ix, EdgeType::Normal);
cfg.ctx(None)
.mark_break(after_for_graph_ix)
.mark_continue(iteration_graph_ix)
.resolve_with_upper_label();
});
/* cfg */
self.leave_node(kind);
if is_lexical_declaration {
self.leave_scope();
}
}
fn visit_if_statement(&mut self, stmt: &IfStatement<'a>) {
let kind = AstKind::IfStatement(self.alloc(stmt));
self.enter_node(kind);
/* cfg - condition basic block */
let (before_if_stmt_graph_ix, start_of_condition_graph_ix) =
control_flow!(|self, cfg| (cfg.current_node_ix, cfg.new_basic_block_normal(),));
/* cfg */
self.record_ast_nodes();
self.visit_expression(&stmt.test);
let test_node = self.retrieve_recorded_ast_nodes().into_iter().next();
/* cfg */
let (after_test_graph_ix, before_consequent_stmt_graph_ix) = control_flow!(|self, cfg| {
cfg.append_condition_to(start_of_condition_graph_ix, test_node);
(cfg.current_node_ix, cfg.new_basic_block_normal())
});
/* cfg */
self.visit_statement(&stmt.consequent);
/* cfg */
let after_consequent_stmt_graph_ix = control_flow!(|self, cfg| cfg.current_node_ix);
/* cfg */
let else_graph_ix = if let Some(alternate) = &stmt.alternate {
/* cfg */
let else_graph_ix = control_flow!(|self, cfg| cfg.new_basic_block_normal());
/* cfg */
self.visit_statement(alternate);
control_flow!(|self, cfg| Some((else_graph_ix, cfg.current_node_ix)))
} else {
None
};
/* cfg - bb after if statement joins consequent and alternate */
control_flow!(|self, cfg| {
let after_if_graph_ix = cfg.new_basic_block_normal();
cfg.add_edge(before_if_stmt_graph_ix, start_of_condition_graph_ix, EdgeType::Normal);
cfg.add_edge(after_consequent_stmt_graph_ix, after_if_graph_ix, EdgeType::Normal);
cfg.add_edge(after_test_graph_ix, before_consequent_stmt_graph_ix, EdgeType::Jump);
if let Some((start_of_alternate_stmt_graph_ix, after_alternate_stmt_graph_ix)) =
else_graph_ix
{
cfg.add_edge(
before_if_stmt_graph_ix,
start_of_alternate_stmt_graph_ix,
EdgeType::Normal,
);
cfg.add_edge(after_alternate_stmt_graph_ix, after_if_graph_ix, EdgeType::Normal);
} else {
cfg.add_edge(before_if_stmt_graph_ix, after_if_graph_ix, EdgeType::Normal);
}
});
/* cfg */
self.leave_node(kind);
}
fn visit_labeled_statement(&mut self, stmt: &LabeledStatement<'a>) {
let kind = AstKind::LabeledStatement(self.alloc(stmt));
self.enter_node(kind);
/* cfg */
let label = &stmt.label.name;
control_flow!(|self, cfg| {
let ctx = cfg.ctx(Some(label.as_str())).default().allow_break();
if stmt.body.is_iteration_statement() {
ctx.allow_continue();
}
});
/* cfg */
self.visit_label_identifier(&stmt.label);
self.visit_statement(&stmt.body);
/* cfg */
control_flow!(|self, cfg| {
let after_body_graph_ix = cfg.current_node_ix;
let after_labeled_stmt_graph_ix = cfg.new_basic_block_normal();
cfg.add_edge(after_body_graph_ix, after_labeled_stmt_graph_ix, EdgeType::Normal);
cfg.ctx(Some(label.as_str())).mark_break(after_labeled_stmt_graph_ix).resolve();
});
/* cfg */
self.leave_node(kind);
}
fn visit_return_statement(&mut self, stmt: &ReturnStatement<'a>) {
let kind = AstKind::ReturnStatement(self.alloc(stmt));
self.enter_node(kind);
/* cfg */
let node_id = self.current_node_id;
/* cfg */
let ret_kind = if let Some(arg) = &stmt.argument {
self.visit_expression(arg);
ReturnInstructionKind::NotImplicitUndefined
} else {
ReturnInstructionKind::ImplicitUndefined
};
/* cfg */
control_flow!(|self, cfg| {
cfg.push_return(ret_kind, node_id);
cfg.append_unreachable();
});
/* cfg */
self.leave_node(kind);
}
fn visit_switch_statement(&mut self, stmt: &SwitchStatement<'a>) {
let kind = AstKind::SwitchStatement(self.alloc(stmt));
self.enter_node(kind);
self.visit_expression(&stmt.discriminant);
self.enter_scope(ScopeFlags::empty());
stmt.scope_id.set(Some(self.current_scope_id));
/* cfg */
let discriminant_graph_ix = control_flow!(|self, cfg| {
let discriminant_graph_ix = cfg.current_node_ix;
cfg.ctx(None).default().allow_break();
discriminant_graph_ix
});
let mut switch_case_graph_spans = vec![];
let mut have_default_case = false;
/* cfg */
for case in &stmt.cases {
let before_case_graph_ix = control_flow!(|self, cfg| cfg.new_basic_block_normal());
self.visit_switch_case(case);
if case.is_default_case() {
have_default_case = true;
}
control_flow!(|self, cfg| switch_case_graph_spans
.push((before_case_graph_ix, cfg.current_node_ix)));
}
/* cfg */
// for each switch case
control_flow!(|self, cfg| {
for i in 0..switch_case_graph_spans.len() {
let case_graph_span = switch_case_graph_spans[i];
// every switch case condition can be skipped,
// so there's a possible jump from it to the next switch case condition
for y in switch_case_graph_spans.iter().skip(i + 1) {
cfg.add_edge(case_graph_span.0, y.0, EdgeType::Normal);
}
// connect the end of each switch statement to
// the condition of the next switch statement
if switch_case_graph_spans.len() > i + 1 {
let (_, end_of_switch_case) = switch_case_graph_spans[i];
let (next_switch_statement_condition, _) = switch_case_graph_spans[i + 1];
cfg.add_edge(
end_of_switch_case,
next_switch_statement_condition,
EdgeType::Normal,
);
}
cfg.add_edge(discriminant_graph_ix, case_graph_span.0, EdgeType::Normal);
}
let end_of_switch_case_statement = cfg.new_basic_block_normal();
if let Some(last) = switch_case_graph_spans.last() {
cfg.add_edge(last.1, end_of_switch_case_statement, EdgeType::Normal);
}
// if we don't have a default case there should be an edge from discriminant to the end of
// the statement.
if !have_default_case {
cfg.add_edge(discriminant_graph_ix, end_of_switch_case_statement, EdgeType::Normal);
}
cfg.ctx(None).mark_break(end_of_switch_case_statement).resolve();
});
/* cfg */
self.leave_scope();
self.leave_node(kind);
}
fn visit_switch_case(&mut self, case: &SwitchCase<'a>) {
let kind = AstKind::SwitchCase(self.alloc(case));
self.enter_node(kind);
if let Some(expr) = &case.test {
self.record_ast_nodes();
self.visit_expression(expr);
let test_node = self.retrieve_recorded_ast_nodes().into_iter().next();
control_flow!(|self, cfg| cfg.append_condition_to(cfg.current_node_ix, test_node));
}
/* cfg */
control_flow!(|self, cfg| {
let after_test_graph_ix = cfg.current_node_ix;
let statements_in_switch_graph_ix = cfg.new_basic_block_normal();
cfg.add_edge(after_test_graph_ix, statements_in_switch_graph_ix, EdgeType::Jump);
});
/* cfg */
self.visit_statements(&case.consequent);
self.leave_node(kind);
}
fn visit_throw_statement(&mut self, stmt: &ThrowStatement<'a>) {
let kind = AstKind::ThrowStatement(self.alloc(stmt));
self.enter_node(kind);
/* cfg */
let node_id = self.current_node_id;
/* cfg */
self.visit_expression(&stmt.argument);
/* cfg */
control_flow!(|self, cfg| cfg.append_throw(node_id));
/* cfg */
self.leave_node(kind);
}
fn visit_try_statement(&mut self, stmt: &TryStatement<'a>) {
let kind = AstKind::TryStatement(self.alloc(stmt));
self.enter_node(kind);
/* cfg */
let (
before_try_statement_graph_ix,
error_harness,
before_finalizer_graph_ix,
before_try_block_graph_ix,
) = control_flow!(|self, cfg| {
let before_try_statement_graph_ix = cfg.current_node_ix;
let error_harness =
stmt.handler.as_ref().map(|_| cfg.attach_error_harness(ErrorEdgeKind::Explicit));
let before_finalizer_graph_ix = stmt.finalizer.as_ref().map(|_| cfg.attach_finalizer());
let before_try_block_graph_ix = cfg.new_basic_block_normal();
(
before_try_statement_graph_ix,
error_harness,
before_finalizer_graph_ix,
before_try_block_graph_ix,
)
});
/* cfg */
self.visit_block_statement(&stmt.block);
/* cfg */
let after_try_block_graph_ix = control_flow!(|self, cfg| cfg.current_node_ix);
/* cfg */
let catch_block_end_ix = if let Some(handler) = &stmt.handler {
/* cfg */
control_flow!(|self, cfg| {
let Some(error_harness) = error_harness else {
unreachable!("we always create an error harness if we have a catch block.");
};
cfg.release_error_harness(error_harness);
let catch_block_start_ix = cfg.new_basic_block_normal();
cfg.add_edge(error_harness, catch_block_start_ix, EdgeType::Normal);
});
/* cfg */
self.visit_catch_clause(handler);
/* cfg */
control_flow!(|self, cfg| {
let catch_block_end_ix = cfg.current_node_ix;
// TODO: we shouldn't directly change the current node index.
cfg.current_node_ix = after_try_block_graph_ix;
Some(catch_block_end_ix)
})
/* cfg */
} else {
None
};
let finally_block_end_ix = if let Some(finalizer) = &stmt.finalizer {
/* cfg */
control_flow!(|self, cfg| {
let Some(before_finalizer_graph_ix) = before_finalizer_graph_ix else {
unreachable!("we always create a finalizer when there is a finally block.");
};
cfg.release_finalizer(before_finalizer_graph_ix);
let start_finally_graph_ix = cfg.new_basic_block_normal();
cfg.add_edge(before_finalizer_graph_ix, start_finally_graph_ix, EdgeType::Normal);
});
/* cfg */
self.visit_finally_clause(finalizer);
/* cfg */
control_flow!(|self, cfg| {
let finally_block_end_ix = cfg.current_node_ix;
// TODO: we shouldn't directly change the current node index.
cfg.current_node_ix = after_try_block_graph_ix;
Some(finally_block_end_ix)
})
/* cfg */
} else {
None
};
/* cfg */
control_flow!(|self, cfg| {
let after_try_statement_block_ix = cfg.new_basic_block_normal();
cfg.add_edge(
before_try_statement_graph_ix,
before_try_block_graph_ix,
EdgeType::Normal,
);
if let Some(catch_block_end_ix) = catch_block_end_ix {
if finally_block_end_ix.is_none() {
cfg.add_edge(
after_try_block_graph_ix,
after_try_statement_block_ix,
EdgeType::Normal,
);
cfg.add_edge(
catch_block_end_ix,
after_try_statement_block_ix,
EdgeType::Normal,
);
}
}
if let Some(finally_block_end_ix) = finally_block_end_ix {
if catch_block_end_ix.is_some() {
cfg.add_edge(
finally_block_end_ix,
after_try_statement_block_ix,
EdgeType::Normal,
);
} else {
cfg.add_edge(
finally_block_end_ix,
after_try_statement_block_ix,
if cfg.basic_block(after_try_block_graph_ix).unreachable {
EdgeType::Unreachable
} else {
EdgeType::Join
},
);
}
}
});
/* cfg */
self.leave_node(kind);
}
fn visit_catch_clause(&mut self, clause: &CatchClause<'a>) {
let kind = AstKind::CatchClause(self.alloc(clause));
self.enter_scope(ScopeFlags::empty());
clause.scope_id.set(Some(self.current_scope_id));
self.enter_node(kind);
if let Some(param) = &clause.param {
self.visit_catch_parameter(param);
}
self.visit_statements(&clause.body.body);
self.leave_node(kind);
self.leave_scope();
}
fn visit_finally_clause(&mut self, clause: &BlockStatement<'a>) {
let kind = AstKind::FinallyClause(self.alloc(clause));
self.enter_scope(ScopeFlags::empty());
clause.scope_id.set(Some(self.current_scope_id));
self.enter_node(kind);
self.visit_statements(&clause.body);
self.leave_node(kind);
self.leave_scope();
}
fn visit_while_statement(&mut self, stmt: &WhileStatement<'a>) {
let kind = AstKind::WhileStatement(self.alloc(stmt));
self.enter_node(kind);
/* cfg - condition basic block */
let (before_while_stmt_graph_ix, condition_graph_ix) =
control_flow!(|self, cfg| (cfg.current_node_ix, cfg.new_basic_block_normal()));
/* cfg */
self.record_ast_nodes();
self.visit_expression(&stmt.test);
let test_node = self.retrieve_recorded_ast_nodes().into_iter().next();
/* cfg - body basic block */
let body_graph_ix = control_flow!(|self, cfg| {
cfg.append_condition_to(condition_graph_ix, test_node);
let body_graph_ix = cfg.new_basic_block_normal();
cfg.ctx(None).default().allow_break().allow_continue();
body_graph_ix
});
/* cfg */
self.visit_statement(&stmt.body);
/* cfg - after body basic block */
control_flow!(|self, cfg| {
let after_body_graph_ix = cfg.current_node_ix;
let after_while_graph_ix = cfg.new_basic_block_normal();
cfg.add_edge(before_while_stmt_graph_ix, condition_graph_ix, EdgeType::Normal);
cfg.add_edge(condition_graph_ix, body_graph_ix, EdgeType::Jump);
cfg.add_edge(after_body_graph_ix, condition_graph_ix, EdgeType::Backedge);
cfg.add_edge(condition_graph_ix, after_while_graph_ix, EdgeType::Normal);
cfg.ctx(None)
.mark_break(after_while_graph_ix)
.mark_continue(condition_graph_ix)
.resolve_with_upper_label();
});
/* cfg */
self.leave_node(kind);
}
fn visit_with_statement(&mut self, stmt: &WithStatement<'a>) {
let kind = AstKind::WithStatement(self.alloc(stmt));
self.enter_node(kind);
/* cfg - condition basic block */
let (before_with_stmt_graph_ix, condition_graph_ix) =
control_flow!(|self, cfg| (cfg.current_node_ix, cfg.new_basic_block_normal()));
/* cfg */
self.visit_expression(&stmt.object);
/* cfg - body basic block */
let body_graph_ix = control_flow!(|self, cfg| cfg.new_basic_block_normal());
/* cfg */
self.visit_statement(&stmt.body);
/* cfg - after body basic block */
control_flow!(|self, cfg| {
let after_body_graph_ix = cfg.new_basic_block_normal();
cfg.add_edge(before_with_stmt_graph_ix, condition_graph_ix, EdgeType::Normal);
cfg.add_edge(condition_graph_ix, body_graph_ix, EdgeType::Normal);
cfg.add_edge(body_graph_ix, after_body_graph_ix, EdgeType::Normal);
cfg.add_edge(condition_graph_ix, after_body_graph_ix, EdgeType::Normal);
});
/* cfg */
self.leave_node(kind);
}
fn visit_function(&mut self, func: &Function<'a>, flags: Option<ScopeFlags>) {
let kind = AstKind::Function(self.alloc(func));
self.enter_scope({
let mut flags = flags.unwrap_or(ScopeFlags::empty()) | ScopeFlags::Function;
if func.is_strict() {
flags |= ScopeFlags::StrictMode;
}
flags
});
func.scope_id.set(Some(self.current_scope_id));
/* cfg */
let (before_function_graph_ix, error_harness, function_graph_ix) =
control_flow!(|self, cfg| {
let before_function_graph_ix = cfg.current_node_ix;
cfg.push_finalization_stack();
let error_harness = cfg.attach_error_harness(ErrorEdgeKind::Implicit);
let function_graph_ix = cfg.new_basic_block_function();
cfg.ctx(None).new_function();
(before_function_graph_ix, error_harness, function_graph_ix)
});
/* cfg */
// We add a new basic block to the cfg before entering the node
// so that the correct cfg_ix is associated with the ast node.
self.enter_node(kind);
/* cfg */
control_flow!(|self, cfg| cfg.add_edge(
before_function_graph_ix,
function_graph_ix,
EdgeType::NewFunction
));
/* cfg */
if let Some(ident) = &func.id {
self.visit_binding_identifier(ident);
}
self.visit_formal_parameters(&func.params);
if let Some(body) = &func.body {
self.visit_function_body(body);
}
/* cfg */
control_flow!(|self, cfg| {
cfg.ctx(None).resolve_expect(CtxFlags::FUNCTION);
cfg.release_error_harness(error_harness);
cfg.pop_finalization_stack();
let after_function_graph_ix = cfg.new_basic_block_normal();
cfg.add_edge(before_function_graph_ix, after_function_graph_ix, EdgeType::Normal);
});
/* cfg */
if let Some(parameters) = &func.type_parameters {
self.visit_ts_type_parameter_declaration(parameters);
}
if let Some(annotation) = &func.return_type {
self.visit_ts_type_annotation(annotation);
}
self.leave_node(kind);
self.leave_scope();
}
fn visit_class(&mut self, class: &Class<'a>) {
// Class level decorators are transpiled as functions outside of the class taking the class
// itself as argument. They should be visited before class is entered. E.g., they inherit
// strict mode from the enclosing scope rather than from class.
for decorator in &class.decorators {
self.visit_decorator(decorator);
}
let kind = AstKind::Class(self.alloc(class));
// FIXME(don): Should we enter a scope when visiting class declarations?
let is_class_expr = class.r#type == ClassType::ClassExpression;
if is_class_expr {
// Class expressions create a temporary scope with the class name as its only variable
// E.g., `let c = class A { foo() { console.log(A) } }`
self.enter_scope(ScopeFlags::empty());
class.scope_id.set(Some(self.current_scope_id));
}
self.enter_node(kind);
if let Some(id) = &class.id {
self.visit_binding_identifier(id);
}
if let Some(parameters) = &class.type_parameters {
self.visit_ts_type_parameter_declaration(parameters);
}
if let Some(super_class) = &class.super_class {
self.visit_class_heritage(super_class);
}
if let Some(super_parameters) = &class.super_type_parameters {
self.visit_ts_type_parameter_instantiation(super_parameters);
}
self.visit_class_body(&class.body);
self.leave_node(kind);
if is_class_expr {
self.leave_scope();
}
}
fn visit_static_block(&mut self, block: &StaticBlock<'a>) {
let kind = AstKind::StaticBlock(self.alloc(block));
self.enter_scope(ScopeFlags::ClassStaticBlock);
block.scope_id.set(Some(self.current_scope_id));
self.enter_node(kind);
self.visit_statements(&block.body);
self.leave_node(kind);
self.leave_scope();
}
fn visit_arrow_function_expression(&mut self, expr: &ArrowFunctionExpression<'a>) {
let kind = AstKind::ArrowFunctionExpression(self.alloc(expr));
self.enter_scope(ScopeFlags::Function | ScopeFlags::Arrow);
expr.scope_id.set(Some(self.current_scope_id));
/* cfg */
let (current_node_ix, error_harness, function_graph_ix) = control_flow!(|self, cfg| {
let current_node_ix = cfg.current_node_ix;
cfg.push_finalization_stack();
let error_harness = cfg.attach_error_harness(ErrorEdgeKind::Implicit);
let function_graph_ix = cfg.new_basic_block_function();
cfg.ctx(None).new_function();
(current_node_ix, error_harness, function_graph_ix)
});
/* cfg */
// We add a new basic block to the cfg before entering the node
// so that the correct cfg_ix is associated with the ast node.
self.enter_node(kind);
self.visit_formal_parameters(&expr.params);
/* cfg */
control_flow!(|self, cfg| cfg.add_edge(
current_node_ix,
function_graph_ix,
EdgeType::NewFunction
));
/* cfg */
self.visit_function_body(&expr.body);
/* cfg */
control_flow!(|self, cfg| {
cfg.ctx(None).resolve_expect(CtxFlags::FUNCTION);
cfg.release_error_harness(error_harness);
cfg.pop_finalization_stack();
cfg.current_node_ix = current_node_ix;
});
/* cfg */
if let Some(parameters) = &expr.type_parameters {
self.visit_ts_type_parameter_declaration(parameters);
}
self.leave_node(kind);
self.leave_scope();
}
fn visit_ts_enum_declaration(&mut self, decl: &TSEnumDeclaration<'a>) {
let kind = AstKind::TSEnumDeclaration(self.alloc(decl));
self.enter_node(kind);
self.visit_binding_identifier(&decl.id);
self.enter_scope(ScopeFlags::empty());
decl.scope_id.set(Some(self.current_scope_id));
for member in &decl.members {
self.visit_ts_enum_member(member);
}
self.leave_scope();
self.leave_node(kind);
}
fn visit_ts_module_declaration(&mut self, decl: &TSModuleDeclaration<'a>) {
let kind = AstKind::TSModuleDeclaration(self.alloc(decl));
self.enter_node(kind);
match &decl.id {
TSModuleDeclarationName::Identifier(ident) => self.visit_identifier_name(ident),
TSModuleDeclarationName::StringLiteral(lit) => self.visit_string_literal(lit),
}
self.enter_scope(ScopeFlags::TsModuleBlock);
decl.scope_id.set(Some(self.current_scope_id));
match &decl.body {
Some(TSModuleDeclarationBody::TSModuleDeclaration(decl)) => {
self.visit_ts_module_declaration(decl);
}
Some(TSModuleDeclarationBody::TSModuleBlock(block)) => {
self.visit_ts_module_block(block);
}
None => {}
}
self.leave_scope();
self.leave_node(kind);
}
fn visit_ts_type_parameter(&mut self, ty: &TSTypeParameter<'a>) {
let kind = AstKind::TSTypeParameter(self.alloc(ty));
self.enter_scope(ScopeFlags::empty());
ty.scope_id.set(Some(self.current_scope_id));
self.enter_node(kind);
if let Some(constraint) = &ty.constraint {
self.visit_ts_type(constraint);
}
if let Some(default) = &ty.default {
self.visit_ts_type(default);
}
self.leave_node(kind);
self.leave_scope();
}

Consensus was manually patch them, we eat our own 💩

@rzvxa
Copy link
Contributor

rzvxa commented Jul 9, 2024

It can be fine to have a slightly different visitation order for some implementations of our visits if it is necessary for them. The good thing about the generated code is that we have the correct walk order by default but it shouldn't prevent anyone from doing custom walks.
For example, a specific implementation might require visiting a statement and then visiting the decorators for it.

It is fine as long as we acknowledge the difference and know it has a reason to be that way.

@rzvxa
Copy link
Contributor

rzvxa commented Jul 9, 2024

I'm mostly scared of scope events since they can drift apart from the source of truth and we use the source of truth version everywhere but the semantics.

Maybe we should generate the scope events as their own functions? something like scope::enter_statement(visitor) that uses the scope attribute information.

@rzvxa
Copy link
Contributor

rzvxa commented Jul 9, 2024

or implement scope visit as a trait for our types so we can write stmt.enter_scope(visitor).

@Dunqing
Copy link
Member

Dunqing commented Jul 10, 2024

We need a visit_scope_id method. This way we can easily set the current_scope_id for different ast's scope_id in the semantic builder

@overlookmotel
Copy link
Contributor

Could we pass the AST node (&mut T) into enter_scope? That would avoid need for adding visit_scope_id method.

@rzvxa
Copy link
Contributor

rzvxa commented Jul 10, 2024

Could we pass the AST node (&mut T) into enter_scope? That would avoid need for adding visit_scope_id method.

I was thinking about this when I saw the @Dunqing's comment. But the type erasure will prevent us from doing anything useful to the T. Maybe we can implement a trait so this can be bound to something like Scoped<T>?

If we don't want to pass in T we can pass AstType + &mut node.scope_id

@overlookmotel
Copy link
Contributor

we can pass AstType + &mut node.scope_id

Maybe that's better. I think just &node.scope_id would do it. That removes need for generics (and only & is required, not &mut, since it's a Cell).

@rzvxa
Copy link
Contributor

rzvxa commented Jul 10, 2024

That removes need for generics

I used AstType instead of AstKind for the same reason, but if we don't need it, then we shouldn't pass it in. I thought we may want to do some checking depending on the type we are entering but with scopeFlags, it should be enough information already.

and only & is required, not &mut, since it's a Cell

This one I just forgot, You are absolutely right.

@Dunqing
Copy link
Member

Dunqing commented Jul 10, 2024

pass scope_id to enter_scope just enough for now

@rzvxa
Copy link
Contributor

rzvxa commented Jul 10, 2024

pass scope_id to enter_scope just enough for now

It seems we all agree on this so I'll go ahead and implement it

@overlookmotel
Copy link
Contributor

He done it! #4168

@Boshen
Copy link
Member Author

Boshen commented Jul 10, 2024

RFC is done, remains issues can be tracked in #3819

@Boshen Boshen closed this as completed Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants