-
-
Notifications
You must be signed in to change notification settings - Fork 485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(ast, ast_codegen): optimize AST structs' memory layouts #4404
perf(ast, ast_codegen): optimize AST structs' memory layouts #4404
Conversation
Your org has enabled the Graphite merge queue for merging into mainAdd the label “merge” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge. Or use the label “hotfix” to add to the merge queue as a hot fix. You must have a Graphite account and log in to Graphite in order to use the merge queue. Sign up using this link. |
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
CodSpeed Performance ReportMerging #4404 will not alter performanceComparing Summary
|
tasks/ast_codegen/src/layout.rs
Outdated
// well known types | ||
// TODO: generate const assertions for these in the ast crate | ||
Span => { 64: Layout::of::<oxc_span::Span>(), 32: Layout::of::<oxc_span::Span>() }, | ||
Atom => { 64: Layout::of::<oxc_span::Atom>(), 32: Layout::of::<oxc_span::Atom>() }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Span
and Atom
, can we mark them #[ast]
and have codegen parse the files in oxc_span
, same as it does for oxc_ast
? We need both types to be made #[repr(C)]
anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I strongly believe we shouldn't extend the ast
marker to types outside of the AST crate. Those one-off types can always be made repr(C)
without all this infra. Span is already in the ideal order, And Vec type doesn't expose its fields so it can be reordered manually. Atom is a transparent type so no need for this stuff whatsoever.
But that's just my preference, I still haven't seen any real need to process those types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With that said, I have nothing against hard coding them as opposed to using a layout of method. I just did it for the sake of consistency if we regenerate on multiple architectures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Span
and Atom
we can mark them #[repr(C)]
and hard-code them if you prefer. But:
- AST transfer needs to know the field order for
Span
(so we'd have to hard-code that too). - We can't avoid extending outside of
oxc_ast
crate, because there are various types inoxc_syntax
crate which are part of the AST e.g.AssignmentOperator
. These types need#[repr(C)]
and explicit discriminants. So it seems to me that simplest way to do that is#[ast]
+ codegen.
What's the reason for your opposition to codegen reaching into other crates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've mulled over this, I guess you are right it would be easier if we do so.
I wanted to keep the code generation's input limited to AST types, Vec, Box, Span, and Atom obviously aren't AST types. But your argument for types defined in oxc_syntax
is valid.
I'm not familiar with all AST-related stuff in the oxc_syntax
crate, But All I can remember is unit enums. So they should be easy to make repr(C)
.
Initially, I was slightly biased against doing this but now I'm natural. It sure helps not to have to worry about those types. Even if they all are simple enums.
I'll do it in a follow-up PR.
@overlookmotel I have a problem understanding something with repr C and option types. Can you help me with it? Take a look at this struct definition: oxc/crates/oxc_ast/src/ast/ts.rs Lines 829 to 836 in b1b66e2
If you get the size of the Is it correct in the first place? If my assumptions are right; How does it work on FFI? I guess option doesn't mean nullable for that pointer types are a better fit, But packing this information in structures wouldn't break the C compatibility? Does it only happen on pointers because there is free space in the address? Right now I only fold the option if there is room at the end of type(either the last field of struct or if all enum variants have 1 bit headroom). Should I also consider all the padding in between fields as viable space for folding options? Edit:I went with my intuition but it might very well be wrong so please let me know if I'm doing it incorrectly. |
34cacf6
to
38b4fc8
Compare
671d536
to
5112832
Compare
To clarify one thing:
"Niche" means a pattern of bits which is illegal for the type to use. In the case of I think the idea for FFI is that if you're getting a pointer from C, that C pointer can be either null or non-null. That maps cleanly onto Rust's The rules around niches as far as I understand them are:
Same thing for all the Some types have multiple niches. e.g.
(By the way, these complications around niches in enums is what motivated the "squashed enums" PR #3115. It was intractable to figure out what niches and discriminants Rust would produce when you have multiple enums, each with niches, nested enum-in-enum-in-enum). If a struct has a field which has a niche, then the struct inherits that niche too (as you've found). The hard part is when a struct has more than 1 field with a niche - which niche does the struct use? I don't know the answer to that one - but we don't need an answer anyway right now - if Padding cannot be used as a niche - I don't entirely understand that one, but pretty sure it's true. NB: One oddity is that One last thing: |
Thank you, I'll rework it to consider niches correctly. |
38b4fc8
to
af18a31
Compare
0ebd4eb
to
d8867e2
Compare
7d25dc6
to
719efba
Compare
af18a31
to
fbaa37a
Compare
719efba
to
b23e99f
Compare
e822ba6
to
425f59c
Compare
07ae601
to
d84a2db
Compare
edf5eb8
to
ed8675c
Compare
31a5a1c
to
42740c6
Compare
ed8675c
to
7118f04
Compare
42740c6
to
8cdd9d9
Compare
7118f04
to
4cc58bf
Compare
8cdd9d9
to
d0297db
Compare
4cc58bf
to
9579449
Compare
d0297db
to
717cd87
Compare
717cd87
to
01338e2
Compare
9579449
to
97c92f9
Compare
d07ce36
to
0c52c0d
Compare
18f1433
to
67e0203
Compare
375b347
to
297401d
Compare
67e0203
to
80e3195
Compare
Just to say, please don't delete this branch. We will want to return to this after we have added |
No description provided.