-
-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(semantic): avoid counting nodes #5703
base: main
Are you sure you want to change the base?
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. Join @overlookmotel and the rest of your teammates on Graphite |
Your org has enabled the Graphite merge queue for merging into mainAdd the label “0-merge” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge. Or use the label “hotfix” to add to the merge queue as a hot fix. You must have a Graphite account and log in to Graphite in order to use the merge queue. Sign up using this link. |
23cfb0f
to
6502986
Compare
6502986
to
228de14
Compare
CodSpeed Performance ReportMerging #5703 will improve performances by 11.24%Comparing Summary
Benchmarks breakdown
|
228de14
to
19ab835
Compare
1e9d7ef
to
527c80f
Compare
527c80f
to
df138a7
Compare
TODO: Allow passing existing counts into This is necessary when re-running semantic on an AST which has been transformed/minified. Nodes / scopes / symbols / references counts may have increased in transformation, but the source text has not changed, so estimating based on source text length will be inaccurate. Additionally, the actual counts will be accurate, unlike estimating from source text length. At present all tests pass but this is an accident because the over-estimates are so large that they obscure this problem. It would be possible to produce panics with the right input. |
3fa7e0f
to
ca8ea3e
Compare
033c4d1
to
7702452
Compare
ca8ea3e
to
a362f51
Compare
7702452
to
843169d
Compare
b9df2d1
to
51ffc54
Compare
As an experiment, I hard-coded the exact counts into This has little effect on most of semantic benchmarks, but it does noticeably speed things up for So, while over-allocating is not too bad, for small files, it's not as good as accurate counts. The over-allocating version is massively over-allocating. For
My best guess of the reason for the slowdown of over-allocating on small files is that the much larger estimated counts push the allocations up a size class in the system allocator. I believe system allocator serves small allocations from a thread-local pool, rather than taking from the main cross-thread pool. The former is faster as it doesn't involve any synchronisation. This would only make a difference for small files - large files probably produce "large class" allocations either way. I feel that small files are the most representative of the code Oxc will be processing in the "real world", so the Conclusion: Getting accurate counts in the parser will be a better solution if it doesn't cost much in the parser. According to @DavidHancu on Discord cost of compiling counts in parser is about -0.3%. That's pretty cheap. |
be62ab2
to
6ce1ea9
Compare
c90458d
to
9145fcc
Compare
Instead of visiting AST to count AST nodes in
SemanticBuilder
, just make high estimates of counts for nodes, scopes, symbols and references, based on source text length. These estimates are the maximum number of items theoretically possible for a given source text length. Almost always they'll be vast over-estimates, but should never be an under-estimate.This ensures
AstNodes
,ScopeTree
andSymbolTable
's allocations will not grow during their construction, which is a major performance hit, but avoids the complete AST traversal we were using previously to get accurate counts.This "estimate high" approach is a viable strategy on 64-bit systems with virtual memory, where:
So use this faster "estimate high" method as standard, and the slower visitor-based counter on 32-bit systems and WASM.
Note:
I tried usingIt was an artefact of the custom global allocator we use in our benchmarking setup. Shrinking allocations does not cause reallocation.shrink_to_fit
on theVec
s once semantic run is complete, to free up the excess memory. But this caused theVec
s to all reallocate, which negated the performance gains.