perf(semantic): avoid counting nodes #5703

overlookmotel · 2024-09-11T13:04:44Z

Instead of visiting AST to count AST nodes in SemanticBuilder, just make high estimates of counts for nodes, scopes, symbols and references, based on source text length. These estimates are the maximum number of items theoretically possible for a given source text length. Almost always they'll be vast over-estimates, but should never be an under-estimate.

This ensures AstNodes, ScopeTree and SymbolTable's allocations will not grow during their construction, which is a major performance hit, but avoids the complete AST traversal we were using previously to get accurate counts.

This "estimate high" approach is a viable strategy on 64-bit systems with virtual memory, where:

The memory space is so large that it's almost impossible to exhaust.
Allocating memory only consumes virtual memory, not physical memory.

So use this faster "estimate high" method as standard, and the slower visitor-based counter on 32-bit systems and WASM.

Note: ~~I tried using shrink_to_fit on the Vecs once semantic run is complete, to free up the excess memory. But this caused the Vecs to all reallocate, which negated the performance gains.~~ It was an artefact of the custom global allocator we use in our benchmarking setup. Shrinking allocations does not cause reallocation.

overlookmotel · 2024-09-11T13:05:03Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @overlookmotel and the rest of your teammates on Graphite

graphite-app · 2024-09-11T13:05:23Z

Your org has enabled the Graphite merge queue for merging into main

Add the label “0-merge” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge. Or use the label “hotfix” to add to the merge queue as a hot fix.

You must have a Graphite account and log in to Graphite in order to use the merge queue. Sign up using this link.

codspeed-hq · 2024-09-11T17:21:42Z

CodSpeed Performance Report

Merging #5703 will improve performances by 11.24%

_{Comparing 09-11-perf_semantic_avoid_counting_nodes (687081d) with main (8ff013a)}

Summary

⚡ 10 improvements
✅ 19 untouched benchmarks

Benchmarks breakdown

	Benchmark	`main`	`09-11-perf_semantic_avoid_counting_nodes`	Change
⚡	`codegen[checker.ts]`	20.6 ms	18.7 ms	+9.97%
⚡	`minifier[antd.js]`	365.2 ms	349.3 ms	+4.54%
⚡	`minifier[typescript.js]`	655 ms	629.9 ms	+3.99%
⚡	`semantic[antd.js]`	129 ms	116 ms	+11.24%
⚡	`semantic[cal.com.tsx]`	46.9 ms	43.9 ms	+6.78%
⚡	`semantic[checker.ts]`	76.6 ms	70.1 ms	+9.24%
⚡	`semantic[pdf.mjs]`	20.9 ms	19.7 ms	+5.93%
⚡	`transformer[antd.js]`	56.3 ms	51.4 ms	+9.54%
⚡	`transformer[checker.ts]`	21.5 ms	20.6 ms	+4.21%
⚡	`transformer[pdf.mjs]`	8 ms	7.3 ms	+10.82%

overlookmotel · 2024-09-11T19:43:23Z

TODO: Allow passing existing counts into SemanticBuilder, instead of basing counts on source text length.

This is necessary when re-running semantic on an AST which has been transformed/minified. Nodes / scopes / symbols / references counts may have increased in transformation, but the source text has not changed, so estimating based on source text length will be inaccurate. Additionally, the actual counts will be accurate, unlike estimating from source text length.

At present all tests pass but this is an accident because the over-estimates are so large that they obscure this problem. It would be possible to produce panics with the right input.

overlookmotel · 2024-09-12T01:09:10Z

As an experiment, I hard-coded the exact counts into SemanticBuilder for our benchmark fixtures, so it doesn't over-allocate.

This has little effect on most of semantic benchmarks, but it does noticeably speed things up for semantic[RadixUIAdoptionSection.jsx] (+4%).

So, while over-allocating is not too bad, for small files, it's not as good as accurate counts.

The over-allocating version is massively over-allocating. For RadixUIAdoptionSection.jsx:

	Actual	Estimated
nodes	392	5041
scopes	2	1261
symbols	11	1259
references	28	1261

My best guess of the reason for the slowdown of over-allocating on small files is that the much larger estimated counts push the allocations up a size class in the system allocator. I believe system allocator serves small allocations from a thread-local pool, rather than taking from the main cross-thread pool. The former is faster as it doesn't involve any synchronisation. This would only make a difference for small files - large files probably produce "large class" allocations either way.

I feel that small files are the most representative of the code Oxc will be processing in the "real world", so the RadixUIAdoptionSection.jsx benchmark is probably the most important one.

Conclusion: Getting accurate counts in the parser will be a better solution if it doesn't cost much in the parser. According to @DavidHancu on Discord cost of compiling counts in parser is about -0.3%. That's pretty cheap.

This reverts commit 51ffc54.

github-actions bot added the A-semantic Area - Semantic label Sep 11, 2024

Boshen force-pushed the 09-11-perf_semantic_avoid_counting_nodes branch from 23cfb0f to 6502986 Compare September 11, 2024 13:09

overlookmotel force-pushed the 09-11-perf_semantic_avoid_counting_nodes branch from 6502986 to 228de14 Compare September 11, 2024 17:11

github-actions bot added A-minifier Area - Minifier A-transformer Area - Transformer / Transpiler labels Sep 11, 2024

overlookmotel force-pushed the 09-11-perf_semantic_avoid_counting_nodes branch from 228de14 to 19ab835 Compare September 11, 2024 17:44

overlookmotel changed the base branch from main to 09-11-refactor_semantic_move_counts_code_into_counter_module September 11, 2024 17:44

This was referenced Sep 11, 2024

refactor(semantic): rename Counts in transform checker #5709

Merged

refactor(semantic): move Counts code into counter module #5710

Merged

refactor(semantic): re-order use statements #5712

Merged

feat(index): add IndexVec::shrink_to #5713

Merged

overlookmotel force-pushed the 09-11-perf_semantic_avoid_counting_nodes branch from 1e9d7ef to 527c80f Compare September 11, 2024 19:24

overlookmotel changed the base branch from 09-11-refactor_semantic_move_counts_code_into_counter_module to 09-11-feat_index_add_indexvec_shrink_to_ September 11, 2024 19:24

overlookmotel force-pushed the 09-11-perf_semantic_avoid_counting_nodes branch from 527c80f to df138a7 Compare September 11, 2024 19:37

DonIsaac force-pushed the 09-11-feat_index_add_indexvec_shrink_to_ branch 2 times, most recently from 3fa7e0f to ca8ea3e Compare September 11, 2024 20:48

DonIsaac force-pushed the 09-11-perf_semantic_avoid_counting_nodes branch from 033c4d1 to 7702452 Compare September 11, 2024 20:48

overlookmotel changed the base branch from 09-11-feat_index_add_indexvec_shrink_to_ to graphite-base/5703 September 11, 2024 22:58

overlookmotel force-pushed the graphite-base/5703 branch from ca8ea3e to a362f51 Compare September 11, 2024 23:01

overlookmotel force-pushed the 09-11-perf_semantic_avoid_counting_nodes branch from 7702452 to 843169d Compare September 11, 2024 23:01

overlookmotel changed the base branch from graphite-base/5703 to main September 11, 2024 23:02

overlookmotel force-pushed the 09-11-perf_semantic_avoid_counting_nodes branch 2 times, most recently from b9df2d1 to 51ffc54 Compare September 12, 2024 00:35

overlookmotel force-pushed the 09-11-perf_semantic_avoid_counting_nodes branch from be62ab2 to 6ce1ea9 Compare September 12, 2024 13:51

perf(semantic): avoid counting nodes

9145fcc

overlookmotel added 3 commits September 13, 2024 08:47

Test: bench allocating exact capacity

03affd3

Revert "Test: bench allocating exact capacity"

370e442

This reverts commit 51ffc54.

Test: Disable custom allocator

687081d

overlookmotel force-pushed the 09-11-perf_semantic_avoid_counting_nodes branch from c90458d to 9145fcc Compare September 13, 2024 07:51

overlookmotel mentioned this pull request Sep 13, 2024

perf(rolldown): use SemanticBuilder::with_stats when running semantic analysis on same AST twice rolldown/rolldown#2230

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(semantic): avoid counting nodes #5703

perf(semantic): avoid counting nodes #5703

overlookmotel commented Sep 11, 2024 •

edited

Loading

overlookmotel commented Sep 11, 2024 •

edited

Loading

graphite-app bot commented Sep 11, 2024

codspeed-hq bot commented Sep 11, 2024 •

edited

Loading

overlookmotel commented Sep 11, 2024 •

edited

Loading

overlookmotel commented Sep 12, 2024 •

edited

Loading

perf(semantic): avoid counting nodes #5703

Are you sure you want to change the base?

perf(semantic): avoid counting nodes #5703

Conversation

overlookmotel commented Sep 11, 2024 • edited Loading

overlookmotel commented Sep 11, 2024 • edited Loading

graphite-app bot commented Sep 11, 2024

Your org has enabled the Graphite merge queue for merging into main

codspeed-hq bot commented Sep 11, 2024 • edited Loading

CodSpeed Performance Report

Merging #5703 will improve performances by 11.24%

Summary

Benchmarks breakdown

overlookmotel commented Sep 11, 2024 • edited Loading

overlookmotel commented Sep 12, 2024 • edited Loading

overlookmotel commented Sep 11, 2024 •

edited

Loading

overlookmotel commented Sep 11, 2024 •

edited

Loading

codspeed-hq bot commented Sep 11, 2024 •

edited

Loading

overlookmotel commented Sep 11, 2024 •

edited

Loading

overlookmotel commented Sep 12, 2024 •

edited

Loading