Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: compiler performance optimization #9674

Merged
merged 56 commits into from
Nov 25, 2023
Merged

perf: compiler performance optimization #9674

merged 56 commits into from
Nov 25, 2023

Conversation

yyx990803
Copy link
Member

@yyx990803 yyx990803 commented Nov 25, 2023

Compiler Performance Optimization

Overall 44% Faster SFC Compilation

Based on a benchmark done using all the SFC files from the Elk repo, which should be largely reprensatitive of a real world app:

benching with:
- 225 files
- isProd: true
- sourceMap: true
- 3 warmup runs
- 10 bench runs

old compiler: 1513ms
new compiler: 845ms
new compiler is 44.15% faster.

---

benching with:
- 225 files
- isProd: true
- sourceMap: true
- 5 warmup runs
- 20 bench runs

old compiler: 2872ms
new compiler: 1618ms
new compiler is 43.66% faster.

Note: this benchmark only measures the time spent by @vue/compiler-sfc parsing SFC files and converting them to JavaScript and source maps. It does not include CSS processing, JavaScript bundling, and minification. So the impact on the build time of an entire project will be less significant, but should still be noticeable.

Twice as Fast Parsing

The parser is rewritten from the ground up and is twice as fast - i.e. it parses the same template in half the time compared to the old one.

  • The old parser was a recursive descent parser that uses a lot of regular expressions and inefficient lookahead searches.

  • The new parser uses a finite-state-machine tokenizer forked from htmlparser2. It iterates through the input in a linear fashion with minimal lookahead and backtracks, and largely removes reliance on regular expressions.

40% Faster Codegen

  • Optimized line / column calculation

    Previously, every CodegenContext.push call involved iterating over the pushed string to check for newlines in order to record the correct line and column position for source map generation. Profiling reveals that this iteration, done in advancePositionWithMutation(), results in non-trivial overhead. In a95d76e, this is optimized so that the string iteration can be skipped if the presence or location of the newline is known ahead of time.

  • Optimized source map generation

    We noticed that SourceMapGenerator.addMapping is spending a ton of time just normalizing and validating input arguments. Given that we always know the exact arguments that we are providing, we can avoid the overhead by directly adding the mapping. This is done in 8928473.

Eliminate SFC Template Doube Parse & Source Map Overhead

SFC parsing has some different requirements compared to normal Vue templates: the content of all root-level tags other than <template> should be treated as plain text, due to the need to support custom blocks. Full tag structure parsing is still needed for <template> because there could be nested <template> tags inside, however, the resulting AST is not reusable for template compilation due to the way the old parser's options were designed.

This means for every SFC we had to perform two parse calls: one for the SFC blocks, and one for the actual template content. In addition, because the second template parse is performed on the already extracted content, we need to re-map its source map locations to be relative to the entire SFC. This is actually very costly and should be avoided.

The new parser solves this problem by treating SFC parsing logic as a first-class concern. As a result, we can directly reuse the AST for the <template> block for subsequent transform and codegen, and also avoid the expensive source map re-mapping.

API Changes

The refactor introduced some minor changes to the AST format and @vue/compiler-core's parser options. These properties and options are mostly used internally or in custom compilers (which are really advanced use cases), so they should not affect most end users.

AST Format Changes

  • New property AttributeNode.nameLoc - SourceLocation for the attribute name. This is needed when generating props objects. Previously we were calculating this in transformElement, now it's included directly in the AST.

  • New property DirectiveNode.rawName: the raw string of the directive attribute name. This is used only during the parsing phase to check for duplicated attributes.

  • Removed but later restored properties: ElementNode.isSelfClosing and SourceLocation.source. These two properties are removed in this PR but later restored because they are needed in @vue/language-tools.

Parser Options Changes

  • New option: parseMode

    • type: 'base' | 'html' | 'sfc'
    • default: 'base'

    To maximize performance, some logic for handling HTML-specific behavior (e.g. special handling of content inside <script>) is handled directly in the tokenizer. Such behavior is disabled in the default 'base' mode.

    In 'sfc' mode, content in all root-level tags except <template> are treated as plain text, while the content of <template> is parsed in 'html' mode.

  • New option: ns

    This new option can be used to specify the root namespace when parsing a template.

  • Removed option: getTextMode

    The equivalent logic for this option has been hard-coded into the tokenizer for better performance. This does in theory remove some flexiblity in terms of defining an alternative list of tags that should be treated as plain text containers, but the use case is non-existent in practice.

Size Increase

The refactor results in slightly larger runtime compiler size. Size change for the global build including both compiler and runtime (min+brotli): 44.5kb => 46.4kb (+1.9kb). This is acceptable considering the performance improvements and the fact that the size increase does not affect projects that use a build step.

magic-string's trim method uses a regex check for aborting which turns
out to be extremely expensive - it cna take up to 10% time in total SFC
compilation! The usage here is purely aesthetic so simply removing it
for a big perf gain is well worth it.
Previously, many CodegenContext.push() calls were unnecessarily
iterating through the entire pushed string to find newlines, when we
already know the newline positions for most of calls. Providing fast
paths for these calls significantly improves codegen performance when
source map is needed.

In benchmarks, this PR improves full SFC compilation performance by ~6%.
@yyx990803 yyx990803 merged commit 6ec85ae into minor Nov 25, 2023
6 checks passed
@yyx990803 yyx990803 deleted the parser-rewrite branch November 25, 2023 08:18
@Doctor-wu
Copy link
Member

🚀

@jingyuexing
Copy link

🎉🎉

@oceangravity
Copy link

Just wow 😮

@ziveen
Copy link

ziveen commented Nov 27, 2023

👍🏻

1 similar comment
@Leen27
Copy link

Leen27 commented Jan 3, 2024

👍🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants