Skip to content

Conversation

@CPunisher
Copy link
Member

@CPunisher CPunisher commented Nov 3, 2025

Description:

  1. Originally, the skip space function had two nested loops: the inner loop handled whitespace, and the outer loop checked for slash comments. The inner loop used match to deal with spaces and newlines, and used a jump table when encountering Unicode characters. Now, everything is handled in a single loop, all match statements are removed, and everything goes through the jump table.
  2. After finding one space, it’s very likely there are more spaces following. Instead of going back to the jump table each time, it continuously tries to process the spaces in a row.
  3. After finding a newline, the lexer's had_line_break is set to true. At this point, it can also skip over any following spaces and newlines continuously, since the had_line_break state only needs to be set once.
  4. The original code handled Unicode at the byte level, for example by checking prefixes to determine Unicode length and manually combining multiple bytes into a full UTF-8 character. Now it simply reads the next character directly.

@changeset-bot
Copy link

changeset-bot bot commented Nov 3, 2025

🦋 Changeset detected

Latest commit: 0a74aa1

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions
Copy link
Contributor

github-actions bot commented Nov 3, 2025

Binary Sizes

File Size
swc.linux-x64-gnu.node 31M (31909128 bytes)

Commit: 5776883

@codspeed-hq
Copy link

codspeed-hq bot commented Nov 3, 2025

CodSpeed Performance Report

Merging #11225 will improve performances by 11.63%

Comparing CPunisher:11-03-perf/skip-space (0a74aa1) with main (af134fa)1

Summary

⚡ 20 improvements
✅ 118 untouched

Benchmarks breakdown

Benchmark BASE HEAD Change
es/codegen/with-parser/large 1.2 ms 1.2 ms +2.15%
es/lexer/angular 7.6 ms 7 ms +7.37%
es/lexer/backbone 1,022 µs 938.7 µs +8.88%
es/lexer/colors 28.8 µs 27.6 µs +4.36%
es/lexer/jquery 5.5 ms 5 ms +9.9%
es/lexer/jquery mobile 8.5 ms 7.7 ms +10.27%
es/lexer/mootools 4.3 ms 3.8 ms +11.63%
es/lexer/three 19.7 ms 18.8 ms +4.51%
es/lexer/underscore 859.5 µs 791.2 µs +8.64%
es/lexer/yui 4.6 ms 4.3 ms +7.7%
es/parser/angular 18.7 ms 18.2 ms +2.8%
es/parser/backbone 2.9 ms 2.8 ms +2.99%
es/parser/cal-com 59.3 ms 57.9 ms +2.45%
es/parser/jquery 14.9 ms 14.4 ms +3.39%
es/parser/jquery mobile 23.3 ms 22.4 ms +3.76%
es/parser/mootools 11.8 ms 11.3 ms +3.95%
es/parser/three 71.8 ms 69.2 ms +3.88%
es/parser/typescript 399.1 ms 388.5 ms +2.73%
es/parser/underscore 2.5 ms 2.4 ms +2.89%
es/parser/yui 11.5 ms 11.2 ms +2.98%

Footnotes

  1. No successful run was found on main (2edbd40) during the generation of this report, so af134fa was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

@CPunisher CPunisher marked this pull request as ready for review November 4, 2025 03:07
@CPunisher CPunisher requested a review from a team as a code owner November 4, 2025 03:07
Copilot AI review requested due to automatic review settings November 4, 2025 03:07
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the whitespace and comment skipping logic in the ECMAScript lexer by consolidating the implementation into a table-driven approach using byte handlers. The refactoring removes the separate SkipWhitespace struct and moves all whitespace handling directly into the Lexer implementation.

  • Introduces helper functions is_irregular_whitespace and is_irregular_line_terminator for Unicode whitespace detection
  • Replaces the old skip_space method's generic parameter with a simpler implementation using byte handler lookup table
  • Consolidates comment handling directly into the byte handler for / (slash) characters

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
crates/swc_ecma_parser/src/lexer/whitespace.rs Major refactoring - removes SkipWhitespace struct, adds Unicode whitespace helper functions, implements new table-driven byte handlers, and moves skip_space method into Lexer impl
crates/swc_ecma_parser/src/lexer/table.rs Formatting change - adds #[rustfmt::skip] attribute and adjusts comment alignment
crates/swc_ecma_parser/src/lexer/state.rs Removes generic type parameter from skip_space method calls
crates/swc_ecma_parser/src/lexer/mod.rs Removes generic type parameter from skip_space method calls and removes the old skip_space implementation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@kdy1 kdy1 added this to the Planned milestone Nov 4, 2025
@kdy1 kdy1 requested a review from a team as a code owner November 4, 2025 03:29
Copy link
Member

@kdy1 kdy1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement! Thanks!

@kdy1 kdy1 changed the title perf(es/parser): optimize skip_space erf(es/parser): Optimize skip_space Nov 4, 2025
@kdy1 kdy1 changed the title erf(es/parser): Optimize skip_space perf(es/parser): Optimize skip_space Nov 4, 2025
@kdy1 kdy1 merged commit 541d252 into swc-project:main Nov 4, 2025
184 checks passed
@CPunisher CPunisher deleted the 11-03-perf/skip-space branch November 4, 2025 08:10
@magic-akari
Copy link
Member

Impressive optimization!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants