perf(lexer): improve byte_handlers for `!` and `?` #12831

Boshen · 2025-08-05T11:39:07Z

No description provided.

Boshen · 2025-08-05T11:39:28Z

perf(lexer): improve byte_handlers for ! and ? #12831 👈 (View in Graphite)
main

How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

0-merge - adds this PR to the back of the merge queue
hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

_{An organization admin has enabled the Graphite Merge Queue in this repository.} _{Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.}

This stack of pull requests is managed by Graphite. Learn more about stacking.

codspeed-hq · 2025-08-05T11:45:17Z

CodSpeed Instrumentation Performance Report

Merging #12831 will not alter performance

_{Comparing 08-05-perf_lexer_improve_byte_handlers_for_and_ (ae0137c) with main (5a24574)¹}

Summary

✅ 34 untouched benchmarks

No successful run was found on main (ae0137c) during the generation of this report, so 5a24574 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩

graphite-app · 2025-08-05T11:45:22Z

Merge activity

Aug 5, 11:45 AM UTC: Boshen added this pull request to the Graphite merge queue.
Aug 5, 11:51 AM UTC: Merged by the Graphite merge queue.

overlookmotel · 2025-08-11T16:56:24Z

@Boshen Just FYI, the reason for the more complicated implementation before was that lexer.peek_2_bytes() means only 1 bounds check on number of bytes remaining in source, whereas the simpler implementation performs 2 bounds checks when first 2 bytes are != (which is fairly common).

I see this change was suggested by AI in #12765, which confidently asserted that the simpler version is more performant. I'm not sure that's true or not. Benchmarks don't seem to show much either way.

I don't propose reverting this change, but just to let you know there was a reason why it was how it was.

Boshen · 2025-08-11T17:12:45Z

@Boshen Just FYI, the reason for the more complicated implementation before was that lexer.peek_2_bytes() means only 1 bounds check on number of bytes remaining in source, whereas the simpler implementation performs 2 bounds checks when first 2 bytes are != (which is fairly common).

I see this change was suggested by AI in #12765, which confidently asserted that the simpler version is more performant. I'm not sure that's true or not. Benchmarks don't seem to show much either way.

I don't propose reverting this change, but just to let you know there was a reason why it was how it was.

My advice is, add more context to these functions to let AI know.

btw AI couldn't figure out all the macros so this is the least it could pick up.

And also, it was written like this because I saw a small dent in perf in the very original implementation

perf(lexer): peak 2 bytes after ! #8662

But I believe it doesn't matter any more after all the crazy optimizations.

overlookmotel · 2025-08-12T11:41:40Z

Ah ha sorry, you wrote this code in the first place! I though it was lucab.

The macros could probably be removed. When I first wrote that stuff I found a sizeable perf improvement from macros over generic functions. But I know Rust better now! Maybe could find a way to get rid of them now without hurting perf.

Anyway, in my view, the byte handlers are fairly well optimized. If we want to get a significant gain, we'd probably need to look at one of:

Removing all the bounds checks, by using a sentinel byte for "EOF" at end of source.
SIMD.
Looking at the rest of the lexer code. The byte handlers take up only ~60% of total run time. What on earth is consuming the other 40%? If we could reduce the size of Lexer::next_token, it could probably be inlined into many places in parser, which could be a solid gain.

So, in short, I think the cost in terms of our time is probably larger than the perf gains we're likely to get optimizing the byte handlers. My guess is that there's bigger gains to be got for less effort elsewhere.

I brought it up here because I've noticed AI sometimes makes claims of better perf which aren't backed up by any evidence. In the absence of evidence, I think it may be better to take an "if it ain't broke, don't fix it" approach - something done for a reason by a reasonably-competent human in my view is probably a better bet than an unpredictable AI. Obviously, as the "code owner", I'm going to look at any changes to the lexer, and the cognitive load of assessing changes is a cost.

github-actions bot added A-parser Area - Parser C-performance Category - Solution not expected to change functional behavior, only performance labels Aug 5, 2025

perf(lexer): improve byte_handlers for ! and ? (#12831)

ae0137c

graphite-app bot force-pushed the 08-05-perf_lexer_improve_byte_handlers_for_and_ branch from 5819c60 to ae0137c Compare August 5, 2025 11:45

graphite-app bot merged commit ae0137c into main Aug 5, 2025
28 checks passed

graphite-app bot deleted the 08-05-perf_lexer_improve_byte_handlers_for_and_ branch August 5, 2025 11:51

This was referenced Aug 6, 2025

release(crates): v0.81.0 #12840

Closed

release(crates): v0.81.0 #12850

Merged

taearls pushed a commit to taearls/oxc that referenced this pull request Aug 12, 2025

perf(lexer): improve byte_handlers for ! and ? (oxc-project#12831)

44df396

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

perf(lexer): improve byte_handlers for `!` and `?` #12831

perf(lexer): improve byte_handlers for `!` and `?` #12831

Uh oh!

Boshen commented Aug 5, 2025

Uh oh!

Boshen commented Aug 5, 2025

Uh oh!

codspeed-hq bot commented Aug 5, 2025 •

edited

Loading

Uh oh!

graphite-app bot commented Aug 5, 2025 •

edited

Loading

Uh oh!

Uh oh!

overlookmotel commented Aug 11, 2025

Uh oh!

Boshen commented Aug 11, 2025

Uh oh!

overlookmotel commented Aug 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

perf(lexer): improve byte_handlers for ! and ? #12831

perf(lexer): improve byte_handlers for ! and ? #12831

Uh oh!

Conversation

Boshen commented Aug 5, 2025

Uh oh!

Boshen commented Aug 5, 2025

How to use the Graphite Merge Queue

Uh oh!

codspeed-hq bot commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Instrumentation Performance Report

Merging #12831 will not alter performance

Summary

Footnotes

Uh oh!

graphite-app bot commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

Uh oh!

overlookmotel commented Aug 11, 2025

Uh oh!

Boshen commented Aug 11, 2025

Uh oh!

overlookmotel commented Aug 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

perf(lexer): improve byte_handlers for `!` and `?` #12831

perf(lexer): improve byte_handlers for `!` and `?` #12831

codspeed-hq bot commented Aug 5, 2025 •

edited

Loading

graphite-app bot commented Aug 5, 2025 •

edited

Loading