Skip to content

Conversation

@Boshen
Copy link
Member

@Boshen Boshen commented Aug 5, 2025

No description provided.

@github-actions github-actions bot added A-parser Area - Parser C-performance Category - Solution not expected to change functional behavior, only performance labels Aug 5, 2025
Copy link
Member Author

Boshen commented Aug 5, 2025


How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • 0-merge - adds this PR to the back of the merge queue
  • hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@codspeed-hq
Copy link

codspeed-hq bot commented Aug 5, 2025

CodSpeed Instrumentation Performance Report

Merging #12831 will not alter performance

Comparing 08-05-perf_lexer_improve_byte_handlers_for_and_ (ae0137c) with main (5a24574)1

Summary

✅ 34 untouched benchmarks

Footnotes

  1. No successful run was found on main (ae0137c) during the generation of this report, so 5a24574 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

@graphite-app
Copy link
Contributor

graphite-app bot commented Aug 5, 2025

Merge activity

@graphite-app graphite-app bot force-pushed the 08-05-perf_lexer_improve_byte_handlers_for_and_ branch from 5819c60 to ae0137c Compare August 5, 2025 11:45
@graphite-app graphite-app bot merged commit ae0137c into main Aug 5, 2025
28 checks passed
@graphite-app graphite-app bot deleted the 08-05-perf_lexer_improve_byte_handlers_for_and_ branch August 5, 2025 11:51
This was referenced Aug 6, 2025
@overlookmotel
Copy link
Member

@Boshen Just FYI, the reason for the more complicated implementation before was that lexer.peek_2_bytes() means only 1 bounds check on number of bytes remaining in source, whereas the simpler implementation performs 2 bounds checks when first 2 bytes are != (which is fairly common).

I see this change was suggested by AI in #12765, which confidently asserted that the simpler version is more performant. I'm not sure that's true or not. Benchmarks don't seem to show much either way.

I don't propose reverting this change, but just to let you know there was a reason why it was how it was.

@Boshen
Copy link
Member Author

Boshen commented Aug 11, 2025

@Boshen Just FYI, the reason for the more complicated implementation before was that lexer.peek_2_bytes() means only 1 bounds check on number of bytes remaining in source, whereas the simpler implementation performs 2 bounds checks when first 2 bytes are != (which is fairly common).

I see this change was suggested by AI in #12765, which confidently asserted that the simpler version is more performant. I'm not sure that's true or not. Benchmarks don't seem to show much either way.

I don't propose reverting this change, but just to let you know there was a reason why it was how it was.

My advice is, add more context to these functions to let AI know.

btw AI couldn't figure out all the macros so this is the least it could pick up.

And also, it was written like this because I saw a small dent in perf in the very original implementation

But I believe it doesn't matter any more after all the crazy optimizations.

@overlookmotel
Copy link
Member

Ah ha sorry, you wrote this code in the first place! I though it was lucab.

The macros could probably be removed. When I first wrote that stuff I found a sizeable perf improvement from macros over generic functions. But I know Rust better now! Maybe could find a way to get rid of them now without hurting perf.

Anyway, in my view, the byte handlers are fairly well optimized. If we want to get a significant gain, we'd probably need to look at one of:

  1. Removing all the bounds checks, by using a sentinel byte for "EOF" at end of source.
  2. SIMD.
  3. Looking at the rest of the lexer code. The byte handlers take up only ~60% of total run time. What on earth is consuming the other 40%? If we could reduce the size of Lexer::next_token, it could probably be inlined into many places in parser, which could be a solid gain.

So, in short, I think the cost in terms of our time is probably larger than the perf gains we're likely to get optimizing the byte handlers. My guess is that there's bigger gains to be got for less effort elsewhere.

I brought it up here because I've noticed AI sometimes makes claims of better perf which aren't backed up by any evidence. In the absence of evidence, I think it may be better to take an "if it ain't broke, don't fix it" approach - something done for a reason by a reasonably-competent human in my view is probably a better bet than an unpredictable AI. Obviously, as the "code owner", I'm going to look at any changes to the lexer, and the cognitive load of assessing changes is a cost.

taearls pushed a commit to taearls/oxc that referenced this pull request Aug 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-parser Area - Parser C-performance Category - Solution not expected to change functional behavior, only performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants