-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RegExp syntax checking performance #58339
Conversation
@typescript-bot perf test |
@rbuckton, the perf run you requested failed. You can check the log here. |
@rbuckton Here they are:
tscComparison Report - baseline..pr
System info unknown
Hosts
Scenarios
tsserverComparison Report - baseline..pr
System info unknown
Hosts
Scenarios
startupComparison Report - baseline..pr
System info unknown
Hosts
Scenarios
Developer Information: |
@typescript-bot run dt |
@rbuckton Here are the results of running the user tests comparing Everything looks good! |
Hey @rbuckton, the results of running the DT tests are ready. Everything looks the same! |
@rbuckton Here are the results of running the user tests comparing Everything looks good! |
@rbuckton Here are the results of running the top 400 repos comparing Everything looks good! |
The perf seels really good. Is that all from avoiding out of bounds? |
Yeah, if |
Co-authored-by: Jake Bailey <5341706+jakebailey@users.noreply.github.com>
Might be worth mentioning in the PR description that you also |
Also other specific optimizations - like hoisting out helper functions and lazily initializing state for captures. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, forgot my green check.
I've been working on expanding this throughout scanner.ts and writing up a lint rule to ensure enforcement, so I'll probably need another review shortly. |
Since I have some additional testing to do, I think I'll put those changes up as a separate PR following this one. |
This makes a few changes to the RegExp syntax checking algorithm for possible performance improvements. This also fixes a small typo in one of the RegExp diagnostic messages.
One of the changes is to ensure we perform proper bounds checks when
pos
might advance pastend
. To accomplish this, I added thecharCodeChecked
andcharCodeUnchecked
functions.charCodeChecked
performs boundary tests which both improves reliability and avoids potential deoptimizations due to reading past the end of a string.charCodeUnchecked
is simply a wrapper fortext.charCodeAt
and exists primarily to make it easier to readily identify unchecked vs checked reads fromtext
.charCodeUnchecked
should only be used when we've already performed a bounds check before reading a character, such as in awhile (pos < end)
loop. I've also done the same for code points withcodePointChecked
andcodePointUnchecked
. Each of these functions is small, so ideally will be inlined by an optimizing compiler like the one used by V8.I've primarily focused on optimizing
scanRegularExpressionWorker
for now. If this seems effective, I will investigate adoptingcharCodeChecked
in more places in the scanner in a later PR.I've also made a few other small perf improvements:
scanRegularExpressionWorker
has been moved out ofreScanSlashToken
and updated to use the existingtext
andend
values from the outer scope.var
to avoid TDZ check overhead.