-
Notifications
You must be signed in to change notification settings - Fork 12.8k
Leave switch as-is but just extract branches to function calls #50245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@typescript-bot perf test this faster |
Heya @weswigham, I've started to run the abridged perf test suite on this PR at e9d86d9. You can monitor the build here. Update: The results are in! |
@weswigham Here they are:Comparison Report - main..50245
System
Hosts
Scenarios
Developer Information: |
@DanielRosenwasser good news! The perf improvements aren't from the switch-case change, and I'm not going crazy reading the v8 source code, unable to figure out why the switch was deoptimized (when, in fact, it wasn't)! Just extracting the various cases into individual functions looks to be most if not all of the performance improvement. So it looks like it's not the switch that's an issue, but rather the length and polymorphism of the function itself. |
@typescript-bot perf test this For LS numbers for this, too~ |
Heya @weswigham, I've started to run the perf test suite on this PR at e9d86d9. You can monitor the build here. Update: The results are in! |
@weswigham Here they are:
CompilerComparison Report - main..50245
System
Hosts
Scenarios
TSServerComparison Report - main..50245
System
Hosts
Scenarios
Developer Information: |
I do like the improved safety of this one over #50225. Would the perf be identical if these functions were instead declared at the bottom of (This is probably very moot post-modules, as this whole thing could be its own file that doesn't export them.) |
I wonder if V8 is using a jump table for the
These functions don't need to be exported from |
I ultimately like @DanielRosenwasser 's PR more, simply because it is yet more performant, but it looks like the perf improvement is correlated with the instruction length of the function and not the presence or lack thereof of the switch, which is what I wanted to confirm with this. |
Based on the conditions for switch optimization I read in the v8 sources, both before this PR and with this PR, the switch should be optimized to a jump table. For an interpreted function, you get a jump table so long as you have a minimum number of branches and cover at least 33% of the range of inputs (measured as the difference between the min and max case values). On x64, for a compiled function, there's a somewhat complex space/time tradeoff heuristic where it falls back to a binary search, but it weights the time metric more, and for anything with a somewhat densely filled out list of cases, should prefer the jump table. There's links to the relevant v8 sources in our internal discussions, if you're curious about the exact heuristic. AFAIK, the 128 case limit mentioned in the recently-trended blog post about a perf problem from 2013 no longer seems to exist - on x64, the max case count for a jump table is |
Just to see if this alone has a perf diff.