-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance by compiling to webassembly #216
Comments
After working through more details to get the Sucrase parser working in full, I tried this again on a more realistic dataset and the results seem more promising now! It's a TypeScript/JSX codebase that's about 4000 files and about 550,000 lines, and the task was to run the Sucrase parser to count the total number of tokens in all files. I confirmed that the number of tokens was the same for js and wasm, so hopefully this means that both are indeed doing a full correct parse. Here are the numbers:
As expected, V8 runs JS better on larger datasets since it has more time to identify hot code paths and compile them with good optimizations. It looks like JS still does slightly better at the largest scale, around 50 million lines of code, but even then the difference is small. In my own use cases, the typical scale is 1000-4000 files, so at least here, I'd expect a 2-3x speedup. It's unclear if the improvements are due to the more realistic dataset or improvements in AssemblyScript, but it looks like AssemblyScript will improve perf, especially on smaller datasets. So this seems like a good enough justification to get the code fully working in AssemblyScript, including using |
FWIW, we have successfully used sucrase with nodegun to get rid of the node start time and the JIT warmup time. |
Wow, that's awesome, I hadn't seen nodegun! I'll have to try it out when I get a chance. |
@alangpierce I think wasm should be faster than js for large dataset with latest AssemblyScript as well. Also if you change |
Is there any link to a working prototype for using Sucrase via WASM? I'd really appreciate it. |
Your parser performance is astonishing, it would be super cool to use your parser in webpack to decrease build time This way you would also get native typescript support 😛 |
I was able to prototype a hacked-together variant of the sucrase-babylon parser and get it working in AssemblyScript. I ran it through a benchmark of compiling a ~100 line file over and over and counting the number of tokens, and it gave the correct answer both when run from js and when run from wasm. Here were the results:
As expected, the wasm running time grows linearly and the JS running time grows sub-linearly at least in small numbers, since V8 has more and more time to optimize the code and wasm optimizations are all at compile time. The wasm code is 50x faster on a very small case, about equal for about 50,000 lines of code, and 2x slower in the extremes. So it looks like, at least for now, V8 can do a really good job at optimization when given long enough, enough to outperform the equivalent code compiled through AssemblyScript.
Sadly, this means webassembly isn't a slam dunk like I was hoping, but it still seems promising. Probably there's room for improvement in the use of webassembly (tricks that would be impossible to implement in plain JS and unreasonable for V8 to infer), and AssemblyScript and WebAssembly both may improve over time, so there's reason to believe webassembly will eventually win out for long-running programs after all. This test case may also be beneficial for V8 since it's processing exactly the same dataset, so the branches and call statistics will be the same every time. This also only handles about half of the sucrase codebase, and the other half (determining and performing the string replacements) may run better in wasm compared with JS.
My plan now is to get Sucrase into a state where it can both run in JS and be compiled by AssemblyScript to webassembly. That should at least make it a good test bed for this stuff.
Another good thing to prototype would be porting the parser to Rust and/or C++ and see how it performs as native code vs wasm, and see how that wasm compares with the wasm produced by AssemblyScript.
The text was updated successfully, but these errors were encountered: