Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore support for using V8 as the Wasm JIT interpreter #6478

Merged
merged 17 commits into from
Dec 14, 2021

Conversation

steven-johnson
Copy link
Contributor

@steven-johnson steven-johnson commented Dec 7, 2021

I've taken the liberty to branch #6429 (with permission from @ngzhian) and finish up the work needed to land it.

This passes locally. The V8-specific code won't (yet) get tested on the buildbots (the script and bots need attention to do so) but it can land without that.

ngzhian and others added 13 commits November 18, 2021 16:18
This is a partial revert of #5097.
It brings back a bunch of the code in WasmExecutor to set up and use V8
to run Wasm code. All of the code is copy-pasted. There are some small
cleanups to move common code (like BDMalloc, structs, asserts) to a
common area guarded by `if WITH_WABT || WITH_V8`.

Enabling V8 requires setting 2 CMake options:
- V8_INCLUDE_PATH
- V8_LIB_PATH

The first is a path to v8 include folder, to find headers, the second is
the monolithic v8 library. This is because it's pretty difficult to
build v8, and there are various flags you can set. Comments around those
options provide some instructions for building v8.

By default, we still use the wabt for running Wasm code, but we can use
V8 by setting WITH_WABT=OFF WITH_V8=ON. Maybe in the future, with more
testing, we can flip this.

Right now this requires a locally patched build of V8 due to
https://crbug.com/v8/10461, but once that is resolved, the version of V8
that includes the fix will be fine.

Also enable a single test, block_transpose, to run on V8, with these
results:

$ HL_JIT_TARGET=wasm-32-wasmrt-wasm_simd128 \
  ./test/performance/performance_block_transpose
Dummy Func version: Scalar transpose bandwidth 3.45061e+08 byte/s.
Wrapper version: Scalar transpose bandwidth 3.38931e+08 byte/s.
Dummy Func version: Transpose vectorized in y bandwidth 6.74143e+08
byte/s.
Wrapper version: Transpose vectorized in y bandwidth 3.54331e+08 byte/s.
Dummy Func version: Transpose vectorized in x bandwidth 3.50053e+08
byte/s.
Wrapper version: Transpose vectorized in x bandwidth 6.73421e+08 byte/s.
Success!

For comparison, when targeting host:

$ ./test/performance/performance_block_transpose
Dummy Func version: Scalar transpose bandwidth 1.33689e+09 byte/s.
Wrapper version: Scalar transpose bandwidth 1.33583e+09 byte/s.
Dummy Func version: Transpose vectorized in y bandwidth 2.20278e+09
byte/s.
Wrapper version: Transpose vectorized in y bandwidth 1.45959e+09 byte/s.
Dummy Func version: Transpose vectorized in x bandwidth 1.45921e+09
byte/s.
Wrapper version: Transpose vectorized in x bandwidth 2.21746e+09 byte/s.
Success!

For comparison, running with wabt:

Dummy Func version: Scalar transpose bandwidth 828715 byte/s.
Wrapper version: Scalar transpose bandwidth 826204 byte/s.
Dummy Func version: Transpose vectorized in y bandwidth 1.12008e+06
byte/s.
Wrapper version: Transpose vectorized in y bandwidth 874958 byte/s.
Dummy Func version: Transpose vectorized in x bandwidth 879031 byte/s.
Wrapper version: Transpose vectorized in x bandwidth 1.10525e+06 byte/s.
Success!
@steven-johnson
Copy link
Contributor Author

OSX-x86 bot is down for repairs, probably OK to ignore for this

@alexreinking
Copy link
Member

Okay, I think I'd like to see the target_sources / target_link_libraries comments addressed. If we don't care about packaging this, then none of the other stuff matters.

@steven-johnson
Copy link
Contributor Author

PTAL

@alexreinking
Copy link
Member

Is this good to merge? Have you looked at the win64 failure?

@steven-johnson
Copy link
Contributor Author

There's no way thhe windows failure is related, but I will sync to head and retry.

@steven-johnson steven-johnson merged commit c3ff4d2 into master Dec 14, 2021
@steven-johnson steven-johnson deleted the srj/pr_6429 branch December 14, 2021 00:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants