Don't check for null in `call_indirect` and `call_ref` #8159

alexcrichton · 2024-03-17T18:28:16Z

This PR is an implementation of #5291 to slightly optimize the lowering of call_indirect and call_ref in Wasmtime. Explicitly checks for null function pointers are no longer present and instead we let a segfault happen when loading from a null function pointer. This segfault is caught and the relevant instruction is annotated with the appropriate trap code.

This support first starts by refactoring the MemFlags API to no longer purely be flags but instead be a mixture of flags and "flag regions". The vmctx/heap/table alias regions are bundled into two bits now and the various trap-related bits are now bundled into four bits. This enables putting arbitrary trap codes in a MemFlags so long as they aren't TrapCode::User(_).

Closes #5291

github-actions · 2024-03-17T18:44:30Z

Subscribe to Label Action

cc @peterhuene

This issue or pull request has been labeled: "cranelift", "cranelift:wasm", "wasmtime:api"

Thus the following users have been cc'd because of the following labels:

peterhuene: wasmtime:api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

…e table. This is based on discussion in bytecodealliance#8158: - We can use `call_indirect` rather than `table.get` + `call_ref`, even on typed funcrefs. TIL; updated the test! - As noted in bytecodealliance#8160, if we use a nullable typed funcref table instead (and given that we know we'll initialize a particular slot before use on the application side, so we won't actually call a null ref), and if we have a null-ref default value, we should be able to avoid the lazy table-init mechanism entirely. (Ignore the part where this module doesn't actually have any update logic that would set non-null refs anywhere; it's a compile-test, not a runtest!) Once bytecodealliance#8159 is merged and bytecodealliance#8160 is implemented, we should see zero branches in this test.

This is based on discussion in bytecodealliance#8158: as noted in bytecodealliance#8160, if we use a nullable typed funcref table instead (and given that we know we'll initialize a particular slot before use on the application side, so we won't actually call a null ref), and if we have a null-ref default value, we should be able to avoid the lazy table-init mechanism entirely. (Ignore the part where this module doesn't actually have any update logic that would set non-null refs anywhere; it's a compile-test, not a runtest!) Once bytecodealliance#8159 is merged and bytecodealliance#8160 is implemented, we should see zero branches in this test.

cfallin

LGTM, nice!

cranelift/codegen/src/ir/memflags.rs

jameysharp

I would kind of prefer to land the MemFlags changes separately from the "don't explicitly check for null function pointers" changes, if that's not too much trouble to split up. The commit history here is almost right for doing that with a quick rebase but I think only part of the "Trim the MemFlags API" commit applies.

I'm interested in whether you or Chris or anyone else have opinions on the other comments I've left below too.

crates/cranelift/src/func_environ.rs

cranelift/codegen/src/ir/memflags.rs

alexcrichton · 2024-03-18T04:13:47Z

prefer to land the MemFlags changes separately

Certainly!

This is based on discussion in #8158: as noted in #8160, if we use a nullable typed funcref table instead (and given that we know we'll initialize a particular slot before use on the application side, so we won't actually call a null ref), and if we have a null-ref default value, we should be able to avoid the lazy table-init mechanism entirely. (Ignore the part where this module doesn't actually have any update logic that would set non-null refs anywhere; it's a compile-test, not a runtest!) Once #8159 is merged and #8160 is implemented, we should see zero branches in this test.

crates/wasmtime/src/runtime/store.rs

This commit uses the support from bytecodealliance#8162 to skip null function pointer checks when performing an indirect call. Instead of an explicit check the segfault from accessing the null function pointer is caught and annotated with the appropriate trap. Closes bytecodealliance#5291

alexcrichton · 2024-03-18T16:52:24Z

Updated and rebased!

jameysharp

This feels very satisfying! Also TIL that you can do static asserts by putting assert! in a const-eval block.

In the original development of this feature, guided by JS AOT compilation to Wasm of a microbenchmark heavily focused on IC sites, I was seeing a ~20% speedup. However, in more recent measurements, on full programs (e.g., the Octane benchmark suite), the benefit is more like 5%. Moreover, in bytecodealliance#8870, I attempted to switch over to a direct-mapped cache, to address a current shortcoming of the design, namely that it has a hard-capped number of callsites it can apply to (50k) to limit impact on VMContext struct size. With all of the needed checks for correctness, though, that change results in a 2.5% slowdown relative to no caching at all, so it was dropped. In the process of thinking through that, I discovered the current design on `main` incorrectly handles null funcrefs: it invokes a null code pointer, rather than loading a field from a null struct pointer. The latter was specifically designed to cause the necessary Wasm trap in bytecodealliance#8159, but I had missed that the call to a null code pointer would not have the same effect. As a result, we actually can crash the VM (safely at least, but still no good vs. a proper Wasm trap!) with the feature enabled. (It's off by default still.) That could be fixed too, but at this point with the small benefit on real programs, together with the limitation on module size for full benefit, I think I'd rather opt for simplicity and remove the cache entirely. Thus, this PR removes call-indirect caching. It's not a direct revert because the original PR refactored the call-indirect generation into smaller helpers and IMHO it's a bit nicer to keep that. But otherwise all traces of the setting, code pre-scan during compilation and special conditions tracked on tables, and codegen changes are gone.

In the original development of this feature, guided by JS AOT compilation to Wasm of a microbenchmark heavily focused on IC sites, I was seeing a ~20% speedup. However, in more recent measurements, on full programs (e.g., the Octane benchmark suite), the benefit is more like 5%. Moreover, in #8870, I attempted to switch over to a direct-mapped cache, to address a current shortcoming of the design, namely that it has a hard-capped number of callsites it can apply to (50k) to limit impact on VMContext struct size. With all of the needed checks for correctness, though, that change results in a 2.5% slowdown relative to no caching at all, so it was dropped. In the process of thinking through that, I discovered the current design on `main` incorrectly handles null funcrefs: it invokes a null code pointer, rather than loading a field from a null struct pointer. The latter was specifically designed to cause the necessary Wasm trap in #8159, but I had missed that the call to a null code pointer would not have the same effect. As a result, we actually can crash the VM (safely at least, but still no good vs. a proper Wasm trap!) with the feature enabled. (It's off by default still.) That could be fixed too, but at this point with the small benefit on real programs, together with the limitation on module size for full benefit, I think I'd rather opt for simplicity and remove the cache entirely. Thus, this PR removes call-indirect caching. It's not a direct revert because the original PR refactored the call-indirect generation into smaller helpers and IMHO it's a bit nicer to keep that. But otherwise all traces of the setting, code pre-scan during compilation and special conditions tracked on tables, and codegen changes are gone.

alexcrichton requested review from a team as code owners March 17, 2024 18:28

alexcrichton requested review from cfallin and fitzgen and removed request for a team March 17, 2024 18:28

alexcrichton force-pushed the no-null-chekc-functions branch from 3e5e290 to f6384ae Compare March 17, 2024 18:30

alexcrichton mentioned this pull request Mar 17, 2024

Wasm tests: add typed-funcref test showing example of desirable optimizations. #8158

Merged

github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:wasm wasmtime:api Related to the API of the `wasmtime` crate itself labels Mar 17, 2024

cfallin mentioned this pull request Mar 17, 2024

Wasmtime: avoid table lazy-init runtime checks on nullable funcref tables with null default #8160

Open

cfallin mentioned this pull request Mar 17, 2024

Wasm ICs / typed-funcrefs test: switch to nullable table. #8161

Merged

cfallin approved these changes Mar 17, 2024

View reviewed changes

cranelift/codegen/src/ir/memflags.rs Outdated Show resolved Hide resolved

jameysharp reviewed Mar 18, 2024

View reviewed changes

crates/cranelift/src/func_environ.rs Outdated Show resolved Hide resolved

cranelift/codegen/src/ir/memflags.rs Outdated Show resolved Hide resolved

cranelift/codegen/src/ir/memflags.rs Outdated Show resolved Hide resolved

fitzgen reviewed Mar 18, 2024

View reviewed changes

crates/wasmtime/src/runtime/store.rs Show resolved Hide resolved

alexcrichton force-pushed the no-null-chekc-functions branch from f6384ae to 8799f01 Compare March 18, 2024 16:51

jameysharp approved these changes Mar 18, 2024

View reviewed changes

alexcrichton added this pull request to the merge queue Mar 18, 2024

Merged via the queue into bytecodealliance:main with commit fbbeaf7 Mar 18, 2024
19 checks passed

alexcrichton deleted the no-null-chekc-functions branch March 18, 2024 19:55

cfallin mentioned this pull request Jun 27, 2024

Wasmtime: remove indirect-call caching. #8881

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't check for null in `call_indirect` and `call_ref` #8159

Don't check for null in `call_indirect` and `call_ref` #8159

alexcrichton commented Mar 17, 2024

github-actions bot commented Mar 17, 2024

cfallin left a comment

jameysharp left a comment

alexcrichton commented Mar 18, 2024

alexcrichton commented Mar 18, 2024

jameysharp left a comment

Don't check for null in call_indirect and call_ref #8159

Don't check for null in call_indirect and call_ref #8159

Conversation

alexcrichton commented Mar 17, 2024

github-actions bot commented Mar 17, 2024

Subscribe to Label Action

cfallin left a comment

Choose a reason for hiding this comment

jameysharp left a comment

Choose a reason for hiding this comment

alexcrichton commented Mar 18, 2024

alexcrichton commented Mar 18, 2024

jameysharp left a comment

Choose a reason for hiding this comment

Don't check for null in `call_indirect` and `call_ref` #8159

Don't check for null in `call_indirect` and `call_ref` #8159