-
Notifications
You must be signed in to change notification settings - Fork 628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refine AOT/JIT code call wasm-c-api import process #2982
Refine AOT/JIT code call wasm-c-api import process #2982
Conversation
@@ -417,6 +417,7 @@ struct wasm_ref_t; | |||
|
|||
typedef struct wasm_val_t { | |||
wasm_valkind_t kind; | |||
uint8_t __paddings[7]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess these manual padding warrants comments.
for platforms with a loose alignment for 64 bit types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the layout of 32-bit may be different from the layout of 64-bit, while in AOT compiler, it uses fixed layout and hopes they are the same.
i feel we already have too many function calling mechanisms. |
Yes, there are some calling mechanisms now, such as AOT calls AOT, AOT calls host, host calls AOT and so on. For the callings between AOT and host, there are mainly three calling conventions: (1) host calls AOT, (2) AOT calls host native APIs, whose convention is same as AOT function and can be registered by wasm_runtime_register_natives, (3) AOT calls host wasm-c-api APIs whose convention is defined by wasm-c-api, and can be registered by Since there are scenarios in which there may be frequent (lots of) callings between host and AOT/JIT, e.g. Envoy, refining these calling processes becomes important as it really impacts performance a lot in that scenario. Currently for (2), developer can use I tested the callings for four empty c-api import functions with 1/2/3/4 arguments respectively from AOT code, so the execution time is mostly the time of the calling process. The sample is uploaded: We can see the execution time after optimization is about 23% to 24% of that without optimization. It improves a lot. My suggestion is that we disable it by default, and add a new document to describe these optimization opportunities (register quick AOT entries, |
Merge bytecodealliance:main into wenyongh:dev/quick_invoke_c_api_import
Merge this PR as it may improve the calling process a lot and may benefit some scenarios, e.g. Envoy. |
…2982) Allow to invoke the quick call entry wasm_runtime_quick_invoke_c_api_import to call the wasm-c-api import functions to speedup the calling process, which reduces the data copying. Use `wamrc --invoke-c-api-import` to generate the optimized AOT code, and set `jit_options->quick_invoke_c_api_import` true in wasm_engine_new when LLVM JIT is enabled.
Allow to invoke the quick all entry wasm_runtime_quick_invoke_c_api_import to
call the wasm-c-api import functions to speedup the calling process, which reduces
the data copying.
Use
wamrc --invoke-c-api-import
to generate the optimized AOT code, and setjit_options->quick_invoke_c_api_import
true in wasm_engine_new when LLVM JITis enabled.