-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate modules while translating #2059
Validate modules while translating #2059
Conversation
I've marked this as a draft while the wasmparser changes have yet to be released. |
4714173
to
1fafdfc
Compare
Subscribe to Label Actioncc @bnjbvr, @peterhuene
This issue or pull request has been labeled: "cranelift", "cranelift:wasm", "lightbeam", "wasmtime:api"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\o/ I'll be quite happy to try this PR in Spidermonkey, before it lands, to give a confirmation this integrates nicely and it passes our extensive test suite :-)
55489f0
to
a729479
Compare
a729479
to
66b956e
Compare
Ok the wasmparser bits are published and this should be good to go! |
Subscribe to Label Actioncc @fitzgen
This issue or pull request has been labeled: "fuzzing"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Looks good on API and wasmparser integration side.
@@ -53,14 +53,14 @@ use wasmparser::{MemoryImmediate, Operator}; | |||
/// Translates wasm operators into Cranelift IR instructions. Returns `true` if it inserted | |||
/// a return. | |||
pub fn translate_operator<FE: FuncEnvironment + ?Sized>( | |||
module_translation_state: &ModuleTranslationState, | |||
validator: &mut FuncValidator<impl WasmModuleResources>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One early comment (before i try this tomorrow): can you please expose wasmparser::FuncValidator
publicly, so that embedders aren't required to add wasmparser
to their own dependencies? (and if embedders are to implement WasmModuleResources
, this one as well maybe?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing! I've pushed up a comment which reexports wasmparser
as a crate since there's a number of types/etc you'll likely want to pull from there.
6ed53c0
to
87cb8bf
Compare
The bits to implement don't seem terribly hard, we already required most of this information anyways, so it was around. It's nice to have a proper API to make sense of it! One request, though: in wasmparser, most of the trait functions return a Because of the above issue, I haven't gone through implementing it entirely in Spidermonkey. Since I wouldn't want to block this from landing, would it make sense to allow both APIs right now, mark the former one as deprecated, and allow for implementing the new one for clients who want it? |
(I'm being told Rust nightly has generic associated types, which would allow putting a lifetime type parameter in the |
After some discussion on Zulip, I think bytecodealliance/wasm-tools#68 will help address the issues here, so I'll get that landed/published and update it here before we merge this. |
87cb8bf
to
5f5abc6
Compare
Ok I think |
As agreed on Zulip, since i won't be around for some time, this should land later to give Julian/Chris an opportunity to look into Spidermonkey changes this coming week. Thanks Alex! |
c34ea71
to
5570c95
Compare
Getting closer on my end too; FWIW I had to make one small change to fix some tests, exposing a junk-bytes-at-end condition as an error return rather than panic: commit (feel free to incorporate). I just have to tackle some table/reftypes-related test failures and then SpiderMonkey will be ready for this! |
This way we can let the validator take care of any issues with mismatched `end` instructions and/or trailing operators/bytes.
Oh oops, good catch! The intention was to let wasmparser's validator catch that naturally, so I pushed up a slightly tweaked commit which should remove the panic as well. |
I've gotten all tests to pass in SpiderMonkey using this validator, now! In order to do so, I locally updated the Given that, I think there isn't any further reason to hold this back -- thanks for waiting on SM to catch up! (A tangential note, as hinted above -- the SpiderMonkey test suite has a whole bunch of "this wasm module should fail to compile in a way that matches this regex" tests, so we should be careful updating any error messages going forward, or at least consider the cost; I wish there were a better way (standardized error messages as per a spec?) but this is what we have for now...) |
The spec tests do have expected error messages and, except for a few exceptions, we match them in |
Ah, I hadn't realized there were standard error messages. I can dig further into why SpiderMonkey doesn't hit issues with this (perhaps the harness is tweaked?) but actually all of the issues I hit were in the SpiderMonkey-specific test suite. An example here matches on a specific "type mismatch: expression has type A but expected B" message; |
"standard" is a bit of a stretch in that they're not super descriptitive, it's typically "type mismatch" for everything related to a type error and the implementation typically provides more descriptive stuff afterwards. That being said of It's great to hear though that SpiderMonkey is coming along well so I'm going to carry over the approval from before and go ahead and merge this. Thanks again for the heads up @cfallin! |
…c. r=jseward This patch pulls in an updated Cranelift with a new validation strategy, introduced by bytecodealliance/wasmtime#2059. This new design validates the Wasm module as it parses the function bodies. A subsequent patch will adapt Baldrdash to work with this. Differential Revision: https://phabricator.services.mozilla.com/D92503
…c. r=jseward This patch pulls in an updated Cranelift with a new validation strategy, introduced by bytecodealliance/wasmtime#2059. This new design validates the Wasm module as it parses the function bodies. A subsequent patch will adapt Baldrdash to work with this. Differential Revision: https://phabricator.services.mozilla.com/D92503
This patch is an update of Benjamin Bouvier's WIP patch that supports Cranelift's new Wasm validator functionality, as introduced in bytecodealliance/wasmtime#2059. Previously, Baldrdash allowed SpiderMonkey to validate Wasm function bodies before invoking Cranelift. The Cranelift PR now causes validation to happen in the function-body compiler. Because the Cranelift backend now takes on this responsibility, we no longer need to invoke the SpiderMonkey function-body validator. This requires some significant new glue in the Baldrdash bindings to allow Cranelift's Wasm frontend to access module type information. Co-authored-by: Benjamin Bouvier <benj@benj.me> Differential Revision: https://phabricator.services.mozilla.com/D92504
…c. r=jseward This patch pulls in an updated Cranelift with a new validation strategy, introduced by bytecodealliance/wasmtime#2059. This new design validates the Wasm module as it parses the function bodies. A subsequent patch will adapt Baldrdash to work with this. Differential Revision: https://phabricator.services.mozilla.com/D92503
This patch is an update of Benjamin Bouvier's WIP patch that supports Cranelift's new Wasm validator functionality, as introduced in bytecodealliance/wasmtime#2059. Previously, Baldrdash allowed SpiderMonkey to validate Wasm function bodies before invoking Cranelift. The Cranelift PR now causes validation to happen in the function-body compiler. Because the Cranelift backend now takes on this responsibility, we no longer need to invoke the SpiderMonkey function-body validator. This requires some significant new glue in the Baldrdash bindings to allow Cranelift's Wasm frontend to access module type information. Co-authored-by: Benjamin Bouvier <benj@benj.me> Differential Revision: https://phabricator.services.mozilla.com/D92504
…c. r=jseward This patch pulls in an updated Cranelift with a new validation strategy, introduced by bytecodealliance/wasmtime#2059. This new design validates the Wasm module as it parses the function bodies. A subsequent patch will adapt Baldrdash to work with this. Differential Revision: https://phabricator.services.mozilla.com/D92503
…c. r=jseward This patch pulls in an updated Cranelift with a new validation strategy, introduced by bytecodealliance/wasmtime#2059. This new design validates the Wasm module as it parses the function bodies. A subsequent patch will adapt Baldrdash to work with this. Differential Revision: https://phabricator.services.mozilla.com/D92503
This patch is an update of Benjamin Bouvier's WIP patch that supports Cranelift's new Wasm validator functionality, as introduced in bytecodealliance/wasmtime#2059. Previously, Baldrdash allowed SpiderMonkey to validate Wasm function bodies before invoking Cranelift. The Cranelift PR now causes validation to happen in the function-body compiler. Because the Cranelift backend now takes on this responsibility, we no longer need to invoke the SpiderMonkey function-body validator. This requires some significant new glue in the Baldrdash bindings to allow Cranelift's Wasm frontend to access module type information. Co-authored-by: Benjamin Bouvier <benj@benj.me> Differential Revision: https://phabricator.services.mozilla.com/D92504
This commit is intended to be the first of many in implementing the module linking proposal. At this time this builds on bytecodealliance#2059 so it shouldn't land yet. The goal of this commit is to compile bare-bones modules which use module linking, e.g. those with nested modules. My hope with module linking is that almost everything in wasmtime only needs mild refactorings to handle it. The goal is that all per-module structures are still per-module and at the top level there's just a `Vec` containing a bunch of modules. That's implemented currently where `wasmtime::Module` contains `Arc<[CompiledModule]>` and an index of which one it's pointing to. This should enable serialization/deserialization of any module in a nested modules scenario, no matter how you got it. Tons of features of the module linking proposal are missing from this commit. For example instantiation flat out doesn't work, nor does import/export of modules or instances. That'll be coming as future commits, but the purpose here is to start laying groundwork in Wasmtime for handling lots of modules in lots of places.
This commit is intended to be the first of many in implementing the module linking proposal. At this time this builds on bytecodealliance#2059 so it shouldn't land yet. The goal of this commit is to compile bare-bones modules which use module linking, e.g. those with nested modules. My hope with module linking is that almost everything in wasmtime only needs mild refactorings to handle it. The goal is that all per-module structures are still per-module and at the top level there's just a `Vec` containing a bunch of modules. That's implemented currently where `wasmtime::Module` contains `Arc<[CompiledModule]>` and an index of which one it's pointing to. This should enable serialization/deserialization of any module in a nested modules scenario, no matter how you got it. Tons of features of the module linking proposal are missing from this commit. For example instantiation flat out doesn't work, nor does import/export of modules or instances. That'll be coming as future commits, but the purpose here is to start laying groundwork in Wasmtime for handling lots of modules in lots of places.
…c. r=jseward This patch pulls in an updated Cranelift with a new validation strategy, introduced by bytecodealliance/wasmtime#2059. This new design validates the Wasm module as it parses the function bodies. A subsequent patch will adapt Baldrdash to work with this. Differential Revision: https://phabricator.services.mozilla.com/D92503
This patch is an update of Benjamin Bouvier's WIP patch that supports Cranelift's new Wasm validator functionality, as introduced in bytecodealliance/wasmtime#2059. Previously, Baldrdash allowed SpiderMonkey to validate Wasm function bodies before invoking Cranelift. The Cranelift PR now causes validation to happen in the function-body compiler. Because the Cranelift backend now takes on this responsibility, we no longer need to invoke the SpiderMonkey function-body validator. This requires some significant new glue in the Baldrdash bindings to allow Cranelift's Wasm frontend to access module type information. Co-authored-by: Benjamin Bouvier <benj@benj.me> Differential Revision: https://phabricator.services.mozilla.com/D92504
This commit is intended to be the first of many in implementing the module linking proposal. At this time this builds on bytecodealliance#2059 so it shouldn't land yet. The goal of this commit is to compile bare-bones modules which use module linking, e.g. those with nested modules. My hope with module linking is that almost everything in wasmtime only needs mild refactorings to handle it. The goal is that all per-module structures are still per-module and at the top level there's just a `Vec` containing a bunch of modules. That's implemented currently where `wasmtime::Module` contains `Arc<[CompiledModule]>` and an index of which one it's pointing to. This should enable serialization/deserialization of any module in a nested modules scenario, no matter how you got it. Tons of features of the module linking proposal are missing from this commit. For example instantiation flat out doesn't work, nor does import/export of modules or instances. That'll be coming as future commits, but the purpose here is to start laying groundwork in Wasmtime for handling lots of modules in lots of places.
This commit is intended to be the first of many in implementing the module linking proposal. At this time this builds on bytecodealliance#2059 so it shouldn't land yet. The goal of this commit is to compile bare-bones modules which use module linking, e.g. those with nested modules. My hope with module linking is that almost everything in wasmtime only needs mild refactorings to handle it. The goal is that all per-module structures are still per-module and at the top level there's just a `Vec` containing a bunch of modules. That's implemented currently where `wasmtime::Module` contains `Arc<[CompiledModule]>` and an index of which one it's pointing to. This should enable serialization/deserialization of any module in a nested modules scenario, no matter how you got it. Tons of features of the module linking proposal are missing from this commit. For example instantiation flat out doesn't work, nor does import/export of modules or instances. That'll be coming as future commits, but the purpose here is to start laying groundwork in Wasmtime for handling lots of modules in lots of places.
This commit is intended to be the first of many in implementing the module linking proposal. At this time this builds on #2059 so it shouldn't land yet. The goal of this commit is to compile bare-bones modules which use module linking, e.g. those with nested modules. My hope with module linking is that almost everything in wasmtime only needs mild refactorings to handle it. The goal is that all per-module structures are still per-module and at the top level there's just a `Vec` containing a bunch of modules. That's implemented currently where `wasmtime::Module` contains `Arc<[CompiledModule]>` and an index of which one it's pointing to. This should enable serialization/deserialization of any module in a nested modules scenario, no matter how you got it. Tons of features of the module linking proposal are missing from this commit. For example instantiation flat out doesn't work, nor does import/export of modules or instances. That'll be coming as future commits, but the purpose here is to start laying groundwork in Wasmtime for handling lots of modules in lots of places.
* Validate modules while translating This commit is a change to cranelift-wasm to validate each function body as it is translated. Additionally top-level module translation functions will perform module validation. This commit builds on changes in wasmparser to perform module validation interwtwined with parsing and translation. This will be necessary for future wasm features such as module linking where the type behind a function index, for example, can be far away in another module. Additionally this also brings a nice benefit where parsing the binary only happens once (instead of having an up-front serial validation step) and validation can happen in parallel for each function. Most of the changes in this commit are plumbing to make sure everything lines up right. The major functional change here is that module compilation should be faster by validating in parallel (or skipping function validation entirely in the case of a cache hit). Otherwise from a user-facing perspective nothing should be that different. This commit does mean that cranelift's translation now inherently validates the input wasm module. This means that the Spidermonkey integration of cranelift-wasm will also be validating the function as it's being translated with cranelift. The associated PR for wasmparser (bytecodealliance/wasmparser#62) provides the necessary tools to create a `FuncValidator` for Gecko, but this is something I'll want careful review for before landing! * Read function operators until EOF This way we can let the validator take care of any issues with mismatched `end` instructions and/or trailing operators/bytes.
This commit is intended to be the first of many in implementing the module linking proposal. At this time this builds on #2059 so it shouldn't land yet. The goal of this commit is to compile bare-bones modules which use module linking, e.g. those with nested modules. My hope with module linking is that almost everything in wasmtime only needs mild refactorings to handle it. The goal is that all per-module structures are still per-module and at the top level there's just a `Vec` containing a bunch of modules. That's implemented currently where `wasmtime::Module` contains `Arc<[CompiledModule]>` and an index of which one it's pointing to. This should enable serialization/deserialization of any module in a nested modules scenario, no matter how you got it. Tons of features of the module linking proposal are missing from this commit. For example instantiation flat out doesn't work, nor does import/export of modules or instances. That'll be coming as future commits, but the purpose here is to start laying groundwork in Wasmtime for handling lots of modules in lots of places.
As far as I can tell it's been unused since bytecodealliance#2059 in 2020.
As far as I can tell it's been unused since bytecodealliance#2059 in 2020.
This commit is a change to cranelift-wasm to validate each function body
as it is translated. Additionally top-level module translation functions
will perform module validation. This commit builds on changes in
wasmparser to perform module validation interwtwined with parsing and
translation. This will be necessary for future wasm features such as
module linking where the type behind a function index, for example, can
be far away in another module. Additionally this also brings a nice
benefit where parsing the binary only happens once (instead of having an
up-front serial validation step) and validation can happen in parallel
for each function.
Most of the changes in this commit are plumbing to make sure everything
lines up right. The major functional change here is that module
compilation should be faster by validating in parallel (or skipping
function validation entirely in the case of a cache hit). Otherwise from
a user-facing perspective nothing should be that different.
This commit does mean that cranelift's translation now inherently
validates the input wasm module. This means that the Spidermonkey
integration of cranelift-wasm will also be validating the function as
it's being translated with cranelift. The associated PR for wasmparser
(bytecodealliance/wasm-tools#62) provides the necessary tools to create
a
FuncValidator
for Gecko, but this is something I'll want carefulreview for before landing!