-
Notifications
You must be signed in to change notification settings - Fork 257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a new "abi" which supports the full type grammar #422
Conversation
This commit implements support for new ABI which is an evolution of the current ABI specific to WASI. The main purpose of this ABI is to support all possible types in all places (e.g. multiple results, multiple params, lists of records of variants of structs of lists, etc...).
This is necessary to implement lists-of-lists properly with translation/validation.
Born out of recent discussions and realizations that we'll need an owned/borrowed distinction for arguments where possible.
Even if they aren't declared as such.
Mostly just updating read/write to load/store the appropriate size of the bitflags instead of decomposing into structs-of-bools.
* Automatically size enums based on how many cases they have * Read/write the tag appropriate tag size
Allows code generators to use this in their own calculations if necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great!
// bitcasts. This will go through and cast everything | ||
// to the right type to ensure all blocks produce the | ||
// same set of results. | ||
casts.truncate(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and elsewhere, is there a reason for using truncate(0)
instead of clear()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah that's just my age showing, I'm not sure we had clear()
at Rust 1.0...
I'll switch!
@@ -438,6 +519,39 @@ pub struct Variant { | |||
} | |||
|
|||
impl Variant { | |||
pub fn infer_repr(cases: usize) -> IntRepr { | |||
match cases { | |||
n if n < u8::max_value() as usize => IntRepr::U8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When n
is 255 or 256, it seems like we could still use a u8
variant, right? This is just an optimization, but I also wanted to make sure I'm not missing something subtle. So this could be written as n if n <= 0x100
and similar for the other types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh dear thanks for catching this!
When using the new ABI here:
|
I64ToF32, | ||
|
||
None, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bitcasts between types with different sizes are sensitive to endianness. For example, in F32ToF64
, does the F32
go in the most significant half of the F64
or the least significant half? Cross-endian configurations are a theoretical concern at this point, but I think we could at least document what should happen. Since wasm itself is little-endian, I propose this say "bitcasts between types with different sizes use little-endian byte ordering".
Also, it'd be good to mention here that widening bitcasts zero-extend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I've figured that like wasm everything is little-endian here. For conversions like f32 to f64 I'm imagining it's the same as f64::from(1.0f32)
in Rust where it's not really about moving bits but the f64 value losslessly matches the f32 value.
But yeah I'll definitely clarify this and indicate that everything is zero-extended. I haven't thought too too hard about the semantics here, but I think this'll all be ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. f64::from
is lossless except it converts signaling NaN to quiet NaN, which is obscure, but surprising given that the rest of the ABI preserves NaN bit patterns. But this isn't urgent to sort out now.
@jedisct1 ah yeah that's expected, the record-by-value-parameter is "splatted" into its flattened form, so each of the two arguments takes up 3 literal parameter values each in the wasm signature. The return value is also represented as multiple values, however, and because C/Rust aren't super great about multi-value returns today that's represented as a return pointer. This means that in all it turns out as 7 parameters (3 for first arg, 3 for second, 1 for ret pointer) |
Thanks for clarifying, and for your amazing work on this, Alex! The ret pointer totally makes sense. Multi-return values also require tuples in Zig, Swift and AssemblyScript, and having a single pointer is more convenient than the previous ABI. However, the flattening of structures in function parameters is quite of a massive change. Does that mean that we will always need two completely different representations for the same type, even if properly padded structures are already used internally? That seems to make everything more complicated, including language support for using imported functions. That flattening is certainly necessary. But if there is a way to avoid such a departure from the previous ABI and still use pointers instead, that would be immensely useful to ease the work of people maintaining languages, code generators and runtimes. |
It's definitely not intended to have two representations of the same type. The intention is that code generators would be based on this crate which abstracts away all the details of the ABI. This helps ensure that code generators all agree with one another and there's only one place that actually defines the ABI (this crate). In that sense the intention is to solve the problems you're mentioning, not create new problems. This will require code generators to migrate to using this crate and the ABI definitions, which is a large change, but it's expected that the general amount of maintenance afterwards is no different from today. |
Thanks Alex, Only code generators in Rust can use this crate :( The two representations I was referring to is the fact that when using this crate, offsets returned by Should these offsets be ignored and the splatted representation is always the correct one to store data as? |
Yes that's true, if you don't want to write Rust code you won't be able to use the crate. The intention is that this ABI is documented/canonicalized in documentation (like the wasm spec) so if other implementations would like to rebuild everything there's still a shared specification of what to do. For the two representations you're talking about, I'm not sure what you mean. (sorry I haven't been sure for the past few comments but haven't addressed this point specifically). When a struct is passed as a parameter each of its fields recursively get expanded into individual function arguments. This has nothing to do with memory/layout/etc since nothing is stored in memory, everything is passed as function parameters and such. Does that answer your question? Sorry I'm not entirely sure because the "splat to arguments" is not intended to be a representation, it's just an implementation detail of how you call a function with a struct argument |
Use the parsed module name for inline modules, and validate that it matches the filename for file modules.
This is now more formally specified at WebAssembly/interface-types#132 in the context of interface types. The intention is that once that's settled this will be updated to match the specification there! |
And in witxt files, use the module name instead of giving `(witx ...)` its own name.
Move typename, resource, and const declarations inside of module syntax.
This work has since been subsumed by the Canonical ABI and associated tooling. |
This commit adds a new
Abi
variant calledAbi::Next
. The purpose of this new ABI is to support the full expressivity of the type grammar, making validation much simpler since there generally doesn't need to be a ton of validation. The goal of this ABI is to have the next WASI snapshot move to it. Old WASI snapshots will never use this ABI because it's a breaking change from the existing WASI ABI.At a syntactical level functions are specified with the old ABI as:
whereas the new ABI is recognized as:
This commit also prepares the next wasi snapshot to rely far less on
@witx
and custom types which are unlikely to be in interface types. To that end a few changes are made:in-buffer T
andout-buffer T
types are added. These represent input/output from the callee's perspective (e.g. aread
function takes anout-buffer
and awrite
function takes anin-buffer
) and represent a slice ofT
in memory that the callee may consume but also may not. Having this be a first-class type instead of raw pointers allows WASI to get virtualized by wasm modules themselves given suitable host environments.The new ABI doesn't have a ton of documentation yet but I hope to write that up in the future as necessary. For now the details can be mostly glossed over since code generators are just receiving primitive instructions to implement and the details of the ABI are all handled by this crate. At a high level though the ABI is:
(list T)
to work so the callee knows what representation the caller has.memory
must be exported currently. Similarly for some types in the ABI you'd also need to export awitx_malloc
andwitx_free
function. The exact details of how to wire all this up I hope can be more flexible in the future, but I figure this is probably at least a good starting point.The intention of this ABI is to be a sort of "canonical ABI" for interface types. This enables WASI (and everything else using witx) to use the full type grammar as intended by interface types while assigning meaning to what the host/wasm need to do to communicate with each other. In the limit interface types will allow each module to customize its precise ABI, but for now this gives the ability to today have modules start communicating while we wait for the customization pieces to all fall in place.
Currently I have not changed the ephemeral snapshot to use the new ABI, but depending on feedback on this that's the next thing I'd like to do. That should enable the ephemeral snapshot to have access to a much more rich type grammar than what it has access to today. Furthermore I'd also like to eventually "backport" the {in,out}-buffer types to the previous ABI in a simple fashion (just a pointer/length) to ideally remove the need for
@witx
if possible. I haven't attempted to do this yet.