|
| 1 | +# Component Model Binary Format Explainer |
| 2 | + |
| 3 | +This document defines the binary format for the AST defined in the |
| 4 | +[explainer](Explainer.md). The top-level production is `component` and the |
| 5 | +convention is that a file suffixed in `.wasm` may contain either a |
| 6 | +[`core:module`] *or* a `component`, using the `kind` field to discriminate |
| 7 | +between the two in the first 8 bytes (see [below](#component-definitions) for |
| 8 | +more details). |
| 9 | + |
| 10 | +Note: this document is not meant to completely define the decoding or validation |
| 11 | +rules, but rather merge the minimal need-to-know elements of both, with just |
| 12 | +enough detail to create a prototype. A complete definition of the binary format |
| 13 | +and validation will be present in the [formal specification](../../spec/). |
| 14 | + |
| 15 | + |
| 16 | +## Component Definitions |
| 17 | + |
| 18 | +(See [Component Definitions](Explainer.md#component-definitions) in the explainer.) |
| 19 | +``` |
| 20 | +component ::= <component-preamble> s*:<section>* => (component flatten(s*)) |
| 21 | +preamble ::= <magic> <version> <kind> |
| 22 | +magic ::= 0x00 0x61 0x73 0x6D |
| 23 | +version ::= 0x0a 0x00 |
| 24 | +kind ::= 0x01 0x00 |
| 25 | +section ::= section_0(<core:custom>) => ϵ |
| 26 | + | t*:section_1(vec(<type>)) => t* |
| 27 | + | i*:section_2(vec(<import>)) => i* |
| 28 | + | f*:section_3(vec(<func>)) => f* |
| 29 | + | m: section_4(<core:module>) => m |
| 30 | + | c: section_5(<component>) => c |
| 31 | + | i*:section_6(vec(<instance>)) => i* |
| 32 | + | e*:section_7(vec(<export>)) => e* |
| 33 | + | s: section_8(<start>) => s |
| 34 | + | a*:section_9(vec(<alias>)) => a* |
| 35 | +``` |
| 36 | +Notes: |
| 37 | +* Reused Core binary rules: [`core:section`], [`core:custom`], [`core:module`] |
| 38 | +* The `version` given above is pre-standard. As the proposal changes before |
| 39 | + final standardization, `version` will be bumped from `0xa` upwards to |
| 40 | + coordinate prototypes. When the standard is finalized, `version` will be |
| 41 | + changed one last time to `0x1`. (This mirrors the path taken for the Core |
| 42 | + WebAssembly 1.0 spec.) |
| 43 | +* The `kind` field is meant to distinguish modules from components early in the |
| 44 | + binary format. (Core WebAssembly modules already implicitly have a `kind` |
| 45 | + field of `0x0` in their 4 byte [`core:version`] field.) |
| 46 | + |
| 47 | + |
| 48 | +## Instance Definitions |
| 49 | + |
| 50 | +(See [Instance Definitions](Explainer.md#instance-definitions) in the explainer.) |
| 51 | +``` |
| 52 | +instance ::= ie:<instance-expr> => (instance ie) |
| 53 | +instanceexpr ::= 0x00 0x00 m:<moduleidx> a*:vec(<modulearg>) => (instantiate (module m) (with a)*) |
| 54 | + | 0x00 0x01 c:<componentidx> a*:vec(<componentarg>) => (instantiate (component c) (with a)*) |
| 55 | + | 0x01 e*:vec(<export>) => e* |
| 56 | + | 0x02 e*:vec(<core:export>) => e* |
| 57 | +modulearg ::= n:<name> 0x02 i:<instanceidx> => n (instance i) |
| 58 | +componentarg ::= n:<name> 0x00 m:<moduleidx> => n (module m) |
| 59 | + | n:<name> 0x01 c:<componentidx> => n (component c) |
| 60 | + | n:<name> 0x02 i:<instanceidx> => n (instance i) |
| 61 | + | n:<name> 0x03 f:<funcidx> => n (func f) |
| 62 | + | n:<name> 0x04 v:<valueidx> => n (value v) |
| 63 | + | n:<name> 0x05 t:<typeidx> => n (type t) |
| 64 | +export ::= a:<componentarg> => (export a) |
| 65 | +name ::= n:<core:name> => n |
| 66 | +``` |
| 67 | +Notes: |
| 68 | +* Reused Core binary rules: [`core:export`], [`core:name`] |
| 69 | +* The indices in `modulearg`/`componentarg` are validated according to their |
| 70 | + respective index space, which are built incrementally as each definition is |
| 71 | + validated. In general, unlike core modules, which supports cyclic references |
| 72 | + between (function) definitions, component definitions are strictly acyclic |
| 73 | + and validated in a linear incremental manner, like core wasm instructions. |
| 74 | +* The arguments supplied by `instantiate` are validated against the consuming |
| 75 | + module/component according to the [subtyping](Subtyping.md) rules. |
| 76 | + |
| 77 | + |
| 78 | +## Alias Definitions |
| 79 | + |
| 80 | +(See [Alias Definitions](Explainer.md#alias-definitions) in the explainer.) |
| 81 | +``` |
| 82 | +alias ::= 0x00 0x00 i:<instanceidx> n:<name> => (alias export i n (module)) |
| 83 | + | 0x00 0x01 i:<instanceidx> n:<name> => (alias export i n (component)) |
| 84 | + | 0x00 0x02 i:<instanceidx> n:<name> => (alias export i n (instance)) |
| 85 | + | 0x00 0x03 i:<instanceidx> n:<name> => (alias export i n (func)) |
| 86 | + | 0x00 0x04 i:<instanceidx> n:<name> => (alias export i n (value)) |
| 87 | + | 0x01 0x00 i:<instanceidx> n:<name> => (alias export i n (func)) |
| 88 | + | 0x01 0x01 i:<instanceidx> n:<name> => (alias export i n (table)) |
| 89 | + | 0x01 0x02 i:<instanceidx> n:<name> => (alias export i n (memory)) |
| 90 | + | 0x01 0x03 i:<instanceidx> n:<name> => (alias export i n (global)) |
| 91 | + | ... other Post-MVP Core definition kinds |
| 92 | + | 0x02 0x00 ct:<varu32> i:<moduleidx> => (alias outer ct i (module)) |
| 93 | + | 0x02 0x01 ct:<varu32> i:<componentidx> => (alias outer ct i (component)) |
| 94 | + | 0x02 0x05 ct:<varu32> i:<typeidx> => (alias outer ct i (type)) |
| 95 | +``` |
| 96 | +Notes: |
| 97 | +* For instance-export aliases (opcodes `0x00` and `0x01`), `i` is validated to |
| 98 | + refer to an instance in the instance index space that exports `n` with the |
| 99 | + specified definition kind. |
| 100 | +* For outer aliases (opcode `0x02`), `ct` is validated to be *less or equal |
| 101 | + than* the number of enclosing components and `i` is validated to be a valid |
| 102 | + index in the specified definition's index space of the enclosing component |
| 103 | + indicated by `ct` (counting outward, starting with `0` referring to the |
| 104 | + current component). |
| 105 | + |
| 106 | + |
| 107 | +## Type Definitions |
| 108 | + |
| 109 | +(See [Type Definitions](Explainer.md#type-definitions) in the explainer.) |
| 110 | +``` |
| 111 | +type ::= dt:<deftype> => dt |
| 112 | + | it:<intertype> => it |
| 113 | +deftype ::= mt:<moduletype> => mt |
| 114 | + | ct:<componenttype> => ct |
| 115 | + | it:<instancetype> => it |
| 116 | + | ft:<functype> => ft |
| 117 | + | vt:<valuetype> => vt |
| 118 | +moduletype ::= 0x4f mtd*:vec(<moduletype-def>) => (module mtd*) |
| 119 | +moduletype-def ::= 0x01 dt:<core:deftype> => dt |
| 120 | + | 0x02 i:<core:import> => i |
| 121 | + | 0x07 n:<name> d:<core:importdesc> => (export n d) |
| 122 | +core:deftype ::= ft:<core:functype> => ft |
| 123 | + | ... Post-MVP additions => ... |
| 124 | +componenttype ::= 0x4e ctd*:vec(<componenttype-def>) => (component ctd*) |
| 125 | +instancetype ::= 0x4d itd*:vec(<instancetype-def>) => (instance itd*) |
| 126 | +componenttype-def ::= itd:<instancetype-def> => itd |
| 127 | + | 0x02 i:<import> => i |
| 128 | +instancetype-def ::= 0x01 t:<type> => t |
| 129 | + | 0x07 n:<name> dt:<deftypeuse> => (export n dt) |
| 130 | + | 0x09 a:<alias> => a |
| 131 | +import ::= n:<name> dt:<deftypeuse> => (import n dt) |
| 132 | +deftypeuse ::= i:<typeidx> => type-index-space[i] (must be <deftype>) |
| 133 | +functype ::= 0x4c param*:vec(<param>) t:<intertypeuse> => (func param* (result t)) |
| 134 | +param ::= 0x00 t:<intertypeuse> => (param t) |
| 135 | + | 0x01 n:<name> t:<intertypeuse> => (param n t) |
| 136 | +valuetype ::= 0x4b t:<intertypeuse> => (value t) |
| 137 | +intertypeuse ::= i:<typeidx> => type-index-space[i] (must be <intertype>) |
| 138 | + | pit:<primintertype> => pit |
| 139 | +primintertype ::= 0x7f => unit |
| 140 | + | 0x7e => bool |
| 141 | + | 0x7d => s8 |
| 142 | + | 0x7c => u8 |
| 143 | + | 0x7b => s16 |
| 144 | + | 0x7a => u16 |
| 145 | + | 0x79 => s32 |
| 146 | + | 0x78 => u32 |
| 147 | + | 0x77 => s64 |
| 148 | + | 0x76 => u64 |
| 149 | + | 0x75 => float32 |
| 150 | + | 0x74 => float64 |
| 151 | + | 0x73 => char |
| 152 | + | 0x72 => string |
| 153 | +intertype ::= pit:<primintertype> => pit |
| 154 | + | 0x71 field*:vec(<field>) => (record field*) |
| 155 | + | 0x70 case*:vec(<case>) => (variant case*) |
| 156 | + | 0x6f t:<intertypeuse> => (list t) |
| 157 | + | 0x6e t*:vec(<intertypeuse>) => (tuple t*) |
| 158 | + | 0x6d n*:vec(<name>) => (flags n*) |
| 159 | + | 0x6c n*:vec(<name>) => (enum n*) |
| 160 | + | 0x6b t*:vec(<intertypeuse>) => (union t*) |
| 161 | + | 0x6a t:<intertypeuse> => (option t) |
| 162 | + | 0x69 t:<intertypeuse> u:<intertypeuse> => (expected t u) |
| 163 | +field ::= n:<name> t:<intertypeuse> => (field n t) |
| 164 | +case ::= n:<name> t:<intertypeuse> 0x0 => (case n t) |
| 165 | + | n:<name> t:<intertypeuse> 0x1 i:<varu32> => (case n t (defaults-to case-label[i])) |
| 166 | +``` |
| 167 | +Notes: |
| 168 | +* Reused Core binary rules: [`core:import`], [`core:importdesc`], [`core:functype`] |
| 169 | +* The type opcodes follow the same negative-SLEB128 scheme as Core WebAssembly, |
| 170 | + with type opcodes starting at SLEB128(-1) (`0x7f`) and going down, |
| 171 | + reserving the nonnegative SLEB128s for type indices. |
| 172 | +* The (`module`|`component`|`instance`)`type-def` opcodes match the corresponding |
| 173 | + section numbers. |
| 174 | +* Module, component and instance types create fresh type index spaces that are |
| 175 | + populated and referenced by their contained definitions. E.g., for a module |
| 176 | + type that imports a function, the `import` `moduletype-def` must be preceded |
| 177 | + by either a `type` or `alias` `moduletype-def` that adds the function type to |
| 178 | + the type index space. |
| 179 | +* Currently, the only allowed form of `alias` in instance and module types |
| 180 | + is `(alias outer ct li (type))`. In the future, other kinds of aliases |
| 181 | + will be needed and this restriction will be relaxed. |
| 182 | + |
| 183 | + |
| 184 | +## Function Definitions |
| 185 | + |
| 186 | +(See [Function Definitions](Explainer.md#function-definitions) in the explainer.) |
| 187 | +``` |
| 188 | +func ::= body:<funcbody> => (func body) |
| 189 | +funcbody ::= 0x00 ft:<typeidx> opt*:vec(<canonopt>) f:<funcidx> => (canon.lift ft opt* f) |
| 190 | + | 0x01 opt*:<canonopt>* f:<funcidx> => (canon.lower opt* f) |
| 191 | +canonopt ::= 0x00 => string=utf8 |
| 192 | + | 0x01 => string=utf16 |
| 193 | + | 0x02 => string=latin1+utf16 |
| 194 | + | 0x03 i:<instanceidx> => (into i) |
| 195 | +``` |
| 196 | +Notes: |
| 197 | +* Validation prevents duplicate or conflicting options. |
| 198 | +* Validation of `canon.lift` requires `f` to have a `core:functype` that matches |
| 199 | + the canonical-ABI-defined lowering of `ft`. The function defined by |
| 200 | + `canon.lift` has type `ft`. |
| 201 | +* Validation of `canon.lower` requires `f` to have a `functype`. The function |
| 202 | + defined by `canon.lower` has a `core:functype` defined by the canonical ABI |
| 203 | + lowering of `f`'s type. |
| 204 | +* If the lifting/lowering operations implied by `canon.lift` or `canon.lower` |
| 205 | + require access to `memory`, `realloc` or `free`, then validation will require |
| 206 | + the `(into i)` `canonopt` be present and the corresponding export be present |
| 207 | + in `i`'s `instancetype`. |
| 208 | + |
| 209 | + |
| 210 | +## Start Definitions |
| 211 | + |
| 212 | +(See [Start Definitions](Explainer.md#start-definitions) in the explainer.) |
| 213 | +``` |
| 214 | +start ::= f:<funcidx> arg*:vec(<valueidx>) => (start f (value arg)*) |
| 215 | +``` |
| 216 | +Notes: |
| 217 | +* Validation requires `f` have `functype` with `param` arity and types matching `arg*`. |
| 218 | +* Validation appends the `result` types of `f` to the value index space (making |
| 219 | + them available for reference by subsequent definitions). |
| 220 | + |
| 221 | +In addition to the type-compatibility checks mentioned above, the validation |
| 222 | +rules for value definitions additionally require that each value is consumed |
| 223 | +exactly once. Thus, during validation, each value has an associated "consumed" |
| 224 | +boolean flag. When a value is first added to the value index space (via |
| 225 | +`import`, `instance`, `alias` or `start`), the flag is clear. When a value is |
| 226 | +used (via `export`, `instantiate` or `start`), the flag is set. After |
| 227 | +validating the last definition of a component, validation requires all values' |
| 228 | +flags are set. |
| 229 | + |
| 230 | + |
| 231 | +## Import and Export Definitions |
| 232 | + |
| 233 | +(See [Import and Export Definitions](Explainer.md#import-and-export-definitions) in the explainer.) |
| 234 | + |
| 235 | +As described in the explainer, the binary decode rules of `import` and `export` |
| 236 | +have already been defined above. |
| 237 | + |
| 238 | +Notes: |
| 239 | +* Validation requires all import and export `name`s are unique. |
| 240 | + |
| 241 | + |
| 242 | + |
| 243 | +[`core:version`]: https://webassembly.github.io/spec/core/binary/modules.html#binary-version |
| 244 | +[`core:section`]: https://webassembly.github.io/spec/core/binary/modules.html#binary-section |
| 245 | +[`core:custom`]: https://webassembly.github.io/spec/core/binary/modules.html#custom-section |
| 246 | +[`core:module`]: https://webassembly.github.io/spec/core/binary/modules.html#binary-module |
| 247 | +[`core:export`]: https://webassembly.github.io/spec/core/binary/modules.html#binary-export |
| 248 | +[`core:name`]: https://webassembly.github.io/spec/core/binary/values.html#binary-name |
| 249 | +[`core:import`]: https://webassembly.github.io/spec/core/binary/modules.html#binary-import |
| 250 | +[`core:importdesc`]: https://webassembly.github.io/spec/core/binary/modules.html#binary-importdesc |
| 251 | +[`core:functype`]: https://webassembly.github.io/spec/core/binary/types.html#binary-functype |
| 252 | + |
| 253 | +[Future Core Type]: https://github.com/WebAssembly/gc/blob/master/proposals/gc/MVP.md#type-definitions |
0 commit comments