From 466994b9a55e7b7168caa19495d1bd0a5caa4d68 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Sa=C3=BAl=20Cabrera?= Date: Wed, 3 Aug 2022 09:55:23 -0400 Subject: [PATCH 1/9] Add RFC: Baseline compilation in Wasmtime MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This RFC proposes the addition of a new WebAssembly (Wasm) compiler to Wasmtime: a single-pass or “baseline” compiler. A baseline compiler works to improve overall compilation performance, yielding faster startup times, at the cost of less optimized code. This RFC describes: * The motivation for a baseline compiler * A high-level overview and description of the structure of the baseline compiler * A high-level overview of its integration with Wasmtime --- accepted/wasmtime-baseline-compilation.md | 260 ++++++++++++++++++++++ 1 file changed, 260 insertions(+) create mode 100644 accepted/wasmtime-baseline-compilation.md diff --git a/accepted/wasmtime-baseline-compilation.md b/accepted/wasmtime-baseline-compilation.md new file mode 100644 index 0000000..fda7632 --- /dev/null +++ b/accepted/wasmtime-baseline-compilation.md @@ -0,0 +1,260 @@ +# Baseline Compilation in Wasmtime + +Authors: Saúl Cabrera (@saulecabrera); Chris Fallin (@cfallin) + +## Summary + +This RFC proposes the addition of a new WebAssembly (Wasm) compiler to Wasmtime: +a single-pass or “baseline” compiler. A baseline compiler works to improve +overall compilation performance, yielding faster startup times, at the cost of +less optimized code. + + +## Motivation + +Wasmtime currently uses Cranelift by default, which is an optimizing compiler. +Cranelift performs code optimizations at the expense of slower compilation +times. This makes Just-In-Time (JIT) compilation of Wasm unsuitable for cases +where higher compilation performance is desired (e.g. short-lived trivial +programs, cases in which startup time is more critical than runtime +performance). + +The introduction of a baseline compiler is a first step towards: (i) faster +compilation and startup times (ii) enabling a tiered compilation model in +Wasmtime, similar to what is present in Wasm engines in web browsers. This RFC +**does not** account for tiered compilation, it only accounts for the +introduction of a baseline compiler. + +**Approximate** [measurements](https://github.com/Shopify/wasm-bench) taken on +top of a subset of the [Sightglass benchmark +suite](https://github.com/bytecodealliance/sightglass/tree/main/benchmarks) +– using different optimizing and baseline compilers (Cranelift from Wasmtime, +Liftoff from V8 and RabaldrMonkey from SpiderMonkey) – show that a baseline +compiler on average yields 15x to 20x faster compilation while producing code +that is on average 1.1x to 1.5x slower than the one produced by an optimizing +compiler. These measurements align on average with other measurements observed +when comparing interpretation and compilation for WebAssembly[^1]. + + +[^1]: Ben L. Titzer. [A fast in-place interpreter for + WebAssembly](https://arxiv.org/pdf/2205.01183.pdf) + +## Proposal: Winch, a baseline compiler for Wasmtime + +Winch: WebAssembly Intentionally-Non-Optimizing Compiler and Host + + +### Design Principles + +* Single pass over Wasm bytecode +* Function as the unit of compilation +* Machine code generation directly from Wasm bytecode – no intermediate + representation +* Avoid reinventing machine-code emission – use Cranelift's instruction emitter + code to create an assembler library ("MacroAssembler") +* Prioritize compilation performance over runtime performance + +### High-level overview + +```mermaid + graph TD; + A(wasmparser)-->B(cranelift-wasm); + A-->C(winch); + C-->D(MacroAssembler); + D-->X(cranelift-asm); + X-->E(MachInst); + X-->F(MachBuffer); + B-->G(cranelift); + G-->X; +``` + +### MacroAssembler and Borrowing from Cranelift + +We plan to factor out the lower layers of Cranelift that produce and operate on +machine code in order to reuse them as a generic assembler library +(“MacroAssembler”). + +The two key abstractions that will be useful to reuse are the `MachInst` +(“machine instruction”) trait and its implementations for each architecture; and +the `MachBuffer`, which is a machine-code emission buffer with some knowledge of +branches and ability to do peephole optimizations on them. The former lets us +reuse all the logic to encode instructions for an ISA; the latter lets us emit +code with “labels” and references to labels, and have the fixups done for us. + +The `MachInst` trait and its implementations, and the `MachBuffer`, can be +mostly factored out into a separate crate `cranelift_asm`. This will require +some care with respect to layering: in particular, definitions of +machine-instruction types are currently done in the ISLE backends for each ISA +within Cranelift. We can continue to use ISLE for these, but they will need to +be moved to the separate crate. + + +As a result of this initial layering, one will be able to build a `MachInst` as +a Rust data structure and emit it manually, for example: + +```rust +let add = cranelift_asm::x64::AluRmiR { op: AluRmiR::Add, … }; let mut +buf = MachBuffer::new(); add.emit(&mut buf, …); +``` + +However this is still quite cumbersome. As a next step, we will develop an API +over this that provides for procedural generation of instructions: i.e., one +method call for each instruction. Something like: + +```rust +let mut masm = cranelift_asm::x64::MacroAssembler::new(); +masm.add(rd, rm); +masm.store(rd, MemArg::base_offset(ra, 64)); +``` +This would allow for +fairly natural single-pass code emission. In essence, this is an implementation +of the [MacroAssembler +idea](https://searchfox.org/mozilla-central/rev/fa71140041c5401b80a11f099cc0cd0653295e2c/js/src/jit/MacroAssembler.h) +from SpiderMonkey. Each architecture will have an implementation of the +MacroAssembler API; perhaps there can be a trait that abstracts commonalities, +but enough will be different (e.g., instruction set quirks beyond the usual +“add/sub/and/or/not” suspects, x64 two-operand form vs aarch64 three-operand +form, and more) that we expect there to be different `MacroAssembler` types for +each ISA. This in turn implies different lowering code that invokes the +`MacroAssembler` per ISA in the baseline compiler. The lowering code can perhaps +share many helpers that are monomorphized on the “common ISA core” trait. + +In the above examples, we bypass the register-allocation support, i.e. the +ability to hold virtual register operands rather than real registers, in the +`MachInst`s. This is supported today by passing through `RealReg`s (“real +registers”) instead. In the baseline compiler we expect register allocation to +occur before invoking the `MacroAssembler`; i.e., when generating the +instructions we already know which register we are using for each operand. Doing +otherwise (emitting with vregs first and editing later) requires actually +buffering the MachInst structs in memory, which we do not wish to do. + +We don’t expect to make any changes to Cranelift itself beyond the layering +refactor to borrow its `MachInst` and `MachBuffer` implementations. In +particular we don’t expect to use the `MacroAssembler` wrapper in Cranelift, at +least at first, because it will be built around constructing and emitting +instructions to machine code right away, without buffering (as in Cranelift’s +VCode). It’s possible in the future that we may find other ways to make +`MacroAssembler` generic and leverage it in Cranelift too, but that is beyond +the scope of this RFC. + +### Register Allocation + +We plan to implement register allocation in a linear[^2] fashion. + +[^2]: Known as [Linear Scan Register +Allocation](http://web.cs.ucla.edu/~palsberg/course/cs132/linearscan.pdf) + +The baseline compiler will hold a reference to a register allocator abstraction, +which will keep a list of registers, represented by Cranelit's `Reg` +abstraction, per ISA, along with their availability. It will also hold +a reference to a value stack abstraction, to keep track of operands and results +and their location as it performs compilation. These are the two key +abstractions for register allocation: + +```rust +pub struct Compiler { + //... + allocator: RegisterAllocator, + value_stack: ValueStack, + //... +} +``` + +The value stack is expected to keep track of the location of its values. +A particular value can be tagged as either a: + +* Local: representing a function local slot (index and type). The address of the + local will be resolved lazily to reduce register pressure. +* Register +* Constant: representing an immediate value. +* Memory Offset: the location of the value at a given memory offset + +Registers will be requested to the register allocator every time an operation +requires it. If no registers are available, the baseline compiler will move +locals and registers to memory, changing their tag to a memory offset, +performing what's known as spilling, effectively freeing up registers. Spilling +will also be performed at control flow points. To reduce the number of spills, +the baseline compiler will also perform limited constant rematerialization. + +Assuming that we have an immediate at the top of the stack, emitting an add +instruction with an immediate operand would look something like this: + +```rust +let mut masm = cranelift_asm::x64::MacroAssembler::new(); +// request a general purpose register; +// spill if +let imm = self.value_stack.pop(); +none available let rd = self.gpr(); +masm.add(rd, imm); +``` + +### Integration with Wasmtime + +We plan to integrate the baseline compiler incrementally into Wasmtime, as an +in-tree crate, `winch`. It will be introduced as a runtime feature, off by +default[^3]. Taking as a guideline [Wasmtime's tiers of +support](https://github.com/bytecodealliance/wasmtime/pull/4479), this means +that the baseline compiler will be introduced as a Tier 3 feature. + +[^3]: Adding a compile time feature for configuration from the start might add +unnecessary operational complexity i.e. making testing/fuzzing harder. + +In general, the development of the baseline compiler will be done in phases, +each phase covering a specific set of features: + +| Phase | Feature | Feature Type | +|-------|--------------------------|---------------------| +| 1 | cranelift_asm crate | Refactoring | +| 1 | x64 support | Target architecture | +| 1 | wasi_snapshot_preview1 | WASI proposal | +| 1 | wasi_unstable | WASI proposal | +| 1 | Multi-Memory | Wasm proposal | +| 1 | Reference Types | Wasm proposal | +| 1 | Epoch-based interruption | Wasmtime feature | +| 1 | Parallel compilation | Wasmtime feature | +| 1 | Fuzzing integration | Test coverage | +| 2 | Fuel | Wasmtime feature | +| 2 | SIMD | Wasm proposal | +| 2 | Memory 64 | Wasm proposal | +| 2 | ARM support | Target architecture | +| 3 | s390x | Target architecture | +| 3 | Debugging integration | Debugging | + +#### Configuring compilation + +We plan to extend `wasmtime::Strategy` to include a baseline compiler entry: + +```rust +pub enum Strategy { + Auto, + Cranelift, + Baseline, // or Winch +} +``` + +Which will be configurable via the strategy method in the `wasmtime::Config` +struct: + +```rust +config.strategy(Strategy::Baseline); +``` + +We also plan to extend Wasmtime's `run` and `compile` subcommands to support +a compiler argument: + +```sh +wasmtime compile --compiler= file.wasm +wasmtime run --compiler= file.wasm ``` +``` + +#### Performing compilation + +The baseline compiler will implement the `wasmtime_environ::Compiler` trait, +serving as the separation layer between Wasmtime and the compiler. We plan to +modify the `wasmtime::Engine::compiler` method to account for the compilation +strategy and choose the compiler accordingly. + +#### Development and long term maintenance + +Saúl Cabrera (@saulecabrera) will be the main maintainer of the baseline +compiler with support from Chris Fallin (@cfallin). From d8d934ec8a16c6d0ae823c9d5ba499f472bfab99 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Sa=C3=BAl=20Cabrera?= Date: Wed, 3 Aug 2022 15:28:38 -0400 Subject: [PATCH 2/9] Typo corrections --- accepted/wasmtime-baseline-compilation.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/accepted/wasmtime-baseline-compilation.md b/accepted/wasmtime-baseline-compilation.md index fda7632..4df6428 100644 --- a/accepted/wasmtime-baseline-compilation.md +++ b/accepted/wasmtime-baseline-compilation.md @@ -93,8 +93,9 @@ As a result of this initial layering, one will be able to build a `MachInst` as a Rust data structure and emit it manually, for example: ```rust -let add = cranelift_asm::x64::AluRmiR { op: AluRmiR::Add, … }; let mut -buf = MachBuffer::new(); add.emit(&mut buf, …); +let add = cranelift_asm::x64::AluRmiR { op: AluRmiR::Add, … }; +let mut buf = MachBuffer::new(); +add.emit(&mut buf, …); ``` However this is still quite cumbersome. As a next step, we will develop an API @@ -182,7 +183,7 @@ instruction with an immediate operand would look something like this: ```rust let mut masm = cranelift_asm::x64::MacroAssembler::new(); // request a general purpose register; -// spill if +// spill if none available let imm = self.value_stack.pop(); none available let rd = self.gpr(); masm.add(rd, imm); @@ -244,7 +245,7 @@ a compiler argument: ```sh wasmtime compile --compiler= file.wasm -wasmtime run --compiler= file.wasm ``` +wasmtime run --compiler= file.wasm ``` #### Performing compilation From 00aacd0e488c0d731adf0f07d77740a3e4ce6acc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Sa=C3=BAl=20Cabrera?= Date: Wed, 3 Aug 2022 15:38:44 -0400 Subject: [PATCH 3/9] Specify that changes will be introduced with a compile-time feature --- accepted/wasmtime-baseline-compilation.md | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/accepted/wasmtime-baseline-compilation.md b/accepted/wasmtime-baseline-compilation.md index 4df6428..7fa7986 100644 --- a/accepted/wasmtime-baseline-compilation.md +++ b/accepted/wasmtime-baseline-compilation.md @@ -192,14 +192,11 @@ masm.add(rd, imm); ### Integration with Wasmtime We plan to integrate the baseline compiler incrementally into Wasmtime, as an -in-tree crate, `winch`. It will be introduced as a runtime feature, off by -default[^3]. Taking as a guideline [Wasmtime's tiers of +in-tree crate, `winch`. It will be introduced as a compile-time feature, off by +default. Taking as a guideline [Wasmtime's tiers of support](https://github.com/bytecodealliance/wasmtime/pull/4479), this means that the baseline compiler will be introduced as a Tier 3 feature. -[^3]: Adding a compile time feature for configuration from the start might add -unnecessary operational complexity i.e. making testing/fuzzing harder. - In general, the development of the baseline compiler will be done in phases, each phase covering a specific set of features: From 47ad6a14db90e5b48705f4b65fa3c9380215e535 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Sa=C3=BAl=20Cabrera?= Date: Thu, 4 Aug 2022 13:40:47 -0400 Subject: [PATCH 4/9] Fix invalid rust snippet --- accepted/wasmtime-baseline-compilation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/accepted/wasmtime-baseline-compilation.md b/accepted/wasmtime-baseline-compilation.md index 7fa7986..58bfc5c 100644 --- a/accepted/wasmtime-baseline-compilation.md +++ b/accepted/wasmtime-baseline-compilation.md @@ -185,7 +185,7 @@ let mut masm = cranelift_asm::x64::MacroAssembler::new(); // request a general purpose register; // spill if none available let imm = self.value_stack.pop(); -none available let rd = self.gpr(); +let rd = self.gpr(); masm.add(rd, imm); ``` From d1dffc13d44122078861cd613bab3b99d2c23e5f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Sa=C3=BAl=20Cabrera?= Date: Thu, 4 Aug 2022 13:46:06 -0400 Subject: [PATCH 5/9] Modify phases Move initial aarch64 support to phase 1 and move reference types to phase 2 --- accepted/wasmtime-baseline-compilation.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/accepted/wasmtime-baseline-compilation.md b/accepted/wasmtime-baseline-compilation.md index 58bfc5c..376ee1d 100644 --- a/accepted/wasmtime-baseline-compilation.md +++ b/accepted/wasmtime-baseline-compilation.md @@ -204,17 +204,18 @@ each phase covering a specific set of features: |-------|--------------------------|---------------------| | 1 | cranelift_asm crate | Refactoring | | 1 | x64 support | Target architecture | +| 1 | Initial aarch64 support | Target architecture | | 1 | wasi_snapshot_preview1 | WASI proposal | | 1 | wasi_unstable | WASI proposal | | 1 | Multi-Memory | Wasm proposal | -| 1 | Reference Types | Wasm proposal | | 1 | Epoch-based interruption | Wasmtime feature | | 1 | Parallel compilation | Wasmtime feature | | 1 | Fuzzing integration | Test coverage | +| 2 | Reference Types | Wasm proposal | | 2 | Fuel | Wasmtime feature | | 2 | SIMD | Wasm proposal | | 2 | Memory 64 | Wasm proposal | -| 2 | ARM support | Target architecture | +| 2 | Finalize aarch64 support | Target architecture | | 3 | s390x | Target architecture | | 3 | Debugging integration | Debugging | From 141a3c6e3df27b1a9dc2d70580e021ab2ec58d00 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Sa=C3=BAl=20Cabrera?= Date: Fri, 5 Aug 2022 08:40:23 -0400 Subject: [PATCH 6/9] Explicitly call out `Strategy::Winch` --- accepted/wasmtime-baseline-compilation.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/accepted/wasmtime-baseline-compilation.md b/accepted/wasmtime-baseline-compilation.md index 376ee1d..e07c016 100644 --- a/accepted/wasmtime-baseline-compilation.md +++ b/accepted/wasmtime-baseline-compilation.md @@ -227,7 +227,7 @@ We plan to extend `wasmtime::Strategy` to include a baseline compiler entry: pub enum Strategy { Auto, Cranelift, - Baseline, // or Winch + Winch } ``` @@ -235,7 +235,7 @@ Which will be configurable via the strategy method in the `wasmtime::Config` struct: ```rust -config.strategy(Strategy::Baseline); +config.strategy(Strategy::Winch); ``` We also plan to extend Wasmtime's `run` and `compile` subcommands to support From e04ccae6dbc401a1936c921c06e18c8c658a61d2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Sa=C3=BAl=20Cabrera?= Date: Mon, 8 Aug 2022 15:39:12 -0400 Subject: [PATCH 7/9] General fixes to improve clarity --- accepted/wasmtime-baseline-compilation.md | 19 ++++++++----------- 1 file changed, 8 insertions(+), 11 deletions(-) diff --git a/accepted/wasmtime-baseline-compilation.md b/accepted/wasmtime-baseline-compilation.md index e07c016..1e78c8a 100644 --- a/accepted/wasmtime-baseline-compilation.md +++ b/accepted/wasmtime-baseline-compilation.md @@ -21,7 +21,7 @@ performance). The introduction of a baseline compiler is a first step towards: (i) faster compilation and startup times (ii) enabling a tiered compilation model in -Wasmtime, similar to what is present in Wasm engines in web browsers. This RFC +Wasmtime, similar to what is present in Wasm engines in Web browsers. This RFC **does not** account for tiered compilation, it only accounts for the introduction of a baseline compiler. @@ -127,7 +127,7 @@ registers”) instead. In the baseline compiler we expect register allocation to occur before invoking the `MacroAssembler`; i.e., when generating the instructions we already know which register we are using for each operand. Doing otherwise (emitting with vregs first and editing later) requires actually -buffering the MachInst structs in memory, which we do not wish to do. +buffering the `MachInst` structs in memory, which we do not wish to do. We don’t expect to make any changes to Cranelift itself beyond the layering refactor to borrow its `MachInst` and `MachBuffer` implementations. In @@ -140,14 +140,11 @@ the scope of this RFC. ### Register Allocation -We plan to implement register allocation in a linear[^2] fashion. - -[^2]: Known as [Linear Scan Register -Allocation](http://web.cs.ucla.edu/~palsberg/course/cs132/linearscan.pdf) +We plan to implement register allocation in a single-pass fashion. The baseline compiler will hold a reference to a register allocator abstraction, which will keep a list of registers, represented by Cranelit's `Reg` -abstraction, per ISA, along with their availability. It will also hold +abstraction, per ISA, along with their availability. It will also hold a reference to a value stack abstraction, to keep track of operands and results and their location as it performs compilation. These are the two key abstractions for register allocation: @@ -172,7 +169,7 @@ A particular value can be tagged as either a: Registers will be requested to the register allocator every time an operation requires it. If no registers are available, the baseline compiler will move -locals and registers to memory, changing their tag to a memory offset, +all locals and all registers to memory, changing their tag to a memory offset, performing what's known as spilling, effectively freeing up registers. Spilling will also be performed at control flow points. To reduce the number of spills, the baseline compiler will also perform limited constant rematerialization. @@ -182,9 +179,9 @@ instruction with an immediate operand would look something like this: ```rust let mut masm = cranelift_asm::x64::MacroAssembler::new(); +let imm = self.value_stack.pop(); // request a general purpose register; // spill if none available -let imm = self.value_stack.pop(); let rd = self.gpr(); masm.add(rd, imm); ``` @@ -242,8 +239,8 @@ We also plan to extend Wasmtime's `run` and `compile` subcommands to support a compiler argument: ```sh -wasmtime compile --compiler= file.wasm -wasmtime run --compiler= file.wasm +wasmtime compile --compiler= file.wasm +wasmtime run --compiler= file.wasm ``` #### Performing compilation From 566ba6279b6d32b8f5e4a2395490a6d379168407 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Sa=C3=BAl=20Cabrera?= Date: Tue, 9 Aug 2022 07:21:59 -0400 Subject: [PATCH 8/9] Explicitly call out `Assembler` --- accepted/wasmtime-baseline-compilation.md | 26 +++++++++++------------ 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/accepted/wasmtime-baseline-compilation.md b/accepted/wasmtime-baseline-compilation.md index 1e78c8a..459bbfa 100644 --- a/accepted/wasmtime-baseline-compilation.md +++ b/accepted/wasmtime-baseline-compilation.md @@ -51,7 +51,7 @@ Winch: WebAssembly Intentionally-Non-Optimizing Compiler and Host * Machine code generation directly from Wasm bytecode – no intermediate representation * Avoid reinventing machine-code emission – use Cranelift's instruction emitter - code to create an assembler library ("MacroAssembler") + code to create an assembler library * Prioritize compilation performance over runtime performance ### High-level overview @@ -60,7 +60,7 @@ Winch: WebAssembly Intentionally-Non-Optimizing Compiler and Host graph TD; A(wasmparser)-->B(cranelift-wasm); A-->C(winch); - C-->D(MacroAssembler); + C-->D(Assembler); D-->X(cranelift-asm); X-->E(MachInst); X-->F(MachBuffer); @@ -68,11 +68,11 @@ Winch: WebAssembly Intentionally-Non-Optimizing Compiler and Host G-->X; ``` -### MacroAssembler and Borrowing from Cranelift +### Assembler and Borrowing from Cranelift We plan to factor out the lower layers of Cranelift that produce and operate on machine code in order to reuse them as a generic assembler library -(“MacroAssembler”). +(“Assembler”). The two key abstractions that will be useful to reuse are the `MachInst` (“machine instruction”) trait and its implementations for each architecture; and @@ -103,39 +103,39 @@ over this that provides for procedural generation of instructions: i.e., one method call for each instruction. Something like: ```rust -let mut masm = cranelift_asm::x64::MacroAssembler::new(); +let mut masm = cranelift_asm::x64::Assembler::new(); masm.add(rd, rm); masm.store(rd, MemArg::base_offset(ra, 64)); ``` This would allow for -fairly natural single-pass code emission. In essence, this is an implementation +fairly natural single-pass code emission. In essence, this is a lower-level approximation of the [MacroAssembler idea](https://searchfox.org/mozilla-central/rev/fa71140041c5401b80a11f099cc0cd0653295e2c/js/src/jit/MacroAssembler.h) from SpiderMonkey. Each architecture will have an implementation of the -MacroAssembler API; perhaps there can be a trait that abstracts commonalities, +Assembler API; perhaps there can be a trait that abstracts commonalities, but enough will be different (e.g., instruction set quirks beyond the usual “add/sub/and/or/not” suspects, x64 two-operand form vs aarch64 three-operand -form, and more) that we expect there to be different `MacroAssembler` types for +form, and more) that we expect there to be different `Assembler` types for each ISA. This in turn implies different lowering code that invokes the -`MacroAssembler` per ISA in the baseline compiler. The lowering code can perhaps +`Assembler` per ISA in the baseline compiler. The lowering code can perhaps share many helpers that are monomorphized on the “common ISA core” trait. In the above examples, we bypass the register-allocation support, i.e. the ability to hold virtual register operands rather than real registers, in the `MachInst`s. This is supported today by passing through `RealReg`s (“real registers”) instead. In the baseline compiler we expect register allocation to -occur before invoking the `MacroAssembler`; i.e., when generating the +occur before invoking the `Assembler`; i.e., when generating the instructions we already know which register we are using for each operand. Doing otherwise (emitting with vregs first and editing later) requires actually buffering the `MachInst` structs in memory, which we do not wish to do. We don’t expect to make any changes to Cranelift itself beyond the layering refactor to borrow its `MachInst` and `MachBuffer` implementations. In -particular we don’t expect to use the `MacroAssembler` wrapper in Cranelift, at +particular we don’t expect to use the `Assembler` wrapper in Cranelift, at least at first, because it will be built around constructing and emitting instructions to machine code right away, without buffering (as in Cranelift’s VCode). It’s possible in the future that we may find other ways to make -`MacroAssembler` generic and leverage it in Cranelift too, but that is beyond +`Assembler` generic and leverage it in Cranelift too, but that is beyond the scope of this RFC. ### Register Allocation @@ -178,7 +178,7 @@ Assuming that we have an immediate at the top of the stack, emitting an add instruction with an immediate operand would look something like this: ```rust -let mut masm = cranelift_asm::x64::MacroAssembler::new(); +let mut masm = cranelift_asm::x64::Assembler::new(); let imm = self.value_stack.pop(); // request a general purpose register; // spill if none available From e51a539c16c15a106c49c461da877853220a8563 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Sa=C3=BAl=20Cabrera?= Date: Fri, 2 Sep 2022 10:45:29 -0400 Subject: [PATCH 9/9] Add new points to the _Design Principles_ section --- accepted/wasmtime-baseline-compilation.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/accepted/wasmtime-baseline-compilation.md b/accepted/wasmtime-baseline-compilation.md index 459bbfa..45a8814 100644 --- a/accepted/wasmtime-baseline-compilation.md +++ b/accepted/wasmtime-baseline-compilation.md @@ -53,6 +53,11 @@ Winch: WebAssembly Intentionally-Non-Optimizing Compiler and Host * Avoid reinventing machine-code emission – use Cranelift's instruction emitter code to create an assembler library * Prioritize compilation performance over runtime performance +* Simple to verify by looking. It should be evident which machine instructions + are emitted per WebAssembly Opcode +* Adding and iterating on new (WebAssembly and developer-facing) features should be simpler + than doing it in an optimizing tier (Cranelift) + ### High-level overview