Closed
Description
Booleans, u8, i8, u16 and i16 are optimized poorly it seems. There is a lot of masking going on at the moment that is not necessary.
Consider the following two functions:
export function add16(x: i16, y: i16): i16 {
return x + y;
}
export function add32(x: i32, y: i32): i32 {
return x + y;
}
They get compiled down to:
(func $main/add16 (export "add16") (type $t0) (param $p0 i32) (param $p1 i32) (result i32)
get_local $p0
get_local $p1
i32.add
i32.const 16
i32.shl
i32.const 16
i32.shr_s
return)
(func $main/add32 (export "add32") (type $t0) (param $p0 i32) (param $p1 i32) (result i32)
get_local $p0
get_local $p1
i32.add
return)
As you can see, the add16 function has a lot of masking operations, while the add32 is just a pure addition. Compare this to the following Rust code:
#[no_mangle]
pub extern fn add16(x: i16, y: i16) -> i16 {
x + y
}
#[no_mangle]
pub extern fn add32(x: i32, y: i32) -> i32 {
x + y
}
Which gets compiled down to the following code:
(func $add16 (export "add16") (type $t0) (param $p0 i32) (param $p1 i32) (result i32)
get_local $p1
get_local $p0
i32.add)
(func $add32 (export "add32") (type $t0) (param $p0 i32) (param $p1 i32) (result i32)
get_local $p1
get_local $p0
i32.add)
As you can see, they are the same, as adding two i16 values is the same as adding two i32 values, because the upper 16-bits of the 32-bit wide "register" is getting masked away anyway when storing the 16-bit wide value in memory with a i32.store16_u instruction. So there's only very few instructions where actual masking is necessary (like dividing).