-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass small structs in parameters instead of memory #88
Comments
I think it's a good idea in general, but perhaps we can simplify things by punting on it until we specify an ABI with full multivalue support. |
This tangentially came up here when @tlively landed an upgrade for the Rust emscripten target to use the LLVM wasm backend instead of the old fastcomp backend. Rust's naive encoding of ABIs on the non-emscripten wasm32-unknown-unknown target did not attempt to force aggregate structures to be passed-by-pointer which meant that prior to rust-lang/rust#63649 passing something like We ended up avoiding changing the ABI for the Rust I wanted to comment here to basically say that I personally think this would be the right way forward. This matches the behavior of ABIs on other systems (C/Linux example for reference) and generally feels like it jives much better with wasm as a target since there's not really any limit to params/results and by passing aggregates through a return pointer it feels like it's making a decision best left up to the actual compiler consuming the wasm. I'd be remiss if I didn't also mention factor that updating In terms of specification, would something like this be that complicated with respect to existing ABIs? I would imagine that the specification would be something like "if an aggregate is passed by value then each of its components is passed by value as a separate argument, recursively" or something like that. I would suspect that native ABIs have to do all sorts of weirdness for limits of registers and such, but wasm has "infinite registers" so it seems like it would be less of an issue. |
I think also need to figure out how we do ABI versioning in the object format before we can do something like this. I guess we could bump the version of linking metadata section which would make old and new object incompatible. Or we could do it for real and fix #54. |
I am currently working on toolchain multivalue support, and I am using Rust's test suite to measure my progress since clang does not yet produce IR that returns structs by value. Once multivalue is stable in the tools, I plan to conduct science on a new C ABI. Not only is multivalue return on the table, but also other arbitrary ABI changes like passing structs in registers. Since I expect a new ABI to depend on multivalue, it would make sense to use the new ABI as the default C ABI if and only if multivalue is supported. When multivalue is supported, the old C API should be available in an opt-in basis for compatibility with MVP library binaries. In the LLVM ecosystem both target features and ABIs are properties of individual functions, so a single object file could use both the new and old ABIs. An object-wide ABI version or target feature policy controlling the ABI therefore doesn't make much sense, but per-object ABI metadata might be useful for producing more helpful linker errors. Alternatively, we could construct the new ABI such that ABI mismatches always produce type mismatches that the linker already knows how to report. According to that plan, we would keep the current ABI around as long as we cared about MVP compatibility, which is potentially forever. @alexcrichton does this sound reasonable to you? |
I think it definitely makes sense to reconsider the default ABI with the multi-value feature enabled, yeah. I would indeed want to make sure that this change (passing small structs via "registers") would make it in as part of the ABI for that. As to what to do about the current ABI, I think it sort of depends on how quickly multi-value is adopted in engines. If it shows up quickly I think most wasm producers could reasonably default to multi-value, but we've still got a potentially year-long transitionary period in the meantime. Basically the crux of the issue was that the ABI that wasm-bindgen wants disagrees with the already-there ABI in Clang, and that's something we want to fix. Long-term with multi-value it's highly likely to be fixed, but we still have a potentially long transitionary period where the ABIs would disagree. I don't want the difficulty of work in wasm-bindgen to hold anything back, but I also don't want "whatever clang happened to implement" to cause a bunch of work for everyone else. In the meantime the use case for mixing raw clang code (not emscripten, just bare clang) and Rust is relatively limited and doesn't come up all that often. The major target for that, wasi, is probably one where we should update rustc to match clang's ABI today since there's not a preexisting tool like wasm-bindgen which targets that. |
I don't know if calling the current ABI "whatever clang happened to implement" is fair. IIUC the current ABI was indeed developed alongside the clang/llvm work but the decision making process there was far from arbitrary or ad-hoc. I'd like to see this change happen one day, but it is a breaking change and I imagine non-trivial to change in llvm(?). I'm curious what @sunfishcode thinks? Is this worth doing now or delaying until other changes such as mutli-value require an ABI change anyway? Our of interest, why do clang and wasm-bindgen need to agree here? Or rather, why to C and rust need to agree? Whats is the benefit we get? Isn't there already rust<->C ABI FFI layer in rust that would allow for this? Or would this change just allow for such as layer to be more efficient? |
Ah sorry, I didn't mean to mischaracterize! I wasn't aware of the process of creating the current ABI. The need for interop is from when you're liking Rust/C into the same wasm blob. The Rust signature will declare what C does (and vice versa) and for it to work right we'd need those two encoded the same way. This basically is the FFI layer of how Rust/C communicate which is why the problem comes up. The nitty-gritty is that @tlively's recently added The differences there mean that when mixing C/Rust inside of wasm-bindgen a C file that does: struct Foo { int a; int b; };
void foo(Foo foo) { /* ... */ } is incompatible with the analagous Rust import: #[repr(C)]
struct Foo { a: i32, b: i32 }
extern "C" {
fn foo(foo: Foo);
} (because Rust passes in registers but C expects a pointer) |
So, I do care about the ability to mix C and Rust in wasm, using "bare" wasm rather than wasi. If this is an ABI incompatibility, would it potentially make sense to incorporate ABI versioning into the target name? We could have a (The downside would be teaching other tools about those ABI versions.) |
Currently the only wasm target using the correct C ABI is wasm32-unknown-emscripten. WASI uses the wasm-bindgen-compatible C ABI. I haven't experimented with this or had enough design discussions about it yet, but here's my current thinking about how we should introduce a new ABI. I don't want end users to have to think about ABIs in the common case, so I don't want to introduce mandatory ABI specifiers in the target strings. Optional ABI specifiers would be fine. A key feature of the new ABI will be that it requires multivalue, so I was thinking a reasonable default would be to use the new ABI when multivalue is enabled and the old ABI when multivalue is not enabled. We can also play games in the target features section in object files to have the linker emit a helpful error if you try to link objects with incompatible ABIs accidentally. This is actually going to be much harder in C/C++ than in Rust because Rust can store ABI in its unstable interface metadata, allowing seamless calls between old- and new-ABI functions. But C/C++ will need a way to inject that same ABI information into header files, possibly with a pragma or something. |
I'd also want to clarify that I would hate for This issue is the only concrete reason why we haven't switched Rust's wasm32-unknown-unknown ABI (AFAIK). The Another possible solution might be to add a |
Agree. Thankfully we were thinking about how to introduce a new ABI independently from the wasm-bindgen issue. The wasm-bindgen ABI mismatch certainly makes the ABI work more important and useful, but doesn't change the design goals AFAICT.
I appreciate the convenience of this solution, but I don't think this is something we would want to upstream. I would like to avoid having more C ABIs than absolutely necessary, and I want the ABIs we do have to be deliberately designed for performance and code size rather than added for convenience. |
Currently clang will pass any structure with more than a single element indirectly (as LLVM byval). Many other ABIs pass small structures in registers, potentially improving performance. We could do the same thing, but choosing the upper limit on the size (and therefore the number of registers to use) is a little more interesting, since we don't know how many physical registers the underlying machine will have (let alone have available for passing arguments). But it's also true that linear memory references may be more expensive if there is bounds checking.
It seems worthwhile to at least use 2 params since most calling conventions are likely to be able to handle that in most situations, and pairs are fairly common.
If we do this it will make the ABI a lot more complex to specify, since there will be classification rules about when this would apply (both AMD64 SysV and AAPCS are fairly complex in this regard). But it seems worth it. We would probably need slightly different rules for promotion than those architectures since we have both 32-bit and 64-bit argument types.
I'll think a little more about the details, but does anyone have strong opinions about whether this is likely to be a good idea?
The text was updated successfully, but these errors were encountered: