-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Post-MVP host mappings for GC objects #496
Comments
You won't usually be able to map Wasm arrays to language arrays directly, as they are more low-level. Most languages have (different) extra features that Wasm arrays do not have, such as built-in hash ids, possibly extra fields or methods, other typing rules, compatibility with other language-specific types. Some also have fewer, like C or Rust. There is no magic high-level interop between an assembly-level language and a higher-level host language. |
Thanks for the feedback, and I agree with your comment that
My thinking re 0-copy is that the actual data pointed to by wasm arrays should not be different from the data pointed to by host language arrays i.e. a contiguous array of bytes of size greater or equal than n_elements * sizeof<array_element_type>. Is that incorrect ? Moreover, since the host is responsible for both the wasm array implementation and the host array implementation, it doesn't seem unfair to expect the host to deal with the mapping between them, and directly access the underlying data instead of copying it. Unlike with strings, we're not looking for cross-language interoperability (or should I say, wasm arrays interoperability is already specified by the GC spec). As a concrete example, I wouldn't be shocked if a JS engine, when asked to create a wasm i32 array, actually instantiated an Int32Array and wrapped it into a So there wouldn't be a direct mapping of Wasm arrays to host language arrays, but the cost of calling ToJSValue/FromJSValue for arrays would be minimal and its complexity would be O¹ rather Oⁿ. |
The cost of ToJSValue and back is always O(1). But you're typically not getting a value within the host language's ordinary set of types. I vaguely remember previous discussions about mapping Wasm arrays to typed arrays in JS. IIRC it wasn't obvious how to do that without performance penalties and semantic issues, for example, because multiple typed arrays can share the same array buffer in JS, which makes no sense from the Wasm perspective and breaks Wasm semantics. At best, Wasm arrays correspond to JS array buffers, but then there is the whole mess with detached buffers etc. |
Let me rephrase this:
I'm not sure that 'Wasm semantics would be broken by JS typed arrays' (which are backed by array buffers). Rather since JS allows the developer to treat the underlying data the way they wish, that potentially breaks the data, not the semantics imho i.e. if a developer creates an Int32Array from an ArrayBuffer that contains Uint8s, then writing i32s[0] = -1 will break the ui8s at 0,1,2,3. But it doesn't break the ability of the Uint8Array to know its length and provide read/write access to elements, which are still valid uint8s, all equal to 255. Do you have a specific example in mind where semantics would be broken ? Forbidding detached buffers on JS arrays shared with WebAssembly sound like a very acceptable limitation imho, especially given the benefits i.e. performance, and being able to read the array data without copying it, and if mutable, resize it and write to it using host language syntax. |
Implementing Wasm i32 arrays as JS Int32Arrays under the hood would be possible, I think, but it would be quite a bit slower than our (V8's) current strategy of implementing Wasm arrays differently (with much less overhead). JS TypedArrays are surprisingly heavyweight in terms of both memory overhead and access performance. (Yes they're fast, but WasmGC arrays are faster.) As long as Int32Arrays and Wasm i32 arrays are implemented as different in-memory object layouts (for the benefit of the latter!), there isn't going to be any zero-copy conversion between them. Making them interchangeable would (at best!) mean settling for the slower of the two designs -- and at worst make that even slower, if it then needs to distinguish more internal cases. That said, for single-element access, the status quo is totally fine: when you export an array getter function For convenience, it would be nice to support a richer syntax for JS/Wasm interaction, but for performance, it isn't necessary. |
It's the fact that two typed arrays with different identity can still alias each other, which is observable through mutation, but not allowed by the Wasm array semantics.
I'm not sure what you mean by that, as far as Wasm is concerned, it's just bits being written, and every bit pattern is legal under either type. |
@jakobkummerow thanks for the insights, and great to hear that (at least in javascript/v8) there would be less performance impact using single-element access than there would be in converting wasm arrays to host language ones. That said, I tend to also value convenience. Afaiu, in the current state of the spec, every provider of a wasm that wants to give access to a wasm array would have to create and export those functions, for the consumers of that wasm to call them. These consumers would have to know which specific methods to call in each specific wasm ? Whereas if these were made available as part of the GC spec (and if the host language implements them!), then that problem would go away. Consider for example the following Typescript code: class WasmArrayHandler implements ProxyHandler<object> {
static isValidArrayIndex(key: string) {
try {
const index = parseInt(key);
return index >= 0;
} catch {
return false;
}
}
static isValidArrayKey(key: string) {
return key=="length" || WasmArrayHandler.isValidArrayIndex(key);
}
get(target: object, key: string): any {
return WasmArrayHandler.isValidArrayKey(key) ? WebAssembly.get(target, key) : undefined;
}
}
abstract class ArrayProxy {
static of<T>(target: object): T[] {
return new Proxy(target, new WasmArrayHandler()) as T[];
}
}
const wasm_array = someFunctionReturningAWasmI32Array();
const proxied = ArrayProxy.of(wasm_array);
const value = proxied.length;
const item = proxied[2];
proxied[0] = 27;
For that code to work, it could indeed rely on the specific wasm to export an array.getter function. This may smell like syntactic sugar. It's not. It's about standardizing access from the host language to wasm array elements and properties, which is a much smaller problem than doing so for structs, and as such might deserve a dedicated discussion. |
Just to clarify, the intent is that the Wasm JS API will eventually be extended with classes and functions that give full direct access to Wasm GC objects. That is possible without turning Wasm arrays into JS typed arrays. The main reason we deferred the API were the many open questions, things like handling JS prototypes are notoriously sensitive, and doing them wrong might harm coherence, JS usability, or Wasm performance. |
I think that "philosophically" you only want to exchange data, not behavior. Java Records rather than Class instances (I appreciate this distinction is not available in many languages). |
Reading the discussions re mappings between GC objects and host objects, I see that they focus mostly on structs.
I'm wondering if it would make sense to treat arrays separately. The reasons for that are:
As an example, an i32 unpacked array would map to:
...
similarly, an i32 array packed using Int8 would map to:
...
Would it make sense to submit a draft Post-MVP PR focused on array mapping ?
The text was updated successfully, but these errors were encountered: