-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[wasm] Webcil-in-WebAssembly #85932
[wasm] Webcil-in-WebAssembly #85932
Conversation
Tagging subscribers to 'arch-wasm': @lewing Issue DetailsDefine a WebAssembly module wrapper for Webcil assemblies. WhyIn some settings serving HowEssentially we serve this WebAssembly module: (module
(data "\0f\00\00\00") ;; data segment 0: payload size
(data "webcil Payload\cc") ;; data segment 1: webcil payload
(memory (import "webcil" "memory") 1)
(global (export "webcilVersion") i32 (i32.const 0))
(func (export "getWebcilSize") (param $destPtr i32) (result)
local.get $destPtr
i32.const 0
i32.const 4
memory.init 0)
(func (export "getWebcilPayload") (param $d i32) (param $n i32) (result)
local.get $d
i32.const 0
local.get $n
memory.init 1)) The module exports two WebAssembly functions So a runtime or tool that wants to consume the webcil module can do something like: const wasmModule = new WebAssembly.Module (...);
const wasmMemory = new WebAssembly.Memory ({initial: 1});
const wasmInstance =
new WebAssembly.Instance(wasmModule, {webcil: {memory: wasmMemory}});
const { getWebcilPayload, webcilVersion, getWebcilSize } = wasmInstance.exports;
console.log (`Version ${webcilVersion.value}`);
getWebcilSize(0);
const size = new Int32Array (wasmMemory.buffer)[0]
console.log (`Size ${size}`);
console.log (new Uint8Array(wasmMemory.buffer).subarray(0, 20));
getWebcilPayload(4, size);
console.log (new Uint8Array(wasmMemory.buffer).subarray(0, 20)); How (Part 2)But actually, we will define the wrapper to consist of exactly 2 data segments in the WebAssembly data section: segment 0 is 4 bytes and encodes the webcil payload size; and segment 1 is of variable size and contains the webcil payload. So to load a webcil-in-wasm module, the runtime gets the raw bytes of the WebAssembly module (ie: without instantiating it), and parses it to find the data section, assert that there are 2 segments, ensure they're both passive, and get the data directly from segment 1. Remaining work
|
src/libraries/Microsoft.NET.WebAssembly.Webcil/src/Webcil/WebcilConverter.cs
Show resolved
Hide resolved
// length of the webcil payload. segment 1 is of a variable size and contains the webcil payload. | ||
// | ||
// the unchanging parts are stored as a "prefix" and "suffix" which contain the bytes for the following | ||
// WAT program, split into the parts that come before the data section, and the bytes that come after: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: i think in this case it's a WAT module; it's not a program since it isn't actually executable in a meaningful sense
src/libraries/Microsoft.NET.WebAssembly.Webcil/src/Webcil/WebcilWasmWrapper.cs
Show resolved
Hide resolved
798112c
to
2aff523
Compare
|
||
gboolean stop = FALSE; | ||
while (success && !stop && ptr < boundp) { | ||
success = visit_section (ptr, boundp, &ptr, visitor, user_data, &stop); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems to implicitly rely on visit_section to advance to the end of the section, and visit_section seems to implicitly rely on section_visitor to do that. am I misunderstanding it? I think it would be ideal if visit_section always ensured that endp ended up in the right place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oooh. good catch. actually I explicitly wanted visit_section
to advance ptr
and not to rely on the visitor (which is "user code" - and I don't actually want ot trust it to do the traversal). There's a missing line in visit_section
to bump ptr
before returning
This was a copy/paste mistake from my toy standalone prototype
src/libraries/Microsoft.NET.WebAssembly.Webcil/src/Webcil/WebcilWasmWrapper.cs
Show resolved
Hide resolved
WBT tests should work now... |
Also works with Blazor (at least as far as network loading is concerned. actually running tests is blocked on dotnet/installer#16318 - which includes dotnet/aspnetcore#48067) |
f9936b6
to
f879fcd
Compare
Ok this is ready for review |
debugger tests are working in manual testing. next steps:
|
I think I somehow managed to make the runtime load both "System.Private.CoreLib.dll" and "System.Private.CoreLib.wasm"... otherwise I don't understand how despite a |
cfd08f5
to
8a42b70
Compare
I wonder is this is related to ICU data files or to some satellite assemblies ? |
ICU data files are loaded fine in the test. We are loading custom ICU file that does not contain fr-FR. We are also setting |
As far as I can tell it's the same exception as before. I think what is happening is that the webcil packaging is somehow allowing the runtime to load the same assembly (CoreLib) twice. So somehow the I think I see how it is happening. I'm going to try to repro locally |
I can repro locally, but I think I was wrong about what is going wrong - or at least i'm getting a different-seeming failure locally. I think getting resource streams from assemblies isn't working right when webcil-in-wasm (it does appear to work right with standalone .webcil. So it's something about offsets into a |
I think I understand what the issue was. It wasn't resources (although there was an issue with resources) and it wasn't somehow loading the same assembly multiple times. The issue is alignment. When we load webcil-in-wasm by reading the wasm module directly, the webcil payload wasn't aligning on a good bounadry. The binary WASM format doesn't really take care to align data segments on multi-byte boundaries - and the uleb128 encoding of all the internal data sizes would make that hard to achieve, anyway. On the other hand, ECMA-335 does care about the internal boundaries:
As a result, the code for parsing a the exception clause information for an ECMA-335 method header was doing a call to align the input pointer to a 4-byte boundary and ending up at an unexpected offset. Which resulted in mis-parsing the header and not setting up a correct catch clause for the runtime/src/mono/mono/metadata/metadata.c Lines 4157 to 4158 in 63e6777
The solution is to change the spec slightly and use data segment 0 not only to carry the size of the webcil payload, but also to add extra padding bits so that the content of data segment 1 (not segment 1's header!) ends up on a 16-byte boundary (smaller is probably ok, but 16 is probably a safer choice). (We don't want to somehow pad out segment 1's data because in the case where we do use module instantiation then we wouldn't want the segment 1 data to be extracted by Adding padding to segment 0 is ok because we only want the first 4 bytes of it anyway. |
Add padding to data segment 0 to ensure that data segment 1's payload (ie the webcil content itself) is 4-byte aligned
pushed a commit to fix the alignment problem. also pushed some debugging junk, so that will need to be backed out before we merge. but I want to see how CI does. Also there's a fix here to ensure the debugger sees the debugging sections of the webcil image. The upshot is that for in-memory MonoImageStorage, the webcil loader will bump |
d1b27c0
to
15f6666
Compare
instead just keep track of the webcil offset in the MonoImageStorage. This introduces a situation where MonoImage:raw_data is different from MonoImageStorage:raw_data. The one to use for accessing IL and metadata is MonoImage:raw_data. The storage pointer is just used by the image loading machinery
b1a9934
to
e187b00
Compare
WBT is passing 🎉 but over in the other PR #86330 some debugger tests are failing, I want to get those fixes in here before merging this PR . (I couldn't figure out how to add a new variant of the debugger runs yet - so I'm testing webcil-in-wasm debugger support by changing the default packaging and running the "normal" debugger lane). |
Stepping tests were failing with plain .webcil also. it's a problem in the debugger's webcil parser where it is calling This one is |
Define a WebAssembly module wrapper for Webcil assemblies.
Contributes to #80807
Why
In some settings serving
application/octet-stream
data, or files with weird extensions will trigger firewalls or AV tools. But let's assume that if you're interested in deploying a .NET WebAssembly app, you're in an environment that can at least serve WebAssembly modules.How
Essentially we serve this WebAssembly module:
The module exports two WebAssembly functions
getWebcilSize
andgetWebcilPayload
that write some bytes (being the size or payload of the webcil assembly) to the linear memory at a given offset. The module also exports the constantwebcilVersion
to version the wrapper format.So a runtime or tool that wants to consume the webcil module can do something like:
How (Part 2)
But actually, we will define the wrapper to consist of exactly 2 data segments in the WebAssembly data section: segment 0 is 4 bytes and encodes the webcil payload size; and segment 1 is of variable size and contains the webcil payload.
So to load a webcil-in-wasm module, the runtime gets the raw bytes of the WebAssembly module (ie: without instantiating it), and parses it to find the data section, assert that there are 2 segments, ensure they're both passive, and get the data directly from segment 1.
Remaining work