Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wasm] Webcil-in-WebAssembly #85932

Merged
merged 31 commits into from
May 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
310aa3e
Add option to emit webcil inside a wasm module wrapper
lambdageek May 4, 2023
2f2459d
[mono][loader] implement a webcil-in-wasm reader
lambdageek May 8, 2023
7be51f8
reword WebcilWasmWrapper summary comment
lambdageek May 8, 2023
2e58062
fix whitespace
lambdageek May 8, 2023
056b9ac
fix typos
lambdageek May 8, 2023
897df89
visit_section should bump the ptr after traversal
lambdageek May 8, 2023
15a1894
remove extra bytes from wasm webcil prefix
lambdageek May 9, 2023
1688dec
don't forget to include number of segments in the data section
lambdageek May 9, 2023
d8638f7
update the Webcil spec to include the WebAssembly wrapper module
lambdageek May 9, 2023
e8f52bb
fix typos and whitespace
lambdageek May 9, 2023
f06938f
advance endp past the data segment payload
lambdageek May 9, 2023
0f5e02f
Adjust RVA map offsets to account for wasm prefix
lambdageek May 9, 2023
4271d8f
Add a note about the rva mapping to the spec
lambdageek May 9, 2023
2c90fb5
Serve webcil-in-wasm as .wasm
lambdageek May 9, 2023
307ae9a
fix wbt
lambdageek May 11, 2023
930cfeb
remove old .webcil support from Sdk Pack Tasks
lambdageek May 11, 2023
7670b3c
Set SelfContained=true for browser-wasm runtimes (#86102)
lewing May 11, 2023
d17b8dd
Implement support for webcil in wasm in the managed WebcilReader
lambdageek May 11, 2023
61afae7
why fail?
lambdageek May 12, 2023
8a42b70
did we load the same asm twice??
lambdageek May 12, 2023
ccd074e
Merge remote-tracking branch 'origin/main' into webcil-wasm-wrapper
lambdageek May 15, 2023
6166a67
checkpoint. things are broken. but I adjusted MonoImage:raw_data
lambdageek May 12, 2023
eedf447
align webcil payload to a 4-byte boundary within the wasm module
lambdageek May 15, 2023
d9b396e
remove WIP tracing
lambdageek May 16, 2023
89ef589
assert that webcil raw data is 4-byte aligned
lambdageek May 16, 2023
d9a3fea
revert unrelated build change
lambdageek May 16, 2023
6de7dee
revert unrelated change
lambdageek May 16, 2023
88f956e
revert whitespace
lambdageek May 16, 2023
0542c51
revert WBT debugging output changes
lambdageek May 16, 2023
7c0643d
add 4-byte alignment requirement to the webcil spec
lambdageek May 16, 2023
e187b00
Don't modify MonoImageStorage:raw_data
lambdageek May 16, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
84 changes: 74 additions & 10 deletions docs/design/mono/webcil.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,83 @@

## Version

This is version 0.0 of the Webcil format.
This is version 0.0 of the Webcil payload format.
This is version 0 of the WebAssembly module Webcil wrapper.

## Motivation

When deploying the .NET runtime to the browser using WebAssembly, we have received some reports from
customers that certain users are unable to use their apps because firewalls and anti-virus software
may prevent browsers from downloading or caching assemblies with a .DLL extension and PE contents.

This document defines a new container format for ECMA-335 assemblies
that uses the `.webcil` extension and uses a new WebCIL container
format.
This document defines a new container format for ECMA-335 assemblies that uses the `.wasm` extension
and uses a new WebCIL metadata payload format wrapped in a WebAssembly module.


## Specification

### Webcil WebAssembly module

Webcil consists of a standard [binary WebAssembly version 0 module](https://webassembly.github.io/spec/core/binary/index.html) containing the following WAT module:

``` wat
(module
(data "\0f\00\00\00") ;; data segment 0: payload size as a 4 byte LE uint32
(data "webcil Payload\cc") ;; data segment 1: webcil payload
(memory (import "webcil" "memory") 1)
(global (export "webcilVersion") i32 (i32.const 0))
(func (export "getWebcilSize") (param $destPtr i32) (result)
local.get $destPtr
i32.const 0
i32.const 4
memory.init 0)
(func (export "getWebcilPayload") (param $d i32) (param $n i32) (result)
local.get $d
i32.const 0
local.get $n
memory.init 1))
```

That is, the module imports linear memory 0 and exports:
* a global `i32` `webcilVersion` encoding the version of the WebAssembly wrapper (currently 0),
* a function `getWebcilSize : i32 -> ()` that writes the size of the Webcil payload to the specified
address in linear memory as a `u32` (that is: 4 LE bytes).
* a function `getWebcilPayload : i32 i32 -> ()` that writes `$n` bytes of the content of the Webcil
payload at the spcified address `$d` in linear memory.

The Webcil payload size and payload content are stored in the data section of the WebAssembly module
as passive data segments 0 and 1, respectively. The module must not contain additional data
segments. The module must store the payload size in data segment 0, and the payload content in data
segment 1.

The payload content in data segment 1 must be aligned on a 4-byte boundary within the web assembly
module. Additional trailing padding may be added to the data segment 0 content to correctly align
data segment 1's content.

(**Rationale**: With this wrapper it is possible to split the WebAssembly module into a *prefix*
consisting of everything before the data section, the data section, and a *suffix* that consists of
everything after the data section. The prefix and suffix do not depend on the contents of the
Webcil payload and a tool that generates Webcil files could simply emit the prefix and suffix from
constant data. The data section is the only variable content between different Webcil-encoded .NET
assemblies)

(**Rationale**: Encoding the payload in the data section in passive data segments with known indices
allows a runtime that does not include a WebAssembly host or a runtime that does not wish to
instantiate the WebAssembly module to extract the payload by traversing the WebAssembly module and
locating the Webcil payload in the data section at segment 1.)

(**Rationale**: The alignment requirement is due to ECMA-335 metadata requiring certain portions of
the physical layout to be 4-byte aligned, for example ECMA-335 Section II.25.4 and II.25.4.5.
Aligning the Webcil content within the wasm module allows tools that directly examine the wasm
module without instantiating it to properly parse the ECMA-335 metadata in the Webcil payload.)

(**Note**: the wrapper may be versioned independently of the payload.)


### Webcil payload

The webcil payload contains the ECMA-335 metadata, IL and resources comprising a .NET assembly.

As our starting point we take section II.25.1 "Structure of the
runtime file format" from ECMA-335 6th Edition.

Expand All @@ -40,12 +102,12 @@ A Webcil file follows a similar structure
| CLI Data |
| |

## Webcil Headers
### Webcil Headers

The Webcil headers consist of a Webcil header followed by a sequence of section headers.
(All multi-byte integers are in little endian format).

### Webcil Header
#### Webcil Header

``` c
struct WebcilHeader {
Expand Down Expand Up @@ -75,11 +137,11 @@ The next pairs of integers are a subset of the PE Header data directory specifyi
of the CLI header, as well as the directory entry for the PE debug directory.


### Section header table
#### Section header table

Immediately following the Webcil header is a sequence (whose length is given by `coff_sections`
above) of section headers giving their virtual address and virtual size, as well as the offset in
the Webcil file and the size in the file. This is a subset of the PE section header that includes
the Webcil payload and the size in the file. This is a subset of the PE section header that includes
enough information to correctly interpret the RVAs from the webcil header and from the .NET
metadata. Other information (such as the section names) are not included.

Expand All @@ -92,11 +154,13 @@ struct SectionHeader {
};
```

### Sections
(**Note**: the `st_raw_data_ptr` member is an offset from the beginning of the Webcil payload, not from the beginning of the WebAssembly wrapper module.)

#### Sections

Immediately following the section table are the sections. These are copied verbatim from the PE file.

## Rationale
### Rationale

The intention is to include only the information necessary for the runtime to locate the metadata
root, and to resolve the RVA references in the metadata (for locating data declarations and method IL).
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.

using System;
using System.Collections.Immutable;
using System.IO;
using System.Reflection;
using System.Runtime.InteropServices;
using System.Text;

namespace Microsoft.NET.WebAssembly.Webcil;

internal class WasmModuleReader : IDisposable
{
public enum Section : byte
{
// order matters: enum values must match the WebAssembly spec
Custom,
Type,
Import,
Function,
Table,
Memory,
Global,
Export,
Start,
Element,
Code,
Data,
DataCount,
}

private readonly BinaryReader _reader;

private readonly Lazy<bool> _isWasmModule;

public bool IsWasmModule => _isWasmModule.Value;

public WasmModuleReader(Stream stream)
{
_reader = new BinaryReader(stream, Encoding.UTF8, leaveOpen: true);
_isWasmModule = new Lazy<bool>(this.GetIsWasmModule);
}


public void Dispose()
{
Dispose(true);
}


protected virtual void Dispose(bool disposing)
{
if (disposing)
{
_reader.Dispose();
}
}

protected virtual bool VisitSection (Section sec, out bool shouldStop)
{
shouldStop = false;
return true;
}

private const uint WASM_MAGIC = 0x6d736100u; // "\0asm"

private bool GetIsWasmModule()
{
_reader.BaseStream.Seek(0, SeekOrigin.Begin);
try
{
uint magic = _reader.ReadUInt32();
if (magic == WASM_MAGIC)
return true;
} catch (EndOfStreamException) {}
return false;
}

public bool Visit()
{
if (!IsWasmModule)
return false;
_reader.BaseStream.Seek(4L, SeekOrigin.Begin); // skip magic

uint version = _reader.ReadUInt32();
if (version != 1)
return false;

bool success = true;
while (success) {
success = DoVisitSection (out bool shouldStop);
if (shouldStop)
break;
}
return success;
}

private bool DoVisitSection(out bool shouldStop)
{
shouldStop = false;
byte code = _reader.ReadByte();
Section section = (Section)code;
if (!Enum.IsDefined(typeof(Section), section))
return false;
uint sectionSize = ReadULEB128();

long savedPos = _reader.BaseStream.Position;
try
{
return VisitSection(section, out shouldStop);
}
finally
{
_reader.BaseStream.Seek(savedPos + (long)sectionSize, SeekOrigin.Begin);
}
}

protected uint ReadULEB128()
{
uint val = 0;
int shift = 0;
while (true)
{
byte b = _reader.ReadByte();
val |= (b & 0x7fu) << shift;
if ((b & 0x80u) == 0) break;
shift += 7;
if (shift >= 35)
throw new OverflowException();
}
return val;
}

protected bool TryReadPassiveDataSegment (out long segmentLength, out long segmentStart)
{
segmentLength = 0;
segmentStart = 0;
byte code = _reader.ReadByte();
if (code != 1)
return false; // not passive
segmentLength = ReadULEB128();
segmentStart = _reader.BaseStream.Position;
// skip over the data
_reader.BaseStream.Seek (segmentLength, SeekOrigin.Current);
return true;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ FilePosition SectionStart

private string InputPath => _inputPath;

public bool WrapInWebAssembly { get; set; } = true;

private WebcilConverter(string inputPath, string outputPath)
{
_inputPath = inputPath;
Expand All @@ -62,6 +64,26 @@ public void ConvertToWebcil()
}

using var outputStream = File.Open(_outputPath, FileMode.Create, FileAccess.Write);
if (!WrapInWebAssembly)
{
WriteConversionTo(outputStream, inputStream, peInfo, wcInfo);
}
else
{
// if wrapping in WASM, write the webcil payload to memory because we need to discover the length

// webcil is about the same size as the PE file
using var memoryStream = new MemoryStream(checked((int)inputStream.Length));
WriteConversionTo(memoryStream, inputStream, peInfo, wcInfo);
lambdageek marked this conversation as resolved.
Show resolved Hide resolved
memoryStream.Flush();
var wrapper = new WebcilWasmWrapper(memoryStream);
memoryStream.Seek(0, SeekOrigin.Begin);
wrapper.WriteWasmWrappedWebcil(outputStream);
}
}

public void WriteConversionTo(Stream outputStream, FileStream inputStream, PEFileInfo peInfo, WCFileInfo wcInfo)
{
WriteHeader(outputStream, wcInfo.Header);
WriteSectionHeaders(outputStream, wcInfo.SectionHeaders);
CopySections(outputStream, inputStream, peInfo.SectionHeaders);
Expand Down Expand Up @@ -210,7 +232,7 @@ private static void WriteStructure<T>(Stream s, T structure)
}
#endif

private static void CopySections(FileStream outStream, FileStream inputStream, ImmutableArray<SectionHeader> peSections)
private static void CopySections(Stream outStream, FileStream inputStream, ImmutableArray<SectionHeader> peSections)
{
// endianness: ok, we're just copying from one stream to another
foreach (var peHeader in peSections)
Expand Down
Loading