Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap to WebAssembly support #12002

Open
lbguilherme opened this issue Apr 15, 2022 · 6 comments
Open

Roadmap to WebAssembly support #12002

lbguilherme opened this issue Apr 15, 2022 · 6 comments

Comments

@lbguilherme
Copy link
Contributor

lbguilherme commented Apr 15, 2022

This is intended as an umbrella issue to organize high-level WebAssembly progress and goals.

Why?

WebAssembly is a new standard for a compilation target that is quickly growing in popularity, not only on the Web. It offers portability to run anywhere with near-native speed (web browsers, cloud servers, embedded devices, plugins, blockchains, etc), it allows different languages to interoperate in a convenient format, and it is secure and verifiable before execution. Startup time is also faster than a Docker container. For more details please refer to this excellent article: Pay attention to WebAssembly by Harshal Sheth.

How?

I believe Crystal should aim to support WebAssembly as a first-class platform both for writing complete applications and for writing plugins for existing applications. By complete application, I mean a Crystal app that interacts primarily by IO operations with sockets or files and has a clear start and finalization lifetime, like an ordinary process. This would primarily happen with the WASI library interface. By plugin, I mean a Crystal module that imports and exports some functions that can be called by the loader application, interacting with it. Different from a shared library on native targets (that Crystal isn't good at targetting because the GC and stdlib like to have full control over the process' IO and memory), a WebAssembly module is isolated from the application that loads it, with independent memory and system operations.

Some languages that target WebAssembly very well are:

  • Rust
  • AssemblyScript
  • C
  • C++
  • Go
  • Grain

Many other languages (like Python, Java, or .NET) support WebAssembly as well, some through interpreters. Here is an up-to-date list: https://github.com/appcypher/awesome-wasm-langs.

There are three common targets for WebAssembly:

  • WASI: A module that imports the functions from the WASI interface and/or the wasi-libc to interact with files, sockets, standard IO, environment variables, clock, etc. It is POSIX-like, but very restrictive and sandboxed. Runtimes like Wasmer or Wasmtime offer native integration with WASI. It is also common on some cloud serverless platforms. Most, but not all, of the stdlib can be implemented with WASI.
  • Unknown: A module that doesn't import anything and is expected to run anywhere. It has no IO or system operations of any kind it is limited to computation and memory operations. It can import functions from some custom runtime, but the primary goal is to export some functions to be called by other applications. This is common on plugins and on the Web, to speed-up algorithms that don't depend on the system, like media processing, parsers, etc. Just a limited subset of the stdlib would be available on this target.
  • JavaScript: Many runtimes that execute WebAssembly also execute JavaScript, and they offer a lot of functionality that is not exposed to the module by default. Some compilers adopted the practice of generating both a WebAssembly module with the majority of the application and a JavaScript glue code to load the WebAssembly and ease the interop between the languages. Emscripten does that for C/C++ and wasm-bindgen for Rust does the same. This could allow almost the entire stdlib to work, and also allow calling JavaScript from Crystal and the other way around. I currently believe this target is better suited as a shard instead of part of the core compiler and I'm currently working on an implementation of it: https://github.com/lbguilherme/crystal-js.

Additionally, Crystal's stdlib depends on some libraries to be able to fully run: libc, libevent, libgc, libpcre, libgmp, libxml, etc. Those need to be compiled to WebAssembly and then linked to the Crystal app during the build process. It is still unclear if all of them support WebAssembly.

What are the challenges ahead?

The first step was released with Crystal 1.4, a few days ago: targeting WebAssembly and offering a WASI target with a subset of the stdlib working. It is still experimental and needs work to be finalized. See #10870.

Fibers and concurrency

In WebAssembly the stack of protected and cannot be read or manipulated in any way. In fact, LLVM creates a shadow stack in the main memory to store stack variables whose pointer is taken at some point. For concurrency with Fibers, channels, and non-blocking IO, Crystal needs to keep multiple stacks and switch between them. This simply isn't possible in plain WebAssembly.

Here we can take advantage of the Bynarien Asyncify code transformation pass. It does static code transformations on a compiled WebAssembly file to allow the stack to be manipulated and swapped, at the cost of some performance and code size. The comments on top of the source file have some nice explanations on how it works.

The Fiber.swapcontext method would be implemented by storing the next Fiber in a global variable and then begin the unwinding process with Asyncify. The program entry point (fun main) would need to catch this unwinding, stop it, and then start the rewinding of the next Fiber (stored on that global variable). Finally, Fiber.swapcontext of the new Fiber would catch and stop the rewinding process.

The downside is that every exported function will need to be wrapped in some kind of Scheduler.enable do ... end block to allow Fiber swaps to happen inside it.

Garbage Collector

LibGC (bdwgc) can be compiled into WebAssembly, but it requires inspecting the stack to work properly. As we saw before, the stack can't be inspected or manipulated directly. Asyncify's unwind and rewind operations work by storing every stack variable in a memory buffer. Thus all we need to do is to unwind the current Fiber whenever the GC needs to run and then rewind the same Fiber back again. We again need to ensure every exported function is wrapped in a block to handle the unwinding and rewinding of Fibers.

There is a proposal to implement a WebAssembly native GC in the works, but I don't expect it to be supported everywhere anytime soon.

Also, a hard-coded maximum memory size needs to be defined, as there is no way to figure out the maximum memory available.

Exceptions

There is an ongoing proposal for WebAssembly Exceptions. It is already supported by the Chromium-based browsers, by Firefox behind a flag in the nightly version and by LLVM's codegen. Unfortunately, it is still not widely used or well documented.

We can:

  1. Exit on exceptions without the ability to catch them (current behavior). Not ideal.
  2. Implement the native wasm exception targeting behind a flag, but the final module will only run on runtimes that support it. Ideal for the future.
  3. Implement exceptions on top of Asyncify again. begin ... rescue can be implemented by unwinding and rewinding the stack, and keeping a copy of the memory buffer, and raising can be implemented by unwinding, discarding, and rewinding into the previously saved state, similar to how setjmp / longjmp work. This is very taxing on performance but works everywhere.

CallStack

We might be able to raise and rescue from exceptions, but we are still unable to obtain a nice stack trace from it as the stack can't be inspected. Currently, the only way to support this is by invoking JavaScript.

Threads

WebAssembly modules can run in multiple threads and support shared memory and atomic operations. But WASI doesn't provide a way to start a thread. Creating threads can only be performed from JavaScript currently.

Signals and Processes

Those aren't supported and don't make much sense with WebAssembly. They can be somewhat emulated with JavaScript if this other process is also a wasm module.

EventLoop

It is likely that libevent2 can be compiled to WASI since wasi-libc implements the poll function. But if that's not the case, then the event loop can be implemented on top of WASI's poll_oneoff function. It supports subscribing for clock events or for a file descriptor to be ready for reads/writes.

Ecosystem interop

Existing shards written in pure Crystal should work unchanged unless they depend on some unimplemented part of the stdlib. Shards that depend on native libraries should work as long as the underlying library can be compiled to WebAssembly as well. Some new shards bringing interoperability with other languages will likely arise, those using WebAssembly-specific functionality.

Run the standard library spec as a CI step

Spec can already run (depending on mocking Fiber.yield) and it should shed a light on what parts of the standard library work and what parts won't. Running it on the CI will help with preventing regressions on top of what already works, after WebAssembly graduates as a supported target.

@Fryguy
Copy link
Contributor

Fryguy commented Apr 15, 2022

Typo?

-In WebAssembly the stack of protected and can be read or manipulated in any way.
+In WebAssembly the stack is protected and cannot be read or manipulated in any way.

@lbguilherme
Copy link
Contributor Author

Fixed 😅

@zw963
Copy link
Contributor

zw963 commented Oct 12, 2022

Add the link to form WASM discuss here for convenience.

https://forum.crystal-lang.org/t/trying-out-wasm-support/4508/1

@straight-shoota
Copy link
Member

straight-shoota commented Jun 6, 2023

WASIX is a newly announced superset of WASI with some additional features that we've been missing (e.g. threads and async IO): https://wasix.org/
Will need to investigate how we could utilize it in Crystal.

@maxfierke
Copy link
Contributor

As exciting as WASIX seems (and it looks really cool!), it does appear to be a set of vendor extensions rather than a true standard, so it might be something that would be best explored as a shard rather than within the compiler or stdlib itself (at least until parts of it become part of a WASI preview or something, or it it's implemented in runtimes other than Wasmer)

Relatedly, I noticed yesterday that Mitchell Hashimoto has started work on an event-loop library that includes support for WASI: https://github.com/mitchellh/libxev

@theogbob
Copy link

Any updates regarding this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants