Skip to content

Conversation

@wanderer
Copy link

@wanderer wanderer commented Jun 16, 2019

This PR creates interface definitions for WASI using Capn Proto IDL. This is an alternative to using WebIDL (as mentioned here #31) and hopefully this PR will flush out the pros and cons of using either IDL.

The immediate advantages of Capn Proto are

  • It much simpler and cleaner then WebIDL
  • Object-Capability focused

To start with this PR only creates the Idealized representation of WASI in Capn Proto. As defined we can generate Wasm function definitions in parity to what we have now (some annotation are need to be added to generate the current naming conventions as well as some of the arguments positions with regards to returning structs). At this point this PR doesn't suggest to use Capn Proto serialization for representing data structures in memory or defining a full bindings (see WebAssembly/design#1274 for that), this is only about defining the interface. For now all data structures would still be encoded by the C ABI.

I think once we have an IDL it should be much easier to iterate and make progress on evolving interface specs. My hope is that we can use a IDL that is easy for people to contribute to and maintain!

Thanks to @erights and @kentonv for input!

@@ -0,0 +1,21 @@
@0xb47a0b67171659f7;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are these?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is the unique file id: https://capnproto.org/language.html#unique-ids
the @0, @1, @2, etc. are used to define the canonical field/method order for seriaization.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could possible use these in the future for versioning the interfaces but I wanted to start on discussion at time.

schema/README.md Outdated
```
(func $returnTwoInts
(param $someInput i64)
(param $ptr_fist i32)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*first

argc :Size, # The number of arguments.
argv_buf_size :Size # The size of the argument string data.
);
} No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing trailing newline

);

# Return command-line argument data sizes.
sizesGet @1 () -> (
Copy link
Member

@devsnek devsnek Jun 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assuming i can use these files as some sort of data to generate information i can use in my library or language or whatever, shouldn't these functions be snake_case? (and contain some data about being in the wasi_unstable module?)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so there are a couple of routes to go here. Capn proto schema style should be camelcase, but then the doc and interface generators can could spit out snake_case. Also we can add annotations to generate the current naming if there are other discrepancies.

Copy link
Member

@sunfishcode sunfishcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Could you post the generated interfaces somewhere? When you say this generates interfaces "in parity"' with what we have now, I'd be interested in seeing the specifics. As an example, could you show what the generated interface for fd_read looks like?

We should also note that webidl-bindings, aka "☃Bindings", is continuing to evolve. See Luke et al.'s presentation at last week's CG meeting. Of particular interest to WASI are that (a) WASI is one of that proposal's focus use cases, and (b) there is now a proposed refactoring (see the "Is the center node really "Web IDL"?" and following slides) where WASI and other non-Web users wouldn't have to deal with the full complexity of Web IDL. (As a full disclosure, I myself have been involved with Luke and others on parts of this proposal.)

Here are some additional notes at a first readthrough:

Some pros of Capn Proto IDL:

  • Capn Proto IDL is defined and has many tools for working with it today.
  • Users who do want to serialize the API can easily do so.

Some cons of Capn Proto IDL in this context:

  • Capn Proto is focused on wire-format compatibility, but for WebAssembly APIs the focus needs to be on API compatibility. Capn Proto's required ID numbers and struct field numbers aren't relevant to WebAssembly APIs, while WebAssembly APIs care about names for interfaces, functions, and other things, which would need to be annotations in Capn Proto. It looks like it's probably possible to do everything we'd need, but there will be a lot of extra noise.
  • Capn Proto NUL-terminates strings, which I assume means they can't contain embedded NULs. In WebAssembly, the existing convention is to permit embedded NULs in strings.
  • Capn Proto wouldn't as easily accomodate WebAssembly-specific features, such as specifying allocation functions and handling allocation automatically, eg. alloc-utf8-mem-str in ☃Bindings, or referencing WebAssembly globals.

# If __WASI_RIGHT_FD_READ is set, includes the right to invoke __wasi_poll_oneoff() to subscribe to __WASI_EVENTTYPE_FD_READ.
# If __WASI_RIGHT_FD_WRITE is set, includes the right to invoke __wasi_poll_oneoff() to subscribe to __WASI_EVENTTYPE_FD_WRITE.
pollFdReadwrite @27 :Bool;
sockShutdown @28 :Bool; # The right to invoke __wasi_sock_shutdown().
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Capn Proto pack these Bools into a bitfield, as is done in the current WASI Core API?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so i wasn't thinking of using capn protos serialization directly in memory. But yeah thats the idea here. A struct of all bools should be encoded as a bitfield. To make it more explicit we could add annotations.

@kentonv
Copy link

kentonv commented Jun 17, 2019

Hi, Cap'n Proto author here. I wasn't involved in this proposal, but I find it interesting. I don't think I'm in a position to say if it's the right thing for WASI (since I don't know enough about WASI or the alternative options being considered). But, I'm happy to help by answer questions about Cap'n Proto.

To that end:

Capn Proto NUL-terminates strings, which I assume means they can't contain embedded NULs.

Incorrect: Cap'n Proto strings can contain embedded NULs. The size of the string is directly encoded on the wire, and this is how Cap'n Proto determines the string extent, not based on the NUL terminator. The NUL terminator is required so that programs which interact with C interfaces are not forced to make extra copies (though such programs should of course consider what happens if a NUL appears inside the string body).

From the schema language's point of view, a Text value can be any array of 16-bit codepoints (though preferably one that is valid UTF-16 representing 32-bit codepoints). Cap'n Proto serialization chooses WTF-8 + required NUL terminator as the encoding for such values, but this is a property of the serialization and need not constrain alternative use cases.

Capn Proto wouldn't as easily accomodate WebAssembly-specific features, such as specifying allocation functions and handling allocation automatically

Cap'n Proto supports an extensible annotation syntax which is commonly used to specify features relevant to alternative serializations and use cases. For example, Cap'n Proto has a JSON serializer that supports a bunch of annotations for specifying idiomatic JSON patterns that don't otherwise easily fit into Cap'n Proto's type system:

https://github.com/capnproto/capnproto/blob/master/c++/src/capnp/compat/json.capnp

That said, it is true that Cap'n Proto won't be as good a fit as a custom schema language optimized for your specific use case. (But I'd guess WebIDL is not a perfect fit either, since it also was designed for a different use case.)

Does Capn Proto pack these Bools into a bitfield, as is done in the current WASI Core API?

Yes it does, although this is again a property of the serialization, not of the schema language. An alternative serialization (or non-serialization-oriented use case) can do whatever it wants.

@erights
Copy link

erights commented Jun 17, 2019

Yes it does, although this is again a property of the serialization, not of the schema language. An alternative serialization (or non-serialization-oriented use case) can do whatever it wants.

Yes. What Martin and I are proposing is Cap'n Proto as the Schema/IDL language for WASI interfaces in general, where the normal WASI bindings to be generated would start with the most natural direct mapping to the WASM type system and the guidelines at https://github.com/WebAssembly/tool-conventions/blob/master/BasicCABI.md

I find Cap'n Proto to be a much more natural, readable, and understandable language than anything that could result from any process that starts with WebIDL. As a non-web IDL, it really is rather horrible.

@erights
Copy link

erights commented Jun 17, 2019

(Note that I updated the URL in my previous comment.)

WASI has already confused its messaging, causing pervasive misunderstanding even among those, like I, that have been involved from its beginnings. In "wasi-core" were these APIs which are really specialized to POSIX-like systems. I inferred that WASI was specifically about trying to create ocap abstractions of POSIX-like services, in the same sense that Node or the Java libraries create object abstractions of POSIX-like services. I only found out otherwise at the WASM-on-Blockchain event.

Because the main WASM engines are currently from browser makers, and primarily engineered to add value to browsers, there is a similar potential misunderstanding lurking. If WASI adopts an IDL that is good for browsers and terrible for anything else, I think that will amplify that misunderstanding.

Cap'n Proto is obviously neutral to both, is already engineered to be ocap friendly, and is actually good.

@dumblob
Copy link

dumblob commented Jun 17, 2019

Just an idea: maybe we could align some bits of this proposal (incl. e.g. parametrization of delimiting constants, lengths, etc.) also with sproto as it's very similar to Cap'n Proto, but seems slightly simpler. That could make this pull more robust.

@wanderer
Copy link
Author

Cool. Could you post the generated interfaces somewhere? When you say this generates interfaces "in parity"' with what we have now, I'd be interested in seeing the specifics.

@sunfishcode I havn't written a generator yet, but I think it will be straight forward.

As an example, could you show what the generated interface for fd_read looks like?

so read is defined as

interface File extends (FileDescriptor) {
  # Read from a file descriptor.
  # Note: This is similar to readv in POSIX.
  read @4 (
    iovs :List(Iovec), # List of scatter/gather vectors to which to store data.
  ) -> (
    error :Errno,
    nread :Size
  );
}

should be translated the signature (i think, please correct if wrong)

(func $fd_read (param $fd i32) (param $iovs i32) (param $iovs_len i32) (param $nread i32) (return i32))

to do this

  1. The first parameter should also be the capability for the method (all methods should have caps, but the one that currently don't such as evn_vars can be annotated).
  2. :List(<stuct>) all list can get translated into (param $<struct> i32) (param $<struct_len>_len i32) so here we would have a iovs is a point to list of iovec_t stucts and iovs_len is length.
  3. nread :Size needs to be a pointer in the arguments since we don't have multple return values. Any returned value after the first returned value should be appended to the argument list as a point.
  4. Naming, I didn't spend a lot of time thinking about how to generate the current names accurately. But all the camel case should be turned into snake_case if this breaks somewhere we can fall back to annotations and later evolve the naming to be more natural.

@sunfishcode
Copy link
Member

WASI has already confused its messaging, causing pervasive misunderstanding even among those, like I, that have been involved from its beginnings. In "wasi-core" were these APIs which are really specialized to POSIX-like systems. I inferred that WASI was specifically about trying to create ocap abstractions of POSIX-like services, in the same sense that Node or the Java libraries create object abstractions of POSIX-like services. I only found out otherwise at the WASM-on-Blockchain event.

I agree the "WASI Core" name and POSIX focus of our documentation has caused confusion. I'm working on correcting that.

Because the main WASM engines are currently from browser makers, and primarily engineered to add value to browsers, there is a similar potential misunderstanding lurking. If WASI adopts an IDL that is good for browsers and terrible for anything else, I think that will amplify that misunderstanding.

I'd actually point to those POSIX-style APIs, the ones you rightly observed we focused too much on, as an example of how we're not just doing what's good for browsers here. Synchronous POSIX-flavored filesystem APIs are about as opposite of "good for browsers" as one can get.

actually good

A lot of the work in ☃Bindings concerns things that need to happen at the wasm engine level to connect wasm to native APIs efficiently, both now and with reference types and later with full GC types. In fact, the IDL parts of this aren't really fleshed out yet.

The Cap'n Proto IDL proposal here is leading with the IDL, and doesn't yet address other aspects of bindings. As such, the two efforts may be fairly complementary. Is there a way to combine the best parts of both efforts? I recognize that on a practical level, this is difficult at this moment since ☃Bindings doesn't have real documentation yet, so we may not be able to answer this right away, but it's something I'm thinking about.

Brainstorming along these lines a little, how important is it for this proposal to stay compatible with upstream Cap'n Proto here? If we changed the syntax, we could avoid a lot of annotations and ID numbers. We could also add things like wasm module names, and generally have tighter integration without having everything be annotations. If I understand @wanderer here, it sounds like we'll need to have our own generators in any case.

@PoignardAzur
Copy link

I know this is a little hypocritical for me to say, but this really feels like a solution in need of a problem.

I don't see who is expected to bind to these definitions:

  • WASI VMs will mostly use the C API of wasm, so they'll use the C definitions directly.
  • Programs and libraries being compiled to wasm will mostly use the C standard library, whose implementations over WASI will mostly use the C definitions.
  • Programs that are low-level enough to call WASI syscalls directly will most likely have a way to bind to C functions before they have a way to bind to Capnproto.

If the wasm community can agree on an interop IDL, then it might be worth writing definitions for that IDL, but I'm not sure Capnproto is going to be it.

The thing is, nobody has written a comprehensive list of which traits would be desirable in an ideal wasm IDL (though I've tried to get us started); it's possible Capnproto's IDL would have some but not all of these traits, in which case it might make more sense to use (yet another) brand new IDL, and write a Capnproto converter.

@erights
Copy link

erights commented Jun 18, 2019

I am not attached to Cap'n Proto IDL taken literally. @wanderer 's work demonstrates how Cap'n Proto could be used both to express current bindings, and, without changing the IDL, generating the future bindings we'll need so the capabilities are virtualizable. We could also generate the adapters providing the old style bindings as adapters to the new style. (We have not yet documented the new style bindings which will need more wasm support, though I have talked to both @sunfishcode and @wanderer about this necessary transition.)

Any Cap'n Proto-like IDL that has the virtues of Cap'n Proto without Cap'n Proto features that are irrelevant to us would be fine with me. In particular, the IDL's types should map naturally to WASM types, along the lines of https://github.com/WebAssembly/tool-conventions/blob/master/BasicCABI.md . However, we shouldn't be gratuitously incompatible with Cap'n Proto just for the sake of being different. Ideally, it would be a subset of Cap'n Proto that we define, that has a natural mapping to Cap'n Proto itself.

@erights
Copy link

erights commented Jun 18, 2019

To be clear, both @wanderer and I are not suggesting adoption of the Cap'n Proto serialization format or version compat rules. We suggest using the syntax and semantics of an existing IDL, aside from support for version transition, and writing a new generator to generate natural wasm bindings from it.

If we do this, I'm sure that there will be other contexts where we are concerned to define ABI compat rules across version transitions. Cap'n Proto's are well thought out. But this is not what we're proposing for WASI's current purposes.

@erights
Copy link

erights commented Jun 24, 2019

Just went through the Cap'n Proto Schema examples, and wrote down a BNF for the subset I think is relevant, both for WASM/WASI and for Agoric's remote object protocol, CapTP. The following BNF accepts too much, but gives an ast for doing post-parse validation. Alternatively, we could make more distinctions in the BNF.

The notation "X**Y" means "zero or more Xs separated by Ys", i.e., "(X (Y X)*)?".

start ::= typeDecl*;

typeDecl ::= enumDecl | structDecl | interfaceDecl | constDecl;

type ::= primType | "List" | "AnyPointer" | IDENT | type "(" type**"," ")";

primType ::= "Void" | "Bool" | intType | floatType | "Text" | "Data";

// Only "BigInt" is relevant to CapTP
intType ::= "Int8" | "Int16" | "Int32" | "Int64"
|           "UInt8" | "UInt16" | "UInt32" | "UInt64"
|           "BigInt";

// Only "Float64" is relevant to CapTP
floatType ::= "Float32" | "Float64" | "Float128";

enumType ::= "enum" type "{" (IDENT ";")* "}";

structType ::= "struct" type "{" memberDecl* "}";

memberDecl ::= paramDecl ";"
|              "union" "{" memberDecl* "}"
|              typeDecl;

interfaceDecl ::= "interface" type super? "{" methodDecl* "}";

super ::= "extends" "(" type**"," ")";

methodDecl ::= IDENT "(" paramDecl**"," ")" ("->" "(" resultDecl**"," ")")? ";";

paramDecl ::= IDENT ":" type ("=" expr)?;

resultDecl ::= IDENT ":" type;

constDecl ::= "const" IDENT ":" type "=" expr ";"

expr ::= ...

The unions above are implicitly tagged. If we're willing to leave the Cap'n Proto subset, we could instead do what Rust does: extend enums so each branch of the enum has its own memberDecl*

@aardappel
Copy link

Given that you're not going to be using Cap'n Proto as a serialization system, but just borrowing its IDL parser (which is a tiny part of the project), I'd say you're better of designing an IDL made specifically for WASM. It's not that much work, and you're not risking carrying all sorts of serialization-isms along that are not relevant for this purpose.

@jgravelle-google
Copy link

⛄️Bindings is essentially trying to do that. In particular the approach I want us to take here is one of starting from scratch, and only merging with WebIDL where it makes sense to, without being constrained by the existing design of WebIDL at all.

We suggest using the syntax and semantics of an existing IDL, aside from support for version transition, and writing a new generator to generate natural wasm bindings from it.

That's essentially my plan to support WebIDL, is to create a WebIDL->⛄️Bindings generator, similar to how Rust's web-sys crate generates Rust APIs annotated with wasm_bindgen from WebIDL.

I'd say you're better off designing an IDL made specifically for WASM.

That's also something we'll come up with, a text IDL for generators to target, most likely.

@wanderer
Copy link
Author

this has been replaced with #64

@wanderer wanderer closed this Jul 18, 2019
alexcrichton pushed a commit to alexcrichton/WASI that referenced this pull request Jan 19, 2022
* expose public functions for the name (e.g. "INVAL") and docs corresponding to errno

* regenerate lib_generated, use funcs in Error's display and debug impls

* bump version to 0.10.2

* add documentation to generate lib_generated.rs
yoshuawuyts pushed a commit to yoshuawuyts/WASI that referenced this pull request Nov 25, 2025
This is to align language in the WASI phase process with all pre-existing WASI repos.

Phase 4 Advancement Criteria was renamed to Portability Criteria in WebAssembly#549, so rename it in this document.
yoshuawuyts added a commit to yoshuawuyts/WASI that referenced this pull request Nov 25, 2025
yoshuawuyts pushed a commit to yoshuawuyts/WASI that referenced this pull request Nov 25, 2025
Convert all pseudo-resources to full resources, and as such introduces some borrow<..> annotations.

Additionally, this reworks body handling to go through an intermediate resource, rather than expose direct access to the underlying stream for the request/response resource. This intermediate resource allows us to better track when trailers are expected to be written or read.
yoshuawuyts added a commit to yoshuawuyts/WASI that referenced this pull request Nov 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.