Skip to content

Proposal: Update WIT syntax/semantics for packages/dependencies/use and alter binary format for externnames #193

Open
@alexcrichton

Description

@alexcrichton

I've been talking with @lukewagner and @guybedford recently about #177 and how
best to tackle that in components and WIT, and I'm opening this issue to propose
changes to both WIT and the component model binary format to solve the issue at
hand and a few neighboring concerns as well. As a heads up what I'm proposing
here is a breaking change relative to the currently implemented and accepted WIT
syntax, and it's an addition of the binary syntax (while adding some ignores of
some preexisting binary syntax).

Motivation and Existing Problems

The main motivation of these changes is that if you have a world such as:

// today's syntax
world {
    import foo: self.foo
}

then any dependency of foo will be "inlined" into the world as well:

// today's syntax
world {
    import foo-dep1: self.foo-dep1
    import foo-dep2: self.foo-dep2
    import foo: self.foo
}

This interacts poorly with transitive dependencies where, for example the
proxy world for WASI depends on a types interface in the wasi-http package
so the world ends up having import types: ... which is quite a "bland" name
without any context in it.

One way to fix this is to consider an "id" more often than a kebab-name. For
example the "id" of the relevant interface is wasi:http/types which is a much
more contextually relevant name. The problem with this, however, is that WIT
doesn't have a great means by which to discover its own package name or
namespace. For example when developing the wasi-http package itself you'd
simply have:

// wit/types.wit

default interface types {
  // ...
}

So there's no context available to say wither wasi or http. This is sort of
solve with the deps folder the wit-parser crate implements along the lines
of:

// wit/deps/http/types.wit

default interface types {
  // ...
}

where here the parser at least knows the package name is http. Still, however,
it doesn't know anything about wasi.

Another problem with the deps/* folder syntax is that packages are named after
their directory names meaning that there's no way to have two packages of the
same name. For example two packages from different registries or two packages
from different versions of the same registry.

In general many of the above problems are solvable in isolation somewhat but
I'm hoping to solve them all at once with updates to the WIT syntax, binary
format, and translation to wasm. One final point worth addressing here, or at
least planning to be addressed here, is at least somewhat introducing the
concept of versioning for future use. This isn't intended to be comprehensive
just yet, but should help set up some foundations hopefully.

Changes to WIT

There are a number of syntactical changes to WIT described here and this
attempts to go through them in such an order to build up to a final picture at
the end.

Package names in files

The first change I'd like to propose to WIT is a requirement that all *.wit
files will start at the top with a package ... statement. For example you
could write:

package my-package

This will solve the "what's the name of my package" problem from above by now
it's explicitly declared inline. This is additionally where we can have other
semantic information such as versioning. An example of wasi-http might be:

package wasi:http@0.1

// ...

Here wasi: is listed as a "namespace", the name of the package is http, and
the version is specified as 0.1.

Valid package names will be:

  • kebab-name - I'm just doing something
  • kebab-name@1.0 - I'm doing something slightly more serious
  • kebab-namespace:kebab-name - I'm developing for a registry
  • kebab-namespace:kebab-name@1.0 - I'm very serious at this point (most WASI
    repos will probably look like this)

For now the version could probably be any semver (e.g. up to 1.0-alpha1+test
or something like that.

The package ... header would be required at the top of all *.wit files that
make up a package. Each package ... directive would also be required to match
all the other one specified, including the version.

Package Organization

WIT today is structured as a Package has a set of Documents where each
document is a file (typically *.wit). Each Document can have a set of
interfaces and worlds, where at most one world and one interface can be flagged
as default. I'm proposing to change this instead to:

  • A Package is a collection of interfaces and worlds.
  • Packages can consist of multiple *.wit files but the interfaces/worlds in a
    package are unioned together. This means you can't define interface foo in
    two files in the same package. Packages can always be represented with one
    file.
  • The default keyword goes away entirely.

These changes are intended to assist with being able to auto-assign a unique ID
to any interface within a package. They will also affect use paths described
below. When combined with the above change to package ... headers some
examples of inferred IDs are:

// wit/types.wit

package wasi:http

interface types { // id = "wasi:http/types"
  // ...
}
// wit/proxy.wit

package wasi:http

world proxy { // id = "wasi:http/proxy"
  // ...
}

The "ID" is inferred to be the package name, without the version plus a slash,
plus the name of the world or interface. The version of the package additionally
becomes the version of the interface.

The rationale for removing default is more clear with the changes to use
below, and otherwise all WIT files being merged together means that for all
interface and world items there's a unique ID within the package.

Changes to use

Today the use statement looks like:

interface foo {
  use self.my-interface.{...}                 // import from same-file interface
  use pkg.other-document.my-interface.{...}   // import from other file, same package interface
  use my-dependency.their-interface.{...}     // foreign dependency
}

With the updates to package structure above these forms largely no longer make
sense. Instead this proposes making IDs more first-class in WIT syntax. The
use statement will still be somewhat similar grammatically where it will look
like:

use-statement ::=  'use' interface '.' '{' names '}'

where the major change is how an interface is specified. To explain that
here's a few examples. First is importing from another interface in the same
file:

interface foo {
  // ..
}

interface bar {
  use foo.{...}
}

Here the interface is just the bare name foo as that's what's in scope. If
foo where in separate file then it needs to be explicitly imported in the
outer scope to be used.

// wit/foo.wit
interface foo {
  // ..
}

// wit/bar.wit
use foo

interface bar {
  use foo.{...}
}

Here a bare use foo at the top level pulls the interface foo into scope. Due
to the name being foo, that means "find an interface named foo elsewhere in
this package. In this situation it's found in wit/foo.wit.

Importing from a foreign dependency now happens through the ID of that
dependency. For example, with a deps structure, it would look like:

// wit/deps/the-dependency/foo.wit
package the-dependency

interface foo {
  // ..
}

// wit/foo.wit
use the-dependency/foo

interface bar {
  use foo.{...}
}

Here the use the-dependency/foo statement will look for a package with the
name the-dependency which is found within the deps folder in this case. Here
the name the-dependency matches the package declaration. For wasi-http
this would look like:

// wit/deps/http/types.wit
package wasi:http@1.0.0

interface types {
  // ..
}

// wit/foo.wit
use wasi:http/types

interface bar {
  use types.{...}
}

Here it can be seen that wasi: namespacing syntax is allowed as well.
Furthermore it can be seen that the name of the folder holding the wasi-http
package, http, does not have to exactly match the package specifier
wasi:http@1.0.0, it can be named anything.

This leads us to an example where you can depend on two packages of the same
name from different registries:

// wit/deps/http1/types.wit
package wasi:http@1.0.0

interface types {
  // ..
}

// wit/deps/http2/types.wit
package bytecodealliance:http@1.0.0

interface types {
  // ..
}

// wit/foo.wit
use wasi:http/types as wasi-types
use bytecodealliance:http/types as ba-types

interface bar {
  use wasi-types.{...}
  use ba-types.{...}
}

The as syntax will be allowed to rename dependencies locally within a file's
context to avoid name collisions. The inferred ID of the wasi-types and
ba-types interfaces will still be unique as they're derived from unique
package IDs.

Finally this can also be used to import multiple versions of the same package:

// wit/deps/http1/types.wit
package wasi:http@1.0.0

interface types {
  // ..
}

// wit/deps/http2/types.wit
package wasi:http@2.0.0

interface types {
  // ..
}

// wit/foo.wit
use wasi:http/types@1.0.0 as types1
use wasi:http/types@2.0.0 as types2

interface bar {
  use types1.{...}
  use types2.{...}
}

The @version syntax is allowed in use statements to disambiguate which
version is being imported if multiple exist. It's an error for there two be two
candidates without a version specified, for example. If only one version is
available it's inferred to be that version (as is the case for all above
examples).

Finally, all of the above can be slightly more "sugary" as well. The top-level
use can also be used inline within worlds and interfaces. For example this:

use wasi:http/types

interface foo {
  use types.{...}
}

is equivalent to:

interface foo {
  use wasi:http/types.{...}
}

This is provided when something is only needed in one location to avoid the top
level use and assigning a name to it. Note, though, that this can get sort of
sigil-heavy with use wasi:http/types@1.0.0.{...} so it's not intended to be
super widespread. That being said these two are also equivalent:

// wit/foo.wit
interface foo {
  // ..
}

// wit/bar.wit
use foo

interface bar {
  use foo.{...}
}

is equivalent to:

// wit/foo.wit
interface foo {
  // ..
}

// wit/bar.wit
interface bar {
  use foo.{...}
}

so the use foo statements to import package-sibling interfaces is not
required.

Changes to world

Currently imports and exports to a world are of the form:

world foo {
    import kebab-name: self.the-interface
}

This suffers from the downside mentioned originally where each interface must be
assigned a kebab-name. The new syntax for worlds will look like:

world-import ::= 'import' ( kebab-name ':' )? interface

This means that the kebab-name is now optional. It'll be required for unnamed
interfaces such as import foo: interface { ... } but otherwise it's now
possible to do:

// wit/proxy.wit
package wasi:http

use types

world proxy {
  import types
}

Here there's no kebab-name. This means that the "name" of the import will be the
ID of the interface, which in this case is wasi:http/types (as imported from a
sibling wit/types.wit. A kebab-name can be explicitly listed, but often won't
be required any more.

For example a Fastly-specific world might look like:

// wit/world.wit
package fastly:compute

world app {
  import wasi:http/types
  import wasi:clocks/monotonic-clock
  import wasi:random/random
  // .. more WASI imports ..

  import fastly:http/upstream
  import fastly:http/caching

  // ...
}

Here no kebab-names are necessary and everything, including their transitive
dependencies, will be assigned names based on IDs. Transitive dependencies will
always be inferred to be named by their identifier to avoid clashes between
transitive dependencies.

Changes to the binary format

Currently the binary format has:

externname  ::= n:<name> ea:<externattrs>
externattrs ::= 0x00
              | 0x01 url:<URL>

This proposes changing this to:

externname  ::= n:<name> 0x00                 => kebab-name n
              | n:<name> 0x01 url:<URL>       => kebab-name n (ignore the url)
              | n:<name> 0x02 v:<version>     => where n =~ /(kebab:)?kebab/kebab/

version ::= 0x00                              => no version
          | 0x01 semver:<string>              => version `semver`

Here the 0x00 production is still interpreted as "here's a kebab name for the
thing". The 0x01 production is reinterpreted to ignore the URL for now (for lack
of a better idea of what to do with it). This could perhaps get removed during a
final 1.0 break. The 0x02 production is the ID-based form of import which refers
to "I'm importing an interface". I know @lukewagner is also interested in
perhaps an 0x03 form of import meaning "I'm importing a specific component
implementation" with perhaps more options too, but I don't think that affects
this proposal in particular.

A text format for these could perhaps look like:

(import "kebab-name" (...))                           ;; just a name
(import "my-package/types" (version) (...))           ;; no version
(import "wasi:htttp/types" (version "1.0.0") (...))   ;; everything

I think that's everything I wanted to cover for this, and I'm interested in
hearing feedback from others using WIT who have thoughts on syntax,
bikeshedding, the breakage involved here, etc.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions