WA1 is the working name for a language that compiles down to Web Assembly. It is a research project. From it has come two things: the prototype language that is implemented in this project and described in this document; and the design document WHITEPAPER.MD. This white paper contains the design for a much larger scope project. It couldn't have been written without the hard slog of putting this code in place.
I would like to take the opportunity to thank the authors of all the open source libraries that I use. For a full attribution list, please see the LICENSE.md.
Yes, it needs a fancy name. At some point it will get that.
WA1 is a strongly-typed expression-based language, which borrows from TypeScript, Rust and Scala. Here we use 'expression-based' to mean that, generally, the last value of a block is its return value, and you can compose expressions freely.
Because we are not monsters, classic control flow also works, so you can still use return
if you want.
So, a simple function is written like this. In simple build mode, use the export keyword to make it visible outside web assembly.
export fn mul(x: Number, y: Number) -> Number {
x * y
}
But this is also fine:
fn mul3(x: Number, y: Number, z: Number) -> Number {
return x * y * z
}
if
is an expression too!
export fn negMax(x: Number, y: Number) -> Number {
if(x > y)
-x
else {
-y
}
}
But it doesn't need to be.
export fn bang(var x: Number) -> Number {
if(x <= 0)
return 0
let var out: Number = 1.0
while(x > 0) {
out *= x
x -= 1
}
out
}
There are some examples in the tests/simple/one.wa1
file which contains our original test!
To build the compiler, first you need Rust. Install cargo, and make sure you are up to date with rustup update
. Then, run
cargo build
in the root. To run the compiler, you need to pass the compiler some arguments.
cargo run -- build-simple tests/simple/one.wa1 -o=tests/simple/out.wasm
will parse tests/simple/one.wa1
and generate tests/simple/out.wasm
.
To run, you will need some way of running wasm.
We ship a runner built on wasmtime for easy testing. To run, call
cargo run --bin test-runner -- tests/simple/out.wasm -f 'fourTimes(8);'
This will run the init function (always called __wasm_call_ctors
for compatibility with existing WASM conventions) and then the entry point you give it.
One way is to use a webpage that looks something like this:
<html>
<body>
<script>
fetch('out.wasm').then(response =>
response.arrayBuffer()
).then(bytes =>
WebAssembly.instantiate(bytes, {imports: {}})
).then(results => {
results.instance.exports.__wasm_call_ctors();
window.addd = results.instance.exports.addd; //or whatever is exported from the file
});
</script>
</body>
</html>
Serve it with http-server; install that with
npm install -g http-server
and manually call in your browser in the debugger by typing the function name (called addd
out of the box). Note that you have to call
__wasm_call_ctors
manually. We don't use start functions, because the wasm linker tools warn against it.
Local and global consts are declared like this.
let s = expr
The type of s
is inferred from the expression you assign it.
Variables use the var
keyword, meaning that you can change what the variable contains.
let var t = 5
t += 1
You don't need 'em; a return character is equivalent. The things in javascript, like getting weird problems with return
, are not an issue here.
- Number - a 64 bit float.
- Int - An integer. You specify bounds as generic arguments, so
Int<-2147483648, 2147483647>
is a 32 bit signed int. You should rarely need to do that, though; by default you get a 32 bit signed, and most things will take that. The compiler tries to do most safe casting for you. This is... a tiny bit crazy and I may not keep it. Ints that stay within the safe 64 bit float range can be auto-casted to Number. - Bool - A boolean.
As mentioned above, numeric types will auto-cast. If you need to manually invoke casting, the 'as' keyword is your friend.
//totally spurious example
let a = 9 as Number
Currently there are four legal int member functions. These are somewhat bodged today.
let i = 256
let clz = i.countLeadingZeros()
let ctz = i.countTrailingZeros()
let shl = i.shiftLeft(3)
let shr = i.shiftRight(3)
Functions are declared as follows. Use thin arrows for explicit return type declaration, fat arrows will let the type engine figure it out.
//explicit return type
fn name(arg: Type) -> Type {
body
}
//implicit return type
fn name(arg: Type) => {
body
}
The deduction algorithm for return types is very simple. Don't be surprised if you need to help it.
Astonishingly, generic functions are supported. Here is the identity function. Not all types can be applied, but that's a work in progress. Stonishingly, expect bugs (but I am fixing them.)
fn id<T>(x: T) -> T { x }
To use them you can provide the types, or have it deduce them.
let n1: Number = 4
let n2 = id<Number>(n1)
let n3 = id(n1)
The deduction algorithm is very simple, and just matches function argument types against what it's been given. This means that it won't be able to deduce a type variable that is only used as a return type.
They are not actually true generics. They specialize for some types -- at the moment, the numeric types -- which means I can avoid a box. It also means they are sort of a mix between templates and generics. Some days I like saying 'genemplates', other days 'templerics' is more fun.
The return value of a block is the last expression. So a function that added two numbers would be this:
fn add(x: Number, y: Number) => x + y
This works because a function expects a block; a single expression or a squiggly bracket enclosed list of expressions is a block; the last expression of a block is what it returns.
if
is an expression, if it has an else branch.
let a = if(b) 3; else 4
Type inference will do some work for you if the branches are a bit different, but will complain if they are radically different.
You don't have to have an else branch, mind; you just can't use it as an expression.
if(k != 4)
j = true
while
is not an expression.
while(a > 3) {
a -= 1
if(a == 100)
break
}
break
and continue
work as you would expect.
return
causes early return of the function you are in.
return 4;
You don't need it at the end of a function (and is considered bad practice), but it is useful for flow control in complicated functions. If you combine fat arrow with return, we try and guess based on all the returns in the function.
You can, obviously, define your own types, built from the system types. This is still very much a work in progress.
To create a type alias, use the alias
keyword. A type alias is what it sounds like, a different name for the same type, not a different type.
alias __size_t = Int<0, 4294967295>
The type keyword is the simplest (and currently only) true user type. It wraps an existing type, and then supplements it with extra member functions.
Member functions have access to a this
local variable, which is of the type defined. To copy the inner data, use this.getInner()
. To copy the inner
data such that you can mutate this copy, use this.getInnerMut()
.
type One = Int {
constructor(x: Int) => x
fn add(r: Int) -> Int this.getInner() + r
}
It's important to understand that getInner
and friends copy. WA1 is pass-by-value by default. This means in the above example, you will get an int in the add
function, but it's a local variable which is a copy of the this
value. There is no setInner
function.
The constructor
keyword is the sole way of constructing. This essentially casts an object of the inner type into the outer type. This is subject to change.
You use the new
keyword to construct.
fn testOne() => {
let oneOne = new One(3)
assert.assert(oneOne.add(4) == 7)
}
Member functions may be private (put the private
keyword before the fn
).
- Private means that you can only call the function in a member function of the same type. This means you can call other instance's private functions.
- Protected is not yet implemented.
Traits in WA1 are purely compile-time constructs. They are a set of member functions that a type variable can assert that it supports.
They are declared like this.
trait TraitOne{
fn a() -> Void
fn b(x: Int) -> Int
fn c() -> TraitOne
}
Note the use of the trait name as a return value. This is completely fine, but traits are not actually types. Instead, the TraitOne
return type is a generic type variable that will be resolved later.
You use them like this.
//generic function that asserts a type variable conforms to a trait, and calls a function on it
fn fb<T: TraitOne>(x: T) -> Int x.b(1)
In theory you can specify a set of traits that the type variable supports. We borrow the lovely union and intersection syntax from TypeScript for this. So, in a magical world this might work:
fn fb<T: TraitOne & TraitTwo>(x: T) -> Int x.b(1)
This means that the type variable T supports the intersection of TraitOne
and TraitTwo
, which means the union of their member
functions. Likewise the |
operator is the union of traits, which means the intersection of their member functions. Read the
operator as you would a boolean - in the above example T
has TraitOne
and TraitTwo
so it gets both sets.
I say 'magical world' because I haven't written any tests for it so it probably has bugs.
In order to implement a trait, you have to use an implement
keyword. You don't have to provide all the functions: if the
type already implements the functions, just don't include them in the implement
block.
type TypeOne = Int{
constructor(x: Int) => x
fn a() -> Void {}
fn inner() -> Int this.getInner()
}
implement TraitOne for TypeOne {
fn b(x: Int) -> Int {
this.getInner() + x
}
fn c() -> TypeOne {
new TypeOne(this.getInner() * 2)
}
}
I dislike the word implement
. I think I may switch to express
.
Then, using them feels like a regular generic function invocation.
fn testOne() -> Void {
let x = new TypeOne(8)
assert.assert(fb(x) == 9)
}
These are, of course, inspired by Haskell's type classes and are a little closer to them than Rust or Scala's traits given they are pure compile-time constructs. But they are also obviously much less powerful beasts, for good or for ill.
They almost certainly don't work in a million weird ways, because I literally just finished typing them, so don't expect happiness.
Inspired by TypeScript, WA1 contains type guards. In TypeScript these are used to tell one JavaScript blob from another. Here, we use them to do limited scope down-casting. What this essentially means is this. Given a bit of code of the form:
let x: Bar = whatever()
if(isFoo(x)) {
x.methodOnFoo()
}
any local access of x in the block will automatically cast x to a Foo. This lasts until the block is existed, or x is assigned to. Creating a type guard function is an unsafe operation, so needs to be done in unsafe mode.
Type guards also work on integers. Comparison operators will narrow integer ranges, casting if necessary.
Instead of using the simple build mode described above, you may use a full build mode. For this we generate WASM object files, which, because they are a standard, can then be linked with an external linker. A very simple example exists in the tests folder.
The linker we use is wasm-ld, given that is the only working WASM linker. You will need to install it somehow. On Ubuntu you can get that by calling
sudo apt install lld
On Windows probably the easiest is to install a pre-built binary from the releases page, here. MacOS you are on your own, I am afraid.
I found building it from source was not impossible, but I imagine you may be able to install from the above link. Once you have an exe name (and in in Ubuntu
it is wasm-ld-9
, surprisingly) you need to edit the build-wsb.json
file you are building to point to the right location. Yes, I know. This should get better
as either the LLVM WASM toolchain matures, or I finally build a configuration tool.
When you are set up, call this to run the sample.
cargo run -- build tests/linker/build-wsb.json --clean
Here, the 'clean' arg will always force a build. You can omit if you wish. The build-wsb.json format is subject to change. However, at the moment, here it is.
{
"entry_point": {
"file_name": "two.wa1",
"is_unsafe": false
},
"source_files": [
{
"file_name": "one.wa1",
"is_unsafe": false
}
],
"src_path": "./src",
"out_path": "./out",
"module_name": "linker",
"wasm_exe": "wasm-ld-9'
}
The entry_point
is the place where we do the module exports. So anything exported in there will be exported to the final wasm.
Anything exported from the source_files
will be visible to other files, but not outside the module.
The only working import syntax is
import {*} from "./one"
where the *
means to import everything. Once you import a function f
from ./m
it will appear as m.f
. That means
to test the example you would run
cargo run --bin test-runner -- tests/linker/out/linker.wasm -f 'linker.hello(1, 2);'
Unsafe mode is designed to be a mode that library writers can write low-level code. It feels like C in that it is not much more than structured WASM. In order to use it you need to somehow pass the unsafe arg to the compiler. Syntax is deliberately scary.
A leading __
is pronounced 'unsafe', by the way.
__Ptr
type. Pointer to raw memory.mut __Ptr
is a pointer where you can change what it points to.__Ptr
is a readonly thing.- intrinsics - wrap low level wasm calls
__memorySize()
-memory.size
instruction__memoryGrow(0, numPages)
-memory.grow
instruction__trap()
-unreachable
instruction
__Option
- is a essentially a nullable pointer. You can have it be a__Null
or__Some<T>
. It will end up being the core of the real option. It's kind of horrible because it's pretending to be a union but it really is a nullable pointer.__struct
type - to get one you have two horrible choices. You can cast from a__Ptr
(this is unsafe. The clue is in the name...), or use__static
(which is a C-style static allocation). To define one, type__struct Hello { a: Int; }
. The static creation is,__static mut Hello {a: 3}
. Omit themut
for it and the things it holds to be immutable. A__struct
is and will always be a raw pointer to memory, used for writing the allocator and other low level things.__typeguard
This keyword lets you define type guards. Here is an example.
export fn __Option_isNull<T: __struct T>(x: __Option<T>) -> Bool __typeguard {
true => __Null
false => __Some<T>
} {
x == __Null
}
- Typing
- every expr needs a return type!
- functions decls are exprs, not statements
- Infer return types
- Handle
never
properly - i.e. complain about dead code - do proper auto-widening. means you can widen anything to unknown. Needed for templates, I think.
- means subsuming the number code, which is probably a good thing
- typescript-style const types, where a = 0 means typeof a == 0
- move the bin op types from the ast
- clean up the cast code
- bool literals
- check __Option equality
- consts properly rolled into the type system (aka the Konst Waaagh)
- privacy
- be honest about int and ptr member functions (currently they are bodged)
- Expression based language
- blocks return a value
- no need for a return statement (but still supported - what is this, scala?)
- if as an expression
- Error handling
- all exprs have a loc
- all errors have a loc
- Control flow
- if
- while
- simple for loops
- assignments and consts
- global variables
- match statements
- implement this for strings
- assign is void
- Strings
- a 'char' is a unicode grapheme cluster - see this - this guy has some cool rust libs
- Optimise steps. These should be on even on O0
- use the 'consume' code to not write drops
- remove returns right before end
- change tee + drop into set
- remove get + drop
- remove casts straight after consts, and just have the appropriately typed const inline
- == 0 is eqz
- globals that are init to a const should use the wasm global init mechanism
- when calling __static, copy true data into the data section and drop the initializer expression
- tail recursion
- Function calling
- simple static function call
- functions as first class objects (but not closures)
- allow tuples to be used as function args (so e.g. instead of
f(1, 2)
, we also allowlet t = (1, 2); f t
)
- Known bugs
- fairly sure prefix unary operators are wrong - may need to start at a precedence
- the lexer doesn't parse negative numbers!
- import order is not correct - imports need to be first - need a secondary mapping. Claim fixed, needs to be tested
- type guards - in unguarded loops
- Inlining
- inline numeric constants?
- or full blown inlining, of which this is just a special case?
- initially write a function that inlines code that contains no
return
s as an ast operation - copy any non-variable exprs into local variables
- initially write a function that inlines code that contains no
- Purity
- Record (some? one?) of states pre expr
- globally pure (no non-const global access)
- locally pure (true purity)
- locally impure, globally pure (essentially reassignment to a local variable - referential integrity but not proper FP)
- changing passed in object
- side effecting (e.g. writing a mutable global, writing a file)
- non-deterministic (e.g. currenttime, memorysize, reading a global or file)
- evaluate globally and locally pure functions if they have literals passed in. Is this... insane?
- Record (some? one?) of states pre expr
- A runtime
- a simple allocator
- a small object allocator
- a gc based on the rajan & bacon paper
- but with assignment counter
- Structs, ptrs
- structs - initial code
- union types are discriminated unions like Rust enums - yeah!
- typescript lets us call members that are in all the types in the unions. This is very convenient, but hard here. Proposal: generate accessor functions for all union members, route to the appropriate type, insert that. Not free, but not terrible either.
- option is union T | null... maybe?
- means map over a union is well understood, I guess
- make small ints fit into small spaces
- opt: booleans are 1 bit in structs, 32 bits on the heap?
- Types
- tuples - are just collections of locals, so pass by value (needs wasm multi value for some things)
- tuples - need a syntax for tuple 1. Probably
Tuple<int>(3)
- type keyword is a first class type, not an alias (might as well, eh? I always disliked that Scala and TS do that)
- Closure generation and usage
- closure capture
- call function with closure
- optimization - don't capture consts, you have all the information to roll them inline. This is awesome because I don't think JS can do this
- templerics
- for funcs generate meta code to turn into templates
- for types generate meta types
- implicit type args on instantiation
- support fat arrow
- partial instantiation - specifically
fn f1<T>(x: T) => {} fn f2<U>(x: U) => f1<U>(x)
currently generates an UnresolvedTypeArg error
- containers
- arrays
- pass by value. increment rc on creation, if function contains a modification, not otherwise
- 'ref' keyword for pass by ref
- if a function takes an object and modifies it and returns that object, mark it. When you do let a = f(a) you should be able to remove ref count incs
- a function that takes an object by ref and modifies it is marked. it will not inc the ref count; instead, the calling function does.
- hashmaps (probably called 'objects' to be JS friendly)
- map/foreach are macros
- more complex for loops (in, of)
- arrays
- Numbers
- there is an i32 type, an i64 type and an 164 type; the compiler auto promotes i32 to the other two types
- colors are a special type
- Imports and exports
- syntax
- import {*} as x from y => means will get everything from y as x.a, x.b
- import {a} as x from y => means will get x.a
- import {*} from y => means will get y.a, y.b, etc
- import {a} from y => means will get y.a
- A stage 1 parser that will generate a list of file exports - globals, functions, types
- A file format for that
- When import commands are run, load that, pull the imports in
- Functions to be inlined
- types - I think import doesn't work properly
- syntax
- Linker
- object file format that contains
- use wasm-ld
-
fatal-warnings
- tool chain
- npm
- binaryen optimizer