Generate Typescript types for PEGjs/Peggy grammars directly from the grammar files.
Input grammar:
a = "a" b:b? { return ["a", b]; }
b = "b" a:a? { return ["b", a]; }
Autogenerated types:
type A = ["a", B | null];
type B = ["b", A | null];
This project uses npm workspaces. The type generation code is found in packages/peggy-to-ts
.
peggy-to-ts
exports a TypeExtractor
object. It is used as follows:
const typeExtractor = new TypeExtractor(grammarSourceCode);
console.log("The generated types are:", await typeExtractor.getTypes());
By default, the type of the first rule is the only exported rule. To export other
rules, pass in an array of allowedStartRules
by name. E.g., typeExtractor.getTypes(["rule1", "rule5"])
will export the types for Rule1
and Rule5
.
If your grammar's actions (the things that look like { return ...}
) contain arbitrary
Javascript, Typescript might not be able to guess what the types for that action are. You
can provide hints to Typescript via casing using Typescript's as
syntax. For example
rule = x:.* { return mySpecialFunction(x); }
could be refined via
rule = x:.* { return mySpecialFunction(x) as number[] }
This would allow peggy-to-ts
to determine that the type of rule
was number[]
.
To build, run
npm install
npm run build
This will build both the playground and peggy-to-ts
.
peggy-to-ts
uses ts-morph
to extract typescript types from
the grammar file. It works in stages.
- Extract all the rules from the grammar file.
- Rename the rules and all references to be CamelCase (if that option is set.)
- Recursively create the type of all non-action rules (e.g.
Rule1 / Rule2
can directly turn into the typeRule1 | Rule2
) - Guess the types of
Action
rules, i.e., the rules with Javascript return values.
Step 4. is where the hard work is. First a function is created with generic types for each named value in the rule and function body equal to the action body. For example, consider the rule:
Foo = x:Bar? y:Baz { return x ? x : y; }
A function is created
function tmpFunction<T_0 extends Bar | null, T_1 extends Baz>(x: T_0, y: T_1) {
return x ? x : y;
}
Then, Typescript (via ts-morph
) is asked about the function's return type. In this case it would be
T_0 | T_1
. Then, the generic parameter types are replaced with their named types, so we arrive at
type Foo = Bar | Baz;
Unfortunately, Typescript's generic types have to be eliminated from the final result because generics in Typescript cannot be recursive. For example,
type A = ["a", B];
type B = ["b", A];
is an acceptable type, but
type Mixed<X,Y> = [X,Y];
type A = Mixed<"a", B>;
type B = Mixed<"b", A>;
is not an allowed type in typescript. So, peggy-to-ts
goes out of its way to eliminate
generics from the types that it generates.
PEGjs/Peggy actions are allowed to perform arbitrary transforms. Sometimes these transforms will result in Typescript types that are recursive in a way that Typescript cannot handle. For example,
Start
= "a"
/ "(" s:Start ")" { return s; }
results in
type Start = "a" | Start;
Of course, we know by looking at the grammar that if the correct type
is type Start = "a"
. However, peggy-to-ts
is not currently able to
deduce this. (If you have an idea about how to improve this case, please make a PR!)