A Pratt parser, based on Matklad's tutorial.
Bug reports welcome; I wouldn't exactly say this is well-tested, although it has worked correctly on the two things I've used it for so far.
See the example for how you should use this. In brief:
- Define a lexer somehow (which produces a stream of
Token
s, where aToken
is a type you define). - Define what it means for a lexeme to be atomic (this is
Expr.atom
). Atomic tokens can appear on their own to form an expression all by themselves, or they can be combined or modified using operations specified by non-atomic tokens. - Define how the various non-atomic lexemes behave (this is
Parser.withUnaryPrefix
and friends) to combine subexpressions into larger expressions.
We supply:
Parser.withUnaryPrefix
, which specifies e.g. that the token!
can appear as a unary prefix of another expression, and that when it does, it indicates (e.g.) negation.Parser.withUnaryPostfix
, which specifies e.g. that the token!
can appear as a unary suffix of another expression, and that when it does, it indicates (e.g.) the factorial.Parser.withInfix
, which specifies e.g. that the token+
can appear between two expressions, and that when it does, it indicates (e.g.) addition.Parser.withBracketLike
, which specifies e.g. that the token(
can appear before an expression, and that it's terminated by e.g.)
. You can also implementif/then/else
this way: that's just a weird kind of bracket and comma (and theelse
is kind of mixfix: it needs to consume one expression after itself, by contrast with the closing bracket)
which does not).
When specifying how expressions combine, you also provide precedences; numerically larger precedences are higher-precedence (so they bind more tightly).
Precedences are given as pairs, so that you can define e.g. left-associativity (resp. right-associativity) by having the infix operator bind more strongly to the right (resp. left).
For example, if +
has precedence (10, 5), so it binds strongly on the left and weakly on the right:
a + b + c
is morallya+ b+ c
because the+
binds strongly to the left;a+ b+ c
can only be bracketed asa+ (b+ c)
because the rightmost+
wants to stick to theb
more than it wants to stick to thec
;- so
a + b + c
is equal toa + (b + c)
, i.e.+
is right-associative.