-
Notifications
You must be signed in to change notification settings - Fork 233
Moving to menhir
Menhir is ocaml-based LR(1) parser generator with a lot of interesting features such as parametrized production rules, nice syntactic additions over the mainstream yacc syntax and really good error reporting of parser conflicts. Being based on ocamlyacc the syntax is pretty close to fsyacc.
The main motivation for this major change are :
- having a simpler parser (a code easier to dive in and modify), closer to the expected bnf of FStar
- having better maintenance tools (i.e for analysing shift/reduce, reduce/reduce and end-of-stream conflicts)
- having better error reporting of syntactic errors (not done yet...)
Until now, the main parser was written in fsyacc and sed-compiled into a ocamlyacc-compatible file which was used for the ocaml based FStar. The introduction of menhir swap the situation : the official parser is written in a menhir compatible file format and then preprocessed to ocamlyacc and fsyacc.
If you are not modifying the parser, there is currently no change to your workflow. You only need menhir (version 20161115
or later) if you start modifying the parser. In that case do not forget to regenerate the ocamlyacc and fsyacc parsers by calling make -C src/ocaml-output parser
before pushing your changes.
To recapitulate we now have 3 different parsers :
-
src/parser/parse.mly
is the original menhir parser -
src/parser/ml/parse.mly
is the preprocessed-from-menhir ocamlyacc parser -
src/parser/parse.fsy
is the sed-processed-from-ocamlyacc fsyacc parser
and the rules to (re)generate them are in src/ocaml-output/Makefile
.
A lot of small syntax fixes have been incorporated in this new parser.
A long-standing request for FStar : having the possibility of writing
let f (#a b #c : Type) = ...
instead of the more cumbersome
let f (#a : Type) (b : Type) (#c : Type) =...
In the case of a multibinder with a refinement (x y : t { phi })
no
variable is bound in phi
(which is different from the one-binder case).
So (x y : t { phi })
is elaborated to (x:(z:t { phi })) (x:(z:t { phi }))
with z fresh,
whereas (x : t { phi })
is elaborated to (x:(x:t { phi }))
val
is now syntactically closer to let
and accepts binders before the type ascription e.g.
val f : (x:int) -> (b:bool) -> unit
can be written
val f (x:int) (b:bool) : unit
There is still a restriction that these parameters are binders and not patterns so you still
can't write val g (Some x: option int) : unit
whereas this syntax is valid for let
.
The natural syntax val ( op ) : ...
and let ( op ) x y = ...
is now valid in order to declare and define infix operators.
The wildcard _
can now be used in most binding constructs for example fun _ -> ...
.
There is currently one specific restriction in let
-multibinders : in that case all the binders must be named (so let f (x _ : int) = 42
is forbidden).
Fstar now accept refinements annotations on arbitrary patterns such as
let h (Some (x,y):(option (int * int)){x + y = 3}) = x
This annotation is desugared to the following code
let f (Some (x,y):(z:(option (int * int)){ match z with | Some (x,y) -> x + y = 3 | _ -> False})) = x
The ascription operator <:
is now tighter than before for example
(fun x -> x <: int)
is now parsed as fun x -> (x <: int)
Note that in the previous parser, the rule for ascription was ( term <: type )
and that the parens are no longer necessary.
There are also some regressions/syntactic fixes which might break existing code :
-
(| t |)
is no longer accepted as a variant of( t )
in particular the correct syntax for defining a dependent pair type is(x:a & b(x))
(if the parens are necessary) - The focusing let has been slightly modified :
let ~> rec bidule = ...
has been replaced bylet rec ~> bidule = ...
- The stratified specific
=
qualifier is not available anymore on binders i.e. the deprecatedfun (=x:t) -> ...
must be changed tofun ($x:t) -> ...
- It is now possible to put parens around implicit parameters in types e.g.
val u : (#a:Type) -> (x:a) -> Tot unit