-
Notifications
You must be signed in to change notification settings - Fork 1
Options and directives
Options are in the first part of the grammar file. They are used to define various things. Similiar to yacc, all options begin with '%'.
%header { [content] }
Define the header of the generated code, just like prologues in yacc. [content]
will be inserted directly to the begining of the code. Usually [content]
consists statements like import ... from ...
or require('...')
to import some necesary modules for the sematic actions to work.
%extra_arg { [args] }
Define extra member variables for the generated parser object. [args]
will be directed inserted to the body of the defination of the parser, so can be accessed in the sematic action blocks.
The specific format of [args]
depends on the output language. For typescript and javascript, the generated parser is a big closure object, so you could write some thing like this:
%extra_arg {
var foo;
var bar;
}
%init { [init args] }{ [init body] }
Define the arguments and extra body of the parser's initializing method. The generated parser has a method init()
, which is used to initialize the parser, and has no argument by default. But it could be necessary to initialize the variables defined with %extra_arg
, or to allow the sematic actions to access some other objects without using global variables, so this directive could help.
For example, the following defination:
%extra_arg {
var foo: number;
}
%init {foo1: number}{
foo = foo1;
}
would results in something like this:
...
var foo: number;
...
init(foo1: number){
... some statements that initialize internal variables ...
foo = foo1;
}
...
As you might have noticed, the format of [init args]
and [init body]
also depends on the output language. In the example above, the output is typescript.
%type {[type]}
Define the type of the sematic variables on the sematic stack, which is the return type of the terminals and non terminals of the grammar rules. See Grammar rules for more information about this.
This directive is only necessary when output language is a strong type language, such as typescript. If this directive is not present, a default type would be used. For typescript, the default type is any
.
%option { ( [option name] = "[option value]" )* }
Define miscellaneous options. These option might depend on the output language. Currently, supported options are:
Option | Description | Default |
---|---|---|
className | the generated parser's class name | Parser |
%output "[output language]"
Specify output language. Supported languages are:
- typescript (default)
- javascript
more languages will be added in future releases.
%token <[token name]>
Define a token directly, without any lexical rule, just let the parser know that such a token exists. Usually this directive is used to define token to be emitted in the sematic action blocks. See Actions for more information.
%token_hook ([argument]){[body]}
Defines a function that will be called when every token is emitted. If this function returns true
the just emitted token will be ignored by the parser.
[argument]
is the emitted token.
%left [token] (, [token])*
%right [token] (, [token])*
%nonassoc [token] (, [token])*
Define associativity and priority of tokens, where [token]
could be <[token name]>
or a string, the alias of the token. These directives are basically used to elimilate shift/reduce conflict, and provide a simpler way to write operator precedence grammars such as math expressions. See The parser for a more concrete explaination.
%lex ( < [lexical state] (, [lexical state])* > )? { [lexical rules] }
Define lexical rules. Checkout Lexical rules for a specification of this directive.