You will need to install Python and Pip, version 3.6 or better. See https://www.python.org/downloads/ and https://www.geeksforgeeks.org/how-to-install-pip-on-windows/.
An example of a parser for the Python3 target consists of the following files.
- An Antlr4 grammar, e.g., Expr.g4:
grammar Expr; start_ : expr (';' expr)* EOF; expr : atom | ('+' | '-') expr | expr '**' expr | expr ('*' | '/') expr | expr ('+' | '-') expr | '(' expr ')' | atom ; atom : INT ; INT : [0-9]+ ; WS : [ \t\n\r]+ -> skip ;
- Driver.py:
The driver code opens a file, creates a lexer, token stream,
and parser, then calls the parser.
import sys from antlr4 import * from ExprLexer import ExprLexer from ExprParser import ExprParser from VisitorInterp import VisitorInterp def main(argv): input_stream = FileStream(argv[1]) lexer = ExprLexer(input_stream) stream = CommonTokenStream(lexer) parser = ExprParser(stream) tree = parser.start_() if __name__ == '__main__': main(sys.argv)
- requirements.txt:
This file contains a list of the
required packages for the program. Required
packages are downloaded by
pip
. The file must include a reference to the Antlr Python3 runtime.antlr4-python3-runtime==4.13.0
- A build script, e.g., build.sh:
You should provide a script that builds the program.
pip install -r requirements.txt antlr4 -Dlanguage=Python3 Expr.g4
It is vital that the versions for the Antlr tool used to generate the parser and the Antlr Python3 runtime match. E.g., 4.13.0. Using build files will help eliminate common errors from happening.
For a list of antlr4 tool options, please visit ANTLR Tool Command Line Options.
- Input, e.g., input.txt:
-(1 + 2)/3; 1; 2+3; 8*9
- A run script, which runs your program.
python Driver.py input.txt
Tree traversal is used to implement static or dynamic program analysis. Antlr generates two types of tree traversals: visitors and listeners.
Understanding when to choose a visitor versus a listener is a good idea. For further information, see https://tomassetti.me/listeners-and-visitors/.
A visitor is the best choice when computing only a single synthesized attribute or when you want to control the order of parse tree nodes visited. Alternatively, a listener is the best choice when computing both synthesized and inherited attributes.
In many situations, they are interchangeable.
Antlr visitors generally implement a post-order tree walk. If you write
visit...
methods, the method must contain code to visit the children
in the order you want. For a post-order tree walk, visit the children first.
To implement a visitor, add the -visitor
option to the antlr4
command.
Create a class that inherits from the generated visitor,
then add visit
methods that implement the analysis. Your driver code
should call the visit()
method for the root of the parse tree.
For example, the following code implements an expression evaluator for the Expr.g4 grammar using a visitor.
- Driver.py:
import sys from antlr4 import * from ExprLexer import ExprLexer from ExprParser import ExprParser from VisitorInterp import VisitorInterp def main(argv): input_stream = FileStream(argv[1]) lexer = ExprLexer(input_stream) stream = CommonTokenStream(lexer) parser = ExprParser(stream) tree = parser.start_() if parser.getNumberOfSyntaxErrors() > 0: print("syntax errors") else: vinterp = VisitorInterp() vinterp.visit(tree) if __name__ == '__main__': main(sys.argv)
- VisitorInterp.py:
import sys from antlr4 import * from ExprParser import ExprParser from ExprVisitor import ExprVisitor class VisitorInterp(ExprVisitor): def visitAtom(self, ctx:ExprParser.AtomContext): return int(ctx.getText()) def visitExpr(self, ctx:ExprParser.ExprContext): if ctx.getChildCount() == 3: if ctx.getChild(0).getText() == "(": return self.visit(ctx.getChild(1)) op = ctx.getChild(1).getText() v1 = self.visit(ctx.getChild(0)) v2 = self.visit(ctx.getChild(2)) if op == "+": return v1 + v2 if op == "-": return v1 - v2 if op == "*": return v1 * v2 if op == "/": return v1 / v2 return 0 if ctx.getChildCount() == 2: opc = ctx.getChild(0).getText() if opc == "+": return self.visit(ctx.getChild(1)) if opc == "-": return - self.visit(ctx.getChild(1)) return 0 if ctx.getChildCount() == 1: return self.visit(ctx.getChild(0)) return 0 def visitStart_(self, ctx:ExprParser.Start_Context): for i in range(0, ctx.getChildCount(), 2): print(self.visit(ctx.getChild(i))) return 0
Antlr listeners perform an LR tree traversal. enter
and exit
methods are
called during the tranversal. A parse tree node is visited twice, first for
the enter
method, then the exit
method after all children have been walked.
To implement a listener, add the -listener
option to the antlr4
command.
Add a class that inherits from the generated listener
with code that implements the analysis.
The following example implements an expression evaluator using a listener.
- Driver.py:
import sys from antlr4 import * from ExprLexer import ExprLexer from ExprParser import ExprParser from ListenerInterp import ListenerInterp def main(argv): input_stream = FileStream(argv[1]) lexer = ExprLexer(input_stream) stream = CommonTokenStream(lexer) parser = ExprParser(stream) tree = parser.start_() if parser.getNumberOfSyntaxErrors() > 0: print("syntax errors") else: linterp = ListenerInterp() walker = ParseTreeWalker() walker.walk(linterp, tree) if __name__ == '__main__': main(sys.argv)
- ListenerInterp.py:
import sys from antlr4 import * from ExprParser import ExprParser from ExprListener import ExprListener class ListenerInterp(ExprListener): def __init__(self): self.result = {} def exitAtom(self, ctx:ExprParser.AtomContext): self.result[ctx] = int(ctx.getText()) def exitExpr(self, ctx:ExprParser.ExprContext): if ctx.getChildCount() == 3: if ctx.getChild(0).getText() == "(": self.result[ctx] = self.result[ctx.getChild(1)] else: opc = ctx.getChild(1).getText() v1 = self.result[ctx.getChild(0)] v2 = self.result[ctx.getChild(2)] if opc == "+": self.result[ctx] = v1 + v2 elif opc == "-": self.result[ctx] = v1 - v2 elif opc == "*": self.result[ctx] = v1 * v2 elif opc == "/": self.result[ctx] = v1 / v2 else: ctx.result[ctx] = 0 elif ctx.getChildCount() == 2: opc = ctx.getChild(0).getText() if opc == "+": v = self.result[ctx.getChild(1)] self.result[ctx] = v elif opc == "-": v = self.result[ctx.getChild(1)] self.result[ctx] = - v elif ctx.getChildCount() == 1: self.result[ctx] = self.result[ctx.getChild(0)] def exitStart_(self, ctx:ExprParser.Start_Context): for i in range(0, ctx.getChildCount(), 2): print(self.result[ctx.getChild(i)])
Further information can be found from the ANTLR 4 definitive guide.
The examples from the ANTLR 4 book converted to Python are here.
There are many examples of grammars that target the Python3 target in the grammars-v4 Github repository.