A compiler for a small subset of java.
The compiler was written as the main project in an undergraduate compiler course.
Minijavac parses Minijava source code, reports syntax and semantic errors, and checks to ensure that variables are initialized before they are used.
If the source code is error free, java bytecodes(class files) are generated in the current working directory.
As a language addition, minijava developers can use the power operator(23 ~ 2**3).
The jar file can be found here
minijavac requires java 1.8 to run.
Minijava source code can be compiled as follows:
java -jar minijavac.jar [source-file-name]
The generated byte code is backwards compatible with java 1.1 and will run on any version of java.
java [main-class-name]
Compilers operate in three main stages:
-
Parse source files to generate a parse tree. Report any syntax errors.
-
Traverse the parse tree, building a symbol table from the source code and reporting all semantic errors.
-
Generate code in the target language(java bytecode in this case)
To parse minijava source code, a parser was generated with the Antlr4 parser generator tool. Antlr generates a parser in java for the given minijava grammar. The parser attempts to generates a parse tree given some minijava source code. If any syntax errors are encountered during parsing, an error message is generated underlining the offending tokens. If there are no syntax errors, the parser builds a parse tree and the compiler proceeds to semantic analysis.
Semantic analysis proceeds in the following fashion.
- A symbol table is constructed be traversing the parse tree. During this phase, the program is checked for:
- The program is type checked to grant as many static typing guarantees as it can. eg. x = x + y; is checked to ensure that:
- x and y are both integers(+ operates only on ints in minijava)
- The left hand side of the assignment, x, has the same type as the expression on the right side.
- Variables are checked to ensure that they are initialized before use.
For a given method f() taking no arguments and returning an integer:
int f(){
int x; //declaration
x=0; //initialization
return x; //use
}
should compile, while
int f(){
int x; //declaration
return x; //error, variable x may not have been initialized.
}
should print an initialization before use error
This is where the [magic](CodeGenerator Documentation) happens. The parse tree is traversed a final time in order to generate byte code.
//: # (Generating bytecode, even with ASM, is akin to printing assembly to a file.)
Building the tool from source requires the Antlr4 java library and the ASM java library.
-
Clone the project from the git repository
-
Unzip asm-5.0.3-bin.zip and add asm-5.0.3/lib/all/asm-all-5.03.jar and the antlr jar to your path.
-
Generate the lexer and parser by feeding Antlr4 the minijava grammar.
java -jar ~/path/to/antlr-4.4-complete.jar -visitor Minijava.g4
-
compile with the java 1.8 compiler
javac *.java
And you're done. If you want to package it as a jar file:
jar cf minijavac.jar *.class