ASTExtractor is an Abstract Syntax Tree (AST) extractor for Java source code, based on the Eclipse compiler. The tool functions as a wrapper of the Eclipse compiler and allows exporting the AST of source code files or projects in XML and JSON formats. The tool has a command line interface and can also be used as a library. The documentation is available at http://thdiaman.github.io/ASTExtractor/
Execute as:
java -jar ASTExtractor.jar -project="path/to/project" -properties="path/to/propertiesfile" -repr=XML|JSON
java -jar ASTExtractor.jar -file="path/to/file" -properties="path/to/propertiesfile" -repr=XML|JSON
-properties
allows setting the location of the properties file (default is no properties so all syntax tree
nodes are returned) and -repr
allows selecting the representation of the tree (default is XML).
Import the library in your code. Set a location for the properties file using
ASTExtractorProperties.setProperties("ASTExtractor.properties");
- For folders containing java files:
String ast = ASTExtractor.parseFolder("path/to/folder/");
- For java files:
String ast = ASTExtractor.parseFile("path/to/file.java");
- For contents of java files (i.e. strings):
String ast = ASTExtractor.parseString(""
+ "import org.myclassimports;\n"
+ "\n"
+ "public class MyClass {\n"
+ " private int myvar;\n"
+ "\n"
+ " public MyClass(int myvar) {\n"
+ " this.myvar = myvar;\n"
+ " }\n"
+ "\n"
+ " public void getMyvar() {\n"
+ " return myvar;\n"
+ " }\n"
+ "}\n"
);
If JSON is required as the output representation then use these functions with a second string
argument that can be either "XML"
or "JSON"
.
ASTExtractor also has python bindings. Using the python wrapper is simple. At first, the library has to be imported and the ASTExtractor object has to be initialized given the path to the jar of the library and the path to the properties file of the library:
ast_extractor = ASTExtractor("path/to/ASTExtractor.jar", "path/to/ASTExtractor.properties")
After that, you can use it as follows:
- For folders containing java files:
ast = ast_extractor.parse_folder("path/to/folder/");
- For java files:
ast = ast_extractor.parse_file("path/to/file.java");
- For contents of java files (i.e. strings):
ast = ast_extractor.parse_string(
"import org.myclassimports;\n" +
"\n" +
"public class MyClass {\n" +
" private int myvar;\n" +
"\n" +
" public MyClass(int myvar) {\n" +
" this.myvar = myvar;\n" +
" }\n" +
"\n" +
" public void getMyvar() {\n" +
" return myvar;\n" +
" }\n" +
"}\n"
)
If JSON is required as the output representation then use these functions with a second string
argument that can be either "XML"
or "JSON"
.
Note that after using the library, you have to close the ASTExtractor object using function close
, i.e.:
ast_extractor.close()
An Abstract Syntax Tree can be very complex, including details for every identifier of the code.
In ASTExtractor, the complexity of the tree can be controlled using the ASTExtractor.properties
file. In this file, the user can select the nodes that should not appear in the final tree (OMIT
)
and the nodes that should not be analyzed further, i.e. that should be forced to be leaf nodes (LEAF
)
The default options are shown in the following example ASTExtractor.properties file:
LEAF = PackageDeclaration, ImportDeclaration, ParameterizedType, ArrayType, VariableDeclarationFragment
OMIT = Javadoc, Block