-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read support for remaining LLVM IR language concepts #15
Comments
@quarnster Do you have any idea how we may get rid of the intermediate step of translating into and out of the AST representation? I really wish we could figure how a way, as this extra step will make the threshold higher for keeping in sync with updates to the LLVM IR assmebly language, as time goes by and LLVM develops upstream. |
The intention is to keep the AST package minimal for now, and support only the LLVM IR assembly directives currently defined by the llir/llvm/ir package. Updates #15.
Tracking of read support for individual LLVM IR instructions. Last update: 2016-09-20 This comment has been superseded by #15 (comment) |
@mewmew Is there anything specific I can try to help with? |
While not a contrived example, a94bf85 implemented support for fixing the dummy values of Ping me before starting any larger work, so we don't duplicate our efforts. The good thing is, we can easily split the work based on the categories of instructions. I've recently been working on the other operators category. |
Is this particular topic still under active development? Is there any kind of an ETA on when it will be useable? |
Hi @chrisbdaemon, It is very much indeed. Current statusStatus as of 2016-12-10 All instructions covered by the LLVM IR subset defined in [https://godoc.org/github.com/llir/llvm/ir] are now supported by the parser. The parser branch has now been merged with master. Types
InstructionsBinary instructions
Bitwise instructions
Vector instructions
Aggregate instructions
Memory instructions
Conversion instructions
Other instructions
Terminators
Constants
Constant expressionsBinary expressions
Bitwise expressions
Vector expressions
Aggregate expressions
Memory expressions
Conversion expressions
Other expressions
|
Before the v0.2 release, we should try to implement support for the subset of LLVM IR produced by clang for regular C programs. In particular, anything that we currently strip before parsing, should be implemented natively by the BNF specification. This includes
Help fill out this list as you encounter more. Our goal is to remove the need for Note, the intention is to provide support for parsing source files including these directives, not to define the data types holding information about these directives, for the v0.2 release of the LLVM IR library. Future releases of the LLVM IR library will track the development of a sane data representation for these directives. For now, we simply want to be able to parse complex LLVM IR files, but only want to retain the most relevant information. |
Add support for unnamed function parameters. Example:
; Function Attrs: nounwind uwtable
define i32 @add(i32, i32) #0 {
%3 = alloca i32, align 4
%4 = alloca i32, align 4
store i32 %0, i32* %3, align 4
store i32 %1, i32* %4, align 4
%5 = load i32, i32* %3, align 4
%6 = load i32, i32* %4, align 4
%7 = add nsw i32 %5, %6
ret i32 %7
} |
Now capable of parsing c4.ll. Now capable of parsing sqlite3.ll, except for fence instructions. Updates #15.
The LLVM IR library read support is moving closer to being usable for real world LLVM IR assembly files. For now, we've been playing with adding read support for c4.ll and sqlite3.ll, produced by Clang from c4.c and sqlite3.c, respectively. The Gocc generated parser is now very close at handling these files. For c4.ll when optimized with For sqlite3.ll a single instruction remains to be implemented, namely fence. It is exciting times!! This work is tracked in tandem with the development of an LLVM IR to Go decompiler, which uses these files as a first semi-real world test case : ) |
Left to do as of 2017-05-06:
Intended for version 0.3
|
* Add Switch operation. * Finish switch branch. Squashed commit of the following: commit 49d5280dab8a4dbd13419360bc0bf992da444b4e Author: Jan Mercl <0xjnml@gmail.com> Date: Sat Apr 29 16:07:46 2017 +0200 WIPS commit 127b35dd82c0980c591848280619ef6f6de5dd3f Author: Jan Mercl <0xjnml@gmail.com> Date: Sat Apr 29 15:31:35 2017 +0200 WIPS commit 0205c8b377ff506fce92e1c128ede521b3d63d8e Author: Jan Mercl <0xjnml@gmail.com> Date: Sat Apr 29 15:19:27 2017 +0200 WIPS commit 3acb0a215a1c61adc715d8b275f264756891cb83 Author: Jan Mercl <0xjnml@gmail.com> Date: Sat Apr 29 15:14:38 2017 +0200 WIPS commit 98f520c9c30ad150affecda390e52e13c906e344 Author: Jan Mercl <0xjnml@gmail.com> Date: Sat Apr 29 03:04:06 2017 +0200 WIPS commit 09a19e92dbc2142a94c3dd5a5ea491d410990260 Author: Jan Mercl <0xjnml@gmail.com> Date: Sat Apr 29 02:37:28 2017 +0200 WIPS commit 7ed7ce62f9f0381673b75e766ac4e96c109e6029 Author: Jan Mercl <0xjnml@gmail.com> Date: Sat Apr 29 00:20:55 2017 +0200 Try to fix "..\cc\testdata\tcc-0.9.26\tests\tests2\40_stdio.c:5:14: undefined fopen" on Windows. commit 401ecab6670c777da18610f170a0621d4638a6b7 Author: Jan Mercl <0xjnml@gmail.com> Date: Fri Apr 28 22:27:25 2017 +0200 WIPS commit f409c2dadc61ffef89d13f2c81ea6bdb17b6335c Author: Jan Mercl <0xjnml@gmail.com> Date: Fri Apr 28 22:05:38 2017 +0200 WIPS
We are currently preparing for the v0.2 release of llir/llvm. Within the next few days we will check-in the source code of the Gocc generated parser into the source tree, thus making the project go-getable. #15 (comment) tracks the last few pieces of the LLVM IR assembly language which are currently not part of the BNF grammar. A few of those language concepts have been intentionally postponed to a future release, as their occurrence in LLVM IR source files are not too common. Cheers /u & i |
@chrisbdaemon The source code of the Gocc generated parser has now been checked into the source tree, thus making the llir/llvm packages go-getable. The remaining LLVM IR language concepts are tracked for the version 0.3 release (#15 (comment)). |
Read and write support for all LLVM IR construct as of LLVM 7.0 have now been implemented. Therefore, we may now close this issue. If you find any constructs that are not supported, feel free to comment here or open a new issue. |
The intention is to provide read support for LLVM IR assembly using a Gocc generated lexer and parser from a BNF grammar of the LLVM IR assembly language.
The BNF grammar is located at ast/internal/ll.bnf. The reason to keep the grammar in an
internal
directory, is because the lexer and parser packages generated by Gocc will be considered internal packages, and should not be used by end-users directly. Instead, high-level libraries will make use of these internal packages to parse LLVM IR assembly into the data structures of the llir/llvm/ir package.Since LLVM IR makes use of unnamed local variables and basic blocks, a context is required to keep track of and map local IDs to their associated values. A bit unfortunate, but this essentially means we cannot use syntax directed translation to translate directly from LLVM IR assembly to the data structures of the ir package. Instead, we must introduce an intermediate step which keeps the necessary information around for us to create and make use of this contextual information. Said and done, the current approach is to define an ast package for LLVM IR assembly, which will later be traversed to create the aforementioned context and translate AST nodes into their corresponding ir data types.
To get a feel for what the production action expressions of Gocc looks like, see the follow example.
Help wanted
If anyone manages to figure out a clean way for us to skip this step (i.e. not having to translate from BNF grammar to AST, then from AST to ir data types; but instead, translating directly from BNF grammar to ir data types), and go directly from the BNF grammar to the ir package data types using production action expressions, please let us know. This would facilitate the maintainability and future development of this package a lot!
The text was updated successfully, but these errors were encountered: