Let's build a Lua compiler from the ground up in Lua. Here, we will be developing our own tools from scratch along the way, which is great because Lua doesn't come with much in terms of its standard library so we get to do a lot platforming work as well :)
I will be focusing heavier on the parsing aspect of this compiler as that is the low-hanging fruit aspect of compiler construction that I haven't had a chance to explore deeply yet.
See https://github.com/leegao/Lua-In-Lua/blob/master/lua/grammar.ylua for the grammar, the parser language is written in the same language that the parser parses (yo dawg). Note that Lua's language specification is not LL(n) for any finite n, which means that any oracular lookahead machine will still not be able to parse this without angelicism. To get around this, we use an extremely clever idea: we relax the language to be parsable by LL(3) and we use the semantic action during parse time to restrict valid trees. This mix of "dynamic" and "static" parsing analysis will allow us to get a full Lua parser.
To see the compiler in action, run hello.lua. The output will be a list of opcodes that is generated based on ll1/ll1.lua.
Status Report:
- Regular expressions recognizer generator: completed!
- Lexer generator: backend completed, pending Parser to self-host the frontend as well.
- LL(1) Parser generator: completed, which will use the graph reduction to efficiently compute the fixed point.
- Add nullable elimination transformation - (grammar cannot have inherent nullables anymore)
- Eliminate production cycles (A ⩲> A)
- Eliminte immediate left recursion
- Create left-recursion folding mechanism
- Create left-factor elimination
- Self-hosting the Lexer.
- Self-hosting the Parser.
- Added support for extende BNF grammars
- Added support for oracular lookaheads
- Create a lua parser in LL(1)
- Add semantic actions to the lua parser
- Add a tokenizer for Lua
- Lua 5.2 bytecode deserializer
- AST -> Bytecode translation
- AST -> Bytecode compiler.
- Able to compile Hello World!
In progress:
- Operator precedence in parser
- Lua 5.2 bytecode interpreter
TODO:
- Standard library
- Self-host the toolchain (5.2 compatibility)
Stretch:
- LR(0-1) Parser, which will also use a similar mechanism
- Type-induction.
- Type inferencing.