the lexer will parse the input and convert it into a stream of tokens.
examples of tokens:
keywords (if, else, for, func)
operators: +, -, /, *, %, ++, --, ...
grammar separators: {, }, [, ], (, ) ...
types: string, number,
the parser goal is to read the lexer tokens and:
- validate the grammar is correct and report errors if any
- build the AST (Abstract Syntax tree)
the main challenge is parsing expressions, ex: (54(cos(3-4+23)))
Since the target VM is a Stack VM (all local variables should be on the stack), there are no registers, then while parsing/building an expression AST node, the parser will convert all expressions from their infix notation, to the postfix notation.
Ex: 5 + 3 * 4 => 3 4 * 5 +
then the parser will simply return the AST of the entire program.
The compiler goal is to walk the AST and generate valid instructions for the VM, and provide extra meta-data for the VM to run properly.
but also, it has to do some extra work mainly:
- dealing with local/global variables
- function calls conventions
- Statically compute the values of SP, SBP
- compute addresses for VM branching ahead of time [if, else, loops ...]
simply execute the generated bytecode. But the design of the VM will impact heavily the compiler.
any changes to the VM requires changes to the Compiler