introduce another IR for machine code instructions #9514
Labels
accepted
This proposal is planned.
backend-self-hosted
frontend
Tokenization, parsing, AstGen, Sema, and Liveness.
proposal
This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone
Problem statement:
-femit-asm
does not work in self-hosted backendsProposal to address the problems:
Instead of the self-hosted backend going straight from AIR => machine code, it would emit another IR. Let's call it MIR for Machine IR. The instructions would correspond almost one-to-one with machine code instructions. However, MIR could then be lowered into machine code, or into assembly code. This would help with debugging and working on the self-hosted codegen backends. Additionally, there could be another pass on the MIR to convert instructions into smaller encodings based on offset calculations, generating better code.
This also helps with inline assembly, which would emit MIR. In the LLVM backend, inline assembly would be lowered to MIR, which would then be lowered to LLVM flavored inline assembly. This may seem convoluted, but consider that we want Intel syntax for our x86 inline assembly, yet LLVM only supports AT&T (there are too many bugs in the AT&T dialect to say that it is supported). So this would let us have our own nice syntax and then lower it to what LLVM expects.
Compilation Speed Performance Concerns
Things are only one pass currently because I wanted to optimize for compilation speed. However, I think this is the wrong way to look at the problem. Consider:
Design of MIR
There would be a different MIR dialect for each Instruction Set Architecture. For example there would be an x86 MIR which has all the x86 instructions and an ARM MIR which has all the ARM instructions.
The LLVM backend, WebAssembly backend, C backend, and SPIR-V have no need for MIR. The "machine code" in those cases is already high enough level that no MIR is needed.
Fully implementing inline assembly and the full MIR instruction sets for each supported ISA will likely be done with large .zig files which are essentially data. I suspect this will be prohibitively slow and memory intensive for stage1 to handle, so I suggest we do a proof-of-concept with MIR in stage2 until we are fully self-hosted, and then after that we can complete the MIR instruction set listings.
The text was updated successfully, but these errors were encountered: