-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The code generators should be AST to AST transformations #6
Comments
I think it really needs to be a new IR because it has domain-specific problems^Wopportunities and the wins from separating the code seem so much greater than the reuse of admittedly well-trod PNode stuff. The idea of making any change to PNode code (in order to support a new backend IR) sends a shiver down my spine. As a smaller and segregated entity, it can start off tight and evolve a little here and there without disrupting anything else. |
using an IR will eliminate a whole class of errors:
it also enables so many use cases (eg a real REPL, jit during VM, wasm targetting...), I think it's obvious that's the way forward.
note that wasm can be generated from llvmIR (eg https://medium.com/@richardanaya/write-web-assembly-with-llvm-fbee788b2817) so there's no need for nim to worry about wasm, in theory.
IMO that's a no-brainer, it should be an unrelated IRNode that PNode transforms to. It gives maximum flexibility in evolving PNode and IRNode independently. I'm not worried about cost
I'm not seeing a problem. We just generate code that generates static data containing serialized |
Generally, transforming to an intermediate IR is a balance between information loss and simplicity - ie there are things that the language structurally enforces that will go missing in an IR form, depending on how it's chosen. Often, optimizers then work backwards to recreate the missing information - a trivial example is liveness of variables - in Nim, a locally scoped variable in a block goes out of scope at block end but in the LLVM IR it lives for the duration of the function (mostly) and additional annotations are needed to trace the liveness back to use it in optimizations. I would generally not base an IR on LLVM - it's too low-level for representing the language in a way that is useful to many IR-to-IR transformations that could otherwise be made - for example, it would be nice to reason about ownership, lifetimes, callbacks, closures etc in the IR - inlining a closure that does not escape is a typical thing I'd do in such a transformation - RVO another. It is also machine-specific (you can't port IR code between different machines) and uses a lot of pointer manipulations that would be confusing for backends. If you want to work with LLVM IR, just write your transformation in C++ and contribute it to the LLVM compiler - the work will be much more broadly applicable and useful. I would also not base it on PNode - as @disruptek points out, it's better for many reasons if the two are separate, also because one can reason more clearly about what's allowed and what isn't, when the IR is smaller and more dedicated. PNode is too focused on the precise textual representation of the langauge and also what macros are supposed to be working with - it has a lot of distractions that makes it clunky and inconvenient in other contexts. Finally, it's not impossible that each backend should have its own IR - ie representing Every IR will need a generic way to transport additional information to the layer below it and back. For example, to reason about alignment and object size, one needs to know what the C compiler thinks about type sizes etc, which depends on the compiler used and the flags passed to it. Likewise, features like |
If anything, https://rust-lang.github.io/rfcs/1211-mir.html would be a better source of inspiration for an IR - it's still powerful enough to do transformation but a lot simpler than the full language, making backend implementation a breeze (including C and nlvm/llvm) - it's designed to allow reasoning about the application in a way that's similar to what drnim tries to achieve - doing it this way would make it easier to feed back information to the user as well - not-nil analysis, static overflow checking etc etc. The LLVM IR is too low level for this also. |
LLVM IR is register-based - WASM is stackbased - this discrepancy causes issues generating one from the other - again, a higher level IR would likely be easier if you want separate backends for these. |
Just wanted to say that I'd love to see nim move towards ast-to-ast backends, I played with the idea using wasm as a target and a simplified ast kind of like MIR is for Rust would surely help. |
Currently the JS / C / C++ code generators produce the code directly as strings/ropes so the result cannot be optimized further. This design is a messy legacy and prevents certain bugs from being fixed easily. A much more elegant design is to give the target language an internal representation (AST like) that we convert the Nim AST to. There would be a simple "IR to text" step automating trivialities such as seperating arguments with commas. The IR needs an escape hatch like
cEmit
so that the.emit
statement can continue to work. Also, some parts of the code generation might not map easily to a structured IR such as the code generation for Nim's type information.It's an open question if the C IR should be based on Nim's PNode structure, but currently I am leaning heavily to new dedicated tree structure that naturally supports
goto
statements, for example.The text was updated successfully, but these errors were encountered: