-
Notifications
You must be signed in to change notification settings - Fork 694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AST to Stack Machine questions #755
Comments
I too was somewhat surprised about the stack machine implying that ast's are no longer a way of modeling the module. I've been working on level one compression (see decompression prototype for details). It has forced me to reconsider how to implement it. I do believe I found a good replacement for Asts (when considering) decompression, and will update the design documents in the near future to reflect the new direction. |
Given the above it's no surprise that the pivot to a stack machine was decided by people working on browser backends and opposed by people involved in transforming or displaying the Wasm AST. |
@KarlSchimpf I would even much like to see your what you found for your replacement. We have slightly different level one compression needs. In Ethereums case we have global knowledge of the code base so it would be nice to only send segments of the AST that are not contained yet in the blockchain (see more here) . I would like to try and figure out if your replacement can be reused in our usecase. |
It's not going to change things for the runtime AOT or JIT compiler if that is the concern. The wasm decoder for an optimizing compiler converts the wasm code into SSA no matter what the encoding. There might be some slight differences such as the initial number of phi nodes generated etc. The runtime compiled code does not deal with the decoder stack (perhaps only for debugging), although a very simple compiler might emit code that parallels the stack operation. If the Perhaps you could see how this style, using |
I am all for @JSStats proposal for adding |
I've written up a bit about the stack machine here: https://docs.google.com/document/d/1CieRxPy3Fp62LQdtWfhymikb_veZI7S9MnuCZw7biII/edit?usp=sharing |
Thanks @titzer for another useful stack machine writeup! Is it too late to not call it |
I don't like |
Even without this optimization the amount of space that void nodes consumes seems trivial even for a baseline compiler. It seems to have already been decided that wasm is not optimized for interpretation. Thus I think these reasons alone make only a marginal technical case. The key advantage for a baseline compiler appears to be that there are less live values on the stack, and this is a result of the set_local and store operators returning void, which is not a product of the stack machine design.
A baseline compiler is not going to allocate registers for void elements so this restriction makes no difference to the baseline compiler.
The
Does this ensure that wasm validators do not need to compare the full function values stack at control flow merge points, rather just the set of values retained on the stack. If so then this might also be a distinct feature of wasm. Do CIL and JVM validators need to compare all stack values? |
I believe that switching to a stack machine is short-sighted and a big mistake: The AST format could open up new possibilities for software, some of which are observable in Lispy languages like Scheme (I won't list them here). Instead, we're looking at locking the software world back into this 1960's model for another 50 years out of a misguided concern for optimization over power. It's like foregoing the arch because it's more work to craft, and instead coming up with a REALLY efficient way to fit square blocks together. Congratulations, we can build better pyramids, but will never grasp the concept of a cathedral. To really grasp my point, I BEG you all to watch the following two videos in full and think hard about what Alan Kay & Douglass Crockford have to say about new ideas, building complex structures, and leaving something better for the next generation: https://www.youtube.com/watch?v=PSGEjv3Tqo0 As Alan Kay states, what is simpler: something that's easier to process, but for which the software written on top of it is massive; or one that takes a bit more overhead, but allows for powerful new ways to model software and reduce complexity? I believe that an AST model is a major start in inventing "the arch" that's been missing in software, and with something that will proliferate the whole web ... how short-sighted it would be to give that up in favor of "optimizing" the old thing. Imagine if instead of JavaScript, the language of the web had been Java? Lambdas would not be mainstream; new ways of doing OOP would not be thought of; and all the amazing libraries that have been written because of the ad-hoc object modeling that JavaScript offers. Probably one of the messiest and inefficient languages ever written, yet one of the most powerful ever given. C'mon, let's do it a step more by making it binary, homoiconic, and self-modifying. ... and if you're brave enough, think about how Christopher Alexander's philosophy of "unfolding wholeness" applies so much more to an AST than to the stack-machines of the 1960's: Thanks. |
@d-cook But how does an AST encoding of the same low level operators enable 'powerful new ways to model software and reduce complexity'? I don't think it's just a matter of an AST versus a stack of values because both encodings can be decoded to SSA form canonicalizing away the difference so they are just different encodings of the same code. Wasm is not a high level language. It might add objects in future. Javascript does not have 64 bit integers, and it has lots of other constraints. There are lots of languages that might want to target wasm, and making these run more efficiently on the web might help explore more software options. Keeping wasm light weight is driven in part by security concerns, an important matter for the web. If you like Javascript then it is still expected to be supported on the web. |
@d-cook I do think that the browser guys pulled the trigger on the stack machine too early. They made the decision in private and basically told us "this is how it's gonna be" (the earliest explanation of it I can find in this repo is here). They observed that (1) for the final back-end, the stack machine was slightly simpler than the original AST design, (2) it provided occasional optimization opportunities, and (3) the stack machine generalizes nicely to processing multiple values (e.g. for code that deals with small structures). However, since the MVP won't support (3), I don't think they should have finalized the decision so early, without consulting the community group. How can we be sure that there was no simple AST-based solution to the multi-values problem, when the community was never given the opportunity to explore the solution space? Clearly, the stack machine is better for the back-end, but for other situations the AST seems better. This matters because ultimately there will be many more Wasm producers than consumers, and ASTs are nicer for producers to work with. One of the features of Wasm that I think is super important (although I'm not sure if MVP supports it) is run-time code generation. If you're generating code inside the browser, it's probably more convenient to produce an AST, in which case you would have to include a conversion layer from AST to stack machine, which will increase the download size. Some people won't want to increase the size, so there will be a "split" in best practice as two different approaches emerge, one optimized for download size and the other for convenience. We might have avoided that split if the browser guys had been willing to engage the community - but it's also possible that we wouldn't have discovered a way to get comparable benefits from an AST, and we would have switched to a stack machine anyway. Anyway, they made up their minds three months ago, and I think now we just have to live with it. The most popular Wasm back-end, binaryen, is still using an AST so perhaps in the long run that will form the basis of a standardized AST based on wasm.
Please note, wasm is designed to be very low-level so that it can produce high-performance machine code quickly and easily. I think that's part of why it doesn't explicitly model data structures as the CLR and JVM do. Another reason is that they're focusing on systems languages like C++ as a compilation target first; I guess that's because we already have JavaScript and they want to focus on the stuff for which JS is least-well-suited... So even if it were AST-based, generating wasm code directly wouldn't be especially convenient. Notably, wasm was not intended to be a source language for developers to directly write code in. There's some justification for this: producers of wasm code are expected to optimize the code themselves so that the web browser doesn't have to spend precious client-side CPU cycles on optimizations. If wasm were too high-level, it would have been less amenable to optimization. I suspect what you'd like to see is similar to what I called for over two years ago: a virtual machine with
Wasm MVP provides neither the necessary FPs nor IFs, though limited run-time code generation may be possible (I'm not sure - with some help from JS you can generate new modules at runtime but I'm not sure if those modules can interoperate seamlessly with the original program or be GC'd). The "future features" include various FPs but not IFs. You mentioned "the whole web". I believe that once a VM with FPs is added to a web browser, it will inevitably grow bigger than the web itself: it will become the standard operating environment for desktop/mobile apps too, because developers don't want to write two codebases for two different environments. Most developers will want to enable their code to run on the web, so why not use that same code in your non-web apps? All this really underscores the importance of getting Wasm right, so it's a little disappointing that IFs are not in the plan anywhere. The good news, though, is that we can build IFs on top of wasm. In some ways it's actually better to define IFs separately from wasm itself, because it will allow people to explore different interop concepts without forcing the web browser to permanently bake-in a single way of doing things. One downside is that unlike wasm, no company is funding work on interoperability (if there were such a company, I'd send in my resume today - I'm the guy behind Loyc.net btw). Another downside is that the "standard library" would have to be downloaded rather than included in the web browser, but hopefully local caching will take care of that problem in the long run - or at least, it will if one single library becomes very popular. |
On Thu, Nov 3, 2016 at 10:38 AM, David Piepgrass notifications@github.com
I implemented multi-values in both the interpreter and compiler in V8, and
Anyway, they made up their minds three months ago, and I think now we just
|
I'm trying to learn what "pivot to stack machine" mean. So this is an attempt to clear up understanding.
The reason to change to a full stack machine appears to be
The things that are not clear to me are
Also Related
The text was updated successfully, but these errors were encountered: