Skip to content

StackIR: Run StackIR during binary writing #6568

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 80 commits into from
May 9, 2024

Conversation

kripken
Copy link
Member

@kripken kripken commented May 2, 2024

Previously we had passes --generate-stack-ir, --optimize-stack-ir, --print-stack-ir
that could be run like any other passes. After generating StackIR it was stashed on
the function and invalidated if we modified BinaryenIR. If it wasn't invalidated then
it was used during binary writing. This PR switches things so that we optionally
generate, optimize, and print StackIR only during binary writing.

This is almost NFC, but there are some minor noticeable differences:

  1. We no longer print has StackIR in the text format when we see it is there. It
    will not be there during normal printing, as it is only present during binary writing.
    (but --print-stack-ir still works as before; as mentioned above it runs during writing).
  2. --generate/optimize/print-stack-ir change from being passes to being flags that
    control that behavior instead. As passes, their order on the commandline mattered,
    while now it does not, and they only "globally" affect things during writing.
  3. The C API changes slightly, as there is no need to pass it an option "optimize" to
    the StackIR APIs. Whether we optimize is handled by --optimize-stack-ir which is
    set like other optimization flags on the PassOptions object, so we don't need the
    old option to those C APIs.

The benefits this means to bring are

  1. A step towards Run (basic) StackIR optimizations in all binary writes? #6509 which suggests we run StackIR opts more often. By doing
    them during binary writing we can make that change if we want it: it would be a
    small change to the binary writing process to always generate StackIR and
    optimize it.
  2. In Fix binary emitting of br_if with a refined value by emitting a cast #6510 we wanted to potentially depend on StackIR for lowering of br_if that
    needs extra work due to type system differences with the wasm spec. After this
    PR we can make such dependencies if we want them, both because of the previous
    point and also that we no longer "split" binary writing across the codebase: before
    the --generate-stack-ir pass began the binary writing process, in effect, and so it
    stashed StackIR for the writing later. If we want more things, like br_if lowering,
    then we'd have more to stash. Doing it all during binary writing avoids that, since
    nothing is done ahead of time.
  3. This simplifies a bunch of things, e.g., this PR removes the logic to check and
    invalidate StackIR during passes (no longer needed since it happens after passes
    run), and the new wasm EH lowering logic no longer needs to do anything special
    for StackIR (things just work properly: if we are optimizing, then during binary
    writing we'll emit and optimize StackIR).

In more detail, this moves passes/StackIR.cpp to wasm/wasm-stack-opts.cpp.
Those are all the StackIR opts and they are unchanged. Otherwise the binary
writing logic calls a StackIROptimizer that does those opts (if we are doing
them). The binary writing logic still generates and optimizes StackIR in parallel,
so this should have no downside in compile times.

Several tests change due to losing (; has StackIR ;) comments, as explained
above. A more significant change happens in e.g. O2_print-stack-ir as the
--print-stack-ir flag is a global setting, affecting writing, which happens at the
end, so the order of the outputs is reversed (before it was a pass that executed
according to the order on the commandline); the contents are unchanged though.

@tlively
Copy link
Member

tlively commented May 8, 2024

Could we simplify the printing situation by generating StackIR both in the binary writer and also separately on demand when printing, rather than requiring the printer to depend on the binary writer?

@kripken
Copy link
Member Author

kripken commented May 9, 2024

Could we simplify the printing situation by generating StackIR both in the binary writer and also separately on demand when printing, rather than requiring the printer to depend on the binary writer?

Good idea, yeah, I can try that direction. Will take some work though...

@kripken kripken marked this pull request as draft May 9, 2024 18:19
@kripken kripken marked this pull request as ready for review May 9, 2024 19:33
@kripken
Copy link
Member Author

kripken commented May 9, 2024

Ok, I still need to clean up some comments and such, but I think this now has the right shape following your suggestion. I went all the way now and removed all StackIR from wasm.h, after which there is a utility to generate it in wasm-stack.h that is used by both binary writing and printing.

Copy link
Member

@tlively tlively left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, removing StackIR entirely from wasm.h seems like a huge win, conceptually, at least. I'd be happy landing this.

@kripken
Copy link
Member Author

kripken commented May 9, 2024

Yeah, I think the wasm.h simplification makes the other downsides moot This is simpler now, even if it may be doing slightly more work. I took some more measurements now and still can't see a speed downside in practice, so I was overly worried, I guess. Landing.

@kripken
Copy link
Member Author

kripken commented May 9, 2024

(Also fuzzed and verified no code size/metadce changes on Emscripten.)

@kripken kripken merged commit 7b2e019 into WebAssembly:main May 9, 2024
13 checks passed
@kripken kripken deleted the stackir.last branch May 9, 2024 22:00
kripken added a commit that referenced this pull request Aug 21, 2024
#6859)

This is in quite ancient code, so it's a long-standing issue, but it got worse
when we enabled StackIR in more situations (#6568), which made it more
noticeable, I think.

For example, testing on test_biggerswitch in Emscripten, the LLVM part
is pretty slow too so the Binaryen slowdown didn't stand out hugely, but
just doing

wasm-opt --optimize-level=2 input.wasm -o output.wasm

(that is, do no work, but set the optimize level to 2 so that StackIR opts
are run) used to take 28 seconds (!). With this PR that goes down to less
than 1.
@gkdn gkdn mentioned this pull request Aug 31, 2024
CountBleck added a commit to CountBleck/assemblyscript that referenced this pull request Jun 3, 2025
In WebAssembly/binaryen#6568, StackIR was changed from a set of passes
to a global option. This meant that the `true` argument passed into
_BinaryenModuleAllocateAndWriteStackIR (to optimize the StackIR) no
longer had an effect. This commit restores that and also runs the
optimizations under the same {optimize,shrink}Levels as wasm-opt.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants