Skip to content

Conversation

@AndyAyersMS
Copy link
Member

Determine how to emit Wasm control flow from the JIT's control flow graph.

Relies on loop-aware RPO to determine the block order. Currently only handles the main method. Assumes irreducible loops have been fixed upstream (which is not yet guaranteed; bails out if not so).

Doesn't actually do any emission, just prints a textual description in the JIT dump (along with a dot markup version).

Uses only LOOP and BLOCK. Tries to limit the extent of BLOCK.

Run for now as an optional phase even if not targeting Wasm, to do some stress testing.

Contributes to #121178

Determine how to emit Wasm control flow from the JIT's control flow graph.

Relies on loop-aware RPO to determine the block order. Currently only
handles the main method. Assumes irreducible loops have been fixed
upstream (which is not yet guaranteed; bails out if not so).

Doesn't actually do any emission, just prints a textual description in
the JIT dump (along with a dot markup version).

Uses only LOOP and BLOCK. Tries to limit the extent of BLOCK.

Run for now as an optional phase even if not targeting Wasm, to
do some stress testing.

Contributes to dotnet#121178
@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Nov 6, 2025
@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented Nov 6, 2025

FYI @dotnet/jit-contrib @SingleAccretion a first cut at wasm control flow. Eventually the text emission will end up being part of emitwasm (and in binary, with text disasm) and not part of this phase.

Left: sample control flow graph. Black are Wasm blocks, red are Wasm loops.

Right: sample Wasm instruction stream (block contents not shown, just BBxx).

image (49) BB01
FALLTHROUGH

BLOCK (13)
BLOCK (12)
BLOCK (8)
BLOCK (4)
BB04
BR_IF 0 (4)
FALLTHROUGH

LOOP (4)
BB33
BR_IF 2 (8)
FALLTHROUGH

BB07
BR_IF 0 (2)be
FALLTHROUGH

END (4) LOOP
END (4)
BLOCK (7)
BB09
BR_IF 0 (7)
FALLTHROUGH

BB10
BR_IF 1 (8)
FALLTHROUGH

BB12
FALLTHROUGH

END (7)
BB19
BR 1 (12)

END (8)
BB21

BB22
FALLTHROUGH

BB16
FALLTHROUGH-inv
BR_IF-inv 1 (13)

BB30
FALLTHROUGH

END (12)
BB02
RETURN

END (13)
BB29
THROW

@AndyAyersMS
Copy link
Member Author

The

FALLTHROUGH-inv
BR_IF-inv 1 (13)

above is a reminder we need to invert the branch condition (not yet done).

@AndyAyersMS
Copy link
Member Author

Think this is ready for review. There could still be bugs in the control flow, but hard to be sure until we can actually generate code.

Errors seem to be all known or timeouts.

@AndyAyersMS AndyAyersMS marked this pull request as ready for review November 6, 2025 23:48
Copilot AI review requested due to automatic review settings November 6, 2025 23:48
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new experimental Wasm control flow simulation phase to the JIT compiler. The phase analyzes the control flow graph and simulates how Wasm BLOCK/END and LOOP/END control structures would be emitted for the method being compiled.

  • Introduces a new fgWasmControlFlow phase that transforms the control flow graph into Wasm-style nested control structures
  • Adds a JitWasmControlFlow configuration option to enable/disable this experimental feature (DEBUG builds only)
  • Implements an interval-based algorithm to properly nest blocks and loops according to Wasm requirements

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/coreclr/jit/jitconfigvalues.h Adds the JitWasmControlFlow configuration integer to control the new phase
src/coreclr/jit/fgwasm.cpp New file implementing the Wasm control flow analysis and simulation algorithm
src/coreclr/jit/compphases.h Registers the PHASE_WASM_CONTROL_FLOW phase
src/coreclr/jit/compmemkind.h Adds CMK_Wasm memory kind for allocations
src/coreclr/jit/compiler.h Declares the fgWasmControlFlow method
src/coreclr/jit/compiler.cpp Invokes the new phase conditionally in DEBUG builds after loop alignment
src/coreclr/jit/CMakeLists.txt Adds fgwasm.cpp to the build sources

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@AndyAyersMS AndyAyersMS requested a review from a team November 7, 2025 15:31
@AndyAyersMS
Copy link
Member Author

@dotnet/jit-contrib PTAL

@am11 am11 added the arch-wasm WebAssembly architecture label Nov 7, 2025
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to 'arch-wasm': @lewing, @pavelsavara
See info in area-owners.md if you want to be subscribed.

@AndyAyersMS
Copy link
Member Author

@dotnet/jit-contrib ping

CompMemKindMacro(TryRegionClone)
CompMemKindMacro(Async)
CompMemKindMacro(RangeCheckCloning)
CompMemKindMacro(Wasm)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CompMemKindMacro(Wasm)
CompMemKindMacro(WasmCfgLowering)

Wasm seems like too broad a category.

// b must remain fixed but we can increase a as needed to accomplish nesting.
// For switches we will create multiple [a,b0], [a, b1]...
//
// If a forward interval ends on a block that already has an interval, we can ignore it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// If a forward interval ends on a block that already has an interval, we can ignore it.
// If a forward interval ends on a block that already has an interval ending at it, we can ignore it.

// tree for the main method and each funclet, and the layout below will properly transform them all into
// Wasm control flow.
//
unsigned numHotBlocks = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Suggested change
unsigned numHotBlocks = 0;
unsigned numDfsBlocks = 0;

The blocks we're missing here are the throw helper ones which represent implicit flow. I think we usually don't usually refer to them as simply "cold" (as in hot/cold splitting, or zero weights).

It may prove easier to emit the branches to them 'inline' in codegen.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point... let me make a note about properly handling throw helpers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gets renamed in the sequel, so I'll hold off renaming it here.

// Because we're walking front to back, we will have already recorded an interval that starts
// earlier.
//
// We then scan the in non-decreasing start order, finding earlier intervals that contain the start
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// We then scan the in non-decreasing start order, finding earlier intervals that contain the start
// We then scan the intervals in non-decreasing start order, finding earlier intervals that contain the start

//
// We then scan the in non-decreasing start order, finding earlier intervals that contain the start
// of the current interval but not the end. When we find one, the start of the current interval will
// need to increase so the earlier interval can nest inside. That is, if have a:[0, 4] and b:[2,6] we
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// need to increase so the earlier interval can nest inside. That is, if have a:[0, 4] and b:[2,6] we
// need to decrease so the earlier interval can nest inside. That is, if have a:[0, 4] and b:[2,6] we

?

// merged with (2) above.
//
auto resolve = [&intervals](WasmInterval* const iv) {
for (WasmInterval* ip : intervals)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this has quadratic complexity, does it need another TODO for some sort of cutoff, or some other way to avoid pathological cases (maybe the constant factor is small enough)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to find all the conflicts, so there's no safe way to bail out.

If we maintain a data structure with all intervals ordered by end we could perhaps efficiently search it to minimize where to check for conflicts but maintaining that ordering seems like it might be more expensive overall.

I suspect the loop-aware RPO helps us avoid pathological cases but don't have any actual proof.

{
// We only need to consider intervals that start at the same point or earlier.
//
if (ip == iv)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: it is surprisingly difficult on the eye to distinguish between ip/iv/ic. They may benefit from longer names.

Comment on lines +288 to +290
// We ignore these and treat them as if they fall through to the tail (if there is a tail).
// Since the tail cannot be a join we don't need a block.
//
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the successor of these be the BBJ_CALLFINALLYRET to get the behavior above? I was a bit surprised to see that the logic below doesn't consider that block the successor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be handled that way (and is done so and more cleanly in the follow-up PR), but since the BBJ_CALLFINALYRET is guaranteed to also be Next() and is logically single-pred, there is no need for a block here, so ignoring it works out the same way.

Comment on lines +237 to +351
fgVisitBlocksInLoopAwareRPO(m_dfsTree, m_loops, addToSequence);

// Splice in a fake BB0
//
BasicBlock bb0;
INDEBUG(bb0.bbNum = 0;);
bb0.bbPreorderNum = numHotBlocks;
bb0.bbPostorderNum = m_dfsTree->GetPostOrderCount();
initialLayout[numHotBlocks] = &bb0;

// -----------------------------------------------
// (2) Build the intervals
//
// Allocate interval and scratch vectors. We'll use the scratch vector to keep track of
// block intervals that end at a certain point.
//
jitstd::vector<WasmInterval*> intervals(getAllocator(CMK_Wasm));
jitstd::vector<WasmInterval*> scratch(numHotBlocks, nullptr, getAllocator(CMK_Wasm));

for (unsigned int cursor = 0; cursor < numHotBlocks; cursor++)
{
BasicBlock* const block = initialLayout[cursor];

// See if we entered any loops
//
FlowGraphNaturalLoop* const loop = m_loops->GetLoopByHeader(block);

if (loop != nullptr)
{
// Find the loop's lexical extent given our ordering
// (maybe memoize this during loop finding...)
//
// Note that cursor may end up pointing at BB0
//
unsigned endCursor = cursor;
while ((endCursor < numHotBlocks) && loop->ContainsBlock(initialLayout[endCursor]))
{
endCursor++;
}

WasmInterval* const loopInterval = WasmInterval::NewLoop(this, block, initialLayout[endCursor]);

// We assume here that a block is only the header of one loop.
//
intervals.push_back(loopInterval);
}

// Now see where block branches to...
//
if (block->KindIs(BBJ_CALLFINALLY))
{
// We ignore these and treat them as if they fall through to the tail (if there is a tail).
// Since the tail cannot be a join we don't need a block.
//
continue;
}

for (BasicBlock* const succ : block->Succs())
{
unsigned const succNum = succ->bbPreorderNum;

// We ignore back edges; they don't inspire blocks.
//
if (succNum <= cursor)
{
JITDUMP("Backedge " FMT_BB "[%u] -> " FMT_BB "[%u]\n", block->bbNum, cursor, succ->bbNum, succNum);

// The backedge target should be a loop header.
// (TODO: scan loop stack to verify the loop is on the stack?)
//
// Note we currently bail out way above if there are any irreducible loops.
//
assert(m_loops->GetLoopByHeader(succ) != nullptr);
continue;
}

// Branch to next needs no block, unless this is a switch
// (eventually when we leave the default on the switch we can remove this).
//
if ((succNum == (cursor + 1)) && !block->KindIs(BBJ_SWITCH))
{
continue;
}

// Branch to cold block needs no block (presumably something EH related).
// Eventually we need to case these out and handle them better.
//
if (succNum >= numHotBlocks)
{
continue;
}

// See if we already have a block that ends at this point and starts before.
//
WasmInterval* const existingBlock = scratch[succNum];

if (existingBlock != nullptr)
{
// If so we don't need to track this branch.
//
JITDUMP("Subsumed " FMT_BB "[%u] -> " FMT_BB "[%u]\n", block->bbNum, cursor, succ->bbNum, succNum);
assert(existingBlock->Start() <= cursor);
continue;
}

// Non-contiguous, non-subsumed forward branch
//
WasmInterval* const branch = WasmInterval::NewBlock(this, block, initialLayout[succNum]);
intervals.push_back(branch);

// Remember an interval end here
//
scratch[succNum] = branch;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious if merging step 1 and 2 wouldn't simplify things. Step 1 does a recursive visit in the loop structure to create the linear range, but then if I am understanding right, step 2 recovers the nesting relationship from the linear structure.
Can't we track the nesting structure directly during the recursive visit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't true recursion here, just in inner loop walking over the same block ordering.

I'm planning on remembering the loop extent during the loop-aware RPO build so that loop body traversal code should go away, we will just know where the loop ends in the order.

We can still merge (1) and (2) since the only intervals we need to scan for overlap are the ones already created, and we create them in non-decreasing start order.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is recursion in fgVisitBlocksInLoopAwareRPO. That's the recursion I meant.
Doesn't the interval structure fall naturally out from that recursive visit of the loop tree?
Maybe that's what you meant.

Comment on lines +58 to +71
unsigned ChainEnd() const
{
return m_chainEnd;
}

WasmInterval* Chain()
{
if (m_chain == this)
{
return this;
}

WasmInterval* chain = m_chain->Chain();
m_chain = chain;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike that Start/End/ChainEnd are accessors but Chain mutates. It would be nice if Chain's name communicated that it mutates the instance, even though in this case it seems like the mutation is harmless.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any suggestions? GetAndUpdateChain ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

@kg
Copy link
Member

kg commented Nov 11, 2025

LGTM but I'll leave the green check to someone who knows the JIT better

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arch-wasm WebAssembly architecture area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants