-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: stop computing dom start nodes for morph RPO #94497
JIT: stop computing dom start nodes for morph RPO #94497
Conversation
We don't need quite this broad of a start node set, as the DFS will find unreachable blocks on its own. Also lay the groundwork for removing unreachable blocks, by tracking the postorder number of the last reachable block. Contributes to dotnet#93246.
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsWe don't need quite this broad of a start node set, as the DFS will find unreachable blocks on its own. Also lay the groundwork for removing unreachable blocks, by tracking the postorder number of the last reachable block. Contributes to #93246.
|
@jakobbotsch PTAL No diffs, modest (~0.05%) TP win (enabling RPO cost about 0.25%, so this gets back a chunk of that). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks fine to me, but why not improve fgDfsReversePostorder
for everyone? It confuses me a little that morph wants different behavior from everyone else.
We use this for synthesis, reachability, and dominance. The first two probably can change over. For dominance it may make sense to try and pick additional start blocks that are clearly dom tree roots—but let me see what happens if we don't. |
Well, given that we already loop over all the basic blocks in To get the most possible benefits, I would rewrite the entire function like this: fgDfsReversePostorderHelper(fgFirstBB, ...);
for (EHblkDsc* clause : EHClauses(this))
{
// visit handler
}
fgDfsLastReachablePostorderNumber = postorderIndex;
if (preorderIndex != fgBBcount + 1)
{
// rare cases with unreachable blocks and isolated cycles
for (BasicBlock* bb : Blocks())
{
if (bb->bbRefs == 0)
// visit unreachable enter block
}
for (BasicBlock* bb : Blocks())
{
if (!visited(bb))
// visit isolated cycle
}
} IIUC it should have the same benefit as this PR, maintain the same behavior as previously for everyone, and avoid the full basic block scan in common cases. (Thinking about it a little more, I'm not even sure the separate loop checking |
Yeah, that makes sense. Let me see what not finding the "optimal" start point for the unvisited blocks does for dominance... hopefully it does not matter. It would also be nice if we could use bit vectors (or their complements) for worklists, then we wouldn't have to search the entire block list for unvisited blocks, but I don't think we're set up to do that quite yet. We'd also need a map from bbNum to Block*. |
@jakobbotsch looks like indeed none of it was needed, and now this more or less erases the TP cost from the RPO. Enter blocks is still used in a few places (reachability and dominance) so we now have a couple similar looking computations between that, the "inlined" version here in the DFS, and another one in the dead blocks code -- it might be nice to figure out how to have just one that can meet all needs. But I'll save that for another day. |
Still something broken with a few arm tests No Diffs otherwise, and a TP win. CI compiler seems to have been updated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Seems like there's an arm32 failure.
Looks like there is a single diff in x86. Presumably that's because we may start the DFS from handler blocks in a different order than before.
Remove the up-front computation of enter blocks and dom
start nodes from the DFS computations used for RPO.
Also lay the groundwork for removing unreachable blocks, by tracking
the postorder number of the last reachable block.