-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Reconsider how to represent defs/uses of CPU flags (GTF_SET_FLAGS) #74867
Comments
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsWhen Today this works out because we do very limited transformations on LIR and because we use the If we want rematerialization we are probably going to have to solve this in some way. Two solutions come to mind:
One fairly simple step might be to move all non-decomposition uses of cc @dotnet/jit-contrib
|
As part of this |
Containment currently means that the "containee" doesn't generate any code, and that its effect is completely represented in the instruction generated by the "container" node. This would no longer be true. The "container" would be generating multiple instructions. Also, no instructions could be placed between them (this probably doesn't matter much currently -- and even if it was used, any intervening instructions couldn't affect the flags). |
Do you see any issues with this? To me it is an arbitrary limitation to put on containment, also considering that many nodes today already need to generate more than one instruction.
Indeed, that's what I meant by the fact that this might not be as powerful as a fully fledged way to reason about CPU flags. |
I think that's a reasonable way to think about it. Ideally, post-lower, an IR node generates exactly one machine instruction. It's a simple mental model. There are probably exceptions today as you note. As long as LSRA's register needs are exposed, it works even if more than one instruction is generated. But it does make any more general "scheduling" (which we don't do) potentially harder. |
I'm no longer so sure that doing this the containment way is the best approach after having spent a while trying to plumb some of our more 'exotic' optimizations through the Containment for normal relops is not so bad, but the xarch backend can also optimize I had a go at implementing it, and it does seem doable in this particular case, but I'm not sure the approach is really scalable. For reference, this is what the parts ended up looking like:
|
There is a prototype in #82355 of the first approach, i.e. explicitly represent flags dependencies in LIR. Here were my thoughts on the work: #82355 (comment) I think the explicit representation would be the way forward, but I don't think the trade off is worth it today, so I will move this to future. |
For |
This was just an example, |
When
GTF_SET_FLAGS
is set on a node it indicates that a future node may consume the CPU flags that were set by this node. However, this is outside the JIT's modelling of values in the IR as we do not track CPU flag dependencies at all. It means the interference checking we have today is not really sufficient, and it complicates some things such as potentially implementing rematerialization that could introduce nodes trashing the CPU flags at arbitrary places.Today this works out because we do very limited transformations on LIR and because we use the
GTF_SET_FLAGS
capability only in limited situations: decomposed longs on 32-bit platforms and for conditional branching.If we want rematerialization we are probably going to have to solve this in some way. Two solutions come to mind:
One fairly simple step might be to move all non-decomposition uses of
GTF_SET_FLAGS
to the containment model, which should mean that rematerialization becomes possible in many contexts (e.g. always on 64-bit targets).cc @dotnet/jit-contrib
category:proposal
theme:ir
The text was updated successfully, but these errors were encountered: