-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent hoisting nodes with order side effects #100160
Prevent hoisting nodes with order side effects #100160
Conversation
If a node has an order side effect, we can't hoist it at all: we don't know what the order dependence actually is. For example, assertion prop might have determined a node can't throw an exception, and eliminated the `GTF_EXCEPT` flag, replacing it with `GTF_ORDER_SIDEEFF`. We can't hoist because we might then hoist above the expression that led assertion prop to make that decision. This can happen in JitOptRepeat, where hoisting can follow assertion prop.
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
There were quite a few diffs locally, with, as expected, less hoisting out of loops. |
@AndyAyersMS PTAL |
This includes dotnet#100154, dotnet#100160, dotnet#100123
Can we generalize the logic that sets/checks |
I suppose not since there really is no node interfering with it. We had a similar problem around if-conversion that we solved by setting |
A discussion of the issue is here: #94250 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the right fix, but suspect we need to do some compensating fix to avoid being so pessimistic here.
I would suggest digging into a few of the diffs to see (in the default, non-opt-repeat case) how and when we are setting the GTF_ORDER_SIDEEFF
and whether we need to think about deferring those transformations since they may end up blocking a more valuable optimization.
Another angle would be to try and explicitly represent the dependence so we can enable some delimited reordering, but that seems hard; I don't know how we might even attempt this.
It's a modest code size improvement, and very small PerfScore regression. The diffs only affect a seemingly small percentage of method contexts: about 0.1% of contexts have diffs. |
Maybe I'm over-reacting, but I am curious to understand why we have any diffs. |
Here's one example of a diff that was, presumably, introduced because of #78698. Function: When we import an array indexing expression, it has a bounds check. The bounds check and the array address nodes get marked with We don't remove the Here's the IR after fgMorphIndexAddr (I altered gtDispFlags to always dump both
|
Thanks for digging in. I wonder if we should stop trying to optimize the cloned loop and just let the regular optimizations kick in instead. Probably something of a TP hit but perhaps avoids prematurely doing one thing and inhibiting another better thing. During cloning, we can perhaps mark the nodes as "expected to be optimized later" and then check that we indeed follow through. I think it's ok to take this now and sort that out later. @dotnet/jit-contrib any thoughts on this? |
A little more investigation in the above example: Several hoistings now get blocked. The one above actually gets optimized away (in the baseline) so doesn't cause a diff. One that doesn't get hoisted or optimized away in the baseline is:
This is an interesting case because ideally we could hoist this since we're not reordering, with respect to each other, the two nodes that were marked This case is also interesting because it doesn't matter: it's in the slow path loop -- which we do bother to optimize, but which should never be executed. |
Another interesting case:
Here, I think the ORDER bit is set by #78698 on the null check and Before and after, we hoist the NULLCHECK:
Before, we also hoist the
this seems dangerous and only "works out" because the NULLCHECK was (luckily?) hoisted first, and earlier in the IR order. Note that we're hoisting the Once again, it seems like it would be ok to do the hoisting that was originally done if we actually removed the original nodes being hoisted, which would remove the ORDER bits that should be blocking. Well, that is if you are allowed to hoist past any non-ORDER, non-EXCEPT nodes. But the original problem is hoisting past a compare/branch of the variable with the ORDER bit. The result of this PR is the |
So, what are the semantics of Some code, like Some places seem to think
|
I'm ok with taking this fix as well.
The correct and conservative semantics is that
Yeah, this one refines the meaning of
Yeah, seems like a bug. |
One other concern is that CSE may not honor GTF_MAKE_CSE, so we could see a situation where hoisting creates copies of two different trees in proper (implicit) dependence order, but CSE decides not to do the first one, so ends up effectively reordering. We may get lucky here because we tend to hoist the full comma tree for an array bounds + access, so perhaps the CSE ends up being "all or nothing", though I am a bit surprised that the random CSE mode hasn't tripped us up here. Perhaps one more small upvote for peeling instead of hoisting? At least peeling the head block of a loop...if not the full body. |
If a node has an order side effect, we can't hoist it at all: we don't know what the order dependence actually is. For example, assertion prop might have determined a node can't throw an exception, and eliminated the
GTF_EXCEPT
flag, replacing it withGTF_ORDER_SIDEEFF
. We can't hoist because we might then hoist above the expression that led assertion prop to make that decision. This can happen in JitOptRepeat, where hoisting can follow assertion prop.