-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix CQ regression & correctness bug in morphing of long muls #53566
Fix CQ regression & correctness bug in morphing of long muls #53566
Conversation
7ef2138
to
1e7a2d8
Compare
To understand the new format for long muls.
Previously, morph was looking for the exact pattern of MUL(CAST(long <- int), CAST(long <- int)) when assessing candidacy of GT_MUL for being marked with "GTF_MUL_64RSLT" and emitted as "long mul". This worked fine, until the importer was changed to fold all casts with constant operands. This broke the pattern matching and thus all MULs in the form of (long)value * 10 started being emitted as helper calls. This change updates morph to understand the new folded casts and in general updates the "format" of long mul from "CAST * CAST" to "CAST * (CAST | CONST)". In the process, new helper functions have been introduced, to avoid bloating fgMorphSmpOp with the new sizeable logic. Recognition of overflowing cases has been upgraded, and a correctness bug, where "checked((long)uint.MaxValue * (long)uint.MaxValue)" was wrongly treated as non-overflowing, fixed. Additionally, the logic to emit intermediate NOPs has been changed to instead always skip morphing the casts themselves, even when remorphing.
1e7a2d8
to
efe3a33
Compare
Marking as ready for review, the failure is unrelated (no tests ran and python crashed at the end). cc @sandreenko - more morph changes... |
PTAL @dotnet/jit-contrib , I probably won't have time to review it this week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM. Thanks!
Would it make sense to check the script in alongside the test?
As a follow up perhaps we can scrub uses of GTF_MUL_64RST
from the other places in the jit?
Perhaps. I am in the process of experimenting with moving most of the "morph to helpers" logic to lowering, so that we don't have to play tricks like completely blocking long mul trees from assertion propagation for correctness reasons. If that fails (fairly likely given it is not so trivial to add new calls that late), I will of course update all the places that use the flag today to use the helpers. Speaking of flags, I will have to rebase so as not to fail the build now that we have to use
Sounds good to me, and there's precedent in the |
The test itself has been regenerated using it and there were no diffs, as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Previously, morph was looking for the exact pattern of
MUL(CAST(long <- int), CAST(long <- int))
when assessingcandidacy of
GT_MUL
for being marked withGTF_MUL_64RSLT
and emitted as "long mul".This worked fine, until the importer was changed to fold all casts with constant operands. This broke the pattern matching and thus all
MUL
s in the form of(long)value * 10
started being emitted as helper calls.This change updates morph to understand the new folded casts and in general updates the "format" of long mul from
CAST * CAST
toCAST * (CAST | CONST)
.In the process, new helper functions have been introduced, to avoid bloating
fgMorphSmpOp
with the new sizeable logic.Recognition of overflowing cases has been upgraded, and a correctness bug, where
checked((long)uint.MaxValue * (long)uint.MaxValue)
was wrongly treated as non-overflowing, fixed.Additionally, the logic to emit intermediate
NOP
s has been changed to instead always skip morphing the casts themselves,even when remorphing.
A new test has been added, generated via a script. The test is meant to be an outerloop one, but I am temporarily marking it Pri0 for it to run in CI.
Also, a few helpers for working with
gtFlags
have been added.Diffs for this change:
Windows x86
Linux ARM
Most regressions are caused by the fact that the new code doesn't try to look for
CAST(CONST)
, and that is impactful for cases when we emit debuggable code - it now goes through the helper. Because there are so few regressions, and they are for debuggable code only, I decided not to complicate the code by fixing it. The<ReadHeadersAsync>d__43:MoveNext()
regression is caused by different register allocation, even as the helper call was turned intoimul
. One of the regressions onARM
is caused by a lack of CSE.