-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[arm64] JIT: Fold "A * B + C" to MADD/MSUB #61037
Conversation
Tagging subscribers to this area: @JulieLeeMSFT Issue Detailsstatic int Test1(int a, int b, int c) => a * b + c;
static int Test2(int a, int b, int c) => c + a * -b; Codegen diff: ; Method Tests:Test1(int,int,int):int
G_M46617_IG01:
stp fp, lr, [sp,#-16]!
mov fp, sp
G_M46617_IG02:
- mul w0, w0, w1
- add w0, w0, w2
+ madd w0, w0, w1, w2
G_M46617_IG03:
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code: 24
; Method Tests:Test2(int,int,int):int
G_M50938_IG01:
stp fp, lr, [sp,#-16]!
mov fp, sp
G_M50938_IG02:
- neg w1, w1
- mul w0, w1, w0
- add w0, w2, w0
+ msub w0, w1, w0, w2
G_M50938_IG03:
ldp fp, lr, [sp],#16
ret lr
; Total bytes of code: 28
|
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added few questions and suggestions. Overall, looks great!
@@ -13597,6 +13597,33 @@ regNumber emitter::emitInsTernary(instruction ins, emitAttr attr, GenTree* dst, | |||
// src2 can only be a reg | |||
assert(!src2->isContained()); | |||
} | |||
else if ((src1->OperIs(GT_MUL) && src1->isContained()) || (src2->OperIs(GT_MUL) && src2->isContained())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the src1->gtGetOp1()
or src1->gtGetOp2()
would be the one that is marked contained in lower and that should be checked here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
madd
fot GT_ADD
can only be emitted when GT_MUL is contained. So "isContaind" here basically means lower approved this tree to be optimized into madd. GT_MUL can be not contained e.g. if it has gtOverflow set
@dotnet/jit-contrib PTAL, should be ready to review/merge |
if (b->OperIs(GT_NEG) && b->isContained()) | ||
{ | ||
b = b->gtGetOp1(); | ||
msub = !msub; // it's either "a * -b" or "-a * -b" which is the same as "a * b" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// it's either "a * -b" or "-a * -b" which is the same as "a * b"
This comment is little misleading...I can see how -a * -b
is a * b
, but then we will still use msub
so probably we should not confuse by saying "same as a * b"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but then we will still use msub
actually for -a * -b
this code will emit madd
(msub = !msub will return false)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah...I misread this if
as else if
. Never mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The Fuzzlyn run at https://dev.azure.com/dnceng/public/_build/results?buildId=1452609 found this example: // Generated by Fuzzlyn v1.5 on 2021-11-03 12:55:21
// Run on Arm64 Windows
// Seed: 951014135056301943
// Reduced from 152.5 KiB to 0.3 KiB in 00:01:22
// Hits JIT assert in Release:
// Assertion failed 'ins == INS_add' in 'Program:Main(Fuzzlyn.ExecutionServer.IRuntime)' during 'Generate code' (IL size 28)
//
// File: D:\a\_work\3\s\src\coreclr\jit\emitarm64.cpp Line: 13602
//
public class C0
{
}
public class Program
{
public static void Main()
{
if (0 == (27452 + (-2147483647 * M1())))
{
var vr3 = new C0();
}
}
public static long M1()
{
var vr1 = new C0[, ]{{new C0()}};
return 0;
}
} Could it be related to this PR? |
Oops. yes. totally forgot about INS_adds which sets flags, will fix in a moment |
improvements on linux-arm64 dotnet/perf-autofiling-issues#2143 |
Closes #49283
This PR folds "A*B+C" into a single instruction for integers on arm64. It can be extended to handle floats too (for both x86 and arm) if we introduce a sort of a "unsafe math" mode.
Codegen diff:
coreclr_tests.pmi.Linux.arm64.checked.mch:
Detail diffs
libraries.crossgen2.Linux.arm64.checked.mch:
Detail diffs
libraries.pmi.Linux.arm64.checked.mch:
Detail diffs
libraries_tests.pmi.Linux.arm64.checked.mch:
Detail diffs