-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARM64-SVE: Implement ShiftLeftLogical
, ShiftRightArithmetic
, ShiftRightLogical
#104119
Conversation
Note regarding the
|
1 similar comment
Note regarding the
|
Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics |
@@ -558,6 +558,28 @@ void CodeGen::genHWIntrinsic(GenTreeHWIntrinsic* node) | |||
{ | |||
assert(instrIsRMW); | |||
|
|||
insScalableOpts sopt; | |||
|
|||
switch (intrinEmbMask.id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is the best place to put this switch statement, but these intrinsics need to be wrapped with ConditionalSelect
, so implementing special codegen for them off this path doesn't seem feasible without some more invasive refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we do similar handling for ConvertTo
for numOperands == 1
case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good. Added some questions around testing.
/// LSL Ztied1.S, Pg/M, Ztied1.S, Zop2.S | ||
/// svuint32_t svlsl[_u32]_x(svbool_t pg, svuint32_t op1, svuint32_t op2) | ||
/// LSL Ztied1.S, Pg/M, Ztied1.S, Zop2.S | ||
/// LSLR Ztied2.S, Pg/M, Ztied2.S, Zop1.S |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we are doing LSLR
. Can you please remove it from summary docs from this and file?
/// ASR Ztied1.H, Pg/M, Ztied1.H, Zop2.H | ||
/// svint16_t svasr[_s16]_x(svbool_t pg, svint16_t op1, svuint16_t op2) | ||
/// ASR Ztied1.H, Pg/M, Ztied1.H, Zop2.H | ||
/// ASRR Ztied2.H, Pg/M, Ztied2.H, Zop1.H |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are actually bunch of places where we have also added "XXXR". please remove it.
("SveVecBinOpTest.template", new Dictionary<string, string> { ["TestName"] = "Sve_ShiftLeftLogical_ushort_ulong", ["Isa"] = "Sve", ["LoadIsa"] = "Sve", ["Method"] = "ShiftLeftLogical", ["RetVectorType"] = "Vector", ["RetBaseType"] = "UInt16", ["Op1VectorType"] = "Vector", ["Op1BaseType"] = "UInt16", ["Op2VectorType"] = "Vector", ["Op2BaseType"] = "UInt64", ["LargestVectorSize"] = "64", ["NextValueOp1"] = "TestLibrary.Generator.GetUInt16()", ["NextValueOp2"] = "TestLibrary.Generator.GetUInt64()", ["ValidateIterResult"] = "Helpers.ShiftLeft<ushort>(left[i], right[(i * sizeof(ushort)) / sizeof(ulong)]) != result[i]", ["GetIterResult"] = "Helpers.ShiftLeft<ushort>(left[i], right[(i * sizeof(ushort)) / sizeof(ulong)])"}), | ||
("SveVecBinOpTest.template", new Dictionary<string, string> { ["TestName"] = "Sve_ShiftLeftLogical_uint_ulong", ["Isa"] = "Sve", ["LoadIsa"] = "Sve", ["Method"] = "ShiftLeftLogical", ["RetVectorType"] = "Vector", ["RetBaseType"] = "UInt32", ["Op1VectorType"] = "Vector", ["Op1BaseType"] = "UInt32", ["Op2VectorType"] = "Vector", ["Op2BaseType"] = "UInt64", ["LargestVectorSize"] = "64", ["NextValueOp1"] = "TestLibrary.Generator.GetUInt32()", ["NextValueOp2"] = "TestLibrary.Generator.GetUInt64()", ["ValidateIterResult"] = "Helpers.ShiftLeft<uint>(left[i], right[(i * sizeof(uint)) / sizeof(ulong)]) != result[i]", ["GetIterResult"] = "Helpers.ShiftLeft<uint>(left[i], right[(i * sizeof(uint)) / sizeof(ulong)])"}), | ||
|
||
("SveVecBinOpTest.template", new Dictionary<string, string> { ["TestName"] = "Sve_ShiftRightArithmetic_sbyte_byte", ["Isa"] = "Sve", ["LoadIsa"] = "Sve", ["Method"] = "ShiftRightArithmetic", ["RetVectorType"] = "Vector", ["RetBaseType"] = "SByte", ["Op1VectorType"] = "Vector", ["Op1BaseType"] = "SByte", ["Op2VectorType"] = "Vector", ["Op2BaseType"] = "Byte", ["LargestVectorSize"] = "64", ["NextValueOp1"] = "TestLibrary.Generator.GetSByte()", ["NextValueOp2"] = "TestLibrary.Generator.GetByte()", ["ValidateIterResult"] = "Helpers.ShiftRight<sbyte>(left[i], (ulong)right[i]) != result[i]", ["GetIterResult"] = "Helpers.ShiftRight<sbyte>(left[i], (ulong)right[i])"}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hoping to see something like Helpers.ShiftRightArithmetic
for validation for these APIs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The generic ShiftRight
helper will do either arithmetic or logical shift, depending on whether the first operand is signed or not since it's just doing division. I can write a ShiftRightArithmetic
wrapper for ShiftRight
that only takes signed types for the first operand to make the API surface clearer, though I was hoping to avoid having to write multiple overloads for sbyte
, short
, int
, etc -- which would you prefer?
@@ -1943,6 +1943,28 @@ private static int HighNarrowing(long op1, bool round) | |||
|
|||
public static long MultiplyWideningUpperAndSubtract(long[] op1, int[] op2, int[] op3, int i) => MultiplyWideningAndSubtract(op1[i], op2[i + op2.Length / 2], op3[i + op3.Length / 2]); | |||
|
|||
public static T ShiftLeft<T>(T op1, ulong op2) where T : INumber<T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use the existing Helpers.ShiftLeftLogical()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think those match the functionality of these APIs: The existing UnsignedShift
seems to only consider the first byte of the shift value, regardless of the size of the input, whereas the SVE APIs consider the whole second operand (docs). I agree the code duplication isn't ideal -- I'm happy to rename the new helpers to something more discerning, if you'd like.
// If they aren't the same, the #undef won't do anything, | ||
// and the first operand type symbol will remain defined. | ||
// We can then conditionally compile on the condition #if !{Op1VectorType}{Op1BaseType}, etc. | ||
#define {Op1VectorType}{Op1BaseType} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the intent is right and I suggested something along these lines at one point, but looking at this, it feels little odd and might be hard to maintain, specially because we are not careful in setting Op1BaseType
and Op2BaseType
. Lets add a new template (if you did not find the one to reuse) and we will clean up them once we are done with .NET 9 implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it; I copied my changes to _SveBinaryOpTestTemplate
over to _SveBinaryOpDifferentTypesTestTemplate
(minus the conditional compilation stuff), and reverted my changes to the former.
@kunalspathak stress tests still pass (minus the known callee-save issues) with the template changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Part of #99957. As discussed offline with @kunalspathak, some of the newly-added tests are failing under
JitStress
scenarios due to known callee-save register issues. Other than those cases, all tests are passing under the other stress modes -- the output is large, but I'm happy to share it if it's of interest.These new APIs are unique in that their operands can differ in vector widths. To avoid introducing a new test template, I tweaked the existing vector binop test template to parse the second operand's type correctly, and added some preprocessor logic to conditionally compile certain tests that require the operand types to be the same. It looks a bit hacky, but it doesn't affect existing tests that have matching operand types.
@dotnet/arm64-contrib PTAL, thanks!