-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure GetElement are marked as having an immediate operand for Arm64 #104204
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
a606ec2
to
4ccdd43
Compare
CC. @EgorBo, this fixes the minor codegen issue we found |
Guess it's not so minor, since we weren't doing containment at all (for the address or for the index):
Most of these are simple cases such as: - mov w0, #2
dup s16, v16.s[2] -or- - mov w6, #1
umov x6, v16.d[1] But then we also get cases such as: - ldr q17, [fp, #0x30] // [V02 loc0]
- mov w0, #3
- dup s17, v17.s[3]
+ ldr s17, [fp, #0x3C] // [V02 loc0+0x0c] There's a lot of code that ends up extracting or moving elements, so this gets some decent payoffs overall. Edit: The below is part of a - mov w1, #2
mov v8.d[1], v9.d[0]
dup s0, v8.s[2]
+ mov w1, #2
ldr x0, [x20, #0x08] Going to see if I can identify where that is, but I wouldn't block this on that. |
@@ -155,7 +155,7 @@ HARDWARE_INTRINSIC(Vector128, EqualsAny, | |||
HARDWARE_INTRINSIC(Vector128, ExtractMostSignificantBits, 16, 1, false, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid}, HW_Category_Helper, HW_Flag_InvalidNodeId|HW_Flag_BaseTypeFromFirstArg) | |||
HARDWARE_INTRINSIC(Vector128, Floor, 16, 1, false, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid}, HW_Category_Helper, HW_Flag_InvalidNodeId) | |||
HARDWARE_INTRINSIC(Vector128, FusedMultiplyAdd, 16, 3, false, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid}, HW_Category_Helper, HW_Flag_InvalidNodeId) | |||
HARDWARE_INTRINSIC(Vector128, GetElement, 16, 2, true, {INS_smov, INS_umov, INS_smov, INS_umov, INS_smov, INS_umov, INS_umov, INS_umov, INS_dup, INS_dup}, HW_Category_Helper, HW_Flag_SpecialImport|HW_Flag_SpecialCodeGen|HW_Flag_BaseTypeFromFirstArg) | |||
HARDWARE_INTRINSIC(Vector128, GetElement, 16, 2, true, {INS_smov, INS_umov, INS_smov, INS_umov, INS_smov, INS_umov, INS_umov, INS_umov, INS_dup, INS_dup}, HW_Category_Helper, HW_Flag_SpecialImport|HW_Flag_SpecialCodeGen|HW_Flag_BaseTypeFromFirstArg|HW_Flag_SupportsContainment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one thing I noticed that GetElement is not marked as HW_Flag_SupportsContainment
for x64 but it still works there 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They’re functionally inverses of each other.
On Arm64 containment is a rare consideration and so is opt-in, while on x64 it’s a common thing for almost every operation and is instead opt-out
No description provided.