Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Shift and Inserts scalar and SIMD intrinsics. #36818

Conversation

TamarChristinaArm
Copy link
Contributor

This implements the remaining shift and insert intrinsics.

Fixes #31324

/cc @tannergooding @echesakovMSFT @CarolEidt

@Dotnet-GitSync-Bot
Copy link
Collaborator

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI new-api-needs-documentation labels May 21, 2020
@TamarChristinaArm
Copy link
Contributor Author

@echesakovMSFT @tannergooding I'm trying to figure out why my tests are throwing

ERROR!!!-System.PlatformNotSupportedException: Operation is not supported on this platform.

I know my constants aren't compile time constants, but I thought it would generate a lookup table then?

case NI_AdvSimd_ShiftLeftLogicalAndInsertScalar:
case NI_AdvSimd_ShiftRightAndInsert:
case NI_AdvSimd_ShiftRightLogicalAndInsertScalar:
immLowerBound = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

immLowerBound == 0 is a MOVI variant? (Trying to make sure I'm reading the architecture manual correctly here 😄)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah 0 would be a MOV or ORR (not MOVI since that's a bitmask).

@echesakov
Copy link
Contributor

@TamarChristinaArm You would also need something like this echesakov@0fe0cf9 to make sure that in non-const imm case we are checking against lower bound.

Would you mind waiting a little bit before I merge #36552 and changes in my private branch https://github.com/echesakovMSFT/runtime/compare/Arm64-ASIMD-Shift-Intrinsics that I plan to publish today.

#36552 is needed to fix an issue in emitter where INS_sri was not encoding shift amount properly (see #36552 (comment)) and the changes in private will add the check in addRangeCheckIfNeeded.

@TamarChristinaArm
Copy link
Contributor Author

@echesakovMSFT ahh thanks, No I don't mind waiting at all. I was just really confused why it wasn't working :)

@echesakov
Copy link
Contributor

I have one question - in #33398 we decided to explicitly specify what type of right shifts an intrinsic does - logical right shift or arithmetic right shift.

The operations in this PR are slightly different though - they are neither logical nor arithmetic since the bits "created" by shifting are filled with the corresponding bits in Vd.

Would it make sense to name them ShiftLeftAndInsert ShiftLeftAndInsertScalar ShiftRightAndInsert ShiftRightAndInsertScalar? @TamarChristinaArm @tannergooding

@TamarChristinaArm
Copy link
Contributor Author

Would it make sense to name them ShiftLeftAndInsert ShiftLeftAndInsertScalar ShiftRightAndInsert ShiftRightAndInsertScalar? @TamarChristinaArm @tannergooding

I do agree here. When implementing them I did find the naming a bit confusing and inconsistent. But since they were approved like that I went with it. I find your proposed name clearer since as you say regardless of whether the shift is logical or arithmetic the bits you shift in will be overwritten.

@tannergooding
Copy link
Member

That makes sense. This is also an element shift rather than a bitwise shift, so we might consider a different parameter name to make that clear.

That is, if you right shift by 1 then element 0 of the source is lost, element 1 of the source becomes element 0 of the destination, element 2 becomes element 1, etc. The last element of the destination is then preserved

@TamarChristinaArm
Copy link
Contributor Author

That makes sense. This is also an element shift rather than a bitwise shift, so we might consider a different parameter name to make that clear.

It's a bitwise shift though, but every element is shifted by the same amount. I have also just realized that this is a RMW and forgot to modify it as such..

@tannergooding
Copy link
Member

It's a bitwise shift though, but every element is shifted by the same amount. I have also just realized that this is a RMW and forgot to modify it as such..

Ah, I misread the diagram in the manual. I see now that the boxes are for the individual bits, my mistake.

In any case, we'd just need to take this to API review for discussion.

@echesakov
Copy link
Contributor

ahh thanks, No I don't mind waiting at all. I was just really confused why it wasn't working :)

@TamarChristinaArm I merged #36552 and #36830 last week, so sri and sli instructions should work as expected and all the infrastructure needed for non-zero lower bound intrinsics are in place. Please let me know if you need help with resolving the merge conflicts.

@TamarChristinaArm
Copy link
Contributor Author

ahh thanks, No I don't mind waiting at all. I was just really confused why it wasn't working :)

@TamarChristinaArm I merged #36552 and #36830 last week, so sri and sli instructions should work as expected and all the infrastructure needed for non-zero lower bound intrinsics are in place. Please let me know if you need help with resolving the merge conflicts.

@echesakovMSFT awesome thanks! I'm rebasing it now, will let you know if I run into any trouble.

@TamarChristinaArm TamarChristinaArm force-pushed the implement-arm64-shift-and-inserts-insn branch from cf57e2a to ce29c95 Compare June 8, 2020 07:37
@TamarChristinaArm TamarChristinaArm changed the title WIP: Implement Shift and Inserts scalar and SIMD intrinsics. Implement Shift and Inserts scalar and SIMD intrinsics. Jun 8, 2020
@TamarChristinaArm
Copy link
Contributor Author

That should do it @echesakovMSFT @tannergooding @CarolEidt

@TamarChristinaArm
Copy link
Contributor Author

aside from the naming issue @echesakovMSFT raised in #36818 (comment) but @tannergooding mentioned it needs API review, so I have left them as the last reviewed names.

@TamarChristinaArm TamarChristinaArm force-pushed the implement-arm64-shift-and-inserts-insn branch from ce29c95 to 3fcae9a Compare June 11, 2020 11:52
@TamarChristinaArm
Copy link
Contributor Author

rebased with changes from master

@@ -7442,6 +7442,230 @@ public new abstract class Arm64 : ArmBase.Arm64
/// </summary>
public static Vector128<float> ReciprocalStep(Vector128<float> left, Vector128<float> right) => ReciprocalStep(left, right);

/// <summary>
/// uint8x8_t vsli_n_u8(uint8x8_t a, uint8x8_t b, __builtin_constant_p(n))
Copy link
Contributor

@echesakov echesakov Jun 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TamarChristinaArm Everything looks good.

Just one question here - what is __builtin_constant_p(n) and
should we use uint8x8_t vsri_n_u8 (uint8x8_t a, uint8x8_t b, const int n) comment here instead as we did for the remaining intrinsics with an immediate operand?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one question here - what is __builtin_constant_p(n) and

It's a compiler builtin supported by clang and gcc to indicate that the value of n must be a compile time constant. the problem is that in C there's no real way to indicate this as const only means the value is read-only in the function.

should we use uint8x8_t vsri_n_u8 (uint8x8_t a, uint8x8_t b, const int n) comment here instead as we did for the remaining intrinsics with an immediate operand?

probably yes since in CLR it doesn't have to be since you generate that lookup table. I generate these signatures through a script that reads the ACLE sources so they got in automatically :)

@echesakov
Copy link
Contributor

I am going to merge this to avoid future conflicts (e.g. with #36916)

If we decide to update the summary comments/method names we can do this later.

@echesakov echesakov merged commit d2f9adc into dotnet:master Jun 11, 2020
@terrajobst
Copy link
Member

I have one question - in #33398 we decided to explicitly specify what type of right shifts an intrinsic does - logical right shift or arithmetic right shift.

The operations in this PR are slightly different though - they are neither logical nor arithmetic since the bits "created" by shifting are filled with the corresponding bits in Vd.

Would it make sense to name them ShiftLeftAndInsert ShiftLeftAndInsertScalar ShiftRightAndInsert ShiftRightAndInsertScalar?

Works for us.

@ghost ghost locked as resolved and limited conversation to collaborators Dec 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI new-api-needs-documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API Proposal : Arm Shift and Permute intrinsics
5 participants