Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Arm64] ASIMD Polynomial Multiply Long #35143

Closed
echesakov opened this issue Apr 17, 2020 · 11 comments · Fixed by #36853
Closed

[Arm64] ASIMD Polynomial Multiply Long #35143

echesakov opened this issue Apr 17, 2020 · 11 comments · Fixed by #36853
Assignees
Labels
api-approved API was approved in API review, it can be implemented arch-arm64 area-System.Runtime.Intrinsics
Milestone

Comments

@echesakov
Copy link
Contributor

echesakov commented Apr 17, 2020

The following are omitted in #32512

class AdvSimd
{
        /// <summary>
        /// poly16x8_t vmull_p8 (poly8x8_t a, poly8x8_t b)
        ///   A32: VMULL.P8 Qd, Dn, Dm
        ///   A64: PMULL Vd.16B, Vn.8B, Vm.8B
        /// </summary>
        public static Vector128<sbyte> PolynomialMultiplyWideningLower(Vector64<sbyte> left, Vector64<sbyte> right);

        /// <summary>
        /// poly16x8_t vmull_p8 (poly8x8_t a, poly8x8_t b)
        ///   A32: VMULL.P8 Qd, Dn, Dm
        ///   A64: PMULL Vd.16B, Vn.8B, Vm.8B
        /// </summary>
        public static Vector128<byte> PolynomialMultiplyWideningLower(Vector64<byte> left, Vector64<byte> right);

        /// <summary>
        /// poly16x8_t vmull_high_p8 (poly8x16_t a, poly8x16_t b)
        ///   A32: VMULL.P8 Qd, Dn+1, Dm+1
        ///   A64: PMULL2 Vd.16B, Vn.16B, Vm.16B
        /// </summary>
        public static Vector128<sbyte> PolynomialMultiplyWideningUpper(Vector128<sbyte> left, Vector128<sbyte> right);

        /// <summary>
        /// poly16x8_t vmull_high_p8 (poly8x16_t a, poly8x16_t b)
        ///   A32: VMULL.P8 Qd, Dn+1, Dm+1
        ///   A64: PMULL2 Vd.16B, Vn.16B, Vm.16B
        /// </summary>
        public static Vector128<byte> PolynomialMultiplyWideningUpper(Vector128<byte> left, Vector128<byte> right);
}

class Aes
{
        /// <summary>
        /// poly128_t vmull_p64 (poly64_t a, poly64_t b)
        ///   A32: VMULL.P8 Qd, Dn, Dm
        ///   A64: PMULL Vd.1Q, Vn.1D, Vm.1D
        /// </summary>
        public static Vector128<long> PolynomialMultiplyWideningLower(Vector64<long> left, Vector64<long> right);

        /// <summary>
        /// poly128_t vmull_p64 (poly64_t a, poly64_t b)
        ///   A32: VMULL.P8 Qd, Dn, Dm
        ///   A64: PMULL Vd.1Q, Vn.1D, Vm.1D
        /// </summary>
        public static Vector128<ulong> PolynomialMultiplyWideningLower(Vector64<ulong> left, Vector64<ulong> right);

        /// <summary>
        /// poly128_t vmull_high_p64 (poly64x2_t a, poly64x2_t b)
        ///   A32: VMULL.P8 Qd, Dn+1, Dm+1
        ///   A64: PMULL2 Vd.1Q, Vn.2D, Vm.2D
        /// </summary>
        public static Vector128<long> PolynomialMultiplyWideningUpper(Vector128<long> left, Vector128<long> right);

        /// <summary>
        /// poly128_t vmull_high_p64 (poly64x2_t a, poly64x2_t b)
        ///   A32: VMULL.P8 Qd, Dn+1, Dm+1
        ///   A64: PMULL2 Vd.1Q, Vn.2D, Vm.2D
        /// </summary>
        public static Vector128<ulong> PolynomialMultiplyWideningUpper(Vector128<ulong> left, Vector128<ulong> right);
}

There is also encoding PMULL Vd.1Q, Vn.1D, Vm.1D and PMULL2 Vd.1Q, Vn.2D, Vm.2D which

is only allocated in an implementation that includes the Cryptographic
Extension, and is otherwise RESERVED.

I wonder whether we want to expose these at all in .NET 5.0 - since they don't belong to ASIMD ISA?

cc @CarolEidt @TamarChristinaArm @tannergooding

@ghost
Copy link

ghost commented Apr 17, 2020

Tagging subscribers to this area: @tannergooding
Notify danmosemsft if you want to be subscribed.

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Apr 17, 2020
@TamarChristinaArm
Copy link
Contributor

hmm did we get PMUL itself already?

@echesakov
Copy link
Contributor Author

hmm did we get PMUL itself already?

Yes, I implemented it in #33889

@TamarChristinaArm
Copy link
Contributor

I wonder whether we want to expose these at all in .NET 5.0 - since they don't belong to ASIMD ISA?

If you do, won't that require the rest of the AES instructions as well? Since they use the same feature bit.

@echesakov
Copy link
Contributor Author

If you do, won't that require the rest of the AES instructions as well? Since they use the same feature bit.

Good point - I think we have got all AES, SHA1 and SHA256 implemented.

Arm Architecture Reference Manual (A2.3 The Armv8 Cryptographic Extension) talks about Armv8 Cryptographic Extension and the features ARMv8.0-AES and ARMv8.0-SHA:

The Armv8 Cryptographic Extension provides instructions for the acceleration of encryption and decryption, and
includes the following features:
• ARMv8.0-AES, which includes AESD and AESE instructions.
• ARMv8.0-SHA, which includes the SHA1* and SHA256* instructions.
The presence of the Cryptographic Extension in an implementation is subject to export license controls. The
Cryptographic Extension is an extension of the SIMD support and operates on the vector register file.
The Cryptographic Extension also provides multiply instructions that operate on long polynomials

I wonder if this means that a full path, for example, to intrinsic Sha1_FixedRotate should include CryptographicExtension class somewhere in the middle (i.e. be System.Runtime.Intrinsics.Arm.CryptographicExtension.Sha1.FixedRotate) since Sha1 is a subset of instructions (a feature) provided by Cryptographic Extension.

Is my understanding correct, that PMULL instruction that operates on two 64-bit polynomials is going to be a part of the CryptographicExtension but neither Aes nor Sha features?

@TamarChristinaArm
Copy link
Contributor

Unfortunately the crypto landscape is quite complicated... And there seems to be a discrepancy between the descriptions of Armv8.0-A and Armv8.2-A+. I've asked internally for someclarification.

Is my understanding correct, that PMULL instruction that operates on two 64-bit polynomials is going to be a part of the CryptographicExtension but neither Aes nor Sha features?

No, CryptographicExtension is just a made up term, it's not an actual extension. In that it has no feature bits of it's own. CryptographicExtension for Armv8.0-A contains AES and SHA1 but from Armv8.2-A+ the class is much bigger.

If you look at the identification register in ID_AA64ISAR0_EL1 (section D13.2.53) you'll see that PMULL crypto instructions are defined as an optional part of AES in Armv8 (page D13-3099 of the E release of the Arm ARM).

So it belongs as part of AES. I'll reply more once I get a response on the ambiguity between Armv8.0-A and Arm8.2-A.

@echesakov
Copy link
Contributor Author

echesakov commented Apr 21, 2020

@TamarChristinaArm Thanks for the clarification - I will update the proposal to have PMULL on two 64-bit polynomials under Aes.

@TamarChristinaArm
Copy link
Contributor

@echesakovMSFT It's been confirmed to me that indeed we expect people to implement AES and PMULL together. At some point this will be updated in the Arm ARM, but this also matches with llvm and gcc do.

@echesakov
Copy link
Contributor Author

@TamarChristinaArm Thanks for following up on this!

@echesakov echesakov self-assigned this Apr 27, 2020
@echesakov
Copy link
Contributor Author

Updated the proposal to include PMULL Vd.1Q, Vn.1D, Vm.1D and PMULL2 Vd.1Q, Vn.2D, Vm.2D under Aes class

I could not think about a better way of mapping poly128_t to .NET type system other than Vector128<long> or Vector128<ulong>

@echesakov echesakov removed the untriaged New issue has not been triaged by the area owner label May 1, 2020
@JulieLeeMSFT JulieLeeMSFT added this to the 5.0 milestone May 18, 2020
@terrajobst terrajobst added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review labels May 19, 2020
@terrajobst
Copy link
Member

  • Looks good as proposed
namespace System.Runtime.Intrinsics.Arm
{
    partial class AdvSimd
    {
        public static Vector128<sbyte> PolynomialMultiplyWideningLower(Vector64<sbyte>  left, Vector64<sbyte>  right);
        public static Vector128<byte>  PolynomialMultiplyWideningLower(Vector64<byte>   left, Vector64<byte>   right);
        public static Vector128<sbyte> PolynomialMultiplyWideningUpper(Vector128<sbyte> left, Vector128<sbyte> right);
        public static Vector128<byte>  PolynomialMultiplyWideningUpper(Vector128<byte>  left, Vector128<byte>  right);
    }
    partial class Aes
    {
        public static Vector128<long>  PolynomialMultiplyWideningLower(Vector64<long>   left, Vector64<long>   right);
        public static Vector128<ulong> PolynomialMultiplyWideningLower(Vector64<ulong>  left, Vector64<ulong>  right);
        public static Vector128<long>  PolynomialMultiplyWideningUpper(Vector128<long>  left, Vector128<long>  right);
        public static Vector128<ulong> PolynomialMultiplyWideningUpper(Vector128<ulong> left, Vector128<ulong> right);
    }
}

@ghost ghost locked as resolved and limited conversation to collaborators Dec 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented arch-arm64 area-System.Runtime.Intrinsics
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants