Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect and inconsistent AVX/AVX2 MaskStore APIs #9974

Closed
fiigii opened this issue Mar 20, 2018 · 9 comments · Fixed by dotnet/coreclr#17637
Closed

Incorrect and inconsistent AVX/AVX2 MaskStore APIs #9974

fiigii opened this issue Mar 20, 2018 · 9 comments · Fixed by dotnet/coreclr#17637
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI bug
Milestone

Comments

@fiigii
Copy link
Contributor

fiigii commented Mar 20, 2018

We have the below AVX MaskStore

// AVX
        /// <summary>
        /// void _mm_maskstore_ps (float * mem_addr, __m128i mask, __m128 a)
        ///   VMASKMOVPS m128, xmm, xmm
        /// </summary>
        public static unsafe void MaskStore(float* address, Vector128<float> mask, Vector128<uint> source) { throw new PlatformNotSupportedException(); }
        /// <summary>
        /// void _mm_maskstore_pd (double * mem_addr, __m128i mask, __m128d a)
        ///   VMASKMOVPD m128, xmm, xmm
        /// </summary>
        public static unsafe void MaskStore(double* address, Vector128<double> mask, Vector128<ulong> source) { throw new PlatformNotSupportedException(); }

        /// <summary>
        /// void _mm256_maskstore_ps (float * mem_addr, __m256i mask, __m256 a)
        ///   VMASKMOVPS m256, ymm, ymm
        /// </summary>
        public static unsafe void MaskStore(float* address, Vector256<float> mask, Vector256<uint> source) { throw new PlatformNotSupportedException(); }
        /// <summary>
        /// void _mm256_maskstore_pd (double * mem_addr, __m256i mask, __m256d a)
        ///   VMASKMOVPD m256, ymm, ymm
        /// </summary>
        public static unsafe void MaskStore(double* address, Vector256<double> mask, Vector256<ulong> source) { throw new PlatformNotSupportedException(); }

That has incorrect base-type of mask and source and is inconsistent with AVX2 counterparts.

// AVX2

        /// <summary>
        /// void _mm_maskstore_epi32 (int* mem_addr, __m128i mask, __m128i a)
        ///   VPMASKMOVD m128, xmm, xmm
        /// </summary>
        public static unsafe void MaskStore(int* address, Vector128<int> mask, Vector128<int> source) => MaskStore(address, mask, source);
        /// <summary>
        /// void _mm_maskstore_epi32 (int* mem_addr, __m128i mask, __m128i a)
        ///   VPMASKMOVD m128, xmm, xmm
        /// </summary>
        public static unsafe void MaskStore(uint* address, Vector128<uint> mask, Vector128<uint> source) => MaskStore(address, mask, source);
        /// <summary>
        /// void _mm_maskstore_epi64 (__int64* mem_addr, __m128i mask, __m128i a)
        ///   VPMASKMOVQ m128, xmm, xmm
        /// </summary>
        public static unsafe void MaskStore(long* address, Vector128<long> mask, Vector128<long> source) => MaskStore(address, mask, source);

@CarolEidt @tannergooding @eerhardt

@tannergooding
Copy link
Member

The type is "incorrect" according to the C/C++ intrinsics, but I'm not sure it is actually a problem.

The actual mask just checks for the highest bit, which corresponds to the negative bit for float/double. So nothing would prevent the user from using the API or even require them to do weird things to set the mask appropriately.

@tannergooding
Copy link
Member

In some cases, this might even be beneficial as it means loading the vector can be done more efficiently from xmm to xmm, rather than needing to worry about reg to xmm.

@fiigii
Copy link
Contributor Author

fiigii commented Mar 20, 2018

The type is "incorrect" according to the C/C++ intrinsics, but I'm not sure it is actually a problem.

I meant,

MaskStore(float* address, Vector128<float> mask, Vector128<uint> source)

should be

MaskStore(float* address, Vector128<uint> mask, Vector128<float> source)

The base-type of source should be same as the pointer type of dst.

@fiigii
Copy link
Contributor Author

fiigii commented Mar 20, 2018

MaskStore(float* address, Vector128<float> mask, Vector128<float> source)

looks also okay to me that is consistent with AVX2 counterparts.

@RussKeldorph
Copy link
Contributor

@fiigii Are you going to fix this by the end of March?

@fiigii
Copy link
Contributor Author

fiigii commented Mar 20, 2018

@RussKeldorph I think it can be done by the end of March, but I am not sure if the API change is allowed for 2.1.
cc @CarolEidt @eerhardt

@eerhardt
Copy link
Member

eerhardt commented Mar 20, 2018

Is it possible to still use this API without this change? Or does the mis-match in the argument types make it infeasible to use this API properly?

If the scenario is blocked without this change, I would be open to taking an API change. Better to change the API now than to ship an API that just won't work.

@fiigii
Copy link
Contributor Author

fiigii commented Mar 20, 2018

Is it possible to still use this API without this change? Or does the mis-match in the argument types make it infeasible to use this API properly?

These wrong APIs can be "used" with additional StaticCast, but all the use cases will be broken after we correct the APIs.

Actually, I think we can push this API change to post-2.1.

@CarolEidt
Copy link
Contributor

Actually, I think we can push this API change to post-2.1.

I think that's the right thing to do.

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants