-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect and inconsistent AVX/AVX2 MaskStore APIs #9974
Comments
The type is "incorrect" according to the C/C++ intrinsics, but I'm not sure it is actually a problem. The actual mask just checks for the highest bit, which corresponds to the negative bit for float/double. So nothing would prevent the user from using the API or even require them to do weird things to set the mask appropriately. |
In some cases, this might even be beneficial as it means loading the vector can be done more efficiently from xmm to xmm, rather than needing to worry about reg to xmm. |
I meant, MaskStore(float* address, Vector128<float> mask, Vector128<uint> source) should be MaskStore(float* address, Vector128<uint> mask, Vector128<float> source) The base-type of |
MaskStore(float* address, Vector128<float> mask, Vector128<float> source) looks also okay to me that is consistent with AVX2 counterparts. |
@fiigii Are you going to fix this by the end of March? |
@RussKeldorph I think it can be done by the end of March, but I am not sure if the API change is allowed for 2.1. |
Is it possible to still use this API without this change? Or does the mis-match in the argument types make it infeasible to use this API properly? If the scenario is blocked without this change, I would be open to taking an API change. Better to change the API now than to ship an API that just won't work. |
These wrong APIs can be "used" with additional Actually, I think we can push this API change to post-2.1. |
I think that's the right thing to do. |
We have the below AVX MaskStore
That has incorrect base-type of
mask
andsource
and is inconsistent with AVX2 counterparts.@CarolEidt @tannergooding @eerhardt
The text was updated successfully, but these errors were encountered: