-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Fix method names of hardware intrinsic APIs #25965
Conversation
@@ -198,7 +197,7 @@ public static class Avx | |||
public static Vector256<float> Set(float e7, float e6, float e5, float e4, float e3, float e2, float e1, float e0) { throw null; } | |||
public static Vector256<double> Set(double e3, double e2, double e1, double e0) { throw null; } | |||
public static Vector256<T> Set1<T>(T value) where T : struct { throw null; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this rather be called SetOne
? For consistency with SetZero
, Vector<T>.One
, Vector<T>.Zero
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that Set1
should not be consistent with SetZero
or Vector<T>.One
.
SetZero
stands for "set all elements to value of zero";Vector<T>.One
stands for "set all elements to value of one";Set1
stands for "set all elements to one value of XX";Set
stands for "set elements to multiple values of XX, YY, ZZ, ...";
But I know Set1
is not a good name that just follows C++ Intel intrinsic naming. Do you have suggestions for a better name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just Set or SetAll might make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just Set or SetAll might make sense.
We have Set
for multi-value initialization. SetAll
makes sense to me. Thank you.
@@ -169,8 +169,7 @@ public static class Avx | |||
public static Vector128<double> Permute(Vector128<double> value, byte control) { throw null; } | |||
public static Vector256<float> Permute(Vector256<float> value, byte control) { throw null; } | |||
public static Vector256<double> Permute(Vector256<double> value, byte control) { throw null; } | |||
public static Vector256<float> Permute2x128(Vector256<float> left, Vector256<float> right, byte control) { throw null; } | |||
public static Vector256<double> Permute2x128(Vector256<double> left, Vector256<double> right, byte control) { throw null; } | |||
public static Vector256<T> Permute2x128<T>(Vector256<T> left, Vector256<T> right, byte control) where T : struct { throw null; } | |||
public static Vector128<float> PermuteVar(Vector128<float> left, Vector128<float> mask) { throw null; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be called PermuteVariable
? For consitency with BlendVariable
.
Or do these two methods need the Var/Variable suffix at all? Would overload be sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are using the Variable
suffix only for v
-suffixed instructions, e.g., BlendVariable
-> vblendvp*
, ShiftLeftLogicalVariable
->vpsllv*
, etc.
PermuteVar
and PermuteVar8x32
is a special case that will generate vpermilp*
and vperm*
, which breaks the above convention, so it is following C++ Intel intrinsic naming.
We can change it to Variable
suffix, I have no strong preference here.
@@ -731,7 +730,7 @@ public static class Sse | |||
public static Vector128<float> Multiply(Vector128<float> left, Vector128<float> right) { throw new NotImplementedException(); } | |||
public static Vector128<float> Or(Vector128<float> left, Vector128<float> right) { throw new NotImplementedException(); } | |||
public static Vector128<float> Reciprocal(Vector128<float> value) { throw new NotImplementedException(); } | |||
public static Vector128<float> ReciprocalSquareRoot(Vector128<float> value) { throw new NotImplementedException(); } | |||
public static Vector128<float> ReciprocalSqrt(Vector128<float> value) { throw new NotImplementedException(); } | |||
public static Vector128<float> Set(float e3, float e2, float e1, float e0) { throw new NotImplementedException(); } | |||
public static Vector128<float> Set1(float value) { throw new NotImplementedException(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
For my education, what was the rule used to make the method generic vs. non-generic? For example, I wondering about these:
|
I am closing this because of I have cherry picked this into #25969. We can continue the discussion about the naming though. |
Because SSE only has |
Sometimes, |
Is this a common pattern? It does not sound right to be optimizing for case where folks have e.g. |
@jkotas Ok, I will fix this and |
@jkotas @fiigii |
@4creators, it is an optimization. It just avoids the consumer needing to insert a |
I believe that these generic intrinsic is a "legacy" design from the long design process, and the original motivation has gone due to other changes. I will fix it soon, thanks for pointing out! |
@tannergooding my point is that (E)(V)EXTRACTPS instruction does not check if operand is of xmm packed float type and it does not throw any type of floating point exception either, therefore, it can be treated as a general extraction of 32 bits from xmm vector. If we introduce any type of limitations which limit |
float ExtractSingle<T>(Vector128<T> value, byte index) where T : struct { throw null; }
Vector128<Int64> src = //... //
var f = Sse41.ExtractSingle<Int64>(src, 3); Should be perfectly legal - it saves 1 CPU cycle by doing extraction and cast (strange binary one but still), and I may want to use Int64 for loading to get atomic load for adjacent two 32bit values and do some precalculation step with it (again at binary level). |
We are not providing a raw api, however. We are providing a managed abstraction over the underlying hardware instructions. Because it is an abstraction and not raw access to the underlying instructions, there are some helper functions being provided (like static cast, set 1, etc) and other by design limitations set forth (such as no MMX instructions being exposed). |
@4creators Thank you for explaining my original design proposal 😄 . However, after I added Vector128<Int64> src = //... //
Vector128<Single> srcFloat = Sse.StaticCast<Int64, Single>(src);
var f = Sse41.ExtractSingle(srcFloat, 3);
|
Updated the above code example. |
Ahh my ... that was a good one on my side 😆 |
Matching the CoreCLR change dotnet/coreclr#15471
cc @jkotas @eerhardt @tannergooding