-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Marking Vector128<T>.Count and Vector256<T>.Count as [Intrinsic] #24991
Conversation
Let me know if this misses the bar for 3.0, and I can close until after master opens back up. I opened it given the triviality of the fix. |
This was found today when debugging some code with a customer who found a difference between some code using |
…etting the simdSize and baseType
Does this include unrolling of for-loop with Just asking out of interest. |
AFAIK, the appropriate flags are being set (this is the |
What would be the need for such unrolling? In general you don't expect vectors to be accessed element by element a lot. |
@mikedn: I have no need, just out of interest. (for reduction there are often better ways). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, other than some minor suggestions. I think this meets the bar for 3.0, because it's pretty straightforward, and developers will expect this to be recognized.
src/jit/compiler.h
Outdated
#define LPFLG_SIMD_LIMIT 0x0080 // iterator is compared with Vector<T>.Count (found in lpConstLimit) | ||
#define LPFLG_SIMD_LIMIT \ | ||
0x0080 // iterator is compared with Vector<T>, Vector64<T>, Vector128<T>, or Vector256<T>.Count (found in | ||
// lpConstLimit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This formatting looks pretty weird. Did jit-format doe this automatically? I think it would be OK if you aligned it as before, and manually split the constant to the next line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is how the format patch fixed it up. I might just fix this up to say vector count
as well to keep the comment shorter.
src/jit/gentree.cpp
Outdated
@@ -10196,7 +10196,7 @@ void Compiler::gtDispConst(GenTree* tree) | |||
#ifdef FEATURE_SIMD | |||
if ((tree->gtFlags & GTF_ICON_SIMD_COUNT) != 0) | |||
{ | |||
printf(" Vector<T>.Count"); | |||
printf(" Vector<T>, Vector64<T>, Vector128<T>, or Vector256<T>.Count"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than making this so verbose, I think it would be perfectly reasonable to just print "Vector count".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you prefer vector count
or vector element count
?
The latter seems to be more explicit as to what the count is, but I don't feel particularly strongly about it 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vector element count sounds good.
…ctor256_Count don't return nullptr
baseType = getBaseTypeAndSizeOfSIMDType(sig->retTypeClass, &simdSize); | ||
retType = getSIMDTypeForSize(simdSize); | ||
} | ||
else | ||
{ | ||
baseType = getBaseTypeAndSizeOfSIMDType(clsHnd, &simdSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to change this to be here as the check right below (if (!varTypeIsArithmetic(baseType)
) is what handles unsupported T
and we were returning nullptr
.
Validated that we now return the integer constant node, that the loop unrolling functionality works (cc. @gfoidl), and the codegen for the known cases is now "efficient".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe for 5.0, it would be nice to make the loop unrolling work for "real world" scenarios, such as for (int i = 0; i < data.Length; i += Vector128<T>.Count)
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps you could file an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will file one before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is largely covered by both https://github.com/dotnet/coreclr/issues/11606 and https://github.com/dotnet/coreclr/issues/20486.
I've added comments to both of these instead.
…net/coreclr#24991) * Marking Vector128<T>.Count and Vector256<T>.Count as [Intrinsic] * Fixing NI_Vector128_Count and NI_Vector256_Count to use clsHnd when getting the simdSize and baseType * Applying the formatting patch. * Changing some comments to just be "vector element count". * Fixing impBaseIntrinsic to set the baseType so Vector128_Count and Vector256_Count don't return nullptr Commit migrated from dotnet/coreclr@9321692
CC. @CarolEidt.
The
Vector128<T>.Count
andVector256<T>.Count
methods weren't marked as intrinsic so they:This PR marks them as
[Intrinsic]
and hooks it up to the same handling asVector<T>.Count
.