-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert RuntimeHelpers intrinsics to JIT ones #88543
Changes from 2 commits
2bfc816
4a43af0
c6db768
2b34801
5d01074
24f2cf9
fcdf181
13bea8f
5c8b92b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2551,6 +2551,9 @@ GenTree* Compiler::impIntrinsic(GenTree* newobjThis, | |
// This one is just `return true/false` | ||
case NI_System_Runtime_CompilerServices_RuntimeHelpers_IsKnownConstant: | ||
|
||
case NI_System_Runtime_CompilerServices_RuntimeHelpers_EnumEquals: | ||
case NI_System_Runtime_CompilerServices_RuntimeHelpers_EnumCompareTo: | ||
|
||
// We need these to be able to fold "typeof(...) == typeof(...)" | ||
case NI_System_Type_GetTypeFromHandle: | ||
case NI_System_Type_op_Equality: | ||
|
@@ -2759,6 +2762,81 @@ GenTree* Compiler::impIntrinsic(GenTree* newobjThis, | |
break; | ||
} | ||
|
||
case NI_System_Runtime_CompilerServices_RuntimeHelpers_IsReference: | ||
{ | ||
assert(sig->sigInst.methInstCount == 1); | ||
CORINFO_CLASS_HANDLE hClass = sig->sigInst.methInst[0]; | ||
retNode = gtNewIconNode(eeIsValueClass(hClass) ? 0 : 1); | ||
break; | ||
} | ||
|
||
case NI_System_Runtime_CompilerServices_RuntimeHelpers_IsReferenceOrContainsReferences: | ||
{ | ||
assert(sig->sigInst.methInstCount == 1); | ||
CORINFO_CLASS_HANDLE hClass = sig->sigInst.methInst[0]; | ||
retNode = gtNewIconNode(!eeIsValueClass(hClass) || ((info.compCompHnd->getClassAttribs(hClass) & CORINFO_FLG_CONTAINS_GC_PTR) != 0) ? 1 : 0); | ||
break; | ||
} | ||
|
||
case NI_System_Runtime_CompilerServices_RuntimeHelpers_IsBitwiseEquatable: | ||
{ | ||
assert(sig->sigInst.methInstCount == 1); | ||
CORINFO_CLASS_HANDLE hClass = sig->sigInst.methInst[0]; | ||
retNode = gtNewIconNode(info.compCompHnd->isBitwiseEquatable(hClass) ? 1 : 0); | ||
break; | ||
} | ||
|
||
case NI_System_Runtime_CompilerServices_RuntimeHelpers_GetMethodTable: | ||
{ | ||
retNode = gtNewMethodTableLookup(impPopStack().val); | ||
jkotas marked this conversation as resolved.
Show resolved
Hide resolved
|
||
break; | ||
} | ||
|
||
case NI_System_Runtime_CompilerServices_RuntimeHelpers_EnumEquals: | ||
{ | ||
assert(sig->sigInst.methInstCount == 1); | ||
CORINFO_CLASS_HANDLE hClass = sig->sigInst.methInst[0]; | ||
CorInfoType jitType = info.compCompHnd->asCorInfoType(hClass); | ||
|
||
if (varTypeIsFloating(JitType2PreciseVarType(jitType))) | ||
{ | ||
return nullptr; | ||
} | ||
|
||
GenTree* op2 = impPopStack().val; | ||
GenTree* op1 = impPopStack().val; | ||
retNode = impImportCompare(op1, op2, GT_EQ, false); | ||
break; | ||
} | ||
|
||
case NI_System_Runtime_CompilerServices_RuntimeHelpers_EnumCompareTo: | ||
{ | ||
assert(sig->sigInst.methInstCount == 1); | ||
|
||
CORINFO_CLASS_HANDLE hClass = sig->sigInst.methInst[0]; | ||
CorInfoType jitType = info.compCompHnd->asCorInfoType(hClass); | ||
var_types type = JitType2PreciseVarType(jitType); | ||
|
||
if (varTypeIsFloating(type)) | ||
{ | ||
return nullptr; | ||
} | ||
|
||
GenTree* op2 = impPopStack().val; | ||
GenTree* op1 = impPopStack().val; | ||
bool uns = varTypeIsUnsigned(type); | ||
|
||
GenTree* op1Clone; | ||
GenTree* op2Clone; | ||
op1 = impCloneExpr(op1, &op1Clone, CHECK_SPILL_ALL, nullptr DEBUGARG("EnumCompareTo arg1")); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is less efficient that the existing implementation for 8-bit and 16-bit enums. Also, this may be better marked as There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The existing implementation that just called CompareTo on the underlying type feels more robust, less duplicative. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Fixed.
It's marked as
But it also relies on that hardcoded table of tables for ILs for selected types there and it relies more on the inliner. This logic resembles the Mono one where it's expanded directly in the intrinsic too. We could potentially change this to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It means that the debug code will allocate vs. release code will not. The performance difference between debug and release is going to be order of magnitude more than what it is today. Performance regressions like this are not acceptable for debug code. It is not unusual to see issues filled on large debug code performance regressions like this by our customers.
It is not hard to fix the code so that it works for all enum underlying types.
The method being inlined is very small. There are thousands of places in this repo where we have small methods to make the code easier to maintain and depend on them being inlined without an issue. Why should this one be different? Enum comparers are not that common for inliner avoidance to make difference.
Right, this approach would not work well. |
||
op2 = impCloneExpr(op2, &op2Clone, CHECK_SPILL_ALL, nullptr DEBUGARG("EnumCompareTo arg2")); | ||
|
||
retNode = gtFoldExpr(gtNewOperNode(GT_SUB, TYP_INT, | ||
impImportCompare(op1, op2, GT_GT, uns), | ||
impImportCompare(op1Clone, op2Clone, GT_LT, uns))); | ||
break; | ||
} | ||
|
||
case NI_System_Runtime_InteropService_MemoryMarshal_GetArrayDataReference: | ||
{ | ||
assert(sig->numArgs == 1); | ||
|
@@ -8948,6 +9026,30 @@ NamedIntrinsic Compiler::lookupNamedIntrinsic(CORINFO_METHOD_HANDLE method) | |
{ | ||
result = NI_System_Runtime_CompilerServices_RuntimeHelpers_IsKnownConstant; | ||
} | ||
else if (strcmp(methodName, "IsReference") == 0) | ||
{ | ||
result = NI_System_Runtime_CompilerServices_RuntimeHelpers_IsReference; | ||
} | ||
else if (strcmp(methodName, "IsReferenceOrContainsReferences") == 0) | ||
{ | ||
result = NI_System_Runtime_CompilerServices_RuntimeHelpers_IsReferenceOrContainsReferences; | ||
} | ||
else if (strcmp(methodName, "IsBitwiseEquatable") == 0) | ||
{ | ||
result = NI_System_Runtime_CompilerServices_RuntimeHelpers_IsBitwiseEquatable; | ||
} | ||
else if (strcmp(methodName, "EnumEquals") == 0) | ||
{ | ||
result = NI_System_Runtime_CompilerServices_RuntimeHelpers_EnumEquals; | ||
} | ||
else if (strcmp(methodName, "EnumCompareTo") == 0) | ||
{ | ||
result = NI_System_Runtime_CompilerServices_RuntimeHelpers_EnumCompareTo; | ||
} | ||
else if (strcmp(methodName, "GetMethodTable") == 0) | ||
{ | ||
result = NI_System_Runtime_CompilerServices_RuntimeHelpers_GetMethodTable; | ||
} | ||
} | ||
else if (strcmp(className, "Unsafe") == 0) | ||
{ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have mixed feeling about introducing a new JIT/EE interface method just to support an expansion of one-off BCL method in the JIT. Providing the expansion as IL on the VM is less spread through the system.
This direction makes sense only if we believe that 100% of the IL intrinsics are going to move to the JIT eventually. We would need to have a number of new one-off methods like this to pull it off.
@dotnet/jit-contrib Opinions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say that the this proper recognition of this intrinsic as a constant in the JIT is important since it makes sure it can properly skip importing branches guarded by it, similar to how
IsKnownConstant
andIsReferenceOrContainsReferences
are used. It's also tracked for exposal in #75642.As far as I could see, the only IL intrinsics left in the CoreCLR VM are the Unsafe ones (tracked for removal in #69220),
Interlocked.CompareExchange(T)
which can be removed with no impact today since it already has an equivalent managed implementation withUnsafe.As
andActivator.CreateInstance<T>
which seems to be the only problematic one.Since "bitwise equality" is a somewhat fundamental type property (and could potentially be useful for other optimizations in the JIT), I've considered making this a flag in
getClassAttribs
, but I've refrained from doing so since it's I think relatively expensive to fetch and wouldn't be used that often.However, citing @SingleAccretion from Discord:
maybe it'd make sense to introduce
getExtendedClassAttribs(uint32_t mask)
that'd only fetch more rarely used/expensive flags (with a new separate enum for them) like this or #88136 based on the mask?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a list of the APIs that would still require moving and how complex they are to handle? That is, are many of them effectively simple flags like this one which are often trivial to lookup and which may get used on generic hot paths or do we have a number of substantially more complex APIs where the benefits of special-casing them in the JIT are more questionable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, this is a workaround for a deficiency in the current JIT inlining strategy. It would be better to fix it in more general way for everybody out there instead of using it to justify implementing more methods as JIT intrinsics.
I expect that we would see performance regressions in shared generic code callsites. It is a workaround for the current shared generic code inlining limitation.
getExtendedClassAttribs(uint32_t mask)
would make sense if the typically caller needs most of the flags and when there is an efficiency benefit from computing and returning multiple flags together. If the typically caller needs just one flag as is the case for expanding one-off Intrinsics, it is better to have an API that returns the one property directly and avoids the extra level of abstraction.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current CoreCLR and native AOT intrinsics that would required introducing new special new JIT-EE interface APIs. You can generally assume that it is one new JIT/EE interface method for each:
Activaror.CreateInstanceOf<T>
EqualityCompaparer/Comparer.Create<T>
EqualityComparerHelpers.GetComparerForReferenceTypesOnly/StructOnlyEquals
Interlocked.CompareExchange<T>(ref)
RuntimeAugments.GetCanonType
MethodTable.SupportsRelativePointers
System.IO.Stream.HasOverriddenBeginEndRead/HasOverriddenBeginEndReadHasOverriddenBeginEndWrite
We like introducing intrinsics to do various tricks, so this list would grow over time. Also, some of these would require special-handling in the native AOT scanner if they are not IL intrinsics anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even then, having just a temporary workaround is better than not having one.
I don't really think that's the case, I think the only difference is the smaller IL size due to no Unsafe.As calls and the inlining boost from being an intrinsic.
runtime/src/coreclr/System.Private.CoreLib/src/System/Threading/Interlocked.CoreCLR.cs
Lines 120 to 128 in 3110693
I think that it only makes sense to move apis that are: performance sensitive (like the three
Is
APIs here), more valid (like theGetMethodTable
one) or ones that have big implementations in both runtimes (like the enum ones). I don't think it makes much sense to have runtime specific, rarely used intrinsics likeHasOverriddenBeginEndRead
that'd just call into the VM.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you come up with a real-world looking example where
isBitwiseEquatable
expansion in the JIT makes a difference to show that it is worth doing?Yes, the GetMethodTable one looks good.
The implementations in each runtime are about 30 lines of code each. I would not call that big. The inlining expansion in the JIT is not reducing runtime complexity much overall.
The expansion in the JIT makes the code less maintainable by creating a copy of CompareTo implementation from 10 different .cs files into the JIT. To make this explicit, you would want to add a comment "This logic is duplicated in the JIT in EnumCompare intrinsic expansion" to CompareTo implementation in all Byte.cs, UByte.cs, ...., etc. I do not think it would be a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have done a few quick tests and I agree that the IL intrinsic for Interlocked.CompareExchange does not seem to be making a difference. Feel free to remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the motivation for these changes. It seems like a lot of code churn late in a release cycle with a decent risk of introducing correctness or perf issues.
If we think further unification / simplification here is sufficient motivation, perhaps it can wait until .NET 9?