-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider devirtualizing & inlining Comparer<T>.Default.CompareTo as well. #39873
Comments
Thanks, will take a look. Suspect this is similar to #39519, the jit devirtualizes late and so can't inline. |
For
Late devirt here requires class init, so doesn't happen either.
When the Comparer class has been initialized I get the sharplab code, and indeed here we've devirtualized late: ; Assembly listing for method C:Compare2(int,int):int
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; fully interruptible
; Final local variable assignments
;
; V00 arg0 [V00,T00] ( 3, 3 ) int -> r8
; V01 arg1 [V01,T01] ( 3, 3 ) int -> rax
;# V02 OutArgs [V02 ] ( 1, 1 ) lclBlk ( 0) [rsp+0x00] "OutgoingArgSpace"
;
; Lcl frame size = 0
G_M274_IG01:
mov r8d, ecx
mov eax, edx
;; bbWeight=1 PerfScore 0.50
G_M274_IG02:
mov rcx, 0xD1FFAB1E
mov rcx, gword ptr [rcx]
mov edx, r8d
mov r8d, eax
mov rax, 0xD1FFAB1E
mov rax, qword ptr [rax]
;; bbWeight=1 PerfScore 5.00
G_M274_IG03:
rex.jmp rax
;; bbWeight=1 PerfScore 2.00 as seen in the jit dump
Codegen for some other types is interesting too. For short types we use subtract which is good but somehow also need to spill and reload one arg: ; Assembly listing for method C:Compare(short,short):int
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; Final local variable assignments
;
; V00 arg0 [V00,T00] ( 3, 3 ) short -> [rsp+0x08] do-not-enreg[F] ld-addr-op
; V01 arg1 [V01,T01] ( 3, 3 ) short -> rdx
;# V02 OutArgs [V02 ] ( 1, 1 ) lclBlk ( 0) [rsp+0x00] "OutgoingArgSpace"
;
; Lcl frame size = 0
G_M29376_IG01:
mov dword ptr [rsp+08H], ecx
;; bbWeight=1 PerfScore 1.00
G_M29376_IG02:
movsx rax, word ptr [rsp+08H]
movsx rdx, dx
sub eax, edx
;; bbWeight=1 PerfScore 1.50
G_M29376_IG03:
ret
;; bbWeight=1 PerfScore 1.00 |
For the
Wonder if the mixture of types here gets in the way. Seems a bit odd that we widen the type in cc @dotnet/jit-contrib |
Seems to be the case, with @@ -11536,8 +11536,8 @@ void Compiler::impImportBlockCode(BasicBlock* block)
goto ADRVAR;
ADRVAR:
- op1 = gtNewLclvNode(lclNum, lvaGetActualType(lclNum) DEBUGARG(opcodeOffs + sz + 1));
+ op1 = gtNewLclvNode(lclNum, lvaGetRealType(lclNum) DEBUGARG(opcodeOffs + sz + 1)); we now generate ; Assembly listing for method C:Compare(short,short):int
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; Final local variable assignments
;
; V00 arg0 [V00,T00] ( 3, 3 ) short -> rcx ld-addr-op
; V01 arg1 [V01,T01] ( 3, 3 ) short -> rdx
;# V02 OutArgs [V02 ] ( 1, 1 ) lclBlk ( 0) [rsp+0x00] "OutgoingArgSpace"
;
; Lcl frame size = 0
G_M29376_IG01:
;; bbWeight=1 PerfScore 0.00
G_M29376_IG02:
movsx rax, cx
movsx rdx, dx
sub eax, edx
;; bbWeight=1 PerfScore 0.75
G_M29376_IG03:
ret
;; bbWeight=1 PerfScore 1.00
; Total bytes of code 11, prolog size 0, PerfScore 2.85, (MethodHash=b17e8d3f) for method C:Compare(short,short):int Quick gloss of the above importer change over SPC looks promising:
|
Overall support should parallel what we did for the generic comparer, something like the following:
Will consider this for .Net 6. |
This was fixed by #48160, correct? |
Starting in .NET Core 2.1, calling
EqualityComparer<T>.Default.Equals
is specially optimized by the JIT by devirtualizing and inlining the equality check.However, as seen in SharpLab, the similar
Comparer<T>.Default.CompareTo
did not have the same fate. As seen in the generated assembly code, comparing integers with viaIComparable<int>
is optimized, but callingComparer<int>.CompareTo(x1, x2)
results in a function call. Making the two examples generate identical code would be a good optimization opportunity.category:cq
theme:devirtualization
skill-level:intermediate
cost:small
The text was updated successfully, but these errors were encountered: