-
Notifications
You must be signed in to change notification settings - Fork 2.6k
DictionarySlim backport improvements, retaining more entropy #22832
Conversation
src/System.Private.CoreLib/shared/System/Collections/Generic/Dictionary.cs
Outdated
Show resolved
Hide resolved
src/System.Private.CoreLib/shared/System/Collections/Generic/Dictionary.cs
Outdated
Show resolved
Hide resolved
src/System.Private.CoreLib/shared/System/Collections/Generic/Dictionary.cs
Outdated
Show resolved
Hide resolved
/azp run |
@dotnet-bot test this please |
@safern should new command works also here? |
Yeah, it should work. Weird that it didn't work for you. Let me give it a try. |
/azp run |
Azure Pipelines successfully started running 3 pipeline(s). |
Damn, it seems like coreclr now has outerloop builds already. |
/cc @safern this PR is in your area, I don't know who have to review |
@MarcoRossignoli thanks for looking at this. I am trying to absorb the information above. When gathering perf data, it is ideal to include the before and after in the same Benchmark.NET run so that it can show before and after in the same table, and calculate the base/candidate ratio (I see at least one of your tables has this) and also to be super explicit about what commit is base and what commit is candidate. That way it's easier for the rest of us to keep up 😃 Of the various changes suggested in the issue, did any produce a clear perf improvement, and no significant regressions? Can we limit this PR to just that/those, and show the before/after perf data for just that/those? |
I used new comparer to compare before/after, because AFAIK it's not possible merge 2 different benchmark run result togheter(@adamsitnik could you confirm?). Dictionary is on coreclr so to do tests I need to compile my local coreclr+corefx and after use performance repo tests on my local build. But I cannot run and compare old and new version togheter, so my strategy was run old code(coreclr+corefx with no dictionary update) and after new updated code(updated dic on coreclr+corefx) and compare with comparer tool. BTW now I'll try to do comparison on same report cloning 4 repo, current coreclr+corefx vs coreclr(dic updated)+corefx using two CoreRun.exe, I think it's the only way compare in a correct way what do you think @adamsitnik?Is there better way or a benchmarkdotnet feature to merge two "cold" result?
The issue asked to test 2 remaining point:
Check new for (int i = 1; i < Items; i++)
{
dict.Add(i, i);
dict.Add(int.MinValue + i, int.MinValue + i);
}
Finally I think that uint as hashcode could be an improvement in case of collision with not measurable regression, eliminate _freeCount remove one local var but regress all other features due to more complexity on _entry enumeration(after this PR I'll show results on this second one if needed) This PR show result only of one reasonable improvement "Use uint hash code for better entropy". /cc @danmosemsft |
rebased for better comparison with coreclr+corefx upstream |
Thanks @MarcoRossignoli that makes things clearer. For your perf results immediately above, is it possible to include the error? For example I am not sure how CtorDefaultSize could possibly change, but perhaps that 13% is within the error since it is a very short time. Likewise I wonder how the improvements compare to the noise level. |
@danmosemsft I merged results(run updated corefx+coreclr vs corefxupstream+coreclrupstream), slightly better number on custom collision tests BenchmarkDotNet=v0.11.3.1003-nightly, OS=Windows 10.0.17134.590 (1803/April2018Update/Redstone4)
Intel Core i7 CPU 860 2.80GHz (Nehalem), 1 CPU, 8 logical and 4 physical cores
Frequency=2727538 Hz, Resolution=366.6310 ns, Timer=TSC
.NET Core SDK=3.0.100-preview3-010431
[Host] : .NET Core 3.0.0-preview3-27503-5 (CoreCLR 4.6.27422.72, CoreFX 4.7.19.12807), 64bit RyuJIT
Job-TYTTNG : .NET Core fb2d4c2d-5f0e-4d7c-bb86-1e82a8054ebf (CoreCLR 4.6.27603.0, CoreFX 4.7.19.15701), 64bit RyuJIT
Job-HUJGQG : .NET Core fc4c2829-86c1-4f64-abed-1da2cd29ddb8 (CoreCLR 4.6.27602.0, CoreFX 4.7.19.15501), 64bit RyuJIT
Job-TLMUNW : .NET Core fb2d4c2d-5f0e-4d7c-bb86-1e82a8054ebf (CoreCLR 4.6.27603.0, CoreFX 4.7.19.15701), 64bit RyuJIT
Job-CFGYCY : .NET Core fc4c2829-86c1-4f64-abed-1da2cd29ddb8 (CoreCLR 4.6.27602.0, CoreFX 4.7.19.15501), 64bit RyuJIT
Runtime=Core IterationTime=250.0000 ms MaxIterationCount=20
MinIterationCount=15 WarmupCount=1
High error on BenchmarkDotNet=v0.11.3.1003-nightly, OS=Windows 10.0.17134.590 (1803/April2018Update/Redstone4)
Intel Core i7 CPU 860 2.80GHz (Nehalem), 1 CPU, 8 logical and 4 physical cores
Frequency=2727538 Hz, Resolution=366.6310 ns, Timer=TSC
.NET Core SDK=3.0.100-preview3-010431
[Host] : .NET Core 3.0.0-preview3-27503-5 (CoreCLR 4.6.27422.72, CoreFX 4.7.19.12807), 64bit RyuJIT
Job-YLVJAB : .NET Core 7b2a2d6f-d4e3-42b1-9a16-d56d311cd3a3 (CoreCLR 4.6.27603.0, CoreFX 4.7.19.15701), 64bit RyuJIT
Job-SMKTJS : .NET Core cdce2a1f-95b7-448d-8f4b-eff5d026825c (CoreCLR 4.6.27602.0, CoreFX 4.7.19.15501), 64bit RyuJIT
Runtime=Core IterationTime=250.0000 ms MaxIterationCount=20
MinIterationCount=15 WarmupCount=1
FYI @jkotas during perf tests I saw a great decrease of perf difference (~20%) on simple I did some test with dump and there is a difference in emitted code that lead to more "dereferencing": public class D2<K, V>
{
private S[] _entries;
public struct S
{
public int next;
public uint hashcode;
public K key;
public V val;
}
public void Test()
{
for (int i = 0; i < _entries.Length; i++)
{
if (_entries[i].next > -1) Console.WriteLine(_entries[i].next);
}
}
} ...
; Tier-1 compilation
...
G_M54470_IG03:
488B4E08 mov rcx, gword ptr [rsi+8]
3B7908 cmp edi, dword ptr [rcx+8]
7326 jae SHORT G_M54470_IG06
4863C7 movsxd rax, edi
48C1E004 shl rax, 4
8B4C0110 mov ecx, dword ptr [rcx+rax+16]
85C9 test ecx, ecx
7C05 jl SHORT G_M54470_IG04
E866FEFFFF call Console:WriteLine(int) uint before public class D<K, V>
{
private S[] _entries;
public struct S
{
public uint hashcode;
public int next;
public K key;
public V val;
}
public void Test()
{
for (int i = 0; i < _entries.Length; i++)
{
if (_entries[i].next > -1) Console.WriteLine(_entries[i].next);
}
}
} ...
; Tier-1 compilation
...
G_M340_IG03:
488B4E08 mov rcx, gword ptr [rsi+8]
3B7908 cmp edi, dword ptr [rcx+8]
732C jae SHORT G_M340_IG06
4863C7 movsxd rax, edi
48C1E004 shl rax, 4
488D4C0110 lea rcx, bword ptr [rcx+rax+16]
83790400 cmp dword ptr [rcx+4], 0
7C08 jl SHORT G_M340_IG04
8B4904 mov ecx, dword ptr [rcx+4]
E800FFFFFF call Console:WriteLine(int) I'm not so fluent on codegen yet so maybe is expected, btw better to ask to remove my doubt. |
In this particular case, the JIT could have optimized this into same code in both cases. However, accessing fields at offset zero (vs. non-zero offset) tends to generate tiny bit better code on x86/x64 when everything else is equal. The improvement will be non-measurable in most cases, but there are rare cases where the improvement gets amplified due to processor micro-architecture and you can get measurable improvement like in this case. Improvements like these tend to come and go with unrelated changes. You need to take them with a grain of salt. |
Actually before we compared hashcode, it was at offset 0, maybe this "improvement" was in place also before, or it's simply "unrelated" to past choice as you said |
@safern @ViktorHofer is there something I can do for failing CI here? |
My personal workflow is following:
Now depending on how many benchmarks I want to run:
If some of the nano-benchmarks seems to be unstable, I run them affinitized to one CPU. Example: |
@adamsitnik thank's for infos! |
The CI logs are gone for some reason. |
Was there a reason to change? You could try using the original order (does not sound important, so entirely up to you). I suppose |
There is a perf decrease of more or less 20% on Contains test, you can read explanation on this comment #22832 (comment) after grid of results. If it's not a concern to you I'll restore old layout. |
@danmosemsft I try to explain better, during performance tests I found that the offset of fields in Entry struct change emitted code.
Now we could revert order and go on, one doubt is that I cannot see difference in perf with I could try to dump Add/Remove method and check emitted code to understand. What are your thought? /cc @jkotas @stephentoub If you agree with revert, I'll do. |
I do not think it matters a whole lot for the reasons #22832 (comment) Ideally, we would fix the JIT to not generate the extra instruction that seems to be impacting the Contains micro-benchmark measurably. |
OK @MarcoRossignoli what you did makes sense to me. Sounds like you might consider opening a CoreCLR issue for the JIT. They will want a small repro if possible. Certainly no need to wait on that. |
Ok I'll revert order and open issue on CoreCLR with above sample thank's. |
@MarcoRossignoli I assumed you would keep whatever order is fastest in your measurements (since aside from that we do not care about ordering). The bug is just to help other people. |
I apologize @danmosemsft I misunderstood the intentions, I confirm that I would keep code as is, it's ready for review, the outcome of perf tests are above! |
@jkotas I don't have x86 arch machine, is there a way to run x86 on x64 without distort outcome?Or you mean compile x86 and compare before after on x64? |
It is fine to just build and run x86 build on x64 machine. |
@MarcoRossignoli you can use the python script from dotnet/performance repo and tell it to download x86 cli and run the benchmarks using it py .\scripts\benchmarks_ci.py --architecture x86 If you are using CoreRun to run the benchmarks, you need to build the repo for x86 Release as well build -c Release -arch x86 |
Thank's @adamsitnik I'm using |
@jkotas @danmosemsft X86 test are "in line" BenchmarkDotNet=v0.11.3.1003-nightly, OS=Windows 10.0.17134.648 (1803/April2018Update/Redstone4)
Intel Core i7 CPU 860 2.80GHz (Nehalem), 1 CPU, 8 logical and 4 physical cores
Frequency=2727540 Hz, Resolution=366.6307 ns, Timer=TSC
.NET Core SDK=3.0.100-preview3-010431
[Host] : .NET Core 3.0.0-preview3-27503-5 (CoreCLR 4.6.27422.72, CoreFX 4.7.19.12807), 32bit RyuJIT
Job-HNRQQM : .NET Core ae988845-d656-4730-890f-0a72e2fa9bd5 (CoreCLR 4.6.27623.0, CoreFX 4.700.19.17601), 32bit RyuJIT
Job-CHVHEB : .NET Core 57e7ed6b-68a2-4caf-a251-80b88523826f (CoreCLR 4.6.27623.0, CoreFX 4.700.19.17601), 32bit RyuJIT
Job-VBGKTO : .NET Core ae988845-d656-4730-890f-0a72e2fa9bd5 (CoreCLR 4.6.27623.0, CoreFX 4.700.19.17601), 32bit RyuJIT
Job-NSZOKV : .NET Core 57e7ed6b-68a2-4caf-a251-80b88523826f (CoreCLR 4.6.27623.0, CoreFX 4.700.19.17601), 32bit RyuJIT
Runtime=Core IterationTime=250.0000 ms MaxIterationCount=20
MinIterationCount=15 WarmupCount=1
Now on |
@adamsitnik maybe you can save me some "search" time, is there a way to re-order/specify columns using console command?I'd like to have |
there is no way to do it from console command, you would have to modify the code |
thank's for quick response! |
I don't see issue on x64 principal difference are as expected
diff --git a/x64coreclrupstream.txt b/x64coreclr.txt
index bbac7b2..4aa19ef 100644
--- a/x64coreclrupstream.txt
+++ b/x64coreclr.txt
@@ -6,56 +6,57 @@
; fully interruptible
; Final local variable assignments
;
-; V00 this [V00,T03] ( 34, 21.50) ref -> rsi this class-hnd
+; V00 this [V00,T03] ( 35, 22 ) ref -> rsi this class-hnd
; V01 arg1 [V01,T08] ( 11, 9 ) ref -> rdi ld-addr-op class-hnd
; V02 arg2 [V02,T16] ( 5, 3.50) ref -> rbp class-hnd
; V03 arg3 [V03,T13] ( 6, 4 ) ubyte -> rbx
-; V04 loc0 [V04,T04] ( 11, 23 ) ref -> registers class-hnd
+; V04 loc0 [V04,T04] ( 13, 24 ) ref -> registers class-hnd
; V05 loc1 [V05,T12] ( 6, 6 ) ref -> r15 class-hnd
; V06 loc2 [V06,T09] ( 6, 11 ) int -> r12
; V07 loc3 [V07,T01] ( 8, 25.50) int -> r13
; V08 loc4 [V08,T20] ( 5, 3.50) byref -> [rsp+0x50]
; V09 loc5 [V09,T00] ( 9, 29 ) int -> [rsp+0x5C]
-; V10 loc6 [V10,T30] ( 3, 1.50) bool -> registers
-; V11 loc7 [V11,T23] ( 6, 3 ) int -> registers
-; V12 loc8 [V12,T22] ( 6, 3 ) byref -> [rsp+0x48]
+; V10 loc6 [V10,T31] ( 3, 1.50) bool -> registers
+; V11 loc7 [V11,T22] ( 6, 3 ) int -> registers
+; V12 loc8 [V12,T26] ( 5, 2.50) byref -> [rsp+0x48]
;* V13 loc9 [V13 ] ( 0, 0 ) ref -> zero-ref ld-addr-op class-hnd
; V14 loc10 [V14,T18] ( 3, 4.50) ref -> [rsp+0x40] class-hnd
-; V15 loc11 [V15,T31] ( 3, 1.50) int -> r14
+; V15 loc11 [V15,T32] ( 3, 1.50) int -> r14
; V16 OutArgs [V16 ] ( 1, 1 ) lclBlk (32) [rsp+0x00] "OutgoingArgSpace"
-; V17 tmp1 [V17,T28] ( 3, 2 ) int -> rax
+; V17 tmp1 [V17,T29] ( 3, 2 ) int -> r12
; V18 tmp2 [V18,T19] ( 5, 3.74) ref -> r15 class-hnd "spilling QMark2"
; V19 tmp3 [V19,T10] ( 3, 10 ) long -> rcx "impRuntimeLookup slot"
; V20 tmp4 [V20,T11] ( 2, 8 ) ref -> [rsp+0x38] class-hnd "impAppendStmt"
;* V21 tmp5 [V21 ] ( 0, 0 ) ref -> zero-ref class-hnd "bubbling QMark1"
; V22 tmp6 [V22,T05] ( 5, 18 ) long -> r11 "impRuntimeLookup typehandle"
;* V23 tmp7 [V23 ] ( 0, 0 ) long -> zero-ref "VirtualCall with runtime lookup"
-; V24 tmp8 [V24,T34] ( 3, 0 ) long -> rcx "impRuntimeLookup slot"
+; V24 tmp8 [V24,T35] ( 3, 0 ) long -> rcx "impRuntimeLookup slot"
;* V25 tmp9 [V25 ] ( 0, 0 ) ref -> zero-ref class-hnd "bubbling QMark1"
-; V26 tmp10 [V26,T32] ( 4, 0 ) long -> rax "impRuntimeLookup typehandle"
-; V27 tmp11 [V27,T26] ( 3, 2.50) long -> rcx "impRuntimeLookup slot"
+; V26 tmp10 [V26,T33] ( 4, 0 ) long -> rax "impRuntimeLookup typehandle"
+; V27 tmp11 [V27,T27] ( 3, 2.50) long -> rcx "impRuntimeLookup slot"
; V28 tmp12 [V28,T21] ( 4, 3.50) long -> r9 "impRuntimeLookup typehandle"
-; V29 tmp13 [V29,T35] ( 3, 0 ) long -> rcx "impRuntimeLookup slot"
+; V29 tmp13 [V29,T36] ( 3, 0 ) long -> rcx "impRuntimeLookup slot"
;* V30 tmp14 [V30 ] ( 0, 0 ) ref -> zero-ref class-hnd "bubbling QMark1"
-; V31 tmp15 [V31,T33] ( 4, 0 ) long -> rax "impRuntimeLookup typehandle"
+; V31 tmp15 [V31,T34] ( 4, 0 ) long -> rax "impRuntimeLookup typehandle"
;* V32 tmp16 [V32 ] ( 0, 0 ) long -> zero-ref "impRuntimeLookup slot"
;* V33 tmp17 [V33 ] ( 0, 0 ) long -> zero-ref "impRuntimeLookup typehandle"
;* V34 tmp18 [V34 ] ( 0, 0 ) long -> zero-ref "impRuntimeLookup slot"
;* V35 tmp19 [V35 ] ( 0, 0 ) ref -> zero-ref class-hnd "bubbling QMark1"
;* V36 tmp20 [V36 ] ( 0, 0 ) long -> zero-ref "impRuntimeLookup typehandle"
-; V37 tmp21 [V37,T27] ( 3, 2.50) long -> rcx "impRuntimeLookup slot"
+; V37 tmp21 [V37,T28] ( 3, 2.50) long -> rcx "impRuntimeLookup slot"
;* V38 tmp22 [V38 ] ( 0, 0 ) ref -> zero-ref class-hnd "bubbling QMark1"
; V39 tmp23 [V39,T17] ( 5, 4.50) long -> r11 "impRuntimeLookup typehandle"
;* V40 tmp24 [V40 ] ( 0, 0 ) long -> zero-ref "VirtualCall with runtime lookup"
; V41 tmp25 [V41,T14] ( 3, 6 ) ref -> rcx "arr expr"
; V42 tmp26 [V42,T15] ( 3, 6 ) int -> rdx "arr expr"
;* V43 tmp27 [V43 ] ( 0, 0 ) ref -> zero-ref "argument with side effect"
-; V44 tmp28 [V44,T29] ( 2, 2 ) int -> rdx "argument with side effect"
-; V45 tmp29 [V45,T24] ( 3, 3 ) ref -> rcx "arr expr"
-; V46 tmp30 [V46,T25] ( 3, 3 ) int -> rdx "arr expr"
-; V47 cse0 [V47,T06] ( 4, 12.50) byref -> [rsp+0x30] "ValNumCSE"
-; V48 cse1 [V48,T07] ( 4, 12.50) byref -> [rsp+0x28] "ValNumCSE"
-; V49 cse2 [V49,T02] ( 7, 24.50) int -> [rsp+0x58] "ValNumCSE"
+; V44 tmp28 [V44,T30] ( 2, 2 ) int -> rdx "argument with side effect"
+; V45 tmp29 [V45,T23] ( 3, 3 ) ref -> rcx "arr expr"
+; V46 tmp30 [V46,T24] ( 3, 3 ) int -> rdx "arr expr"
+; V47 tmp31 [V47,T25] ( 3, 3 ) int -> rdx "arr expr"
+; V48 cse0 [V48,T06] ( 4, 12.50) byref -> [rsp+0x30] "ValNumCSE"
+; V49 cse1 [V49,T07] ( 4, 12.50) byref -> [rsp+0x28] "ValNumCSE"
+; V50 cse2 [V50,T02] ( 7, 24.50) int -> [rsp+0x58] "ValNumCSE"
;
; Lcl frame size = 104
@@ -106,6 +107,7 @@ G_M9942_IG05:
mov rdx, rdi
cmp dword ptr [rcx], ecx
call qword ptr [r11]
+ mov r12d, eax
jmp SHORT G_M9942_IG07
G_M9942_IG06:
@@ -113,16 +115,15 @@ G_M9942_IG06:
mov rax, qword ptr [rdi]
mov rax, qword ptr [rax+64]
call qword ptr [rax+24]Object:GetHashCode():int:this
+ mov r12d, eax
G_M9942_IG07:
- mov r12d, eax
- and r12d, 0xD1FFAB1E
xor r13d, r13d
mov rcx, gword ptr [rsi+8]
mov r8, gword ptr [rsi+8]
mov eax, r12d
- cdq
- idiv edx:eax, dword ptr [r8+8]
+ xor rdx, rdx
+ div edx:eax, dword ptr [r8+8]
cmp edx, dword ptr [rcx+8]
jae G_M9942_IG38
movsxd rdx, edx
@@ -175,7 +176,7 @@ G_M9942_IG12:
lea rdx, [rdx+2*rdx]
lea r11, bword ptr [r14+8*rdx+16]
mov bword ptr [rsp+30H], r11
- cmp dword ptr [r11+16], r12d
+ cmp dword ptr [r11+20], r12d
jne SHORT G_M9942_IG13
movsxd rdx, r10d
lea rdx, [rdx+2*rdx]
@@ -192,7 +193,7 @@ G_M9942_IG12:
G_M9942_IG13:
mov r11, bword ptr [rsp+30H]
- mov r10d, dword ptr [r11+20]
+ mov r10d, dword ptr [r11+16]
mov r8d, r10d
mov r9d, dword ptr [rsp+58H]
cmp r9d, r13d
@@ -228,7 +229,7 @@ G_M9942_IG17:
lea rcx, [rcx+2*rcx]
lea r10, bword ptr [r14+8*rcx+16]
mov bword ptr [rsp+28H], r10
- cmp dword ptr [r10+16], r12d
+ cmp dword ptr [r10+20], r12d
jne G_M9942_IG21
movsxd rcx, r8d
lea rcx, [rcx+2*rcx]
@@ -273,7 +274,7 @@ G_M9942_IG20:
G_M9942_IG21:
mov r10, bword ptr [rsp+28H]
- mov r8d, dword ptr [r10+20]
+ mov r8d, dword ptr [r10+16]
mov ecx, r8d
mov r9d, dword ptr [rsp+58H]
cmp r9d, r13d
@@ -307,8 +308,8 @@ G_M9942_IG24:
mov rcx, gword ptr [rsi+8]
mov r8, gword ptr [rsi+8]
mov eax, r12d
- cdq
- idiv edx:eax, dword ptr [r8+8]
+ xor rdx, rdx
+ div edx:eax, dword ptr [r8+8]
cmp edx, dword ptr [rcx+8]
jae G_M9942_IG38
movsxd rdx, edx
@@ -332,15 +333,22 @@ G_M9942_IG26:
lea r8, bword ptr [r14+8*rcx+16]
test edx, edx
je SHORT G_M9942_IG27
- mov edx, dword ptr [r8+20]
+ mov edx, dword ptr [rsi+52]
+ cmp edx, dword ptr [r14+8]
+ jae G_M9942_IG38
+ movsxd rdx, edx
+ lea rdx, [rdx+2*rdx]
+ mov edx, dword ptr [r14+8*rdx+32]
+ neg edx
+ add edx, -3
mov dword ptr [rsi+52], edx
G_M9942_IG27:
- mov dword ptr [r8+16], r12d
+ mov dword ptr [r8+20], r12d
mov rax, bword ptr [rsp+50H]
mov edx, dword ptr [rax]
dec edx
- mov dword ptr [r8+20], edx
+ mov dword ptr [r8+16], edx
mov bword ptr [rsp+48H], r8
mov rcx, r8
mov rdx, rdi
@@ -436,7 +444,7 @@ G_M9942_IG38:
call CORINFO_HELP_RNGCHKFAIL
int3
-; Total bytes of code 1069, prolog size 33 for method Dictionary`2:TryInsert(ref,ref,ubyte):bool:this
+; Total bytes of code 1093, prolog size 33 for method Dictionary`2:TryInsert(ref,ref,ubyte):bool:this
; ============================================================
; Assembly listing for method Dictionary`2:Resize(int,bool):this
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Windows
@@ -446,28 +454,27 @@ G_M9942_IG38:
; fully interruptible
; Final local variable assignments
;
-; V00 this [V00,T05] ( 8, 8 ) ref -> rsi this class-hnd
-; V01 arg1 [V01,T12] ( 5, 6 ) int -> rdi
-; V02 arg2 [V02,T16] ( 3, 3 ) bool -> rbx
-; V03 loc0 [V03,T11] ( 5, 8 ) ref -> rbp class-hnd
+; V00 this [V00,T06] ( 8, 8 ) ref -> rsi this class-hnd
+; V01 arg1 [V01,T11] ( 5, 6 ) int -> rdi
+; V02 arg2 [V02,T15] ( 3, 3 ) bool -> rbx
+; V03 loc0 [V03,T10] ( 5, 8 ) ref -> rbp class-hnd
; V04 loc1 [V04,T02] ( 8, 14.50) ref -> r14 class-hnd
-; V05 loc2 [V05,T04] ( 6, 11.50) int -> r15
+; V05 loc2 [V05,T05] ( 6, 11.50) int -> r15
;* V06 loc3 [V06 ] ( 0, 0 ) ref -> zero-ref ld-addr-op class-hnd
; V07 loc4 [V07,T01] ( 6, 20.50) int -> rbx
; V08 loc5 [V08,T00] ( 7, 23 ) int -> r12
-; V09 loc6 [V09,T13] ( 4, 8 ) int -> rdx
+; V09 loc6 [V09,T12] ( 4, 8 ) int -> rdx
; V10 OutArgs [V10 ] ( 1, 1 ) lclBlk (48) [rsp+0x00] "OutgoingArgSpace"
-; V11 tmp1 [V11,T17] ( 3, 4.50) long -> rcx "impRuntimeLookup slot"
-; V12 tmp2 [V12,T15] ( 4, 6.50) long -> rax "impRuntimeLookup typehandle"
-; V13 tmp3 [V13,T14] ( 2, 8 ) byref -> r12 "non-inline candidate call"
-; V14 tmp4 [V14,T18] ( 2, 4 ) ref -> rcx class-hnd "Inlining Arg"
-; V15 cse0 [V15,T08] ( 3, 10 ) int -> rax "ValNumCSE"
-; V16 cse1 [V16,T06] ( 3, 10 ) byref -> rcx "ValNumCSE"
-; V17 cse2 [V17,T07] ( 3, 10 ) byref -> r12 "ValNumCSE"
-; V18 cse3 [V18,T09] ( 3, 10 ) long -> rcx "ValNumCSE"
-; V19 cse4 [V19,T10] ( 4, 9.50) int -> r13 "ValNumCSE"
-; V20 cse5 [V20,T19] ( 2, 4 ) int -> rax "ValNumCSE"
-; V21 rat0 [V21,T03] ( 3, 12 ) ref -> rcx "virtual vtable call"
+; V11 tmp1 [V11,T16] ( 3, 4.50) long -> rcx "impRuntimeLookup slot"
+; V12 tmp2 [V12,T14] ( 4, 6.50) long -> rax "impRuntimeLookup typehandle"
+; V13 tmp3 [V13,T13] ( 2, 8 ) byref -> r12 "non-inline candidate call"
+; V14 tmp4 [V14,T17] ( 2, 4 ) ref -> rcx class-hnd "Inlining Arg"
+; V15 cse0 [V15,T03] ( 4, 12 ) byref -> rcx "ValNumCSE"
+; V16 cse1 [V16,T07] ( 3, 10 ) byref -> r12 "ValNumCSE"
+; V17 cse2 [V17,T08] ( 3, 10 ) long -> rcx "ValNumCSE"
+; V18 cse3 [V18,T09] ( 4, 9.50) int -> r13 "ValNumCSE"
+; V19 cse4 [V19,T18] ( 2, 4 ) int -> rax "ValNumCSE"
+; V20 rat0 [V20,T04] ( 3, 12 ) ref -> rcx "virtual vtable call"
;
; Lcl frame size = 56
@@ -528,14 +535,13 @@ G_M29783_IG04:
movsxd rcx, ebx
lea rcx, [rcx+2*rcx]
lea r12, bword ptr [r14+8*rcx+16]
- cmp dword ptr [r12+16], 0
+ cmp dword ptr [r12+16], -1
jl SHORT G_M29783_IG05
mov rcx, gword ptr [r14+8*rcx+16]
mov rax, qword ptr [rcx]
mov rax, qword ptr [rax+64]
call qword ptr [rax+24]Object:GetHashCode():int:this
- and eax, 0xD1FFAB1E
- mov dword ptr [r12+16], eax
+ mov dword ptr [r12+20], eax
G_M29783_IG05:
inc ebx
@@ -554,18 +560,18 @@ G_M29783_IG07:
movsxd rax, r12d
lea rax, [rax+2*rax]
lea rcx, bword ptr [r14+8*rax+16]
- mov eax, dword ptr [rcx+16]
- test eax, eax
+ cmp dword ptr [rcx+16], -1
jl SHORT G_M29783_IG08
- cdq
- idiv edx:eax, edi
+ mov eax, dword ptr [rcx+20]
+ xor rdx, rdx
+ div edx:eax, edi
mov eax, dword ptr [rbp+8]
cmp edx, eax
jae SHORT G_M29783_IG11
movsxd rax, edx
mov eax, dword ptr [rbp+4*rax+16]
dec eax
- mov dword ptr [rcx+20], eax
+ mov dword ptr [rcx+16], eax
lea eax, [r12+1]
movsxd rdx, edx
mov dword ptr [rbp+4*rdx+16], eax
@@ -600,7 +606,7 @@ G_M29783_IG11:
call CORINFO_HELP_RNGCHKFAIL
int3
-; Total bytes of code 338, prolog size 29 for method Dictionary`2:Resize(int,bool):this
+; Total bytes of code 336, prolog size 29 for method Dictionary`2:Resize(int,bool):this
; ============================================================
; Assembly listing for method Dictionary`2:FindEntry(ref):int:this
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Windows
@@ -682,11 +688,10 @@ G_M16827_IG03:
mov rax, qword ptr [rax+64]
call qword ptr [rax+24]Object:GetHashCode():int:this
mov r12d, eax
- and r12d, 0xD1FFAB1E
mov r13d, dword ptr [rbp+8]
mov eax, r12d
- cdq
- idiv edx:eax, r13d
+ xor rdx, rdx
+ div edx:eax, r13d
cmp edx, r13d
jae G_M16827_IG22
movsxd rcx, edx
@@ -715,7 +720,7 @@ G_M16827_IG05:
lea rdx, [rdx+2*rdx]
lea rax, bword ptr [r14+8*rdx+16]
mov bword ptr [rsp+30H], rax
- cmp dword ptr [rax+16], r12d
+ cmp dword ptr [rax+20], r12d
jne SHORT G_M16827_IG06
mov rdx, gword ptr [r14+8*rdx+16]
mov rcx, r13
@@ -728,7 +733,7 @@ G_M16827_IG05:
G_M16827_IG06:
mov rax, bword ptr [rsp+30H]
- mov ebx, dword ptr [rax+20]
+ mov ebx, dword ptr [rax+16]
cmp ebp, r15d
jle G_M16827_IG20
@@ -755,13 +760,12 @@ G_M16827_IG09:
cmp dword ptr [rcx], ecx
call qword ptr [r11]
mov r8d, eax
- and r8d, 0xD1FFAB1E
mov eax, dword ptr [rbp+8]
mov dword ptr [rsp+44H], eax
mov ecx, dword ptr [rsp+44H]
mov eax, r8d
- cdq
- idiv edx:eax, ecx
+ xor rdx, rdx
+ div edx:eax, ecx
cmp edx, ecx
jae G_M16827_IG22
movsxd rcx, edx
@@ -780,7 +784,7 @@ G_M16827_IG10:
lea r9, bword ptr [r14+8*rcx+16]
mov bword ptr [rsp+28H], r9
mov dword ptr [rsp+4CH], r8d
- cmp dword ptr [r9+16], r8d
+ cmp dword ptr [r9+20], r8d
jne SHORT G_M16827_IG12
mov r10, gword ptr [r14+8*rcx+16]
mov gword ptr [rsp+38H], r10
@@ -804,7 +808,7 @@ G_M16827_IG11:
G_M16827_IG12:
mov r9, bword ptr [rsp+28H]
- mov ebp, dword ptr [r9+20]
+ mov ebp, dword ptr [r9+16]
mov eax, dword ptr [rsp+48H]
cmp eax, r15d
jle SHORT G_M16827_IG21
@@ -858,7 +862,7 @@ G_M16827_IG22:
call CORINFO_HELP_RNGCHKFAIL
int3
-; Total bytes of code 565, prolog size 27 for method Dictionary`2:FindEntry(ref):int:this
+; Total bytes of code 553, prolog size 27 for method Dictionary`2:FindEntry(ref):int:this
; ============================================================
; Assembly listing for method Dictionary`2:Remove(int):bool:this
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Windows
@@ -877,7 +881,7 @@ G_M16827_IG22:
; V06 loc4 [V06,T19] ( 3, 1.50) int -> rdx
; V07 loc5 [V07,T09] ( 5, 6 ) int -> r12
; V08 loc6 [V08,T00] ( 8, 21.50) int -> [rsp+0x34]
-; V09 loc7 [V09,T01] ( 9, 18 ) byref -> [rsp+0x28]
+; V09 loc7 [V09,T01] ( 8, 17.50) byref -> [rsp+0x28]
; V10 OutArgs [V10 ] ( 1, 1 ) lclBlk (32) [rsp+0x00] "OutgoingArgSpace"
;* V11 tmp1 [V11 ] ( 0, 0 ) ref -> zero-ref class-hnd exact "Single-def Box Helper"
; V12 tmp2 [V12,T16] ( 2, 2 ) ref -> rcx class-hnd "dup spill"
@@ -931,11 +935,10 @@ G_M18089_IG03:
mov r15d, eax
G_M18089_IG04:
- and r15d, 0xD1FFAB1E
mov ecx, dword ptr [rbx+8]
mov eax, r15d
- cdq
- idiv edx:eax, ecx
+ xor rdx, rdx
+ div edx:eax, ecx
mov r12d, -1
cmp edx, ecx
jae G_M18089_IG17
@@ -954,7 +957,7 @@ G_M18089_IG05:
movsxd rdx, eax
shl rdx, 4
lea r10, bword ptr [rbp+rdx+16]
- cmp dword ptr [r10], r15d
+ cmp dword ptr [r10+4], r15d
jne SHORT G_M18089_IG08
mov rcx, gword ptr [rdi+24]
test rcx, rcx
@@ -982,7 +985,7 @@ G_M18089_IG07:
G_M18089_IG08:
mov eax, dword ptr [rsp+34H]
mov r12d, eax
- mov eax, dword ptr [r10+4]
+ mov eax, dword ptr [r10]
mov r9d, dword ptr [rsp+30H]
cmp r9d, r14d
jle SHORT G_M18089_IG16
@@ -1010,7 +1013,7 @@ G_M18089_IG11:
G_M18089_IG12:
test r12d, r12d
jge SHORT G_M18089_IG13
- mov r9d, dword ptr [r10+4]
+ mov r9d, dword ptr [r10]
inc r9d
mov dword ptr [rbx+4*r13+16], r9d
jmp SHORT G_M18089_IG14
@@ -1021,13 +1024,14 @@ G_M18089_IG13:
jae SHORT G_M18089_IG17
movsxd rdx, r12d
shl rdx, 4
- mov ecx, dword ptr [r10+4]
- mov dword ptr [rbp+rdx+20], ecx
+ mov ecx, dword ptr [r10]
+ mov dword ptr [rbp+rdx+16], ecx
G_M18089_IG14:
- mov dword ptr [r10], -1
mov edx, dword ptr [rdi+52]
- mov dword ptr [r10+4], edx
+ neg edx
+ add edx, -3
+ mov dword ptr [r10], edx
mov eax, dword ptr [rsp+34H]
mov dword ptr [rdi+52], eax
inc dword ptr [rdi+56]
@@ -1053,7 +1057,7 @@ G_M18089_IG17:
call CORINFO_HELP_RNGCHKFAIL
int3
-; Total bytes of code 386, prolog size 21 for method Dictionary`2:Remove(int):bool:this
+; Total bytes of code 375, prolog size 21 for method Dictionary`2:Remove(int):bool:this
; ============================================================
; Assembly listing for method Dictionary`2:Remove(int,byref):bool:this
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Windows
@@ -1073,7 +1077,7 @@ G_M18089_IG17:
; V07 loc4 [V07,T20] ( 3, 1.50) int -> rdx
; V08 loc5 [V08,T09] ( 5, 6 ) int -> r13
; V09 loc6 [V09,T00] ( 8, 21.50) int -> [rsp+0x44]
-; V10 loc7 [V10,T01] ( 10, 18.50) byref -> [rsp+0x28]
+; V10 loc7 [V10,T01] ( 9, 18 ) byref -> [rsp+0x28]
; V11 OutArgs [V11 ] ( 1, 1 ) lclBlk (32) [rsp+0x00] "OutgoingArgSpace"
;* V12 tmp1 [V12 ] ( 0, 0 ) ref -> zero-ref class-hnd exact "Single-def Box Helper"
; V13 tmp2 [V13,T17] ( 2, 2 ) ref -> rcx class-hnd "dup spill"
@@ -1128,11 +1132,10 @@ G_M18089_IG03:
mov r12d, eax
G_M18089_IG04:
- and r12d, 0xD1FFAB1E
mov ecx, dword ptr [rbp+8]
mov eax, r12d
- cdq
- idiv edx:eax, ecx
+ xor rdx, rdx
+ div edx:eax, ecx
mov r13d, -1
cmp edx, ecx
jae G_M18089_IG17
@@ -1152,7 +1155,7 @@ G_M18089_IG05:
movsxd rdx, r9d
shl rdx, 4
lea r11, bword ptr [r14+rdx+16]
- cmp dword ptr [r11], r12d
+ cmp dword ptr [r11+4], r12d
jne SHORT G_M18089_IG08
mov rcx, gword ptr [rdi+24]
test rcx, rcx
@@ -1180,7 +1183,7 @@ G_M18089_IG07:
G_M18089_IG08:
mov r9d, dword ptr [rsp+44H]
mov r13d, r9d
- mov r9d, dword ptr [r11+4]
+ mov r9d, dword ptr [r11]
mov r10d, dword ptr [rsp+34H]
cmp r10d, r15d
jle G_M18089_IG16
@@ -1209,7 +1212,7 @@ G_M18089_IG11:
G_M18089_IG12:
test r13d, r13d
jge SHORT G_M18089_IG13
- mov r10d, dword ptr [r11+4]
+ mov r10d, dword ptr [r11]
inc r10d
mov rax, qword ptr [rsp+38H]
mov dword ptr [rbp+4*rax+16], r10d
@@ -1221,15 +1224,16 @@ G_M18089_IG13:
jae SHORT G_M18089_IG17
movsxd rax, r13d
shl rax, 4
- mov edx, dword ptr [r11+4]
- mov dword ptr [r14+rax+20], edx
+ mov edx, dword ptr [r11]
+ mov dword ptr [r14+rax+16], edx
G_M18089_IG14:
mov eax, dword ptr [r11+12]
mov dword ptr [rbx], eax
- mov dword ptr [r11], -1
mov eax, dword ptr [rdi+52]
- mov dword ptr [r11+4], eax
+ neg eax
+ add eax, -3
+ mov dword ptr [r11], eax
mov r9d, dword ptr [rsp+44H]
mov dword ptr [rdi+52], r9d
inc dword ptr [rdi+56]
@@ -1255,7 +1259,7 @@ G_M18089_IG17:
call CORINFO_HELP_RNGCHKFAIL
int3
-; Total bytes of code 419, prolog size 24 for method Dictionary`2:Remove(int,byref):bool:this
+; Total bytes of code 408, prolog size 24 for method Dictionary`2:Remove(int,byref):bool:this
; ============================================================
; Assembly listing for method Dictionary`2:TryInsert(int,int,ubyte):bool:this
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Windows
@@ -1265,25 +1269,25 @@ G_M18089_IG17:
; fully interruptible
; Final local variable assignments
;
-; V00 this [V00,T04] ( 22, 14 ) ref -> rsi this class-hnd
-; V01 arg1 [V01,T09] ( 9, 7.50) int -> rdi ld-addr-op
+; V00 this [V00,T04] ( 23, 14.50) ref -> rsi this class-hnd
+; V01 arg1 [V01,T07] ( 9, 7.50) int -> rdi ld-addr-op
; V02 arg2 [V02,T15] ( 5, 3.50) int -> rbx
; V03 arg3 [V03,T16] ( 4, 3 ) ubyte -> rbp
-; V04 loc0 [V04,T00] ( 12, 27 ) ref -> r14 class-hnd
+; V04 loc0 [V04,T02] ( 11, 23.50) ref -> r14 class-hnd
; V05 loc1 [V05,T14] ( 5, 5.50) ref -> r15 class-hnd
-; V06 loc2 [V06,T10] ( 6, 11 ) int -> r12
-; V07 loc3 [V07,T01] ( 7, 25 ) int -> r13
+; V06 loc2 [V06,T08] ( 6, 11 ) int -> r12
+; V07 loc3 [V07,T00] ( 7, 25 ) int -> r13
; V08 loc4 [V08,T19] ( 5, 3.50) byref -> [rsp+0x30]
-; V09 loc5 [V09,T02] ( 7, 25 ) int -> r10
-; V10 loc6 [V10,T29] ( 3, 1.50) bool -> rbp
-; V11 loc7 [V11,T21] ( 6, 3 ) int -> r13
-; V12 loc8 [V12,T20] ( 6, 3 ) byref -> rax
+; V09 loc5 [V09,T01] ( 7, 25 ) int -> r10
+; V10 loc6 [V10,T30] ( 3, 1.50) bool -> rbp
+; V11 loc7 [V11,T20] ( 6, 3 ) int -> r13
+; V12 loc8 [V12,T26] ( 5, 2.50) byref -> rcx
;* V13 loc9 [V13 ] ( 0, 0 ) int -> zero-ref ld-addr-op
;* V14 loc10 [V14 ] ( 0, 0 ) ref -> zero-ref class-hnd
-; V15 loc11 [V15,T30] ( 3, 1.50) int -> r13
+; V15 loc11 [V15,T31] ( 3, 1.50) int -> r13
; V16 OutArgs [V16 ] ( 1, 1 ) lclBlk (32) [rsp+0x00] "OutgoingArgSpace"
;* V17 tmp1 [V17 ] ( 0, 0 ) ref -> zero-ref class-hnd exact "Single-def Box Helper"
-; V18 tmp2 [V18,T26] ( 3, 2 ) int -> rax
+; V18 tmp2 [V18,T27] ( 3, 2 ) int -> r12
;* V19 tmp3 [V19 ] ( 0, 0 ) ref -> zero-ref class-hnd exact "Single-def Box Helper"
;* V20 tmp4 [V20 ] ( 0, 0 ) ref -> zero-ref class-hnd exact "Single-def Box Helper"
; V21 tmp5 [V21,T17] ( 2, 4 ) bool -> rdx "Inline return value spill temp"
@@ -1293,19 +1297,21 @@ G_M18089_IG17:
;* V25 tmp9 [V25 ] ( 0, 0 ) ref -> zero-ref class-hnd exact "Single-def Box Helper"
; V26 tmp10 [V26,T12] ( 3, 6 ) ref -> r8 "arr expr"
; V27 tmp11 [V27,T13] ( 3, 6 ) int -> rdx "arr expr"
-; V28 tmp12 [V28,T27] ( 2, 2 ) int -> rdx "argument with side effect"
-; V29 tmp13 [V29,T23] ( 3, 3 ) ref -> r8 "arr expr"
-; V30 tmp14 [V30,T25] ( 3, 3 ) int -> rdx "arr expr"
-; V31 cse0 [V31,T07] ( 5, 12.50) byref -> rax "ValNumCSE"
-; V32 cse1 [V32,T08] ( 5, 12.50) byref -> [rsp+0x28] "ValNumCSE"
-; V33 cse2 [V33,T05] ( 4, 14 ) long -> r10 "ValNumCSE"
-; V34 cse3 [V34,T06] ( 4, 14 ) long -> [rsp+0x40] "ValNumCSE"
-; V35 cse4 [V35,T24] ( 3, 3 ) ref -> rcx "ValNumCSE"
-; V36 cse5 [V36,T31] ( 3, 1.50) int -> rcx "ValNumCSE"
-; V37 cse6 [V37,T32] ( 3, 1.50) int -> rcx "ValNumCSE"
-; V38 cse7 [V38,T28] ( 3, 1.50) ref -> rcx "ValNumCSE"
-; V39 cse8 [V39,T03] ( 7, 21 ) int -> [rsp+0x3C] "ValNumCSE"
-; V40 cse9 [V40,T22] ( 6, 3 ) int -> r13 "ValNumCSE"
+; V28 tmp12 [V28,T28] ( 2, 2 ) int -> rdx "argument with side effect"
+; V29 tmp13 [V29,T22] ( 3, 3 ) ref -> r8 "arr expr"
+; V30 tmp14 [V30,T24] ( 3, 3 ) int -> rdx "arr expr"
+; V31 tmp15 [V31,T25] ( 3, 3 ) int -> rdx "arr expr"
+; V32 cse0 [V32,T09] ( 4, 10.50) byref -> rax "ValNumCSE"
+; V33 cse1 [V33,T10] ( 4, 10.50) byref -> [rsp+0x28] "ValNumCSE"
+; V34 cse2 [V34,T05] ( 3, 12 ) long -> r10 "ValNumCSE"
+; V35 cse3 [V35,T06] ( 3, 12 ) long -> [rsp+0x40] "ValNumCSE"
+; V36 cse4 [V36,T23] ( 3, 3 ) ref -> rcx "ValNumCSE"
+; V37 cse5 [V37,T32] ( 3, 1.50) int -> rcx "ValNumCSE"
+; V38 cse6 [V38,T33] ( 3, 1.50) int -> rcx "ValNumCSE"
+; V39 cse7 [V39,T29] ( 3, 1.50) ref -> rcx "ValNumCSE"
+; V40 cse8 [V40,T03] ( 7, 21 ) int -> [rsp+0x3C] "ValNumCSE"
+; V41 cse9 [V41,T21] ( 6, 3 ) int -> r13 "ValNumCSE"
+; V42 cse10 [V42,T34] ( 3, 1.50) int -> rax "ValNumCSE"
;
; Lcl frame size = 72
@@ -1341,20 +1347,19 @@ G_M59125_IG03:
mov r11, 0xD1FFAB1E
cmp dword ptr [rcx], ecx
call [IEqualityComparer`1:GetHashCode(int):int:this]
+ mov r12d, eax
jmp SHORT G_M59125_IG05
G_M59125_IG04:
- mov eax, edi
+ mov r12d, edi
G_M59125_IG05:
- mov r12d, eax
- and r12d, 0xD1FFAB1E
xor r13d, r13d
mov rcx, gword ptr [rsi+8]
mov r8, rcx
mov eax, r12d
- cdq
- idiv edx:eax, dword ptr [rcx+8]
+ xor rdx, rdx
+ div edx:eax, dword ptr [rcx+8]
cmp edx, dword ptr [r8+8]
jae G_M59125_IG29
movsxd rax, edx
@@ -1370,9 +1375,9 @@ G_M59125_IG06:
jbe G_M59125_IG18
movsxd r10, r10d
shl r10, 4
- cmp dword ptr [r14+r10+16], r12d
- jne SHORT G_M59125_IG07
lea rax, bword ptr [r14+r10+16]
+ cmp dword ptr [rax+4], r12d
+ jne SHORT G_M59125_IG07
mov edx, dword ptr [rax+8]
cmp edx, edi
sete dl
@@ -1381,8 +1386,7 @@ G_M59125_IG06:
jne SHORT G_M59125_IG09
G_M59125_IG07:
- lea rax, bword ptr [r14+r10+16]
- mov r10d, dword ptr [rax+4]
+ mov r10d, dword ptr [r14+r10+16]
cmp r15d, r13d
jle G_M59125_IG26
@@ -1425,11 +1429,11 @@ G_M59125_IG13:
jbe SHORT G_M59125_IG14
movsxd r10, r10d
shl r10, 4
- cmp dword ptr [r14+r10+16], r12d
- jne SHORT G_M59125_IG16
- mov bword ptr [rsp+30H], r9
mov qword ptr [rsp+40H], r10
lea r11, bword ptr [r14+r10+16]
+ cmp dword ptr [r11+4], r12d
+ jne SHORT G_M59125_IG16
+ mov bword ptr [rsp+30H], r9
mov bword ptr [rsp+28H], r11
mov edx, dword ptr [r11+8]
mov rcx, r15
@@ -1439,13 +1443,12 @@ G_M59125_IG13:
call [IEqualityComparer`1:Equals(int,int):bool:this]
test eax, eax
mov r9, bword ptr [rsp+30H]
- mov r10, qword ptr [rsp+40H]
je SHORT G_M59125_IG16
movzx r13, bpl
cmp r13d, 1
jne SHORT G_M59125_IG15
- mov rbp, bword ptr [rsp+28H]
- mov dword ptr [rbp+12], ebx
+ mov r13, bword ptr [rsp+28H]
+ mov dword ptr [r13+12], ebx
inc dword ptr [rsi+60]
jmp G_M59125_IG23
@@ -1459,8 +1462,8 @@ G_M59125_IG15:
jmp G_M59125_IG11
G_M59125_IG16:
- lea rcx, bword ptr [r14+r10+16]
- mov r10d, dword ptr [rcx+4]
+ mov r10, qword ptr [rsp+40H]
+ mov r10d, dword ptr [r14+r10+16]
mov eax, dword ptr [rsp+3CH]
cmp eax, r13d
jle G_M59125_IG28
@@ -1493,8 +1496,8 @@ G_M59125_IG19:
mov rcx, gword ptr [rsi+8]
mov r8, rcx
mov eax, r12d
- cdq
- idiv edx:eax, dword ptr [rcx+8]
+ xor rdx, rdx
+ div edx:eax, dword ptr [rcx+8]
cmp edx, dword ptr [r8+8]
jae G_M59125_IG29
movsxd rax, edx
@@ -1508,23 +1511,31 @@ G_M59125_IG20:
mov r14, gword ptr [rsi+16]
G_M59125_IG21:
- cmp r13d, dword ptr [r14+8]
+ mov eax, dword ptr [r14+8]
+ cmp r13d, eax
jae SHORT G_M59125_IG29
- movsxd rax, r13d
- shl rax, 4
- lea rax, bword ptr [r14+rax+16]
+ movsxd rcx, r13d
+ shl rcx, 4
+ lea rcx, bword ptr [r14+rcx+16]
test ebp, ebp
je SHORT G_M59125_IG22
- mov ecx, dword ptr [rax+4]
- mov dword ptr [rsi+52], ecx
+ mov edx, dword ptr [rsi+52]
+ cmp edx, eax
+ jae SHORT G_M59125_IG29
+ movsxd rax, edx
+ shl rax, 4
+ mov eax, dword ptr [r14+rax+16]
+ neg eax
+ add eax, -3
+ mov dword ptr [rsi+52], eax
G_M59125_IG22:
- mov dword ptr [rax], r12d
- mov ecx, dword ptr [r9]
- dec ecx
- mov dword ptr [rax+4], ecx
- mov dword ptr [rax+8], edi
- mov dword ptr [rax+12], ebx
+ mov dword ptr [rcx+4], r12d
+ mov eax, dword ptr [r9]
+ dec eax
+ mov dword ptr [rcx], eax
+ mov dword ptr [rcx+8], edi
+ mov dword ptr [rcx+12], ebx
inc r13d
mov dword ptr [r9], r13d
inc dword ptr [rsi+60]
@@ -1566,7 +1577,7 @@ G_M59125_IG29:
call CORINFO_HELP_RNGCHKFAIL
int3
-; Total bytes of code 642, prolog size 27 for method Dictionary`2:TryInsert(int,int,ubyte):bool:this
+; Total bytes of code 653, prolog size 27 for method Dictionary`2:TryInsert(int,int,ubyte):bool:this
; ============================================================
; Assembly listing for method Dictionary`2:Resize(int,bool):this
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Windows
@@ -1576,27 +1587,26 @@ G_M59125_IG29:
; fully interruptible
; Final local variable assignments
;
-; V00 this [V00,T04] ( 6, 6 ) ref -> rsi this class-hnd
-; V01 arg1 [V01,T06] ( 4, 5 ) int -> rdi
+; V00 this [V00,T03] ( 6, 6 ) ref -> rsi this class-hnd
+; V01 arg1 [V01,T05] ( 4, 5 ) int -> rdi
;* V02 arg2 [V02 ] ( 0, 0 ) bool -> zero-ref
-; V03 loc0 [V03,T05] ( 5, 8 ) ref -> rbp class-hnd
-; V04 loc1 [V04,T01] ( 6, 10 ) ref -> rbx class-hnd
-; V05 loc2 [V05,T07] ( 4, 7 ) int -> r14
+; V03 loc0 [V03,T04] ( 5, 8 ) ref -> rbp class-hnd
+; V04 loc1 [V04,T01] ( 7, 12 ) ref -> rbx class-hnd
+; V05 loc2 [V05,T06] ( 4, 7 ) int -> r14
;* V06 loc3 [V06 ] ( 0, 0 ) int -> zero-ref ld-addr-op
;* V07 loc4 [V07 ] ( 0, 0 ) int -> zero-ref
; V08 loc5 [V08,T00] ( 7, 23 ) int -> r15
-; V09 loc6 [V09,T08] ( 3, 6 ) int -> rdx
+; V09 loc6 [V09,T07] ( 3, 6 ) int -> rdx
; V10 OutArgs [V10 ] ( 1, 1 ) lclBlk (48) [rsp+0x00] "OutgoingArgSpace"
;* V11 tmp1 [V11 ] ( 0, 0 ) ref -> zero-ref class-hnd exact "Single-def Box Helper"
;* V12 tmp2 [V12 ] ( 0, 0 ) byref -> zero-ref "impAppendStmt"
-; V13 tmp3 [V13,T11] ( 2, 4 ) ref -> rcx class-hnd "Inlining Arg"
+; V13 tmp3 [V13,T10] ( 2, 4 ) ref -> rcx class-hnd "Inlining Arg"
;* V14 tmp4 [V14 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg"
-; V15 cse0 [V15,T02] ( 3, 10 ) int -> rax "ValNumCSE"
-; V16 cse1 [V16,T03] ( 3, 10 ) long -> r8 "ValNumCSE"
-; V17 cse2 [V17,T10] ( 2, 5 ) int -> rcx "ValNumCSE"
-; V18 cse3 [V18,T12] ( 2, 4 ) int -> rax "ValNumCSE"
-; V19 cse4 [V19,T09] ( 3, 6 ) long -> rax "ValNumCSE"
-; V20 cse5 [V20,T13] ( 3, 3 ) long -> rbx "ValNumCSE"
+; V15 cse0 [V15,T02] ( 4, 12 ) long -> r8 "ValNumCSE"
+; V16 cse1 [V16,T09] ( 2, 5 ) int -> rcx "ValNumCSE"
+; V17 cse2 [V17,T11] ( 2, 4 ) int -> rax "ValNumCSE"
+; V18 cse3 [V18,T08] ( 3, 6 ) long -> rax "ValNumCSE"
+; V19 cse4 [V19,T12] ( 3, 3 ) long -> rbx "ValNumCSE"
;
; Lcl frame size = 56
@@ -1640,18 +1650,18 @@ G_M14072_IG03:
jae SHORT G_M14072_IG07
movsxd r8, r15d
shl r8, 4
- mov eax, dword ptr [rbx+r8+16]
- test eax, eax
+ cmp dword ptr [rbx+r8+16], -1
jl SHORT G_M14072_IG04
- cdq
- idiv edx:eax, edi
+ mov eax, dword ptr [rbx+r8+20]
+ xor rdx, rdx
+ div edx:eax, edi
mov eax, dword ptr [rbp+8]
cmp edx, eax
jae SHORT G_M14072_IG07
movsxd rax, edx
mov edx, dword ptr [rbp+4*rax+16]
dec edx
- mov dword ptr [rbx+r8+20], edx
+ mov dword ptr [rbx+r8+16], edx
lea edx, [r15+1]
mov dword ptr [rbp+4*rax+16], edx
@@ -1683,7 +1693,7 @@ G_M14072_IG07:
call CORINFO_HELP_RNGCHKFAIL
int3
-; Total bytes of code 212, prolog size 17 for method Dictionary`2:Resize(int,bool):this
+; Total bytes of code 217, prolog size 17 for method Dictionary`2:Resize(int,bool):this
; ============================================================
; Assembly listing for method Dictionary`2:TrimExcess(int):this
; Emitting BLENDED_CODE for X64 CPU with SSE2 - Windows
@@ -1693,35 +1703,37 @@ G_M14072_IG07:
; fully interruptible
; Final local variable assignments
;
-; V00 this [V00,T01] ( 13, 9 ) ref -> rsi this class-hnd
+; V00 this [V00,T02] ( 13, 9 ) ref -> rsi this class-hnd
; V01 arg1 [V01,T08] ( 4, 4 ) int -> rdx
-; V02 loc0 [V02,T13] ( 4, 4.50) int -> rdi
-; V03 loc1 [V03,T04] ( 6, 9 ) ref -> rbx class-hnd
+; V02 loc0 [V02,T14] ( 4, 4.50) int -> rdi
+; V03 loc1 [V03,T03] ( 6, 11 ) ref -> rbx class-hnd
;* V04 loc2 [V04 ] ( 0, 0 ) int -> zero-ref
-; V05 loc3 [V05,T12] ( 3, 5 ) int -> rbp
-; V06 loc4 [V06,T14] ( 3, 4.50) ref -> rcx class-hnd
+; V05 loc3 [V05,T13] ( 3, 5 ) int -> rbp
+; V06 loc4 [V06,T15] ( 3, 4.50) ref -> rcx class-hnd
; V07 loc5 [V07,T07] ( 4, 6.50) ref -> r8 class-hnd
; V08 loc6 [V08,T05] ( 6, 9 ) int -> r9
; V09 loc7 [V09,T00] ( 6, 20.50) int -> r10
-; V10 loc8 [V10,T02] ( 3, 10 ) int -> r11
-; V11 loc9 [V11,T09] ( 3, 6 ) byref -> r14
+; V10 loc8 [V10,T11] ( 2, 6 ) int -> r14
+; V11 loc9 [V11,T09] ( 3, 6 ) byref -> r15
; V12 loc10 [V12,T06] ( 4, 8 ) int -> rdx
; V13 OutArgs [V13 ] ( 1, 1 ) lclBlk (32) [rsp+0x00] "OutgoingArgSpace"
-; V14 tmp1 [V14,T16] ( 3, 2 ) int -> rbp
-; V15 cse0 [V15,T03] ( 3, 10 ) long -> rdx "ValNumCSE"
-; V16 cse1 [V16,T11] ( 4, 5.50) int -> [rsp+0x2C] "ValNumCSE"
-; V17 cse2 [V17,T15] ( 2, 4 ) int -> rax "ValNumCSE"
-; V18 cse3 [V18,T10] ( 3, 6 ) int -> r9 "ValNumCSE"
+; V14 tmp1 [V14,T17] ( 3, 2 ) int -> rbp
+; V15 cse0 [V15,T04] ( 3, 10 ) byref -> r11 "ValNumCSE"
+; V16 cse1 [V16,T01] ( 3, 12 ) long -> rdx "ValNumCSE"
+; V17 cse2 [V17,T12] ( 4, 5.50) int -> [rsp+0x24] "ValNumCSE"
+; V18 cse3 [V18,T16] ( 2, 4 ) int -> rax "ValNumCSE"
+; V19 cse4 [V19,T10] ( 3, 6 ) int -> r9 "ValNumCSE"
;
-; Lcl frame size = 48
+; Lcl frame size = 40
G_M47871_IG01:
+ push r15
push r14
push rdi
push rsi
push rbp
push rbx
- sub rsp, 48
+ sub rsp, 40
mov rsi, rcx
G_M47871_IG02:
@@ -1748,12 +1760,13 @@ G_M47871_IG05:
jl SHORT G_M47871_IG07
G_M47871_IG06:
- add rsp, 48
+ add rsp, 40
pop rbx
pop rbp
pop rsi
pop rdi
pop r14
+ pop r15
ret
G_M47871_IG07:
@@ -1771,32 +1784,32 @@ G_M47871_IG07:
mov eax, dword ptr [rbx+8]
G_M47871_IG08:
- mov dword ptr [rsp+2CH], eax
+ mov dword ptr [rsp+24H], eax
cmp r10d, eax
jae G_M47871_IG13
movsxd rdx, r10d
shl rdx, 4
- mov r11d, dword ptr [rbx+rdx+16]
- test r11d, r11d
+ lea r11, bword ptr [rbx+rdx+16]
+ mov r14d, dword ptr [r11+4]
+ cmp dword ptr [rbx+rdx+16], -1
jl SHORT G_M47871_IG09
cmp r9d, dword ptr [rcx+8]
jae SHORT G_M47871_IG13
- movsxd r14, r9d
- shl r14, 4
- lea r14, bword ptr [rcx+r14+16]
- lea rdx, bword ptr [rbx+rdx+16]
- movdqu xmm0, qword ptr [rdx]
- movdqu qword ptr [r14], xmm0
- mov eax, r11d
- cdq
- idiv edx:eax, edi
+ movsxd rdx, r9d
+ shl rdx, 4
+ lea r15, bword ptr [rcx+rdx+16]
+ movdqu xmm0, qword ptr [r11]
+ movdqu qword ptr [r15], xmm0
+ mov eax, r14d
+ xor rdx, rdx
+ div edx:eax, edi
mov eax, dword ptr [r8+8]
cmp edx, eax
jae SHORT G_M47871_IG13
movsxd rax, edx
mov eax, dword ptr [r8+4*rax+16]
dec eax
- mov dword ptr [r14+4], eax
+ mov dword ptr [r15], eax
movsxd rax, edx
inc r9d
mov dword ptr [r8+4*rax+16], r9d
@@ -1804,7 +1817,7 @@ G_M47871_IG08:
G_M47871_IG09:
inc r10d
cmp r10d, ebp
- mov eax, dword ptr [rsp+2CH]
+ mov eax, dword ptr [rsp+24H]
jl SHORT G_M47871_IG08
G_M47871_IG10:
@@ -1813,12 +1826,13 @@ G_M47871_IG10:
mov dword ptr [rsi+56], r10d
G_M47871_IG11:
- add rsp, 48
+ add rsp, 40
pop rbx
pop rbp
pop rsi
pop rdi
pop r14
+ pop r15
ret
G_M47871_IG12:
@@ -1830,5 +1844,5 @@ G_M47871_IG13:
call CORINFO_HELP_RNGCHKFAIL
int3
-; Total bytes of code 256, prolog size 13 for method Dictionary`2:TrimExcess(int):this
+; Total bytes of code 264, prolog size 15 for method Dictionary`2:TrimExcess(int):this
; ============================================================
|
A lot of tests are failing with:
|
The diffs looks pretty good so far. Could you please also look at |
Thank's I'll fix it, I was confused by "no more runnable" failed tests 😞 |
/azp run |
X86 diff I see similar difference with some different registry used(I think due to constant removal) diff --git a/x86coreclrupstream.txt b/x86coreclr.txt
index 17de20b..5c1cdec 100644
--- a/x86coreclrupstream.txt
+++ b/x86coreclr.txt
@@ -6,43 +6,43 @@
; fully interruptible
; Final local variable assignments
;
-; V00 this [V00,T03] ( 34, 21.50) ref -> esi this class-hnd
+; V00 this [V00,T03] ( 35, 22 ) ref -> esi this class-hnd
; V01 arg1 [V01,T08] ( 11, 9 ) ref -> [ebp-0x2C] ld-addr-op class-hnd
-; V02 arg2 [V02,T31] ( 3, 1.50) ref -> [ebp+0x0C] class-hnd
-; V03 arg3 [V03,T27] ( 4, 2 ) ubyte -> [ebp+0x08]
-; V04 loc0 [V04,T04] ( 11, 23 ) ref -> [ebp-0x30] class-hnd
+; V02 arg2 [V02,T32] ( 3, 1.50) ref -> [ebp+0x0C] class-hnd
+; V03 arg3 [V03,T28] ( 4, 2 ) ubyte -> [ebp+0x08]
+; V04 loc0 [V04,T04] ( 13, 24 ) ref -> [ebp-0x30] class-hnd
; V05 loc1 [V05,T13] ( 6, 6 ) ref -> [ebp-0x34] class-hnd
; V06 loc2 [V06,T09] ( 6, 11 ) int -> [ebp-0x10]
; V07 loc3 [V07,T01] ( 8, 25.50) int -> [ebp-0x14]
; V08 loc4 [V08,T18] ( 5, 3.50) byref -> [ebp-0x38]
; V09 loc5 [V09,T00] ( 9, 29 ) int -> [ebp-0x18]
-; V10 loc6 [V10,T32] ( 3, 1.50) bool -> [ebp-0x1C]
-; V11 loc7 [V11,T22] ( 6, 3 ) int -> ecx
-; V12 loc8 [V12,T21] ( 6, 3 ) byref -> [ebp-0x3C]
+; V10 loc6 [V10,T33] ( 3, 1.50) bool -> [ebp-0x1C]
+; V11 loc7 [V11,T21] ( 6, 3 ) int -> ecx
+; V12 loc8 [V12,T25] ( 5, 2.50) byref -> [ebp-0x3C]
;* V13 loc9 [V13 ] ( 0, 0 ) ref -> zero-ref ld-addr-op class-hnd
; V14 loc10 [V14,T16] ( 3, 4.50) ref -> [ebp-0x40] class-hnd
-; V15 loc11 [V15,T33] ( 3, 1.50) int -> [ebp-0x20]
-; V16 tmp0 [V16,T28] ( 3, 2 ) int -> eax
+; V15 loc11 [V15,T34] ( 3, 1.50) int -> [ebp-0x20]
+; V16 tmp0 [V16,T29] ( 3, 2 ) int -> ebx
; V17 tmp1 [V17,T17] ( 5, 3.74) ref -> edi class-hnd "spilling QMark2"
; V18 tmp2 [V18,T10] ( 3, 10 ) int -> eax "impRuntimeLookup slot"
; V19 tmp3 [V19,T11] ( 2, 8 ) ref -> [ebp-0x44] class-hnd "impAppendStmt"
;* V20 tmp4 [V20 ] ( 0, 0 ) ref -> zero-ref class-hnd "bubbling QMark1"
; V21 tmp5 [V21,T05] ( 4, 14 ) int -> ebx "impRuntimeLookup typehandle"
;* V22 tmp6 [V22 ] ( 0, 0 ) int -> zero-ref "VirtualCall with runtime lookup"
-; V23 tmp7 [V23,T36] ( 3, 0 ) int -> ecx "impRuntimeLookup slot"
+; V23 tmp7 [V23,T37] ( 3, 0 ) int -> ecx "impRuntimeLookup slot"
;* V24 tmp8 [V24 ] ( 0, 0 ) ref -> zero-ref class-hnd "bubbling QMark1"
-; V25 tmp9 [V25,T34] ( 4, 0 ) int -> edx "impRuntimeLookup typehandle"
-; V26 tmp10 [V26,T25] ( 3, 2.50) int -> [ebp-0x24] "impRuntimeLookup slot"
+; V25 tmp9 [V25,T35] ( 4, 0 ) int -> edx "impRuntimeLookup typehandle"
+; V26 tmp10 [V26,T26] ( 3, 2.50) int -> [ebp-0x24] "impRuntimeLookup slot"
; V27 tmp11 [V27,T19] ( 4, 3.50) int -> edi "impRuntimeLookup typehandle"
-; V28 tmp12 [V28,T37] ( 3, 0 ) int -> ecx "impRuntimeLookup slot"
+; V28 tmp12 [V28,T38] ( 3, 0 ) int -> ecx "impRuntimeLookup slot"
;* V29 tmp13 [V29 ] ( 0, 0 ) ref -> zero-ref class-hnd "bubbling QMark1"
-; V30 tmp14 [V30,T35] ( 4, 0 ) int -> edx "impRuntimeLookup typehandle"
+; V30 tmp14 [V30,T36] ( 4, 0 ) int -> edx "impRuntimeLookup typehandle"
;* V31 tmp15 [V31 ] ( 0, 0 ) int -> zero-ref "impRuntimeLookup slot"
;* V32 tmp16 [V32 ] ( 0, 0 ) int -> zero-ref "impRuntimeLookup typehandle"
;* V33 tmp17 [V33 ] ( 0, 0 ) int -> zero-ref "impRuntimeLookup slot"
;* V34 tmp18 [V34 ] ( 0, 0 ) ref -> zero-ref class-hnd "bubbling QMark1"
;* V35 tmp19 [V35 ] ( 0, 0 ) int -> zero-ref "impRuntimeLookup typehandle"
-; V36 tmp20 [V36,T26] ( 3, 2.50) int -> ebx "impRuntimeLookup slot"
+; V36 tmp20 [V36,T27] ( 3, 2.50) int -> ebx "impRuntimeLookup slot"
;* V37 tmp21 [V37 ] ( 0, 0 ) ref -> zero-ref class-hnd "bubbling QMark1"
; V38 tmp22 [V38,T20] ( 4, 3.50) int -> eax "impRuntimeLookup typehandle"
;* V39 tmp23 [V39 ] ( 0, 0 ) int -> zero-ref "VirtualCall with runtime lookup"
@@ -51,13 +51,14 @@
;* V42 tmp26 [V42 ] ( 0, 0 ) ref -> zero-ref "argument with side effect"
;* V43 tmp27 [V43 ] ( 0, 0 ) ref -> zero-ref "argument with side effect"
; V44 tmp28 [V44,T12] ( 2, 8 ) ref -> [ebp-0x4C] "argument with side effect"
-; V45 tmp29 [V45,T29] ( 2, 2 ) int -> edx "argument with side effect"
-; V46 tmp30 [V46,T23] ( 3, 3 ) ref -> ecx "arr expr"
-; V47 tmp31 [V47,T24] ( 3, 3 ) int -> edx "arr expr"
-; V48 tmp32 [V48,T30] ( 2, 2 ) int -> edx "argument with side effect"
-; V49 cse0 [V49,T06] ( 4, 12.50) byref -> edi "ValNumCSE"
-; V50 cse1 [V50,T07] ( 4, 12.50) byref -> [ebp-0x50] "ValNumCSE"
-; V51 cse2 [V51,T02] ( 7, 24.50) int -> [ebp-0x28] "ValNumCSE"
+; V45 tmp29 [V45,T30] ( 2, 2 ) int -> edx "argument with side effect"
+; V46 tmp30 [V46,T22] ( 3, 3 ) ref -> ecx "arr expr"
+; V47 tmp31 [V47,T23] ( 3, 3 ) int -> edx "arr expr"
+; V48 tmp32 [V48,T24] ( 3, 3 ) int -> edi "arr expr"
+; V49 tmp33 [V49,T31] ( 2, 2 ) int -> edx "argument with side effect"
+; V50 cse0 [V50,T06] ( 4, 12.50) byref -> edi "ValNumCSE"
+; V51 cse1 [V51,T07] ( 4, 12.50) byref -> [ebp-0x50] "ValNumCSE"
+; V52 cse2 [V52,T02] ( 7, 24.50) int -> [ebp-0x28] "ValNumCSE"
; TEMP_02 ref -> [ebp-0x54]
; TEMP_01 int -> [ebp-0x58]
;
@@ -108,6 +109,7 @@ G_M9942_IG05:
mov edx, edi
nop
call dword ptr [eax]
+ mov ebx, eax
mov gword ptr [ebp-2CH], edi
jmp SHORT G_M9942_IG07
@@ -117,18 +119,17 @@ G_M9942_IG06:
mov ebx, dword ptr [edi]
mov ebx, dword ptr [ebx+40]
call dword ptr [ebx+12]Object:GetHashCode():int:this
+ mov ebx, eax
G_M9942_IG07:
- mov ebx, eax
- and ebx, 0xD1FFAB1E
xor ecx, ecx
mov dword ptr [ebp-14H], ecx
mov edx, gword ptr [esi+4]
mov gword ptr [ebp-48H], edx
mov edi, gword ptr [esi+4]
mov eax, ebx
- cdq
- idiv edx:eax, dword ptr [edi+4]
+ xor edx, edx
+ div edx:eax, dword ptr [edi+4]
mov edi, gword ptr [ebp-48H]
cmp edx, dword ptr [edi+4]
jae G_M9942_IG39
@@ -181,7 +182,7 @@ G_M9942_IG12:
shl edi, 4
lea edi, bword ptr [eax+edi+8]
mov dword ptr [ebp-10H], ebx
- cmp dword ptr [edi+8], ebx
+ cmp dword ptr [edi+12], ebx
jne SHORT G_M9942_IG13
shl edx, 4
mov gword ptr [ebp-30H], eax
@@ -204,7 +205,7 @@ G_M9942_IG12:
jne SHORT G_M9942_IG15
G_M9942_IG13:
- mov edx, dword ptr [edi+12]
+ mov edx, dword ptr [edi+8]
mov edi, edx
mov ecx, dword ptr [ebp-28H]
mov edx, dword ptr [ebp-14H]
@@ -246,7 +247,7 @@ G_M9942_IG17:
lea ecx, bword ptr [edi+ecx+8]
mov bword ptr [ebp-50H], ecx
mov dword ptr [ebp-10H], ebx
- cmp dword ptr [ecx+8], ebx
+ cmp dword ptr [ecx+12], ebx
jne SHORT G_M9942_IG19
shl eax, 4
mov eax, gword ptr [edi+eax+8]
@@ -298,7 +299,7 @@ G_M9942_IG21:
G_M9942_IG22:
mov ecx, bword ptr [ebp-50H]
- mov eax, dword ptr [ecx+12]
+ mov eax, dword ptr [ecx+8]
mov ecx, eax
mov edx, dword ptr [ebp-28H]
mov eax, dword ptr [ebp-14H]
@@ -337,8 +338,8 @@ G_M9942_IG25:
mov ecx, gword ptr [esi+4]
mov edi, gword ptr [esi+4]
mov eax, ebx
- cdq
- idiv edx:eax, dword ptr [edi+4]
+ xor edx, edx
+ div edx:eax, dword ptr [edi+4]
cmp edx, dword ptr [ecx+4]
jae G_M9942_IG39
lea edi, bword ptr [ecx+4*edx+8]
@@ -362,15 +363,21 @@ G_M9942_IG27:
mov edi, dword ptr [ebp-1CH]
test edi, edi
je SHORT G_M9942_IG28
- mov edi, dword ptr [edx+12]
+ mov edi, dword ptr [esi+28]
+ cmp edi, dword ptr [eax+4]
+ jae G_M9942_IG39
+ shl edi, 4
+ mov edi, dword ptr [eax+edi+16]
+ neg edi
+ add edi, -3
mov dword ptr [esi+28], edi
G_M9942_IG28:
- mov dword ptr [edx+8], ebx
+ mov dword ptr [edx+12], ebx
mov edi, bword ptr [ebp-38H]
mov ebx, dword ptr [edi]
dec ebx
- mov dword ptr [edx+12], ebx
+ mov dword ptr [edx+8], ebx
mov bword ptr [ebp-3CH], edx
mov ebx, gword ptr [ebp-2CH]
call CORINFO_HELP_CHECKED_ASSIGN_REF_EBX
@@ -460,7 +467,7 @@ G_M9942_IG39:
call CORINFO_HELP_RNGCHKFAIL
int3
-; Total bytes of code 923, prolog size 18 for method Dictionary`2:TryInsert(ref,ref,ubyte):bool:this
+; Total bytes of code 942, prolog size 18 for method Dictionary`2:TryInsert(ref,ref,ubyte):bool:this
; ============================================================
; Assembly listing for method Dictionary`2:Resize(int,bool):this
; Emitting BLENDED_CODE for generic X86 CPU - Windows
@@ -470,25 +477,24 @@ G_M9942_IG39:
; fully interruptible
; Final local variable assignments
;
-; V00 this [V00,T05] ( 8, 8 ) ref -> [ebp-0x1C] this class-hnd
-; V01 arg1 [V01,T10] ( 5, 6 ) int -> [ebp-0x10]
-; V02 arg2 [V02,T17] ( 1, 1 ) bool -> [ebp+0x08]
-; V03 loc0 [V03,T09] ( 5, 8 ) ref -> ebx class-hnd
+; V00 this [V00,T06] ( 8, 8 ) ref -> [ebp-0x1C] this class-hnd
+; V01 arg1 [V01,T09] ( 5, 6 ) int -> [ebp-0x10]
+; V02 arg2 [V02,T16] ( 1, 1 ) bool -> [ebp+0x08]
+; V03 loc0 [V03,T08] ( 5, 8 ) ref -> ebx class-hnd
; V04 loc1 [V04,T02] ( 8, 21 ) ref -> [ebp-0x20] class-hnd
-; V05 loc2 [V05,T04] ( 6, 11.50) int -> [ebp-0x14]
+; V05 loc2 [V05,T05] ( 6, 11.50) int -> [ebp-0x14]
;* V06 loc3 [V06 ] ( 0, 0 ) ref -> zero-ref ld-addr-op class-hnd
; V07 loc4 [V07,T01] ( 7, 22.50) int -> [ebp-0x18]
; V08 loc5 [V08,T00] ( 7, 23 ) int -> edi
-; V09 loc6 [V09,T11] ( 4, 8 ) int -> edx
-; V10 tmp0 [V10,T14] ( 3, 4.50) int -> ecx "impRuntimeLookup slot"
-; V11 tmp1 [V11,T13] ( 4, 6.50) int -> eax "impRuntimeLookup typehandle"
-; V12 tmp2 [V12,T12] ( 2, 8 ) byref -> edi "non-inline candidate call"
-; V13 tmp3 [V13,T15] ( 2, 4 ) ref -> ecx class-hnd "Inlining Arg"
-; V14 cse0 [V14,T08] ( 3, 10 ) int -> esi "ValNumCSE"
-; V15 cse1 [V15,T06] ( 3, 10 ) byref -> [ebp-0x24] "ValNumCSE"
-; V16 cse2 [V16,T07] ( 3, 10 ) byref -> edi "ValNumCSE"
-; V17 cse3 [V17,T16] ( 2, 4 ) int -> eax "ValNumCSE"
-; V18 rat0 [V18,T03] ( 3, 12 ) ref -> esi "virtual vtable call"
+; V09 loc6 [V09,T10] ( 4, 8 ) int -> edx
+; V10 tmp0 [V10,T13] ( 3, 4.50) int -> ecx "impRuntimeLookup slot"
+; V11 tmp1 [V11,T12] ( 4, 6.50) int -> eax "impRuntimeLookup typehandle"
+; V12 tmp2 [V12,T11] ( 2, 8 ) byref -> edi "non-inline candidate call"
+; V13 tmp3 [V13,T14] ( 2, 4 ) ref -> ecx class-hnd "Inlining Arg"
+; V14 cse0 [V14,T03] ( 4, 12 ) byref -> [ebp-0x24] "ValNumCSE"
+; V15 cse1 [V15,T07] ( 3, 10 ) byref -> edi "ValNumCSE"
+; V16 cse2 [V16,T15] ( 2, 4 ) int -> eax "ValNumCSE"
+; V17 rat0 [V17,T04] ( 3, 12 ) ref -> esi "virtual vtable call"
;
; Lcl frame size = 24
@@ -547,7 +553,7 @@ G_M29783_IG04:
mov edi, eax
shl edi, 4
lea edi, bword ptr [ecx+edi+8]
- cmp dword ptr [edi+8], 0
+ cmp dword ptr [edi+8], -1
jl SHORT G_M29783_IG05
mov dword ptr [ebp-18H], eax
mov esi, eax
@@ -558,8 +564,7 @@ G_M29783_IG04:
mov esi, dword ptr [esi]
mov esi, dword ptr [esi+40]
call dword ptr [esi+12]Object:GetHashCode():int:this
- and eax, 0xD1FFAB1E
- mov dword ptr [edi+8], eax
+ mov dword ptr [edi+12], eax
mov eax, dword ptr [ebp-18H]
mov ecx, gword ptr [ebp-20H]
@@ -581,20 +586,19 @@ G_M29783_IG07:
mov eax, edi
shl eax, 4
lea eax, bword ptr [ecx+eax+8]
- mov bword ptr [ebp-24H], eax
- mov esi, dword ptr [eax+8]
- test esi, esi
+ cmp dword ptr [eax+8], -1
jl SHORT G_M29783_IG08
- mov eax, esi
- cdq
- idiv edx:eax, dword ptr [ebp-10H]
+ mov bword ptr [ebp-24H], eax
+ mov eax, dword ptr [eax+12]
+ xor edx, edx
+ div edx:eax, dword ptr [ebp-10H]
mov eax, dword ptr [ebx+4]
cmp edx, eax
jae SHORT G_M29783_IG14
mov eax, dword ptr [ebx+4*edx+8]
dec eax
mov esi, bword ptr [ebp-24H]
- mov dword ptr [esi+12], eax
+ mov dword ptr [esi+8], eax
lea eax, [edi+1]
mov dword ptr [ebx+4*edx+8], eax
@@ -639,7 +643,7 @@ G_M29783_IG14:
call CORINFO_HELP_RNGCHKFAIL
int3
-; Total bytes of code 332, prolog size 13 for method Dictionary`2:Resize(int,bool):this
+; Total bytes of code 328, prolog size 13 for method Dictionary`2:Resize(int,bool):this
; ============================================================
; Assembly listing for method Dictionary`2:FindEntry(ref):int:this
; Emitting BLENDED_CODE for generic X86 CPU - Windows
@@ -720,12 +724,11 @@ G_M16827_IG03:
mov ebx, dword ptr [ebx+40]
call dword ptr [ebx+12]Object:GetHashCode():int:this
mov ebx, eax
- and ebx, 0xD1FFAB1E
mov dword ptr [ebp-18H], ebx
mov ecx, gword ptr [ebp-38H]
mov eax, ebx
- cdq
- idiv edx:eax, dword ptr [ecx+4]
+ xor edx, edx
+ div edx:eax, dword ptr [ecx+4]
cmp edx, dword ptr [ecx+4]
jae G_M16827_IG19
mov ecx, dword ptr [ecx+4*edx+8]
@@ -759,7 +762,7 @@ G_M16827_IG05:
shl ebx, 4
lea ebx, bword ptr [ecx+ebx+8]
mov edi, dword ptr [ebp-18H]
- cmp dword ptr [ebx+8], edi
+ cmp dword ptr [ebx+12], edi
jne SHORT G_M16827_IG06
mov dword ptr [ebp-10H], eax
mov edi, eax
@@ -779,7 +782,7 @@ G_M16827_IG05:
jne G_M16827_IG15
G_M16827_IG06:
- mov eax, dword ptr [ebx+12]
+ mov eax, dword ptr [ebx+8]
mov ebx, eax
mov edx, dword ptr [ebp-2CH]
mov eax, dword ptr [ebp-14H]
@@ -820,12 +823,11 @@ G_M16827_IG10:
nop
call dword ptr [eax]
mov ebx, eax
- and ebx, 0xD1FFAB1E
mov dword ptr [ebp-1CH], ebx
mov ecx, gword ptr [ebp-38H]
mov eax, ebx
- cdq
- idiv edx:eax, dword ptr [ecx+4]
+ xor edx, edx
+ div edx:eax, dword ptr [ecx+4]
cmp edx, dword ptr [ecx+4]
jae G_M16827_IG19
mov ecx, dword ptr [ecx+4*edx+8]
@@ -842,7 +844,7 @@ G_M16827_IG11:
shl ebx, 4
lea ebx, bword ptr [edx+ebx+8]
mov edi, dword ptr [ebp-1CH]
- cmp dword ptr [ebx+8], edi
+ cmp dword ptr [ebx+12], edi
jne SHORT G_M16827_IG13
mov dword ptr [ebp-10H], eax
mov edi, eax
@@ -874,7 +876,7 @@ G_M16827_IG12:
jne SHORT G_M16827_IG15
G_M16827_IG13:
- mov eax, dword ptr [ebx+12]
+ mov eax, dword ptr [ebx+8]
mov ebx, eax
mov ecx, dword ptr [ebp-2CH]
mov eax, dword ptr [ebp-14H]
@@ -913,7 +915,7 @@ G_M16827_IG19:
call CORINFO_HELP_RNGCHKFAIL
int3
-; Total bytes of code 535, prolog size 13 for method Dictionary`2:FindEntry(ref):int:this
+; Total bytes of code 525, prolog size 13 for method Dictionary`2:FindEntry(ref):int:this
; ============================================================
; Assembly listing for method Dictionary`2:Remove(int):bool:this
; Emitting BLENDED_CODE for generic X86 CPU - Windows
@@ -932,7 +934,7 @@ G_M16827_IG19:
; V06 loc4 [V06,T17] ( 4, 2 ) int -> [ebp-0x1C]
; V07 loc5 [V07,T10] ( 5, 6 ) int -> [ebp-0x20]
; V08 loc6 [V08,T00] ( 8, 21.50) int -> [ebp-0x24]
-; V09 loc7 [V09,T01] ( 9, 18 ) byref -> [ebp-0x38]
+; V09 loc7 [V09,T01] ( 8, 17.50) byref -> [ebp-0x38]
;* V10 tmp0 [V10 ] ( 0, 0 ) ref -> zero-ref class-hnd exact "Single-def Box Helper"
; V11 tmp1 [V11,T18] ( 2, 2 ) ref -> ecx class-hnd "dup spill"
; V12 tmp2 [V12,T20] ( 3, 1.50) ref -> ecx
@@ -984,13 +986,12 @@ G_M18089_IG03:
mov esi, dword ptr [ebp-10H]
G_M18089_IG04:
- and ecx, 0xD1FFAB1E
mov gword ptr [ebp-30H], ebx
mov ebx, dword ptr [ebx+4]
mov dword ptr [ebp-18H], ecx
mov eax, ecx
- cdq
- idiv edx:eax, ebx
+ xor edx, edx
+ div edx:eax, ebx
mov eax, edx
mov dword ptr [ebp-20H], -1
cmp eax, ebx
@@ -1015,7 +1016,7 @@ G_M18089_IG05:
mov gword ptr [ebp-34H], ebx
lea edx, bword ptr [ebx+edx+8]
mov esi, dword ptr [ebp-18H]
- cmp dword ptr [edx], esi
+ cmp dword ptr [edx+4], esi
jne SHORT G_M18089_IG08
mov esi, gword ptr [edi+12]
mov gword ptr [ebp-3CH], esi
@@ -1046,7 +1047,7 @@ G_M18089_IG07:
G_M18089_IG08:
mov eax, dword ptr [ebp-24H]
- mov edx, dword ptr [edx+4]
+ mov edx, dword ptr [edx]
mov ecx, dword ptr [ebp-2CH]
mov esi, dword ptr [ebp-14H]
cmp ecx, esi
@@ -1078,7 +1079,7 @@ G_M18089_IG13:
mov esi, dword ptr [ebp-20H]
test esi, esi
jge SHORT G_M18089_IG14
- mov ecx, dword ptr [edx+4]
+ mov ecx, dword ptr [edx]
inc ecx
mov ebx, gword ptr [ebp-30H]
mov esi, dword ptr [ebp-1CH]
@@ -1090,14 +1091,15 @@ G_M18089_IG14:
cmp esi, ecx
jae SHORT G_M18089_IG18
shl esi, 4
- mov ecx, dword ptr [edx+4]
+ mov ecx, dword ptr [edx]
mov ebx, gword ptr [ebp-34H]
- mov dword ptr [ebx+esi+12], ecx
+ mov dword ptr [ebx+esi+8], ecx
G_M18089_IG15:
- mov dword ptr [edx], -1
mov ecx, dword ptr [edi+28]
- mov dword ptr [edx+4], ecx
+ neg ecx
+ add ecx, -3
+ mov dword ptr [edx], ecx
mov eax, dword ptr [ebp-24H]
mov dword ptr [edi+28], eax
inc dword ptr [edi+32]
@@ -1119,7 +1121,7 @@ G_M18089_IG18:
call CORINFO_HELP_RNGCHKFAIL
int3
-; Total bytes of code 354, prolog size 13 for method Dictionary`2:Remove(int):bool:this
+; Total bytes of code 345, prolog size 13 for method Dictionary`2:Remove(int):bool:this
; ============================================================
; Assembly listing for method Dictionary`2:Remove(int,byref):bool:this
; Emitting BLENDED_CODE for generic X86 CPU - Windows
@@ -1139,7 +1141,7 @@ G_M18089_IG18:
; V07 loc4 [V07,T17] ( 4, 2 ) int -> [ebp-0x1C]
; V08 loc5 [V08,T10] ( 5, 6 ) int -> [ebp-0x20]
; V09 loc6 [V09,T00] ( 8, 21.50) int -> [ebp-0x24]
-; V10 loc7 [V10,T01] ( 10, 18.50) byref -> [ebp-0x38]
+; V10 loc7 [V10,T01] ( 9, 18 ) byref -> [ebp-0x38]
;* V11 tmp0 [V11 ] ( 0, 0 ) ref -> zero-ref class-hnd exact "Single-def Box Helper"
; V12 tmp1 [V12,T18] ( 2, 2 ) ref -> ecx class-hnd "dup spill"
; V13 tmp2 [V13,T20] ( 3, 1.50) ref -> ecx
@@ -1191,13 +1193,12 @@ G_M18089_IG03:
mov esi, dword ptr [ebp-10H]
G_M18089_IG04:
- and ecx, 0xD1FFAB1E
mov gword ptr [ebp-30H], ebx
mov ebx, dword ptr [ebx+4]
mov dword ptr [ebp-18H], ecx
mov eax, ecx
- cdq
- idiv edx:eax, ebx
+ xor edx, edx
+ div edx:eax, ebx
mov eax, edx
mov dword ptr [ebp-20H], -1
cmp eax, ebx
@@ -1222,7 +1223,7 @@ G_M18089_IG05:
mov gword ptr [ebp-34H], ebx
lea edx, bword ptr [ebx+edx+8]
mov esi, dword ptr [ebp-18H]
- cmp dword ptr [edx], esi
+ cmp dword ptr [edx+4], esi
jne SHORT G_M18089_IG08
mov esi, gword ptr [edi+12]
mov gword ptr [ebp-3CH], esi
@@ -1253,7 +1254,7 @@ G_M18089_IG07:
G_M18089_IG08:
mov eax, dword ptr [ebp-24H]
- mov edx, dword ptr [edx+4]
+ mov edx, dword ptr [edx]
mov ecx, dword ptr [ebp-2CH]
mov esi, dword ptr [ebp-14H]
cmp ecx, esi
@@ -1287,7 +1288,7 @@ G_M18089_IG13:
mov esi, dword ptr [ebp-20H]
test esi, esi
jge SHORT G_M18089_IG14
- mov ecx, dword ptr [edx+4]
+ mov ecx, dword ptr [edx]
inc ecx
mov ebx, gword ptr [ebp-30H]
mov esi, dword ptr [ebp-1CH]
@@ -1299,17 +1300,18 @@ G_M18089_IG14:
cmp esi, ecx
jae SHORT G_M18089_IG18
shl esi, 4
- mov ecx, dword ptr [edx+4]
+ mov ecx, dword ptr [edx]
mov ebx, gword ptr [ebp-34H]
- mov dword ptr [ebx+esi+12], ecx
+ mov dword ptr [ebx+esi+8], ecx
G_M18089_IG15:
mov ecx, dword ptr [edx+12]
mov esi, bword ptr [ebp+08H]
mov dword ptr [esi], ecx
- mov dword ptr [edx], -1
mov ecx, dword ptr [edi+28]
- mov dword ptr [edx+4], ecx
+ neg ecx
+ add ecx, -3
+ mov dword ptr [edx], ecx
mov eax, dword ptr [ebp-24H]
mov dword ptr [edi+28], eax
inc dword ptr [edi+32]
@@ -1331,7 +1333,7 @@ G_M18089_IG18:
call CORINFO_HELP_RNGCHKFAIL
int3
-; Total bytes of code 371, prolog size 13 for method Dictionary`2:Remove(int,byref):bool:this
+; Total bytes of code 362, prolog size 13 for method Dictionary`2:Remove(int,byref):bool:this
; ============================================================
; Assembly listing for method Dictionary`2:TryInsert(int,int,ubyte):bool:this
; Emitting BLENDED_CODE for generic X86 CPU - Windows
@@ -1341,49 +1343,50 @@ G_M18089_IG18:
; fully interruptible
; Final local variable assignments
;
-; V00 this [V00,T04] ( 22, 14 ) ref -> esi this class-hnd
-; V01 arg1 [V01,T09] ( 9, 7.50) int -> [ebp-0x10] ld-addr-op
-; V02 arg2 [V02,T28] ( 3, 1.50) int -> [ebp+0x0C]
-; V03 arg3 [V03,T33] ( 2, 1 ) ubyte -> [ebp+0x08]
-; V04 loc0 [V04,T00] ( 12, 30.50) ref -> [ebp-0x34] class-hnd
-; V05 loc1 [V05,T15] ( 5, 5.50) ref -> [ebp-0x38] class-hnd
-; V06 loc2 [V06,T10] ( 6, 11 ) int -> [ebp-0x14]
+; V00 this [V00,T04] ( 23, 14.50) ref -> esi this class-hnd
+; V01 arg1 [V01,T07] ( 9, 7.50) int -> [ebp-0x10] ld-addr-op
+; V02 arg2 [V02,T29] ( 3, 1.50) int -> [ebp+0x0C]
+; V03 arg3 [V03,T34] ( 2, 1 ) ubyte -> [ebp+0x08]
+; V04 loc0 [V04,T00] ( 12, 27.50) ref -> [ebp-0x2C] class-hnd
+; V05 loc1 [V05,T15] ( 5, 5.50) ref -> [ebp-0x30] class-hnd
+; V06 loc2 [V06,T08] ( 6, 11 ) int -> [ebp-0x14]
; V07 loc3 [V07,T01] ( 7, 25 ) int -> [ebp-0x18]
-; V08 loc4 [V08,T18] ( 5, 3.50) byref -> [ebp-0x3C]
-; V09 loc5 [V09,T02] ( 7, 25 ) int -> eax
-; V10 loc6 [V10,T29] ( 3, 1.50) bool -> [ebp-0x1C]
-; V11 loc7 [V11,T20] ( 6, 3 ) int -> registers
-; V12 loc8 [V12,T19] ( 6, 3 ) byref -> eax
+; V08 loc4 [V08,T18] ( 5, 3.50) byref -> [ebp-0x34]
+; V09 loc5 [V09,T02] ( 7, 25 ) int -> registers
+; V10 loc6 [V10,T30] ( 3, 1.50) bool -> ebx
+; V11 loc7 [V11,T19] ( 6, 3 ) int -> registers
+; V12 loc8 [V12,T25] ( 5, 2.50) byref -> eax
;* V13 loc9 [V13 ] ( 0, 0 ) int -> zero-ref ld-addr-op
;* V14 loc10 [V14 ] ( 0, 0 ) ref -> zero-ref class-hnd
-; V15 loc11 [V15,T30] ( 3, 1.50) int -> [ebp-0x20]
+; V15 loc11 [V15,T31] ( 3, 1.50) int -> [ebp-0x1C]
;* V16 tmp0 [V16 ] ( 0, 0 ) ref -> zero-ref class-hnd exact "Single-def Box Helper"
-; V17 tmp1 [V17,T25] ( 3, 2 ) int -> registers
+; V17 tmp1 [V17,T26] ( 3, 2 ) int -> ecx
;* V18 tmp2 [V18 ] ( 0, 0 ) ref -> zero-ref class-hnd exact "Single-def Box Helper"
;* V19 tmp3 [V19 ] ( 0, 0 ) ref -> zero-ref class-hnd exact "Single-def Box Helper"
-; V20 tmp4 [V20,T16] ( 2, 4 ) bool -> ecx "Inline return value spill temp"
-; V21 tmp5 [V21,T11] ( 2, 8 ) int -> ecx ld-addr-op "Inlining Arg"
+; V20 tmp4 [V20,T16] ( 2, 4 ) bool -> eax "Inline return value spill temp"
+; V21 tmp5 [V21,T11] ( 2, 8 ) int -> eax ld-addr-op "Inlining Arg"
;* V22 tmp6 [V22 ] ( 0, 0 ) ref -> zero-ref class-hnd exact "Single-def Box Helper"
;* V23 tmp7 [V23,T17] ( 0, 0 ) int -> zero-ref "Inlining Arg"
;* V24 tmp8 [V24 ] ( 0, 0 ) ref -> zero-ref class-hnd exact "Single-def Box Helper"
-; V25 tmp9 [V25,T13] ( 3, 6 ) ref -> [ebp-0x40] "arr expr"
+; V25 tmp9 [V25,T13] ( 3, 6 ) ref -> [ebp-0x38] "arr expr"
; V26 tmp10 [V26,T14] ( 3, 6 ) int -> edx "arr expr"
-; V27 tmp11 [V27,T12] ( 2, 8 ) int -> [ebp-0x24] "argument with side effect"
-; V28 tmp12 [V28,T26] ( 2, 2 ) int -> edx "argument with side effect"
-; V29 tmp13 [V29,T22] ( 3, 3 ) ref -> ebx "arr expr"
-; V30 tmp14 [V30,T24] ( 3, 3 ) int -> edx "arr expr"
-; V31 cse0 [V31,T07] ( 5, 12.50) byref -> registers "ValNumCSE"
-; V32 cse1 [V32,T08] ( 5, 12.50) byref -> registers "ValNumCSE"
-; V33 cse2 [V33,T23] ( 3, 3 ) ref -> [ebp-0x44] "ValNumCSE"
-; V34 cse3 [V34,T21] ( 6, 3 ) int -> ecx "ValNumCSE"
-; V35 cse4 [V35,T31] ( 3, 1.50) int -> [ebp-0x28] "ValNumCSE"
-; V36 cse5 [V36,T32] ( 3, 1.50) int -> eax "ValNumCSE"
-; V37 cse6 [V37,T27] ( 3, 1.50) ref -> ecx "ValNumCSE"
-; V38 cse7 [V38,T03] ( 7, 24.50) int -> [ebp-0x2C] "ValNumCSE"
-; V39 cse8 [V39,T05] ( 4, 14 ) int -> eax "ValNumCSE"
-; V40 cse9 [V40,T06] ( 4, 14 ) int -> [ebp-0x30] "ValNumCSE"
+; V27 tmp11 [V27,T12] ( 2, 8 ) int -> [ebp-0x20] "argument with side effect"
+; V28 tmp12 [V28,T27] ( 2, 2 ) int -> edx "argument with side effect"
+; V29 tmp13 [V29,T21] ( 3, 3 ) ref -> edi "arr expr"
+; V30 tmp14 [V30,T23] ( 3, 3 ) int -> edx "arr expr"
+; V31 tmp15 [V31,T24] ( 3, 3 ) int -> ebx "arr expr"
+; V32 cse0 [V32,T09] ( 4, 10.50) byref -> ecx "ValNumCSE"
+; V33 cse1 [V33,T10] ( 4, 10.50) byref -> [ebp-0x3C] "ValNumCSE"
+; V34 cse2 [V34,T22] ( 3, 3 ) ref -> [ebp-0x40] "ValNumCSE"
+; V35 cse3 [V35,T20] ( 6, 3 ) int -> edx "ValNumCSE"
+; V36 cse4 [V36,T32] ( 3, 1.50) int -> eax "ValNumCSE"
+; V37 cse5 [V37,T33] ( 3, 1.50) int -> eax "ValNumCSE"
+; V38 cse6 [V38,T28] ( 3, 1.50) ref -> ecx "ValNumCSE"
+; V39 cse7 [V39,T03] ( 7, 24.50) int -> registers "ValNumCSE"
+; V40 cse8 [V40,T05] ( 3, 12 ) int -> [ebp-0x24] "ValNumCSE"
+; V41 cse9 [V41,T06] ( 3, 12 ) int -> [ebp-0x28] "ValNumCSE"
;
-; Lcl frame size = 56
+; Lcl frame size = 52
G_M59125_IG01:
push ebp
@@ -1391,7 +1394,7 @@ G_M59125_IG01:
push edi
push esi
push ebx
- sub esp, 56
+ sub esp, 52
mov esi, ecx
mov edi, edx
@@ -1404,96 +1407,89 @@ G_M59125_IG02:
G_M59125_IG03:
mov eax, gword ptr [esi+8]
- mov gword ptr [ebp-34H], eax
+ mov gword ptr [ebp-2CH], eax
mov edx, gword ptr [esi+12]
test edx, edx
je SHORT G_M59125_IG04
- mov gword ptr [ebp-38H], edx
+ mov gword ptr [ebp-30H], edx
mov ecx, edx
mov edx, edi
call [IEqualityComparer`1:GetHashCode(int):int:this]
+ mov ecx, eax
+ mov dword ptr [ebp-10H], edi
jmp SHORT G_M59125_IG05
G_M59125_IG04:
mov dword ptr [ebp-10H], edi
mov ecx, edi
- mov gword ptr [ebp-38H], edx
- mov eax, ecx
- mov edi, dword ptr [ebp-10H]
+ mov gword ptr [ebp-30H], edx
G_M59125_IG05:
- mov ecx, eax
- and ecx, 0xD1FFAB1E
- xor eax, eax
- mov dword ptr [ebp-18H], eax
- mov ebx, gword ptr [esi+4]
- mov gword ptr [ebp-44H], ebx
- mov gword ptr [ebp-40H], ebx
- mov ebx, gword ptr [ebp-44H]
+ xor ebx, ebx
+ mov edx, gword ptr [esi+4]
+ mov gword ptr [ebp-40H], edx
+ mov gword ptr [ebp-38H], edx
+ mov dword ptr [ebp-14H], ecx
+ mov edi, gword ptr [ebp-40H]
mov eax, ecx
- cdq
- idiv edx:eax, dword ptr [ebx+4]
- mov ebx, gword ptr [ebp-40H]
- cmp edx, dword ptr [ebx+4]
+ xor edx, edx
+ div edx:eax, dword ptr [edi+4]
+ mov edi, gword ptr [ebp-38H]
+ cmp edx, dword ptr [edi+4]
jae G_M59125_IG30
- lea ebx, bword ptr [ebx+4*edx+8]
- mov bword ptr [ebp-3CH], ebx
- mov eax, dword ptr [ebx]
+ lea edi, bword ptr [edi+4*edx+8]
+ mov bword ptr [ebp-34H], edi
+ mov eax, dword ptr [edi]
dec eax
- cmp gword ptr [ebp-38H], 0
+ cmp gword ptr [ebp-30H], 0
jne SHORT G_M59125_IG09
G_M59125_IG06:
- mov edx, gword ptr [ebp-34H]
- mov ebx, dword ptr [edx+4]
- cmp ebx, eax
+ mov edx, gword ptr [ebp-2CH]
+ mov edi, dword ptr [edx+4]
+ cmp edi, eax
jbe G_M59125_IG19
shl eax, 4
- mov dword ptr [ebp-14H], ecx
- cmp dword ptr [edx+eax+8], ecx
- mov dword ptr [ebp-10H], edi
+ mov dword ptr [ebp-24H], eax
+ lea ecx, bword ptr [edx+eax+8]
+ mov eax, dword ptr [ebp-14H]
+ cmp dword ptr [ecx+4], eax
jne SHORT G_M59125_IG07
- lea edi, bword ptr [edx+eax+8]
- mov ecx, dword ptr [edi+8]
- cmp ecx, dword ptr [ebp-10H]
- sete cl
- movzx ecx, cl
- test ecx, ecx
+ mov eax, dword ptr [ecx+8]
+ cmp eax, dword ptr [ebp-10H]
+ sete al
+ movzx eax, al
+ test eax, eax
jne SHORT G_M59125_IG10
G_M59125_IG07:
- lea eax, bword ptr [edx+eax+8]
- mov eax, dword ptr [eax+4]
- mov edi, dword ptr [ebp-18H]
- cmp ebx, edi
+ mov ecx, dword ptr [ebp-24H]
+ mov ecx, dword ptr [edx+ecx+8]
+ cmp edi, ebx
jle G_M59125_IG27
G_M59125_IG08:
- inc edi
- mov gword ptr [ebp-34H], edx
- mov dword ptr [ebp-18H], edi
- mov ecx, dword ptr [ebp-14H]
- mov edi, dword ptr [ebp-10H]
+ inc ebx
+ mov gword ptr [ebp-2CH], edx
+ mov eax, ecx
jmp SHORT G_M59125_IG06
G_M59125_IG09:
- mov dword ptr [ebp-14H], ecx
+ mov dword ptr [ebp-18H], ebx
jmp SHORT G_M59125_IG14
G_M59125_IG10:
- mov eax, edi
- mov edi, dword ptr [ebp-10H]
mov ebx, dword ptr [ebp+08H]
- movzx ecx, bl
- cmp ecx, 1
+ movzx edx, bl
+ cmp edx, 1
jne SHORT G_M59125_IG11
mov ebx, dword ptr [ebp+0CH]
- mov dword ptr [eax+12], ebx
+ mov dword ptr [ecx+12], ebx
inc dword ptr [esi+36]
jmp G_M59125_IG24
G_M59125_IG11:
- cmp ecx, 2
+ cmp edx, 2
je G_M59125_IG26
G_M59125_IG12:
@@ -1508,73 +1504,66 @@ G_M59125_IG13:
ret 8
G_M59125_IG14:
- mov ebx, gword ptr [ebp-34H]
- mov ecx, dword ptr [ebx+4]
- mov dword ptr [ebp-2CH], ecx
- cmp ecx, eax
+ mov edi, gword ptr [ebp-2CH]
+ mov ebx, dword ptr [edi+4]
+ cmp ebx, eax
jbe SHORT G_M59125_IG15
shl eax, 4
+ mov dword ptr [ebp-28H], eax
+ lea ecx, bword ptr [edi+eax+8]
mov edx, dword ptr [ebp-14H]
- cmp dword ptr [ebx+eax+8], edx
- mov dword ptr [ebp-10H], edi
+ cmp dword ptr [ecx+4], edx
jne SHORT G_M59125_IG17
- mov dword ptr [ebp-30H], eax
- lea edi, bword ptr [ebx+eax+8]
- mov edx, dword ptr [edi+8]
- mov dword ptr [ebp-24H], edx
+ mov bword ptr [ebp-3CH], ecx
+ mov edx, dword ptr [ecx+8]
+ mov dword ptr [ebp-20H], edx
mov edx, dword ptr [ebp-10H]
push edx
- mov edx, dword ptr [ebp-24H]
- mov ecx, gword ptr [ebp-38H]
+ mov edx, dword ptr [ebp-20H]
+ mov ecx, gword ptr [ebp-30H]
call [IEqualityComparer`1:Equals(int,int):bool:this]
test eax, eax
- mov eax, dword ptr [ebp-30H]
je SHORT G_M59125_IG17
mov ebx, dword ptr [ebp+08H]
- movzx ecx, bl
- cmp ecx, 1
+ movzx edx, bl
+ cmp edx, 1
jne SHORT G_M59125_IG16
+ mov edi, bword ptr [ebp-3CH]
mov ebx, dword ptr [ebp+0CH]
mov dword ptr [edi+12], ebx
inc dword ptr [esi+36]
jmp G_M59125_IG24
G_M59125_IG15:
- mov edx, ebx
- mov ecx, dword ptr [ebp-14H]
- mov ebx, dword ptr [ebp-2CH]
+ mov edx, edi
+ mov edi, ebx
jmp SHORT G_M59125_IG19
G_M59125_IG16:
- cmp ecx, 2
+ cmp edx, 2
je G_M59125_IG28
jmp SHORT G_M59125_IG12
G_M59125_IG17:
- lea eax, bword ptr [ebx+eax+8]
- mov eax, dword ptr [eax+4]
- mov ecx, dword ptr [ebp-2CH]
- mov edi, dword ptr [ebp-18H]
- cmp ecx, edi
+ mov eax, dword ptr [ebp-28H]
+ mov eax, dword ptr [edi+eax+8]
+ mov ecx, dword ptr [ebp-18H]
+ cmp ebx, ecx
jle G_M59125_IG29
G_M59125_IG18:
- inc edi
- mov gword ptr [ebp-34H], ebx
- mov dword ptr [ebp-18H], edi
- mov edi, dword ptr [ebp-10H]
+ inc ecx
+ mov gword ptr [ebp-2CH], edi
+ mov dword ptr [ebp-18H], ecx
jmp G_M59125_IG14
G_M59125_IG19:
- xor eax, eax
- mov dword ptr [ebp-1CH], eax
+ xor ebx, ebx
mov eax, dword ptr [esi+32]
- mov dword ptr [ebp-28H], eax
test eax, eax
jle SHORT G_M59125_IG20
- mov ebx, dword ptr [esi+28]
- mov dword ptr [ebp-1CH], 1
- mov eax, dword ptr [ebp-28H]
+ mov edi, dword ptr [esi+28]
+ mov ebx, 1
dec eax
mov dword ptr [esi+32], eax
jmp SHORT G_M59125_IG22
@@ -1582,10 +1571,9 @@ G_M59125_IG19:
G_M59125_IG20:
mov eax, dword ptr [esi+24]
mov edx, eax
- mov dword ptr [ebp-20H], edx
- cmp ebx, edx
+ mov dword ptr [ebp-1CH], edx
+ cmp edi, edx
jne SHORT G_M59125_IG21
- mov dword ptr [ebp-14H], ecx
mov ecx, eax
call HashHelpers:ExpandPrime(int):int
mov edx, eax
@@ -1593,46 +1581,53 @@ G_M59125_IG20:
mov ecx, esi
call Dictionary`2:Resize(int,bool):this
mov ecx, gword ptr [esi+4]
- mov ebx, ecx
+ mov edi, ecx
mov eax, dword ptr [ebp-14H]
- cdq
- idiv edx:eax, dword ptr [ecx+4]
- cmp edx, dword ptr [ebx+4]
+ xor edx, edx
+ div edx:eax, dword ptr [ecx+4]
+ cmp edx, dword ptr [edi+4]
jae G_M59125_IG30
- lea ebx, bword ptr [ebx+4*edx+8]
- mov bword ptr [ebp-3CH], ebx
- mov ecx, dword ptr [ebp-14H]
+ lea edi, bword ptr [edi+4*edx+8]
+ mov bword ptr [ebp-34H], edi
G_M59125_IG21:
- mov edx, dword ptr [ebp-20H]
+ mov edx, dword ptr [ebp-1CH]
lea eax, [edx+1]
mov dword ptr [esi+24], eax
mov eax, gword ptr [esi+8]
- mov ebx, edx
+ mov edi, edx
mov edx, eax
G_M59125_IG22:
- cmp ebx, dword ptr [edx+4]
+ cmp edi, dword ptr [edx+4]
jae SHORT G_M59125_IG30
- mov eax, ebx
+ mov eax, edi
shl eax, 4
lea eax, bword ptr [edx+eax+8]
- cmp dword ptr [ebp-1CH], 0
+ test ebx, ebx
je SHORT G_M59125_IG23
- mov edx, dword ptr [eax+4]
+ mov ebx, dword ptr [esi+28]
+ cmp ebx, dword ptr [edx+4]
+ jae SHORT G_M59125_IG30
+ shl ebx, 4
+ mov edx, dword ptr [edx+ebx+8]
+ neg edx
+ add edx, -3
mov dword ptr [esi+28], edx
G_M59125_IG23:
+ mov ecx, dword ptr [ebp-14H]
+ mov dword ptr [eax+4], ecx
+ mov ebx, bword ptr [ebp-34H]
+ mov ecx, dword ptr [ebx]
+ dec ecx
mov dword ptr [eax], ecx
- mov ecx, bword ptr [ebp-3CH]
- mov edx, dword ptr [ecx]
- dec edx
- mov dword ptr [eax+4], edx
- mov dword ptr [eax+8], edi
- mov edi, dword ptr [ebp+0CH]
- mov dword ptr [eax+12], edi
- inc ebx
- mov dword ptr [ecx], ebx
+ mov edx, dword ptr [ebp-10H]
+ mov dword ptr [eax+8], edx
+ mov edx, dword ptr [ebp+0CH]
+ mov dword ptr [eax+12], edx
+ inc edi
+ mov dword ptr [ebx], edi
inc dword ptr [esi+36]
G_M59125_IG24:
@@ -1647,7 +1642,7 @@ G_M59125_IG25:
ret 8
G_M59125_IG26:
- mov ecx, edi
+ mov ecx, dword ptr [ebp-10H]
call ThrowHelper:ThrowAddingDuplicateWithKeyArgumentException(int)
int3
@@ -1656,8 +1651,7 @@ G_M59125_IG27:
int3
G_M59125_IG28:
- mov edi, dword ptr [ebp-10H]
- mov ecx, edi
+ mov ecx, dword ptr [ebp-10H]
call ThrowHelper:ThrowAddingDuplicateWithKeyArgumentException(int)
int3
@@ -1669,7 +1663,7 @@ G_M59125_IG30:
call CORINFO_HELP_RNGCHKFAIL
int3
-; Total bytes of code 630, prolog size 13 for method Dictionary`2:TryInsert(int,int,ubyte):bool:this
+; Total bytes of code 597, prolog size 13 for method Dictionary`2:TryInsert(int,int,ubyte):bool:this
; ============================================================
; Assembly listing for method Dictionary`2:Resize(int,bool):this
; Emitting BLENDED_CODE for generic X86 CPU - Windows
@@ -1679,24 +1673,23 @@ G_M59125_IG30:
; fully interruptible
; Final local variable assignments
;
-; V00 this [V00,T04] ( 6, 6 ) ref -> [ebp-0x18] this class-hnd
-; V01 arg1 [V01,T06] ( 5, 6 ) int -> [ebp-0x10]
+; V00 this [V00,T03] ( 6, 6 ) ref -> [ebp-0x18] this class-hnd
+; V01 arg1 [V01,T05] ( 5, 6 ) int -> [ebp-0x10]
;* V02 arg2 [V02 ] ( 0, 0 ) bool -> zero-ref
-; V03 loc0 [V03,T05] ( 5, 8 ) ref -> ebx class-hnd
-; V04 loc1 [V04,T01] ( 6, 13 ) ref -> [ebp-0x1C] class-hnd
-; V05 loc2 [V05,T09] ( 4, 7 ) int -> [ebp-0x14]
+; V03 loc0 [V03,T04] ( 5, 8 ) ref -> ebx class-hnd
+; V04 loc1 [V04,T01] ( 7, 15 ) ref -> [ebp-0x1C] class-hnd
+; V05 loc2 [V05,T08] ( 4, 7 ) int -> [ebp-0x14]
;* V06 loc3 [V06 ] ( 0, 0 ) int -> zero-ref ld-addr-op
;* V07 loc4 [V07 ] ( 0, 0 ) int -> zero-ref
; V08 loc5 [V08,T00] ( 7, 23 ) int -> ecx
-; V09 loc6 [V09,T07] ( 4, 8 ) int -> edx
+; V09 loc6 [V09,T06] ( 4, 8 ) int -> edx
;* V10 tmp0 [V10 ] ( 0, 0 ) ref -> zero-ref class-hnd exact "Single-def Box Helper"
;* V11 tmp1 [V11 ] ( 0, 0 ) byref -> zero-ref "impAppendStmt"
-; V12 tmp2 [V12,T10] ( 2, 4 ) ref -> ecx class-hnd "Inlining Arg"
+; V12 tmp2 [V12,T09] ( 2, 4 ) ref -> ecx class-hnd "Inlining Arg"
;* V13 tmp3 [V13 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg"
-; V14 cse0 [V14,T02] ( 3, 10 ) int -> edx "ValNumCSE"
-; V15 cse1 [V15,T08] ( 2, 8 ) int -> esi "ValNumCSE"
-; V16 cse2 [V16,T03] ( 3, 10 ) int -> esi "ValNumCSE"
-; V17 cse3 [V17,T11] ( 2, 4 ) int -> eax "ValNumCSE"
+; V14 cse0 [V14,T07] ( 2, 8 ) int -> esi "ValNumCSE"
+; V15 cse1 [V15,T02] ( 4, 12 ) int -> esi "ValNumCSE"
+; V16 cse2 [V16,T10] ( 2, 4 ) int -> eax "ValNumCSE"
;
; Lcl frame size = 16
@@ -1731,48 +1724,46 @@ G_M14072_IG02:
call Array:Copy(ref,int,ref,int,int,bool)
xor ecx, ecx
cmp dword ptr [ebp-14H], 0
- jle SHORT G_M14072_IG05
+ jle SHORT G_M14072_IG08
G_M14072_IG03:
mov eax, gword ptr [ebp-1CH]
mov esi, dword ptr [eax+4]
cmp ecx, esi
- jae SHORT G_M14072_IG07
+ jae SHORT G_M14072_IG09
mov esi, ecx
shl esi, 4
- mov gword ptr [ebp-1CH], eax
- mov edx, dword ptr [eax+esi+8]
- test edx, edx
+ cmp dword ptr [eax+esi+8], -1
jl SHORT G_M14072_IG04
+ mov gword ptr [ebp-1CH], eax
+ mov eax, dword ptr [eax+esi+12]
mov dword ptr [ebp-10H], edi
- mov eax, edx
- cdq
- idiv edx:eax, edi
+ xor edx, edx
+ div edx:eax, edi
mov eax, dword ptr [ebx+4]
cmp edx, eax
- jae SHORT G_M14072_IG07
+ jae SHORT G_M14072_IG09
mov eax, dword ptr [ebx+4*edx+8]
dec eax
mov edi, gword ptr [ebp-1CH]
- mov dword ptr [edi+esi+12], eax
+ mov dword ptr [edi+esi+8], eax
lea eax, [ecx+1]
mov dword ptr [ebx+4*edx+8], eax
- mov gword ptr [ebp-1CH], edi
+ mov eax, edi
mov edi, dword ptr [ebp-10H]
G_M14072_IG04:
inc ecx
- mov esi, dword ptr [ebp-14H]
- cmp ecx, esi
- mov dword ptr [ebp-14H], esi
- jl SHORT G_M14072_IG03
+ mov edx, dword ptr [ebp-14H]
+ cmp ecx, edx
+ mov dword ptr [ebp-14H], edx
+ jl SHORT G_M14072_IG07
G_M14072_IG05:
mov esi, gword ptr [ebp-18H]
lea edx, bword ptr [esi+4]
call CORINFO_HELP_ASSIGN_REF_EBX
lea edx, bword ptr [esi+8]
- mov eax, gword ptr [ebp-1CH]
call CORINFO_HELP_ASSIGN_REF_EAX
G_M14072_IG06:
@@ -1784,10 +1775,18 @@ G_M14072_IG06:
ret 4
G_M14072_IG07:
+ mov gword ptr [ebp-1CH], eax
+ jmp SHORT G_M14072_IG03
+
+G_M14072_IG08:
+ mov eax, gword ptr [ebp-1CH]
+ jmp SHORT G_M14072_IG05
+
+G_M14072_IG09:
call CORINFO_HELP_RNGCHKFAIL
int3
-; Total bytes of code 190, prolog size 13 for method Dictionary`2:Resize(int,bool):this
+; Total bytes of code 198, prolog size 13 for method Dictionary`2:Resize(int,bool):this
; ============================================================
; Assembly listing for method Dictionary`2:TrimExcess(int):this
; Emitting BLENDED_CODE for generic X86 CPU - Windows
@@ -1797,22 +1796,25 @@ G_M14072_IG07:
; fully interruptible
; Final local variable assignments
;
-; V00 this [V00,T02] ( 13, 9 ) ref -> [ebp-0x1C] this class-hnd
-; V01 arg1 [V01,T07] ( 4, 4 ) int -> edx
-; V02 loc0 [V02,T11] ( 4, 4.50) int -> [ebp-0x10]
-; V03 loc1 [V03,T01] ( 6, 12.50) ref -> ebx class-hnd
+; V00 this [V00,T03] ( 13, 9 ) ref -> [ebp-0x20] this class-hnd
+; V01 arg1 [V01,T09] ( 4, 4 ) int -> edx
+; V02 loc0 [V02,T14] ( 4, 4.50) int -> [ebp-0x10]
+; V03 loc1 [V03,T01] ( 6, 14.50) ref -> ebx class-hnd
;* V04 loc2 [V04 ] ( 0, 0 ) int -> zero-ref
-; V05 loc3 [V05,T10] ( 3, 5 ) int -> [ebp-0x14]
-; V06 loc4 [V06,T12] ( 3, 4.50) ref -> [ebp-0x20] class-hnd
-; V07 loc5 [V07,T06] ( 4, 6.50) ref -> [ebp-0x24] class-hnd
-; V08 loc6 [V08,T04] ( 6, 9 ) int -> [ebp-0x18]
-; V09 loc7 [V09,T00] ( 7, 22.50) int -> edi
-; V10 loc8 [V10,T03] ( 3, 10 ) int -> ecx
-; V11 loc9 [V11,T08] ( 3, 6 ) byref -> [ebp-0x28]
-; V12 loc10 [V12,T05] ( 4, 8 ) int -> edx
-; V13 tmp0 [V13,T14] ( 3, 2 ) int -> ecx
-; V14 cse0 [V14,T13] ( 2, 4 ) int -> eax "ValNumCSE"
-; V15 cse1 [V15,T09] ( 3, 6 ) int -> esi "ValNumCSE"
+; V05 loc3 [V05,T13] ( 3, 5 ) int -> [ebp-0x14]
+; V06 loc4 [V06,T15] ( 3, 4.50) ref -> [ebp-0x24] class-hnd
+; V07 loc5 [V07,T08] ( 4, 6.50) ref -> [ebp-0x28] class-hnd
+; V08 loc6 [V08,T05] ( 6, 9 ) int -> [ebp-0x18]
+; V09 loc7 [V09,T00] ( 6, 20.50) int -> edi
+; V10 loc8 [V10,T12] ( 2, 6 ) int -> [ebp-0x1C]
+; V11 loc9 [V11,T10] ( 3, 6 ) byref -> ecx
+; V12 loc10 [V12,T07] ( 4, 8 ) int -> edx
+; V13 tmp0 [V13,T17] ( 3, 2 ) int -> ecx
+; V14 cse0 [V14,T04] ( 3, 10 ) byref -> edx "ValNumCSE"
+; V15 cse1 [V15,T02] ( 3, 12 ) int -> ecx "ValNumCSE"
+; V16 cse2 [V16,T06] ( 4, 9 ) int -> ecx "ValNumCSE"
+; V17 cse3 [V17,T16] ( 2, 4 ) int -> esi "ValNumCSE"
+; V18 cse4 [V18,T11] ( 3, 6 ) int -> ecx "ValNumCSE"
;
; Lcl frame size = 28
@@ -1865,63 +1867,61 @@ G_M47871_IG07:
mov edx, edi
call Dictionary`2:Initialize(int):int:this
mov ecx, gword ptr [esi+8]
- mov gword ptr [ebp-20H], ecx
- mov gword ptr [ebp-1CH], esi
+ mov gword ptr [ebp-24H], ecx
+ mov gword ptr [ebp-20H], esi
mov edx, gword ptr [esi+4]
- mov gword ptr [ebp-24H], edx
+ mov gword ptr [ebp-28H], edx
xor eax, eax
xor edi, edi
cmp dword ptr [ebp-14H], 0
jle SHORT G_M47871_IG10
G_M47871_IG08:
- cmp edi, dword ptr [ebx+4]
+ mov ecx, dword ptr [ebx+4]
+ cmp edi, ecx
jae G_M47871_IG13
mov ecx, edi
shl ecx, 4
- mov ecx, dword ptr [ebx+ecx+8]
- test ecx, ecx
+ lea edx, bword ptr [ebx+ecx+8]
+ mov esi, dword ptr [edx+4]
+ mov dword ptr [ebp-1CH], esi
+ cmp dword ptr [ebx+ecx+8], -1
jl SHORT G_M47871_IG09
- mov edx, gword ptr [ebp-20H]
- cmp eax, dword ptr [edx+4]
+ mov ecx, gword ptr [ebp-24H]
+ cmp eax, dword ptr [ecx+4]
jae SHORT G_M47871_IG13
mov dword ptr [ebp-18H], eax
- mov edx, eax
- shl edx, 4
- mov esi, gword ptr [ebp-20H]
- lea edx, bword ptr [esi+edx+8]
- mov esi, edi
- shl esi, 4
- lea esi, bword ptr [ebx+esi+8]
- mov bword ptr [ebp-28H], edx
- movdqu xmm0, qword ptr [esi]
- movdqu qword ptr [edx], xmm0
- mov eax, ecx
- cdq
- idiv edx:eax, dword ptr [ebp-10H]
- mov ecx, gword ptr [ebp-24H]
- mov eax, dword ptr [ecx+4]
- cmp edx, eax
+ mov ecx, eax
+ shl ecx, 4
+ mov esi, gword ptr [ebp-24H]
+ lea ecx, bword ptr [esi+ecx+8]
+ movdqu xmm0, qword ptr [edx]
+ movdqu qword ptr [ecx], xmm0
+ mov eax, dword ptr [ebp-1CH]
+ xor edx, edx
+ div edx:eax, dword ptr [ebp-10H]
+ mov eax, gword ptr [ebp-28H]
+ mov esi, dword ptr [eax+4]
+ cmp edx, esi
jae SHORT G_M47871_IG13
- mov eax, dword ptr [ecx+4*edx+8]
- dec eax
- mov esi, bword ptr [ebp-28H]
- mov dword ptr [esi+4], eax
- mov esi, dword ptr [ebp-18H]
- inc esi
- mov dword ptr [ecx+4*edx+8], esi
- mov gword ptr [ebp-24H], ecx
- mov eax, esi
+ mov esi, dword ptr [eax+4*edx+8]
+ dec esi
+ mov dword ptr [ecx], esi
+ mov ecx, dword ptr [ebp-18H]
+ inc ecx
+ mov dword ptr [eax+4*edx+8], ecx
+ mov gword ptr [ebp-28H], eax
+ mov eax, ecx
G_M47871_IG09:
inc edi
- mov ecx, dword ptr [ebp-14H]
- cmp edi, ecx
- mov dword ptr [ebp-14H], ecx
+ mov esi, dword ptr [ebp-14H]
+ cmp edi, esi
+ mov dword ptr [ebp-14H], esi
jl SHORT G_M47871_IG08
G_M47871_IG10:
- mov esi, gword ptr [ebp-1CH]
+ mov esi, gword ptr [ebp-20H]
mov dword ptr [esi+24], eax
xor ecx, ecx
mov dword ptr [esi+32], ecx
@@ -1943,5 +1943,5 @@ G_M47871_IG13:
call CORINFO_HELP_RNGCHKFAIL
int3
-; Total bytes of code 258, prolog size 11 for method Dictionary`2:TrimExcess(int):this
+; Total bytes of code 255, prolog size 11 for method Dictionary`2:TrimExcess(int):this
; ============================================================
|
@jkotas PTAL |
/azp run |
@RussKeldorph are the jenkins based legs like
|
@danmosemsft They are not (yet) obsolete. Something seems to have gone terribly wrong with the tools restore. I'm assuming due to infra stuff yesterday. Later PRs seem to be ok w.r.t. this job. |
/azp run |
@MarcoRossignoli I haven't looked at the failures, but if you think there are infrastructure issues unique to this PR, you could open a nice fresh clean one and close this instead. |
I think so, will do! |
Replaced in #23591 for CI issue |
contributes to https://github.com/dotnet/corefx/issues/33392
Use uint hashcode to retain more entropy
before
After
Comparer
More interesting tests is custom test for entropy dotnet/performance@master...MarcoRossignoli:newbenchdic , ctor diff seems an outlier no code changed there.
My thoughts, better entropy on mapping function, code is not so different, the difference is not so great also in case of frequent buckets collision(my test is not "perfect" I tested only chain with max 2 item for every inserted item)
I did also some tests with different mapping functions and "and" perform better than "div" as expected(I don't know if this test have been already done in past), maybe on other PR we could try to measure better inside actual dic if it makes sense.
I did also work on last point of list
_freeCount
field and use_freeList == -1
as a sentinel instead`and the result are not so great, we remove the local var freeCount but after that we use count as a "real item count" and it's no more aligned with entries count. This lead to change every piece of code that iterate throught entries using a local
count = _count
(to decrement after every "valid item" found, next > -1) . Another issue is that DictionarySlim doesn't support "versioning" so I had to change also every enumerator to support current dictionary behaviour(support remove during enumeration). We need also to change code for features like Trim(), EnsureCapacity() that slim one doesn't have.I'll show result if you want, but the complexity of change it's not worth to me.
Some preview code(not optimized but to understand the complexity) https://github.com/MarcoRossignoli/marcorossignoli.github.io/blob/dicbackporting/src/DicBackportingBenchmark/Dic/Dic/Dictionary.cs#L107 https://github.com/MarcoRossignoli/marcorossignoli.github.io/blob/dicbackporting/src/DicBackportingBenchmark/Dic/Dic/Dictionary.cs#L1610
Thank's for chance to explore dictionary so deeply, very interesting and funny!
I hope I was helpful.
/cc @danmosemsft