Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: look into some small CQ issues in ValueTuple.GetHashCode #9165

Open
AndyAyersMS opened this issue Oct 19, 2017 · 3 comments
Open

JIT: look into some small CQ issues in ValueTuple.GetHashCode #9165

AndyAyersMS opened this issue Oct 19, 2017 · 3 comments
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions JitUntriaged CLR JIT issues needing additional triage optimization tenet-performance Performance related issue
Milestone

Comments

@AndyAyersMS
Copy link
Member

Couple of things to investigate here:

  • why an explicit null check remains after we've dereferenced rcx
  • what causes the shuffling back and forth between esi and edx
;  System.ValueTuple`2[Int32,Int32][System.Int32,System.Int32]:GetHashCode():int:this

G_M51052_IG01:
       57                   push     rdi
       56                   push     rsi
       4883EC28             sub      rsp, 40

G_M51052_IG02:
       8B11                 mov      edx, dword ptr [rcx]
       8BF2                 mov      esi, edx
       3909                 cmp      dword ptr [rcx], ecx   // explicit null check?
       4883C104             add      rcx, 4
       8BD6                 mov      edx, esi               // unnecessary?
       8BF2                 mov      esi, edx               // unnecessary?
       8B39                 mov      edi, dword ptr [rcx]
       48B928307235FD7F0000 mov      rcx, 0x7FFD35723028
       BA50010000           mov      edx, 336
       E8B678805F           call     CORINFO_HELP_GETSHARED_NONGCSTATIC_BASE
       8B80F0070000         mov      eax, dword ptr [rax+07F0H]
       8BD0                 mov      edx, eax
       C1C205               rol      edx, 5
       03D0                 add      edx, eax
       8BC2                 mov      eax, edx
       33C6                 xor      eax, esi
       8BD0                 mov      edx, eax
       C1C205               rol      edx, 5
       03D0                 add      edx, eax
       8BC2                 mov      eax, edx
       33C7                 xor      eax, edi

G_M51052_IG03:
       4883C428             add      rsp, 40
       5E                   pop      rsi
       5F                   pop      rdi
       C3                   ret

test source

using System;
using System.Runtime.CompilerServices;

class P
{
    [MethodImpl(MethodImplOptions.NoInlining)] 
    static int VTHash(int i1, int i2) 
    { 
        ValueTuple<int, int> vt = ValueTuple.Create(i1, i2); 

        return vt.GetHashCode(); 
    } 

    public static void Main()
    {
        VTHash(1,1);
    }
}

category:cq
theme:basic-cq
skill-level:expert
cost:small

@mikedn
Copy link
Contributor

mikedn commented Oct 28, 2017

The cost of the static readonly RandomSeed is likely far higher than those extra moves and the null check. Without it the generated code looks like:

G_M4164_IG01:
G_M4164_IG02:
       8B01                 mov      eax, dword ptr [rcx]
       3909                 cmp      dword ptr [rcx], ecx
       488D5104             lea      rdx, bword ptr [rcx+4]
       8B12                 mov      edx, dword ptr [rdx]
       35EFFB8960           xor      eax, 0x6089FBEF ; "inlined" RandomSeed
       8BC8                 mov      ecx, eax
       C1C105               rol      ecx, 5
       03C8                 add      ecx, eax
       8BC1                 mov      eax, ecx
       33C2                 xor      eax, edx
G_M4164_IG03:
       C3                   ret

I wonder if this warrants turning RandomSeed into an intrinsic.

The extra moves are gone but the nullcheck is still present. There's also the pesky LEA that can't be removed due to the lack of some kind of forward propagation. Otherwise the code looks pretty good.

@mikedn
Copy link
Contributor

mikedn commented Oct 28, 2017

And what do you know - the extra moves are caused by a well known importer problem:

After register allocation:

N017 (  3,  3) [000168] ------------       t168 = *  ADD       byref  REG rcx $2c0
                                                 /--*  t168   byref  
N019 (  5,  5) [000056] DA-XG-------              *  STORE_LCL_VAR byref  V07 tmp4         d:3 rcx REG rcx
N021 (  3,  2) [000044] ------------        t44 =    LCL_VAR   int    V06 tmp3         u:3 rsi (last use) REG rsi <l:$1c0, c:$200>
                                                 /--*  t44    int    
N023 (  7,  5) [000077] DA----------              *  STORE_LCL_VAR int    V09 tmp6         d:3 rdx REG rdx
N025 (  1,  1) [000058] ------------        t58 =    LCL_VAR   byref  V07 tmp4         u:3 rcx (last use) REG rcx $2c0
                                                 /--*  t58    byref  
N027 (  5,  4) [000081] DA----------              *  STORE_LCL_VAR byref  V10 tmp7         d:3 rcx REG rcx
N029 (  3,  2) [000084] ------------        t84 =    LCL_VAR   int    V09 tmp6         u:3 rdx (last use) REG rdx <l:$1c0, c:$200>
                                                 /--*  t84    int    
N031 (  7,  5) [000091] DA----------              *  STORE_LCL_VAR int    V11 tmp8         d:3 rsi REG rsi
N033 (  3,  2) [000085] ------------        t85 =    LCL_VAR   byref  V10 tmp7         u:3 rcx (last use) REG rcx $2c0
                                                 /--*  t85    byref  
N035 (  6,  4) [000112] *--XG-------       t112 = *  IND       int    REG rdi <l:$1c1, c:$204>

Node 77 and 91 are originally created in the importer:

lvaGrabTemps(2) returning 9..10 (long lifetime temps) called for IL Stack Entries
Spilling stack entries into temps
               [000078] ------------              *  STMT      void  (IL   ???...  ???)
               [000044] ------------              |  /--*  LCL_VAR   int    V06 tmp3         
               [000077] -A----------              \--*  ASG       int   
               [000076] D------N----                 \--*  LCL_VAR   int    V09 tmp6         
...

In some cases the RA manages to allocate the same register to both variables involved in such an assignment and then no code is generated. But in some case different registers are allocated and we get those moves. Not sure if RA is to blame for this, seems more like a "where's copy propagation when you need it" problem...

@mikedn
Copy link
Contributor

mikedn commented Oct 28, 2017

The redundant NULLCHECK is generated by morph for a ldflda Item2:

fgMorphTree BB03, stmt 7 (before)
               [000048] ---XG-------              /--*  ADDR      byref 
               [000047] ---XG-------              |  \--*  FIELD     int    Item2
               [000046] ------------              |     \--*  LCL_VAR   byref  V00 this         
               [000056] -A-XG-------              *  ASG       byref 
               [000055] D------N----              \--*  LCL_VAR   byref  V07 tmp4         
Before explicit null check morphing:
               [000047] ---XG--N----              *  FIELD     int    Item2
               [000046] ------------              \--*  LCL_VAR   byref  V00 this         
After adding explicit null check:
               [000047] *--XG--N----              *  IND       int   
               [000167] ------------              |     /--*  CNS_INT   long   4 field offset Fseq[Item2]
               [000168] ------------              |  /--*  ADD       byref 
               [000166] ------------              |  |  \--*  LCL_VAR   byref  V00 this         
               [000169] ---X--------              \--*  COMMA     byref 
               [000165] ---X---N----                 \--*  NULLCHECK byte  
               [000164] ------------                    \--*  LCL_VAR   byref  V00 this         

It's not clear why assertionprop doesn't handle this null check. It refuses to generate non-null assertions for byrefs that cannot be tracked back to an object reference but it's not readily obvious why:
https://github.com/dotnet/coreclr/blob/dff79fceb970f9b93f038edaa0016e4a18007b0d/src/jit/assertionprop.cpp#L933-L974

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@BruceForstall BruceForstall added the JitUntriaged CLR JIT issues needing additional triage label Oct 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions JitUntriaged CLR JIT issues needing additional triage optimization tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

4 participants