Update array length compare value numbers on CSE #9615

JosephTremoulet · 2017-02-15T19:29:28Z

Modify CSE to identify compares which are functions of CSE candidate array
length nodes; when those length nodes get CSEd and consequently have their
value numbers updated, also update the value number on the compare
consuming the array length; this way the assertions generated in assertion
prop for the different compares on the CSEd array will use the same value
numbers, and range check elimination becomes more effective.

Resolves #5371

JosephTremoulet · 2017-02-15T19:36:30Z

@briansull @dotnet/jit-contrib PTAL.

This gets the motivating case from #5371.

All diffs are bounds checks removed from loops from which we'd already hoisted the array length, as in this example (System.Dictionary.Clear):

--- ".\\DictionaryClear-before.asm"	2017-02-15 11:21:33.158935800 -0800
+++ ".\\DictionaryClear-after.asm"	2017-02-15 11:22:02.755138400 -0800
@@ -1,54 +1,49 @@
 ; Assembly listing for method Dictionary`2:Clear():this
 ; Emitting BLENDED_CODE for X64 CPU with SSE2
 ; optimized code
 ; rsp based frame
 ; fully interruptible
 ; Final local variable assignments
 ;
 ;  V00 this         [V00,T03] ( 23,  16.5)     ref  ->  rsi         this
-;  V01 loc0         [V01,T01] ( 13,  41.5)     int  ->  rcx        
+;  V01 loc0         [V01,T01] ( 12,  40.5)     int  ->  rcx        
 ;  V02 tmp0         [V02,T00] (  6,  48  )     ref  ->  rax        
 ;  V03 OutArgs      [V03    ] (  1,   1  )  lclBlk (32) [rsp+0x00]  
-;  V04 cse0         [V04,T02] ( 15,  25  )     int  ->  rdx        
+;  V04 cse0         [V04,T02] ( 14,  24  )     int  ->  rdx        
 ;  V05 cse1         [V05,T04] ( 11,  12.5)     ref  ->   r8        
 ;
 ; Lcl frame size = 32
 G_M52189_IG01:
        push     rsi
        sub      rsp, 32
        mov      rsi, rcx
 G_M52189_IG02:
        cmp      dword ptr [rsi+56], 0
        jle      SHORT G_M52189_IG05
        xor      ecx, ecx
        mov      r8, gword ptr [rsi+8]
        mov      edx, dword ptr [r8+8]
        test     edx, edx
        jle      SHORT G_M52189_IG04
 G_M52189_IG03:
        mov      rax, r8
-       cmp      ecx, edx
-       jae      SHORT G_M52189_IG06
        movsxd   r9, ecx
        mov      dword ptr [rax+4*r9+16], -1
        inc      ecx
        cmp      edx, ecx
        jg       SHORT G_M52189_IG03
 G_M52189_IG04:
        mov      rcx, gword ptr [rsi+16]
        mov      r8d, dword ptr [rsi+56]
        xor      edx, edx
        call     Array:Clear(ref,int,int)
        mov      dword ptr [rsi+64], -1
        xor      eax, eax
        mov      dword ptr [rsi+56], eax
        mov      dword ptr [rsi+68], eax
        inc      dword ptr [rsi+60]
 G_M52189_IG05:
        add      rsp, 32
        pop      rsi
        ret      
-G_M52189_IG06:
-       call     CORINFO_HELP_RNGCHKFAIL
-       int3     
-; Total bytes of code 98, prolog size 8 for method Dictionary`2:Clear():this
+; Total bytes of code 88, prolog size 8 for method Dictionary`2:Clear():this

SPMI diffs report 4.06% codesize improvement across 1777 affected methods.

jit-diff results (checked jit against release full diff set):

Summary:
(Note: Lower is better)

Total bytes of diff: -1265 (-0.01 % of base)
    diff is an improvement.

Total byte diff includes 0 bytes from reconciling methods
        Base had    0 unique methods,        0 unique bytes
        Diff had    0 unique methods,        0 unique bytes

Top file improvements by size (bytes):
       -1114 : System.Private.CoreLib.dasm (-0.03 % of base)
         -43 : Microsoft.CSharp.dasm (-0.01 % of base)
         -27 : System.Text.RegularExpressions.dasm (-0.03 % of base)
         -21 : System.Security.Principal.Windows.dasm (-0.06 % of base)
         -16 : System.Security.Cryptography.Cng.dasm (-0.03 % of base)

9 total files with size differences (9 improved, 0 regressed).

Top method improvements by size (bytes):
        -406 : System.Private.CoreLib.dasm - Dictionary`2:OnDeserialization(ref):this (29 methods)
        -290 : System.Private.CoreLib.dasm - Dictionary`2:Clear():this (29 methods)
        -290 : System.Private.CoreLib.dasm - Dictionary`2:Initialize(int):this (29 methods)
         -18 : System.Text.RegularExpressions.dasm - Match:Reset(ref,ref,int,int,int):this
         -15 : System.Private.CoreLib.dasm - GregorianCalendarHelper:GetEra(struct):int:this

31 total methods with size differences (31 improved, 0 regressed).

Throughput results within noise levels. cqNgenTP lab run reports:

Benchmark	delta (positive => improvement)
mscorlib-amd64	0.77%
System.Data.Entity	3.40%
System.Windows.Forms	-0.71%
System.Design	-0.89%
System.Web-amd64	-1.27%
System.ServiceModel	-1.17%
geomean	0.009%

Modify CSE to identify compares which are functions of CSE candidate array length nodes; when those length nodes get CSEd and consequently have their value numbers updated, also update the value number on the compare consuming the array length; this way the assertions generated in assertion prop for the different compares on the CSEd array will use the same value numbers, and range check elimination becomes more effective. Resolves #5371

mikedn · 2017-02-15T21:09:07Z

src/jit/optcse.cpp

+// Arguments:
+//    compare - The compare node to check
+
+void Compiler::updateCseArrLenMap(GenTreePtr compare)


Should this be named optCSEUpdateArrLenMap or something similar?

I named the field cseArrLenMap. I suppose I could rename it to optCseArrLenMap and this method as you suggest, but I don't really understand what the benefit is... as a general rule, are we continuing this prefix naming stuff with new code?

I'm not a fan of such prefixes but considering that Compiler has a zillion members maybe it's better to prefix stuff. The whole thing looks like a ton of C code dumped into a C++ class...

mikedn · 2017-02-15T21:14:23Z

src/jit/optcse.cpp

+    VNFuncApp cmpVNFuncApp;
+
+    if (!vnStore->GetVNFunc(compareVN, &cmpVNFuncApp) ||
+        (cmpVNFuncApp.m_func != GetVNFuncForOper(compare->OperGet(), (compare->gtFlags & GTF_UNSIGNED) != 0)))


(compare->gtFlags & GTF_UNSIGNED) != 0)

Side note: maybe we should add bool gtUnsigned() to GenTree, this stuff is nightmarish. Too bad we don't have unsigned relops.

mikedn · 2017-02-15T21:29:35Z

I'm not familiar enough with JIT VN to comment much on this but it looks like a rather involved change for little benefit. I wonder if it's truly worth it...

briansull · 2017-02-15T21:46:18Z

src/jit/compiler.h

+                                 // re-numbered with the array length to improve range check elimination
+
+    // Given a compare, look for a cse candidate arrlen feeding it and add a map entry if found.
+    void updateCseArrLenMap(GenTreePtr compare);


All of the other CSE related methods defined in compiler.h do start with an opt prefix, so it probably best to continue to do so here.

briansull · 2017-02-16T00:28:00Z

LGTM

This lines up better with other names in code that optimizes common sub- expressions.

JosephTremoulet · 2017-02-16T21:59:06Z

Updated with the suggested renames and factoring out signedness testing into helper IsUnsigned() (looking through gentree.h, it looked to me like IsFoo() was a more common pattern for naming flag test helpers than gtFoo()).

it looks like a rather involved change for little benefit. I wonder if it's truly worth it...

I'm inclined to check it in; the paranoia checks make the code a bit verbose, but that's mostly confined to a helper, and doesn't seem conceptually very complex. Throughput-wise, it fires sparsely enough (statements with comparisons and CSE candidate array lengths) to not seem worrisome, and loops walking through arrays are right at the top of our list of things that are worth spending compile-time to optimize...

mikedn · 2017-02-17T20:07:24Z

and loops walking through arrays are right at the top of our list of things that are worth spending compile-time to optimize...

Of course but sometimes I'm getting the impression that range checks are unfairly targeted. But I've seen cases where the lack of induction variable widening optimizations can be quite costly. Or the lack of proper BB layout in loops. And sometimes range checks can be quite bad but not because of the cost of the actual instructions they generate but because of the impact they have on the surrounding IR/code (e.g. they introduce additional lclvars).

…pareVN Update array length compare value numbers on CSE Commit migrated from dotnet/coreclr@5c7dcce

dnfclas added the cla-already-signed label Feb 15, 2017

JosephTremoulet force-pushed the UpdateCompareVN branch from 10e0f1c to 04e0c4c Compare February 15, 2017 20:46

mikedn reviewed Feb 15, 2017

View reviewed changes

briansull reviewed Feb 15, 2017

View reviewed changes

JosephTremoulet added 2 commits February 16, 2017 11:45

Factor signedness test into helper

98492cf

Rename cseArrLenMap to optCseArrLenMap

1dbbc0f

This lines up better with other names in code that optimizes common sub- expressions.

JosephTremoulet merged commit 5c7dcce into dotnet:master Feb 17, 2017

JosephTremoulet deleted the UpdateCompareVN branch February 17, 2017 19:40

karelz modified the milestone: 2.0.0 Aug 28, 2017

mikedn mentioned this pull request Dec 29, 2017

Stack<T> optimization of (Try)Peek, (Try)Pop and Push dotnet/corefx#26086

Merged

JosephTremoulet mentioned this pull request Jan 31, 2020

JIT optimization: value re-numbering in CSE doesn't help span loops as much as array loops dotnet/runtime#8177

Open

picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022

Merge pull request dotnet/coreclr#9615 from JosephTremoulet/UpdateCom…

384babb

…pareVN Update array length compare value numbers on CSE Commit migrated from dotnet/coreclr@5c7dcce

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update array length compare value numbers on CSE #9615

Update array length compare value numbers on CSE #9615

Uh oh!

JosephTremoulet commented Feb 15, 2017

Uh oh!

JosephTremoulet commented Feb 15, 2017

Uh oh!

mikedn Feb 15, 2017

Uh oh!

JosephTremoulet Feb 15, 2017

Uh oh!

mikedn Feb 15, 2017

Uh oh!

mikedn Feb 15, 2017

Uh oh!

mikedn commented Feb 15, 2017

Uh oh!

briansull Feb 15, 2017

Uh oh!

briansull commented Feb 16, 2017

Uh oh!

JosephTremoulet commented Feb 16, 2017

Uh oh!

mikedn commented Feb 17, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Update array length compare value numbers on CSE #9615

Update array length compare value numbers on CSE #9615

Uh oh!

Conversation

JosephTremoulet commented Feb 15, 2017

Uh oh!

JosephTremoulet commented Feb 15, 2017

Uh oh!

mikedn Feb 15, 2017

Choose a reason for hiding this comment

Uh oh!

JosephTremoulet Feb 15, 2017

Choose a reason for hiding this comment

Uh oh!

mikedn Feb 15, 2017

Choose a reason for hiding this comment

Uh oh!

mikedn Feb 15, 2017

Choose a reason for hiding this comment

Uh oh!

mikedn commented Feb 15, 2017

Uh oh!

briansull Feb 15, 2017

Choose a reason for hiding this comment

Uh oh!

briansull commented Feb 16, 2017

Uh oh!

JosephTremoulet commented Feb 16, 2017

Uh oh!

mikedn commented Feb 17, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants