Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: jump threading #46257

Merged
merged 11 commits into from
Jan 13, 2021
Merged

JIT: jump threading #46257

merged 11 commits into from
Jan 13, 2021

Conversation

AndyAyersMS
Copy link
Member

Optimize branches where a branch outcome is fully determined by a dominating
branch, and both true and false values can reach the dominated branch.

Optimize branches where a branch outcome is fully determined by a dominating
branch, and both true and false values can reach the dominated branch.
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Dec 19, 2020
@AndyAyersMS
Copy link
Member Author

cc @dotnet/jit-contrib

Note we expect this pattern to come up more frequently once we enable guarded devirtualization via PGO.

Jit diffs via PMI. Most regressions are extra CSEs. A few are places where LSRA does some odd spills.

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 50410511
Total bytes of diff: 50391339
Total bytes of delta: -19172 (-0.04% of base)
    diff is an improvement.

Top file regressions (bytes):
         484 : Microsoft.VisualBasic.Core.dasm (0.10% of base)
         132 : System.Collections.Concurrent.dasm (0.04% of base)
          16 : System.Collections.dasm (0.00% of base)
          12 : System.IO.Pipelines.dasm (0.02% of base)
           5 : System.IO.FileSystem.AccessControl.dasm (0.02% of base)
           3 : System.IO.Packaging.dasm (0.00% of base)
           3 : xunit.console.dasm (0.00% of base)
           3 : Microsoft.Extensions.Logging.Abstractions.dasm (0.01% of base)
           2 : System.Net.NameResolution.dasm (0.01% of base)

Top file improvements (bytes):
       -2352 : FSharp.Core.dasm (-0.07% of base)
       -1796 : Microsoft.CodeAnalysis.VisualBasic.dasm (-0.03% of base)
       -1615 : System.Linq.Expressions.dasm (-0.21% of base)
       -1512 : System.Data.Common.dasm (-0.10% of base)
       -1417 : System.Private.CoreLib.dasm (-0.03% of base)
       -1047 : System.Private.DataContractSerialization.dasm (-0.14% of base)
        -862 : System.Private.Xml.dasm (-0.02% of base)
        -717 : Microsoft.CodeAnalysis.CSharp.dasm (-0.02% of base)
        -702 : System.Security.Cryptography.Pkcs.dasm (-0.18% of base)
        -662 : System.Collections.Immutable.dasm (-0.06% of base)
        -351 : Newtonsoft.Json.dasm (-0.04% of base)
        -341 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm (-0.01% of base)
        -333 : Microsoft.CSharp.dasm (-0.09% of base)
        -315 : System.Threading.Tasks.Dataflow.dasm (-0.04% of base)
        -306 : System.Text.Json.dasm (-0.05% of base)
        -303 : System.Net.Http.dasm (-0.04% of base)
        -292 : System.ComponentModel.TypeConverter.dasm (-0.11% of base)
        -272 : System.Text.Encoding.CodePages.dasm (-0.39% of base)
        -272 : System.Security.Cryptography.Algorithms.dasm (-0.08% of base)
        -260 : System.Data.Odbc.dasm (-0.13% of base)

100 total files with Code Size differences (91 improved, 9 regressed), 169 unchanged.

Top method regressions (bytes):
          63 ( 2.43% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - LocalRewriter:RewriteForEachArrayOrString(BoundForEachStatement,ArrayBuilder`1,ArrayBuilder`1,bool,BoundExpression):this
          63 ( 3.60% of base) : System.Private.CoreLib.dasm - Comparer`1:System.Collections.IComparer.Compare(Object,Object):int:this (7 methods)
          55 ( 1.43% of base) : System.Collections.dasm - SortedList`2:System.Collections.IDictionary.Add(Object,Object):this (7 methods)
          54 ( 2.71% of base) : System.Collections.Concurrent.dasm - ConcurrentDictionary`2:System.Collections.IDictionary.set_Item(Object,Object):this (7 methods)
          48 ( 2.35% of base) : System.Collections.Concurrent.dasm - ConcurrentDictionary`2:System.Collections.IDictionary.Add(Object,Object):this (7 methods)
          46 ( 8.49% of base) : Microsoft.CodeAnalysis.dasm - DecimalFloatingPointString:FromSource(String):DecimalFloatingPointString
          43 ( 3.25% of base) : FSharp.Core.dasm - HashCompare:GenericEqualityObj(bool,IEqualityComparer,Object,Object):bool
          39 ( 2.41% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToBoolean(Object):bool
          39 ( 2.21% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToByte(Object):ubyte
          39 ( 2.16% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToSByte(Object):byte
          39 ( 2.15% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToShort(Object):short
          39 ( 2.25% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToUShort(Object):ushort
          39 ( 2.38% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToInteger(Object):int
          39 ( 2.36% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToUInteger(Object):int
          39 ( 2.45% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToLong(Object):long
          39 ( 2.38% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToULong(Object):long
          36 ( 2.21% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToSingle(Object,NumberFormatInfo):float
          36 ( 2.24% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToDouble(Object,NumberFormatInfo):double
          35 ( 2.51% of base) : System.Net.Http.dasm - Http3RequestStream:BufferHeaders(HttpRequestMessage):this
          33 ( 2.09% of base) : Microsoft.CodeAnalysis.CSharp.dasm - CSharpSemanticModel:GetTypeInfoForNode(BoundNode,BoundNode,BoundNode):CSharpTypeInfo:this

Top method improvements (bytes):
        -553 (-11.19% of base) : FSharp.Core.dasm - Parallel@1178-1:Invoke(AsyncActivation`1):AsyncReturn:this (7 methods)
        -317 (-25.50% of base) : System.Private.DataContractSerialization.dasm - XmlBaseWriter:WritePrimitiveValue(Object):this
        -317 (-24.90% of base) : System.Private.DataContractSerialization.dasm - XmlJsonWriter:WritePrimitiveValue(Object):this
        -222 (-3.91% of base) : System.Text.Json.dasm - JsonConverter`1:TryRead(byref,Type,JsonSerializerOptions,byref,byref):bool:this (7 methods)
        -216 (-10.41% of base) : System.Collections.Immutable.dasm - ImmutableExtensions:TryCopyTo(IEnumerable`1,ref,int):bool (7 methods)
        -169 (-27.80% of base) : System.Data.Common.dasm - XPathNodePointer:MoveToPreviousSibling():bool:this
        -166 (-4.98% of base) : System.Text.Json.dasm - JsonConverter`1:TryWrite(Utf8JsonWriter,byref,JsonSerializerOptions,byref):bool:this (7 methods)
        -139 (-21.92% of base) : System.Data.Common.dasm - LinqDataView:FindByKey(Object):int:this
        -139 (-19.97% of base) : System.Data.Common.dasm - LinqDataView:FindByKey(ref):int:this
        -126 (-7.17% of base) : System.Collections.Immutable.dasm - Enumerator:ResetStack():this (7 methods)
        -123 (-1.80% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - Binder:ReportOverloadResolutionFailureAndProduceBoundNode(VisualBasicSyntaxNode,int,ArrayBuilder`1,ImmutableArray`1,TypeSymbol,ImmutableArray`1,ImmutableArray`1,DiagnosticBag,VisualBasicSyntaxNode,BoundMethodOrPropertyGroup,Symbol,bool,BoundTypeExpression,Symbol,Location):BoundExpression:this
        -114 (-6.70% of base) : System.Security.Cryptography.Xml.dasm - EncryptedXml:GetDecryptionKey(EncryptedData,String):SymmetricAlgorithm:this
        -110 (-21.91% of base) : FSharp.Core.dasm - NullableOperators:op_QmarkEqualsQmark(Nullable`1,Nullable`1):bool (6 methods)
        -108 (-24.88% of base) : System.Private.Xml.dasm - Ucs4Decoder:Convert(ref,int,int,ref,int,int,bool,byref,byref,byref):this
        -103 (-6.41% of base) : System.Private.Xml.dasm - XmlSerializationReader:ParseArrayType(String):SoapArrayInfo:this
        -102 (-1.50% of base) : System.Private.Xml.Linq.dasm - <EvaluateIterator>d__1`1:MoveNext():bool:this (7 methods)
         -91 (-14.44% of base) : CommandLine.dasm - <>c__1`1:<FormatCommandLine>b__1_5(<>f__AnonymousType7`2):bool:this (7 methods)
         -90 (-10.23% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - SyntaxTreeSemanticModel:GetDeclaredSymbol(TypeParameterSyntax,CancellationToken):ITypeParameterSymbol:this
         -84 (-1.46% of base) : System.Threading.Tasks.Dataflow.dasm - BatchBlockTargetCore:ConsumeReservedMessagesGreedyBounded():this (7 methods)
         -84 (-6.46% of base) : Microsoft.Extensions.Caching.Abstractions.dasm - CacheExtensions:TryGetValue(IMemoryCache,Object,byref):bool (7 methods)

Top method regressions (percentages):
           9 ( 9.68% of base) : Microsoft.CSharp.dasm - ComEventsSink:Remove(ComEventsSink,ComEventsSink):ComEventsSink
           9 ( 9.68% of base) : System.Private.CoreLib.dasm - ComEventsSink:Remove(ComEventsSink,ComEventsSink):ComEventsSink
          46 ( 8.49% of base) : Microsoft.CodeAnalysis.dasm - DecimalFloatingPointString:FromSource(String):DecimalFloatingPointString
           7 ( 4.67% of base) : System.CodeDom.dasm - VBCodeGenerator:get_IsCurrentModule():bool:this
          23 ( 3.73% of base) : System.ComponentModel.TypeConverter.dasm - RectangleConverter:CreateInstance(ITypeDescriptorContext,IDictionary):Object:this
          63 ( 3.60% of base) : System.Private.CoreLib.dasm - Comparer`1:System.Collections.IComparer.Compare(Object,Object):int:this (7 methods)
          17 ( 3.48% of base) : System.Private.CoreLib.dasm - AppContextConfigHelper:GetInt32Config(String,int,bool):int
           7 ( 3.26% of base) : System.Private.CoreLib.dasm - Double:CompareTo(Object):int:this
           7 ( 3.26% of base) : System.Private.CoreLib.dasm - Single:CompareTo(Object):int:this
          43 ( 3.25% of base) : FSharp.Core.dasm - HashCompare:GenericEqualityObj(bool,IEqualityComparer,Object,Object):bool
           6 ( 3.17% of base) : System.Security.Cryptography.Cng.dasm - CngKey:SetProperty(CngProperty):this
          12 ( 3.14% of base) : System.ComponentModel.TypeConverter.dasm - PointConverter:CreateInstance(ITypeDescriptorContext,IDictionary):Object:this
          12 ( 3.14% of base) : System.ComponentModel.TypeConverter.dasm - SizeConverter:CreateInstance(ITypeDescriptorContext,IDictionary):Object:this
          19 ( 3.01% of base) : System.Drawing.Common.dasm - MarginsConverter:CreateInstance(ITypeDescriptorContext,IDictionary):Object:this
          11 ( 2.73% of base) : Microsoft.CodeAnalysis.CSharp.dasm - TypeSymbol:IsExplicitlyImplementedViaAccessors(Symbol,TypeSymbol,byref):bool
          54 ( 2.71% of base) : System.Collections.Concurrent.dasm - ConcurrentDictionary`2:System.Collections.IDictionary.set_Item(Object,Object):this (7 methods)
           3 ( 2.65% of base) : Microsoft.Extensions.Logging.Abstractions.dasm - EventId:Equals(Object):bool:this
           6 ( 2.58% of base) : Microsoft.CodeAnalysis.CSharp.dasm - MemberSemanticModel:GetLambdaParameterSymbol(ParameterSyntax,ExpressionSyntax,CancellationToken):ParameterSymbol:this
          35 ( 2.51% of base) : System.Net.Http.dasm - Http3RequestStream:BufferHeaders(HttpRequestMessage):this
          39 ( 2.45% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToLong(Object):long

Top method improvements (percentages):
         -12 (-46.15% of base) : System.IO.Compression.dasm - DeflateManagedStream:PurgeBuffers(bool):this
         -82 (-32.93% of base) : System.Private.CoreLib.dasm - MethodBuilder:Equals(Object):bool:this
         -52 (-32.50% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - LambdaUtilities:IsNonUserCodeQueryLambda(SyntaxNode):bool
         -27 (-31.76% of base) : System.Data.Common.dasm - DataPointer:IsFoliated(XmlNode):bool
         -25 (-31.25% of base) : System.Data.Common.dasm - XmlDataDocument:Foliate(XmlElement):this
         -25 (-31.25% of base) : System.Data.Common.dasm - XmlDataDocument:IsFoliated(XmlElement):bool:this
         -47 (-30.52% of base) : System.IO.MemoryMappedFiles.dasm - MemoryMappedViewStream:Flush():this
         -28 (-30.43% of base) : System.ComponentModel.TypeConverter.dasm - PropertyDescriptorCollection:System.Collections.IDictionary.get_Item(Object):Object:this
         -42 (-30.22% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - NameSyntax:get_Arity():int:this
         -25 (-28.09% of base) : System.ComponentModel.TypeConverter.dasm - PasswordPropertyTextAttribute:Equals(Object):bool:this
        -169 (-27.80% of base) : System.Data.Common.dasm - XPathNodePointer:MoveToPreviousSibling():bool:this
         -46 (-27.22% of base) : System.Private.Xml.dasm - ReflectionXmlSerializationWriter:IsDefaultValue(TypeMapping,Object,Object,bool):bool:this
         -25 (-26.88% of base) : System.Data.Common.dasm - DataColumnPropertyDescriptor:Equals(Object):bool:this
         -25 (-26.88% of base) : System.Data.Common.dasm - DataRelationPropertyDescriptor:Equals(Object):bool:this
         -25 (-26.88% of base) : System.Data.Common.dasm - DataTablePropertyDescriptor:Equals(Object):bool:this
         -59 (-26.70% of base) : System.Data.Common.dasm - ExpressionParser:ScanName(ushort,ushort,String):this
         -20 (-26.67% of base) : System.Data.Common.dasm - XPathNodePointer:IsFoliated(XmlNode):bool:this
        -317 (-25.50% of base) : System.Private.DataContractSerialization.dasm - XmlBaseWriter:WritePrimitiveValue(Object):this
         -27 (-25.23% of base) : System.ComponentModel.TypeConverter.dasm - StringConverter:ConvertFrom(ITypeDescriptorContext,CultureInfo,Object):Object:this
         -20 (-25.00% of base) : System.Data.Common.dasm - DataColumn:ToString():String:this

1772 total methods with Code Size differences (1600 improved, 172 regressed), 257078 unchanged.

@AndyAyersMS
Copy link
Member Author

Sample diff.

public static int Y(object o)
{
    int result = 0;
    if (o.GetType() == typeof(string)) result++;
    if (o.GetType() == typeof(string)) result++;
    if (o.GetType() == typeof(string)) result++;
    if (o.GetType() == typeof(string)) result++;
    return result;	
}
; BEFORE
;
; Assembly listing for method X:Y(Object):int
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; Final local variable assignments
;
;  V00 arg0         [V00,T02] (  3,  3   )     ref  ->  rcx         class-hnd
;  V01 loc0         [V01,T00] (  9,  5.50)     int  ->  rax        
;# V02 OutArgs      [V02    ] (  1,  1   )  lclBlk ( 0) [rsp+0x00]   "OutgoingArgSpace"
;  V03 cse0         [V03,T01] (  5,  5   )    long  ->  rdx         "CSE - aggressive"
;
; Lcl frame size = 0

G_M10236_IG01:
						;; bbWeight=1    PerfScore 0.00
G_M10236_IG02:
       xor      eax, eax
       mov      rdx, qword ptr [rcx]
       mov      rcx, 0xD1FFAB1E
       cmp      rdx, rcx
       jne      SHORT G_M10236_IG04
						;; bbWeight=1    PerfScore 3.75
G_M10236_IG03:
       mov      eax, 1
						;; bbWeight=0.50 PerfScore 0.12
G_M10236_IG04:
       mov      rcx, 0xD1FFAB1E
       cmp      rdx, rcx
       jne      SHORT G_M10236_IG06
						;; bbWeight=1    PerfScore 1.50
G_M10236_IG05:
       inc      eax
						;; bbWeight=0.50 PerfScore 0.12
G_M10236_IG06:
       mov      rcx, 0xD1FFAB1E
       cmp      rdx, rcx
       jne      SHORT G_M10236_IG08
						;; bbWeight=1    PerfScore 1.50
G_M10236_IG07:
       inc      eax
						;; bbWeight=0.50 PerfScore 0.12
G_M10236_IG08:
       mov      rcx, 0xD1FFAB1E
       cmp      rdx, rcx
       jne      SHORT G_M10236_IG10
						;; bbWeight=1    PerfScore 1.50
G_M10236_IG09:
       inc      eax
						;; bbWeight=0.50 PerfScore 0.12
G_M10236_IG10:
       ret      

; Total bytes of code 77, prolog size 0, PerfScore 17.45, instruction count 19
; AFTER
;
; Assembly listing for method X:Y(Object):int
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; Final local variable assignments
;
;  V00 arg0         [V00,T01] (  3,  3   )     ref  ->  rcx         class-hnd
;  V01 loc0         [V01,T00] (  9,  9   )     int  ->  rax        
;# V02 OutArgs      [V02    ] (  1,  1   )  lclBlk ( 0) [rsp+0x00]   "OutgoingArgSpace"
;
; Lcl frame size = 0

G_M10236_IG01:
						;; bbWeight=1    PerfScore 0.00
G_M10236_IG02:
       xor      eax, eax
       mov      rdx, 0xD1FFAB1E
       cmp      qword ptr [rcx], rdx
       jne      SHORT G_M10236_IG03
       mov      eax, 1
       inc      eax
       inc      eax
       inc      eax
						;; bbWeight=1    PerfScore 4.50
G_M10236_IG03:
       ret      

; Total bytes of code 29, prolog size 0, PerfScore 8.40, instruction count 9

Note the lack downstream opts; forward-sub or similar would help clean this up. But the control flow is simplified.

@AndyAyersMS
Copy link
Member Author

Updated diffs now that we're correctly rejecting cases with both an ambiguous pred and a fall through.

PMI CodeSize Diffs for System.Private.CoreLib.dll, framework assemblies for x64 default jit

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 50409072
Total bytes of diff: 50392478
Total bytes of delta: -16594 (-0.03% of base)
    diff is an improvement.

Top file regressions (bytes):
         494 : Microsoft.VisualBasic.Core.dasm (0.10% of base)
           3 : xunit.console.dasm (0.00% of base)
           3 : Microsoft.Extensions.Logging.Abstractions.dasm (0.01% of base)
           2 : System.Net.NameResolution.dasm (0.01% of base)

Top file improvements (bytes):
       -1785 : Microsoft.CodeAnalysis.VisualBasic.dasm (-0.03% of base)
       -1632 : FSharp.Core.dasm (-0.05% of base)
       -1613 : System.Linq.Expressions.dasm (-0.21% of base)
       -1327 : System.Private.CoreLib.dasm (-0.03% of base)
        -939 : System.Data.Common.dasm (-0.06% of base)
        -704 : System.Security.Cryptography.Pkcs.dasm (-0.18% of base)
        -653 : Microsoft.CodeAnalysis.CSharp.dasm (-0.01% of base)
        -536 : System.Collections.Immutable.dasm (-0.05% of base)
        -515 : System.Private.Xml.dasm (-0.01% of base)
        -442 : System.Text.Json.dasm (-0.08% of base)
        -413 : System.Private.DataContractSerialization.dasm (-0.05% of base)
        -337 : Microsoft.CSharp.dasm (-0.09% of base)
        -323 : System.ComponentModel.TypeConverter.dasm (-0.12% of base)
        -323 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm (-0.01% of base)
        -315 : System.Threading.Tasks.Dataflow.dasm (-0.04% of base)
        -312 : Newtonsoft.Json.dasm (-0.04% of base)
        -295 : System.Net.Http.dasm (-0.04% of base)
        -286 : Microsoft.CodeAnalysis.dasm (-0.02% of base)
        -270 : System.Security.Cryptography.Algorithms.dasm (-0.08% of base)
        -252 : System.Text.Encoding.CodePages.dasm (-0.36% of base)

94 total files with Code Size differences (90 improved, 4 regressed), 175 unchanged.

Top method regressions (bytes):
          63 ( 2.43% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - LocalRewriter:RewriteForEachArrayOrString(BoundForEachStatement,ArrayBuilder`1,ArrayBuilder`1,bool,BoundExpression):this
          63 ( 3.60% of base) : System.Private.CoreLib.dasm - Comparer`1:System.Collections.IComparer.Compare(Object,Object):int:this (7 methods)
          39 ( 2.41% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToBoolean(Object):bool
          39 ( 2.21% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToByte(Object):ubyte
          39 ( 2.16% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToSByte(Object):byte
          39 ( 2.15% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToShort(Object):short
          39 ( 2.25% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToUShort(Object):ushort
          39 ( 2.38% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToInteger(Object):int
          39 ( 2.36% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToUInteger(Object):int
          39 ( 2.45% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToLong(Object):long
          39 ( 2.38% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToULong(Object):long
          36 ( 2.21% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToSingle(Object,NumberFormatInfo):float
          36 ( 2.24% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToDouble(Object,NumberFormatInfo):double
          35 ( 2.51% of base) : System.Net.Http.dasm - Http3RequestStream:BufferHeaders(HttpRequestMessage):this
          33 ( 1.70% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToDecimal(Object,NumberFormatInfo):Decimal
          33 ( 2.09% of base) : Microsoft.CodeAnalysis.CSharp.dasm - CSharpSemanticModel:GetTypeInfoForNode(BoundNode,BoundNode,BoundNode):CSharpTypeInfo:this
          31 ( 1.19% of base) : Newtonsoft.Json.dasm - JValue:Compare(int,Object,Object):int
          24 ( 2.09% of base) : Microsoft.VisualBasic.Core.dasm - BooleanType:FromObject(Object):bool
          24 ( 0.52% of base) : System.DirectoryServices.AccountManagement.dasm - ADStoreCtx:GetGroupsMemberOf(Principal):ResultSet:this
          24 ( 1.21% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - Parser:ParseProcDeclareStatement(SyntaxList`1,SyntaxList`1):DeclareStatementSyntax:this

Top method improvements (bytes):
        -553 (-11.19% of base) : FSharp.Core.dasm - Parallel@1178-1:Invoke(AsyncActivation`1):AsyncReturn:this (7 methods)
        -222 (-3.91% of base) : System.Text.Json.dasm - JsonConverter`1:TryRead(byref,Type,JsonSerializerOptions,byref,byref):bool:this (7 methods)
        -216 (-10.41% of base) : System.Collections.Immutable.dasm - ImmutableExtensions:TryCopyTo(IEnumerable`1,ref,int):bool (7 methods)
        -166 (-4.98% of base) : System.Text.Json.dasm - JsonConverter`1:TryWrite(Utf8JsonWriter,byref,JsonSerializerOptions,byref):bool:this (7 methods)
        -139 (-21.92% of base) : System.Data.Common.dasm - LinqDataView:FindByKey(Object):int:this
        -139 (-19.97% of base) : System.Data.Common.dasm - LinqDataView:FindByKey(ref):int:this
        -123 (-1.80% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - Binder:ReportOverloadResolutionFailureAndProduceBoundNode(VisualBasicSyntaxNode,int,ArrayBuilder`1,ImmutableArray`1,TypeSymbol,ImmutableArray`1,ImmutableArray`1,DiagnosticBag,VisualBasicSyntaxNode,BoundMethodOrPropertyGroup,Symbol,bool,BoundTypeExpression,Symbol,Location):BoundExpression:this
        -114 (-6.70% of base) : System.Security.Cryptography.Xml.dasm - EncryptedXml:GetDecryptionKey(EncryptedData,String):SymmetricAlgorithm:this
        -102 (-1.50% of base) : System.Private.Xml.Linq.dasm - <EvaluateIterator>d__1`1:MoveNext():bool:this (7 methods)
         -91 (-14.44% of base) : CommandLine.dasm - <>c__1`1:<FormatCommandLine>b__1_5(<>f__AnonymousType7`2):bool:this (7 methods)
         -90 (-10.23% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - SyntaxTreeSemanticModel:GetDeclaredSymbol(TypeParameterSyntax,CancellationToken):ITypeParameterSymbol:this
         -84 (-1.46% of base) : System.Threading.Tasks.Dataflow.dasm - BatchBlockTargetCore:ConsumeReservedMessagesGreedyBounded():this (7 methods)
         -84 (-6.46% of base) : Microsoft.Extensions.Caching.Abstractions.dasm - CacheExtensions:TryGetValue(IMemoryCache,Object,byref):bool (7 methods)
         -83 (-17.47% of base) : FSharp.Core.dasm - NullableOperators:op_QmarkEqualsQmark(Nullable`1,Nullable`1):bool (6 methods)
         -83 (-3.98% of base) : System.DirectoryServices.AccountManagement.dasm - PrincipalValueCollection`1:RemoveAt(int):this (7 methods)
         -83 (-1.18% of base) : System.Threading.Tasks.Dataflow.dasm - BatchBlockTargetCore:ConsumeReservedMessagesNonGreedy():this (7 methods)
         -83 (-7.27% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - TraceGC:GetFreeListEfficiency(List`1,TraceGC):FreeListEfficiency
         -82 (-32.93% of base) : System.Private.CoreLib.dasm - MethodBuilder:Equals(Object):bool:this
         -80 (-0.42% of base) : System.Data.Common.dasm - BinaryNode:EvalBinaryOp(int,ExpressionNode,ExpressionNode,DataRow,int,ref):Object:this
         -78 (-11.00% of base) : FSharp.Core.dasm - IntrinsicFunctions:TypeTestGeneric(Object):bool (7 methods)

Top method regressions (percentages):
          63 ( 3.60% of base) : System.Private.CoreLib.dasm - Comparer`1:System.Collections.IComparer.Compare(Object,Object):int:this (7 methods)
          17 ( 3.48% of base) : System.Private.CoreLib.dasm - AppContextConfigHelper:GetInt32Config(String,int,bool):int
           3 ( 2.65% of base) : Microsoft.Extensions.Logging.Abstractions.dasm - EventId:Equals(Object):bool:this
          35 ( 2.51% of base) : System.Net.Http.dasm - Http3RequestStream:BufferHeaders(HttpRequestMessage):this
          39 ( 2.45% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToLong(Object):long
          63 ( 2.43% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - LocalRewriter:RewriteForEachArrayOrString(BoundForEachStatement,ArrayBuilder`1,ArrayBuilder`1,bool,BoundExpression):this
          39 ( 2.41% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToBoolean(Object):bool
          39 ( 2.38% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToULong(Object):long
          39 ( 2.38% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToInteger(Object):int
          39 ( 2.36% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToUInteger(Object):int
           3 ( 2.27% of base) : xunit.console.dasm - Dependency:Equals(Object):bool:this
           3 ( 2.27% of base) : Microsoft.Extensions.DependencyModel.dasm - Dependency:Equals(Object):bool:this
          17 ( 2.25% of base) : Newtonsoft.Json.dasm - BooleanQueryExpression:EqualsWithStringCoercion(JValue,JValue):bool
          39 ( 2.25% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToUShort(Object):ushort
          36 ( 2.24% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToDouble(Object,NumberFormatInfo):double
          39 ( 2.21% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToByte(Object):ubyte
          36 ( 2.21% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToSingle(Object,NumberFormatInfo):float
          18 ( 2.18% of base) : System.Private.CoreLib.dasm - EqualityComparer`1:System.Collections.IEqualityComparer.GetHashCode(Object):int:this (7 methods)
          39 ( 2.16% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToSByte(Object):byte
          39 ( 2.15% of base) : Microsoft.VisualBasic.Core.dasm - Conversions:ToShort(Object):short

Top method improvements (percentages):
         -12 (-46.15% of base) : System.IO.Compression.dasm - DeflateManagedStream:PurgeBuffers(bool):this
         -82 (-32.93% of base) : System.Private.CoreLib.dasm - MethodBuilder:Equals(Object):bool:this
         -52 (-32.50% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - LambdaUtilities:IsNonUserCodeQueryLambda(SyntaxNode):bool
         -25 (-31.25% of base) : System.Data.Common.dasm - XmlDataDocument:Foliate(XmlElement):this
         -25 (-31.25% of base) : System.Data.Common.dasm - XmlDataDocument:IsFoliated(XmlElement):bool:this
         -47 (-30.52% of base) : System.IO.MemoryMappedFiles.dasm - MemoryMappedViewStream:Flush():this
         -28 (-30.43% of base) : System.ComponentModel.TypeConverter.dasm - PropertyDescriptorCollection:System.Collections.IDictionary.get_Item(Object):Object:this
         -42 (-30.22% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - NameSyntax:get_Arity():int:this
         -25 (-28.09% of base) : System.ComponentModel.TypeConverter.dasm - PasswordPropertyTextAttribute:Equals(Object):bool:this
         -46 (-27.22% of base) : System.Private.Xml.dasm - ReflectionXmlSerializationWriter:IsDefaultValue(TypeMapping,Object,Object,bool):bool:this
         -25 (-26.88% of base) : System.Data.Common.dasm - DataColumnPropertyDescriptor:Equals(Object):bool:this
         -25 (-26.88% of base) : System.Data.Common.dasm - DataRelationPropertyDescriptor:Equals(Object):bool:this
         -25 (-26.88% of base) : System.Data.Common.dasm - DataTablePropertyDescriptor:Equals(Object):bool:this
         -27 (-25.23% of base) : System.ComponentModel.TypeConverter.dasm - StringConverter:ConvertFrom(ITypeDescriptorContext,CultureInfo,Object):Object:this
         -20 (-25.00% of base) : System.Data.Common.dasm - DataColumn:ToString():String:this
         -36 (-24.83% of base) : System.Private.DataContractSerialization.dasm - XPathQueryGenerator:ProcessDataContract(DataContract,ExportContext,MemberInfo):DataContract
         -52 (-22.22% of base) : System.Security.Cryptography.Xml.dasm - XmlDecryptionTransform:get_EncryptedXml():EncryptedXml:this
        -139 (-21.92% of base) : System.Data.Common.dasm - LinqDataView:FindByKey(Object):int:this
         -10 (-21.28% of base) : System.Private.CoreLib.dasm - Index:ToString():String:this
         -10 (-20.00% of base) : System.Security.Cryptography.Csp.dasm - SHA1CryptoServiceProvider:Dispose(bool):this

1461 total methods with Code Size differences (1371 improved, 90 regressed), 257390 unchanged.

@EgorBo
Copy link
Member

EgorBo commented Dec 21, 2020

Nice! Btw, do we have a tracking issue to fold this (from your sample):

       mov      eax, 1
       inc      eax
       inc      eax
       inc      eax

into

       mov      eax, 4

@EgorBo
Copy link
Member

EgorBo commented Dec 21, 2020

And from my understanding it only works when conditions have the same VN? e.g. the following won't work?

public static int Test(int o)
{
    int result = 0;
    if (o > 100)
    {
        if (o > 10) result++;
    }
    return result;
}

or e.g.:

public static int Test(object o)
{
    int result = 0;
    if (o.GetType() != typeof(string))
    {
        if (o.GetType() == typeof(string)) result++; // is never taken
    }
    return result;
}

Not sure it will increase the jit-diff though.

@AndyAyersMS
Copy link
Member Author

do we have a tracking issue to fold this

No, not specifically this. We have #6973 but that would be happening upstream.

There is no general framework in the jit for cross-statement expression optimization. Ideally perhaps SSA should be leveraged to make look like we have built cross-tree edges and could operate on the factored-out expression opts in morph, but we're quite a ways from anything like that.

it only works when conditions have the same VN?

Correct.

I have a more general version in the works that can handle some cases where the VN differs.

@AndyAyersMS
Copy link
Member Author

Small revision to account for all preds, and only note the fall through pred when it's not an ambiguous pred.

Only one new diff from the results above because EH preds are now properly counted as ambiguous preds. I suspect the EH flow checks are a bit too stringent, but will leave as is for now.

@AndyAyersMS
Copy link
Member Author

@EgorBo preview of the more general version, that tries to handle cases where knowing the value of one relop implies you know the value of another: AndyAyersMS/runtime@AndyAyersMS:JumpThreading...AndyAyersMS:RelopImpliesRelop

Not yet sure if the extra complexity is worth the trouble... also probably not yet entirely bug free.

@EgorBo
Copy link
Member

EgorBo commented Dec 23, 2020

@EgorBo preview of the more general version, that tries to handle cases where knowing the value of one relop implies you know the value of another: AndyAyersMS/runtime@AndyAyersMS:JumpThreading...AndyAyersMS:RelopImpliesRelop

Not yet sure if the extra complexity is worth the trouble... also probably not yet entirely bug free.

Nice! Will try to play with it locally 🙂

Copy link
Contributor

@sandreenko sandreenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions/comments.

src/coreclr/jit/redundantbranchopts.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/redundantbranchopts.cpp Outdated Show resolved Hide resolved
// However, be conservative if block is in a try as we might not
// have a full picture of EH flow.
//
if (!block->hasTryIndex())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: is not it enough to check that all blocks have the same try index?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question.

I need to take a deeper look at how the jit models EH flow. It seems like some aspects are incomplete or approximate and so it might be dangerous to make assumptions when fgReachable returns false. I think we are safeguarded here and in the previous work I did because dominance information in the flow graph is also approximate in a way. But I am not 100% sure.

For example, in a case like:

    public static int Main(string[] args)
    {
        int x = args.Length;
        int y = 0;

        try 
        {
            if (x == 0)
            {
                throw new Exception();
            }
        }
        catch (Exception e)
        {
        }

        if (x == 0)
        {
            y = 100;
        }

        return y;
    }

We possibly could see that the second (x == 0) is dominated by the first (x == 0), and if we did, fgReachable might only see the false path between the two, because the true path involves a throw/catch. If this happened we might mistakenly believe the second compare is always false. However, when we build dominators the second compare is not dominated by the first, so we never consider jump threading at all.

After computing reachability sets:
------------------------------------------------
BBnum  Reachable by 
------------------------------------------------
BB01 : BB01 
BB02 : BB01 BB02 
BB03 : BB01 BB02 BB03 
BB04 : BB01 BB02 BB04 BB07 
BB05 : BB01 BB02 BB04 BB05 BB07 
BB06 : BB01 BB02 BB04 BB05 BB06 BB07 
BB07 : BB07 

After computing reachability:

------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight    lp [IL range]     [jump]      [EH region]      
------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1       [000..006)                                   
BB02 [0001]  1  0    BB01                  1       [006..009)-> BB04 ( cond ) T0      try {     
BB03 [0002]  1  0    BB02                  0       [009..00F)        (throw ) T0      }         
BB04 [0005]  2       BB02,BB07             1       [014..017)-> BB06 ( cond )                   
BB05 [0006]  1       BB04                  1       [017..01A)                                   
BB06 [0007]  2       BB04,BB05             1       [01A..01C)        (return)                   
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++        
BB07 [0004]  1     0                       1       [011..014)-> BB04 ( cret )    H0 F catch { } 
------------------------------------------------------------------------------------------------

*************** In fgComputeDoms

Dominator computation start blocks (those blocks with no incoming edges):
BB01 BB07 
------------------------------------------------
BBnum  Dominated by
------------------------------------------------
BB07:  BB07 
BB01:  BB01 
BB02:  BB02 BB01 
BB03:  BB03 BB02 BB01 
BB04:  BB04 
BB05:  BB05 BB04 
BB06:  BB06 BB04 

Inside fgBuildDomTree

After computing the Dominance Tree:
BB01 : BB02 
BB02 : BB03 
BB04 : BB06 BB05 

Note in the above that BB04 is not dominated by BB02, because its pred BB07 is an "eh entry". But what's odd is that neither BB04 or BB07 are dominated by BB01 but reachability says they are reachable from BB01.

So mixing information from fgComputeDoms and fgReachable may be risky.

I can't come up with an example with the current opts where we'd make a bad assumption, but if/ when we extend things to handle "related" compares then perhaps we'll run into trouble.

What's also interesting is that SSA has its own dominance computation, and does it differently and seemingly a bit more accurately; here BB01 properly dominates BB04 and BB07. Seems like we ought not to have two different notions of dominance in the jit.

*************** In SsaBuilder::Build()
[SsaBuilder] Max block count is 8.

------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight    lp [IL range]     [jump]      [EH region]      
------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1       [000..006)                                   
BB02 [0001]  1  0    BB01                  1       [006..009)-> BB04 ( cond ) T0      try {     
BB03 [0002]  1  0    BB02                  0       [009..00F)        (throw ) T0      }         
BB04 [0005]  2       BB02,BB07             1       [014..017)-> BB06 ( cond )                   
BB05 [0006]  1       BB04                  1       [017..01A)                                   
BB06 [0007]  2       BB04,BB05             1       [01A..01C)        (return)                   
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
BB07 [0004]  1     0                       0       [011..014)-> BB04 ( cret )    H0 F catch { } 
------------------------------------------------------------------------------------------------

***************  Exception Handling table
index  eTry, eHnd
  0  ::            - Try at BB02..BB03 [006..011), Handler at BB07..BB07 [011..014)
[SsaBuilder] Topologically sorted the graph.
[SsaBuilder::ComputeImmediateDom]

Inside fgBuildDomTree

After computing the Dominance Tree:
BB01 : BB07 BB04 BB02 
BB02 : BB03 
BB04 : BB06 BB05 

I'm not going to have time to unravel this before the holidays, so will come back to it in the new year.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am going to add a bespoke reachability computation (optReachable) for this.

src/coreclr/jit/redundantbranchopts.cpp Outdated Show resolved Hide resolved
BasicBlock* const predBlock = pred->flBlock;

const bool isTruePred =
((predBlock == domBlock) && (trueSuccessor == block)) || fgReachable(trueSuccessor, predBlock);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does not fgReachable have linear complexity in the worst case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's pretty cheap, basically a bit vector check.

// Since flow is going to bypass block, make sure there
// is nothing in block that can cause a side effect.
//
// Note we neglect PHI assignments. This reflects a general lack of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: What does it cause? Is it a correctness issue or CQ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I believe it's just CQ, causing us to miss opportunities downstream (say range prop, which chases back through PHIs, will see more viable paths than actually exist). But hard to be 100% confident.

We really should be more explicit about the ways the optimizer relies on stale analysis data and potentially stale IR, and what (if any) update strategies we employ.

BasicBlock* const predBlock = pred->flBlock;
numPreds++;

// We don't do switch updates, yet.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: why can't we leave the switches flowing into the original blocks as we do with other AmbiguousPreds?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should just add handling for switches; it's not that hard, just a different kind of update.

Copy link
Contributor

@sandreenko sandreenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (after JITDUMP fixes).

@AndyAyersMS AndyAyersMS mentioned this pull request Jan 7, 2021
54 tasks
@AndyAyersMS
Copy link
Member Author

With the new reachability computation, similar diffs to what I posted above (note I've taken a few merges so input assemblies have changed)

;;; old
1461 total methods with Code Size differences (1371 improved, 90 regressed), 257390 unchanged.
;;; new
1482 total methods with Code Size differences (1391 improved, 91 regressed), 260022 unchanged.

@AndyAyersMS
Copy link
Member Author

Still want to make a few more changes (switches, and perhaps we can now assert that if x dominates y, at least one of x's successors must reach y.).

@AndyAyersMS
Copy link
Member Author

Took a long look at asserting that there now must be a path from dominator to dominated block, but that doesn't hold up as we optimize, as we can create unreachable blocks and subgraphs. So just added some comments, and we'll just tolerate those rare cases where we can't find paths.

@AndyAyersMS AndyAyersMS merged commit a568dcc into dotnet:master Jan 13, 2021
@AndyAyersMS AndyAyersMS deleted the JumpThreading branch January 13, 2021 01:28
@ghost ghost locked as resolved and limited conversation to collaborators Feb 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants