Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement forward substitution utility to create larger trees for morph #6973

Closed
russellhadley opened this issue Nov 14, 2016 · 7 comments
Closed
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions optimization tenet-performance Performance related issue
Milestone

Comments

@russellhadley
Copy link
Contributor

russellhadley commented Nov 14, 2016

Extend RyuJIT to use forward substitution to expand the impact of morph optimizations.

category:cq
theme:optimization
skill-level:expert
cost:extra-large

@russellhadley
Copy link
Contributor Author

@erozenfeld for context.

@AndyAyersMS
Copy link
Member

Also see #4655

@fiigii
Copy link
Contributor

fiigii commented Apr 25, 2018

@AndyAyersMS Do you have any plan to add forward substitution? That would very good to optimize slow-LEA on x86/x64 architectures.

@AndyAyersMS
Copy link
Member

We are starting to form up plans for 2.2. I am trying to get a handle on all the issues already open, and as part of this have been trying to cross-link similar things.

If you have examples in mind where we should generate different code, please make sure they are added to the relevant issues.

@EgorBo
Copy link
Member

EgorBo commented Mar 27, 2021

A very simple implementation: EgorBo@1b57e2d

jit-diff :

PMI CodeSize Diffs for System.Private.CoreLib.dll, framework assemblies for  default jit

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 52988913
Total bytes of diff: 52980100
Total bytes of delta: -8813 (-0.02% of base)
    diff is an improvement.


Top file regressions (bytes):
        1849 : System.Memory.dasm (0.68% of base)
          33 : System.Diagnostics.FileVersionInfo.dasm (0.86% of base)
          15 : System.Net.HttpListener.dasm (0.01% of base)
          10 : System.IO.Compression.dasm (0.01% of base)
           8 : System.Runtime.Numerics.dasm (0.01% of base)
           6 : System.Console.dasm (0.01% of base)
           3 : System.Threading.Tasks.Parallel.dasm (0.00% of base)

Top file improvements (bytes):
       -4388 : System.Management.dasm (-1.15% of base)
       -1196 : System.Collections.Immutable.dasm (-0.09% of base)
       -1186 : System.Private.CoreLib.dasm (-0.03% of base)
        -381 : Microsoft.VisualBasic.Core.dasm (-0.08% of base)
        -305 : System.Threading.Tasks.Dataflow.dasm (-0.03% of base)
        -278 : System.Collections.dasm (-0.05% of base)
        -244 : System.Collections.Concurrent.dasm (-0.06% of base)
        -238 : System.Linq.dasm (-0.02% of base)
        -184 : System.Threading.Channels.dasm (-0.10% of base)
        -169 : Microsoft.CodeAnalysis.CSharp.dasm (-0.00% of base)
        -158 : Newtonsoft.Json.dasm (-0.02% of base)
        -146 : System.Linq.Parallel.dasm (-0.01% of base)
        -145 : System.Linq.Expressions.dasm (-0.02% of base)
        -107 : System.DirectoryServices.dasm (-0.02% of base)
        -107 : Microsoft.CodeAnalysis.dasm (-0.01% of base)
        -103 : System.Data.Common.dasm (-0.01% of base)
        -101 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm (-0.00% of base)
         -91 : System.Private.Xml.dasm (-0.00% of base)
         -89 : System.Net.Http.dasm (-0.01% of base)
         -85 : Microsoft.Diagnostics.FastSerialization.dasm (-0.08% of base)

67 total files with Code Size differences (60 improved, 7 regressed), 204 unchanged.

Top method regressions (bytes):
         317 ( 5.46% of base) : System.Memory.dasm - ReadOnlySequence`1:Slice(long,SequencePosition):ReadOnlySequence`1:this (8 methods)
         313 ( 5.60% of base) : System.Memory.dasm - ReadOnlySequence`1:Slice(long,long):ReadOnlySequence`1:this (8 methods)
         265 ( 4.66% of base) : System.Memory.dasm - ReadOnlySequence`1:Slice(SequencePosition,long):ReadOnlySequence`1:this (8 methods)
         184 ( 9.17% of base) : System.Memory.dasm - ReadOnlySequence`1:BoundsCheck(byref,bool):this (8 methods)
         184 (13.99% of base) : System.Memory.dasm - ReadOnlySequence`1:TryGetArray(byref):bool:this (8 methods)
         181 ( 9.49% of base) : System.Memory.dasm - ReadOnlySequence`1:Seek(long,int):SequencePosition:this (8 methods)
         163 ( 5.67% of base) : System.Private.Xml.dasm - XmlSerializationReaderILGen:WriteMemberBegin(ref):this
         120 ( 9.66% of base) : System.Memory.dasm - ReadOnlySequence`1:GetLength():long:this (8 methods)
         103 (41.70% of base) : System.Private.CoreLib.dasm - Vector:ConvertToDouble(Vector`1):Vector`1 (2 methods)
          96 ( 2.23% of base) : System.Memory.dasm - ReadOnlySequence`1:TryGetBuffer(byref,byref,byref):bool:this (8 methods)
          96 ( 3.85% of base) : System.Memory.dasm - ReadOnlySequence`1:BoundsCheck(int,Object,int,Object):this (8 methods)
          93 ( 5.02% of base) : System.Memory.dasm - ReadOnlySequence`1:Seek(byref,long):SequencePosition:this (8 methods)
          41 ( 3.89% of base) : Microsoft.CodeAnalysis.CSharp.dasm - CodeGenerator:EmitBinaryCondOperator(BoundBinaryOperator,bool):this
          40 ( 1.63% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - LocalRewriter:RewriteLiftedBooleanBinaryOperator(BoundBinaryOperator,BoundExpression,BoundExpression,bool,bool,bool,bool):BoundExpression:this
          33 ( 1.77% of base) : System.Diagnostics.FileVersionInfo.dasm - FileVersionInfo:GetVersionInfoForCodePage(long,String):bool:this
          31 ( 0.77% of base) : System.Management.dasm - ManagementClassGenerator:GenerateInitializeObject():this
          29 ( 2.98% of base) : Microsoft.CodeAnalysis.CSharp.dasm - LocalRewriter:MakeBuiltInIncrementOperator(BoundIncrementOperator,BoundExpression):BoundExpression:this
          26 ( 2.77% of base) : System.Speech.dasm - RecognizedPhrase:get_Words():ReadOnlyCollection`1:this
          24 ( 0.44% of base) : System.Management.dasm - ManagementClassGenerator:GenerateCollectionClass():this
          21 (11.41% of base) : Microsoft.CodeAnalysis.dasm - BitVector:set_Item(int,bool):this

Top method improvements (bytes):
       -2556 (-15.94% of base) : System.Management.dasm - ManagementClassGenerator:GenerateTypeConverterClass():CodeTypeDeclaration:this
        -539 (-4.84% of base) : System.Management.dasm - ManagementClassGenerator:AddToDMTFTimeIntervalFunction():this
        -388 (-3.93% of base) : System.Management.dasm - ManagementClassGenerator:AddToTimeSpanFunction():this
        -342 (-11.67% of base) : System.Private.CoreLib.dasm - Vector64`1:ToString():String:this (6 methods)
        -259 (-18.99% of base) : System.Private.CoreLib.dasm - Vector:Widen(Vector`1,byref,byref) (7 methods)
        -222 (-1.44% of base) : System.Management.dasm - ManagementClassGenerator:AddToDateTimeFunction():this
        -187 (-3.65% of base) : System.Collections.Immutable.dasm - ImmutableHashSet`1:IsProperSupersetOf(IEnumerable`1,MutationInput):bool (8 methods)
        -167 (-5.81% of base) : System.Management.dasm - ManagementClassGenerator:AddPropertySet(CodeIndexerExpression,bool,CodeStatementCollection,String,CodeVariableReferenceExpression):this
        -152 (-6.22% of base) : System.Management.dasm - ManagementClassGenerator:AddGetStatementsForEnumArray(CodeIndexerExpression,CodeMemberProperty):this
        -132 (-4.80% of base) : Microsoft.VisualBasic.Core.dasm - StringType:StrLikeText(String,String):bool
        -132 (-2.55% of base) : System.Management.dasm - ManagementClassGenerator:GenerateGetInstancesWithScope():this
        -132 (-2.97% of base) : System.Collections.Immutable.dasm - ImmutableDictionary`2:ContainsValue(Nullable`1):bool:this (8 methods)
        -132 (-3.25% of base) : System.Collections.Immutable.dasm - Builder:ContainsValue(Nullable`1):bool:this (16 methods)
        -127 (-4.20% of base) : System.Collections.Immutable.dasm - Node:TrueForAll(Predicate`1):bool:this (8 methods)
        -123 (-2.28% of base) : System.Collections.dasm - SortedSet`1:Overlaps(IEnumerable`1):bool:this (8 methods)
        -121 (-2.78% of base) : System.Collections.Immutable.dasm - ImmutableSortedSet`1:IsProperSupersetOf(IEnumerable`1):bool:this (8 methods)
        -121 (-3.08% of base) : System.Collections.Immutable.dasm - ImmutableSortedSet`1:IsSupersetOf(IEnumerable`1):bool:this (8 methods)
        -119 (-3.95% of base) : System.Collections.Immutable.dasm - Node:Exists(Predicate`1):bool:this (8 methods)
        -114 (-4.03% of base) : System.Collections.Immutable.dasm - Node:ContainsValue(Nullable`1,IEqualityComparer`1):bool:this (8 methods)
        -113 (-2.76% of base) : System.Collections.Immutable.dasm - ImmutableSortedSet`1:Overlaps(IEnumerable`1):bool:this (8 methods)

Top method regressions (percentages):
         103 (41.70% of base) : System.Private.CoreLib.dasm - Vector:ConvertToDouble(Vector`1):Vector`1 (2 methods)
          20 (14.71% of base) : System.Private.CoreLib.dasm - Vector:ConvertToUInt64(Vector`1):Vector`1
         184 (13.99% of base) : System.Memory.dasm - ReadOnlySequence`1:TryGetArray(byref):bool:this (8 methods)
          16 (13.68% of base) : System.Private.CoreLib.dasm - Vector:ConvertToInt64(Vector`1):Vector`1
          21 (11.41% of base) : Microsoft.CodeAnalysis.dasm - BitVector:set_Item(int,bool):this
         120 ( 9.66% of base) : System.Memory.dasm - ReadOnlySequence`1:GetLength():long:this (8 methods)
         181 ( 9.49% of base) : System.Memory.dasm - ReadOnlySequence`1:Seek(long,int):SequencePosition:this (8 methods)
         184 ( 9.17% of base) : System.Memory.dasm - ReadOnlySequence`1:BoundsCheck(byref,bool):this (8 methods)
           3 ( 6.67% of base) : System.Threading.Tasks.Parallel.dasm - TaskReplicator:GenerateCooperativeMultitaskingTaskTimeout():int
         163 ( 5.67% of base) : System.Private.Xml.dasm - XmlSerializationReaderILGen:WriteMemberBegin(ref):this
         313 ( 5.60% of base) : System.Memory.dasm - ReadOnlySequence`1:Slice(long,long):ReadOnlySequence`1:this (8 methods)
         317 ( 5.46% of base) : System.Memory.dasm - ReadOnlySequence`1:Slice(long,SequencePosition):ReadOnlySequence`1:this (8 methods)
          93 ( 5.02% of base) : System.Memory.dasm - ReadOnlySequence`1:Seek(byref,long):SequencePosition:this (8 methods)
         265 ( 4.66% of base) : System.Memory.dasm - ReadOnlySequence`1:Slice(SequencePosition,long):ReadOnlySequence`1:this (8 methods)
          10 ( 4.08% of base) : System.Private.CoreLib.dasm - Half:op_Explicit(float):Half
          10 ( 3.98% of base) : System.Private.CoreLib.dasm - Half:op_Explicit(double):Half
          41 ( 3.89% of base) : Microsoft.CodeAnalysis.CSharp.dasm - CodeGenerator:EmitBinaryCondOperator(BoundBinaryOperator,bool):this
          96 ( 3.85% of base) : System.Memory.dasm - ReadOnlySequence`1:BoundsCheck(int,Object,int,Object):this (8 methods)
          16 ( 3.38% of base) : System.Private.CoreLib.dasm - Utf8Formatter:TryFormatDecimalF(byref,Span`1,byref,ubyte):bool
          18 ( 3.18% of base) : Microsoft.CodeAnalysis.CSharp.dasm - CSharpSemanticModel:GetSymbolsAndResultKind(BoundBinaryOperator,byref,byref,byref)

Top method improvements (percentages):
        -259 (-18.99% of base) : System.Private.CoreLib.dasm - Vector:Widen(Vector`1,byref,byref) (7 methods)
         -15 (-17.65% of base) : Microsoft.CodeAnalysis.dasm - BitVector:get_Item(int):bool:this
         -40 (-16.00% of base) : System.Private.CoreLib.dasm - Span`1:Fill(double):this
       -2556 (-15.94% of base) : System.Management.dasm - ManagementClassGenerator:GenerateTypeConverterClass():CodeTypeDeclaration:this
         -34 (-15.74% of base) : System.Private.CoreLib.dasm - Span`1:Fill(int):this
         -34 (-15.74% of base) : System.Private.CoreLib.dasm - Span`1:Fill(long):this
         -34 (-15.45% of base) : System.Private.CoreLib.dasm - Span`1:Fill(short):this
         -85 (-14.66% of base) : Microsoft.Diagnostics.FastSerialization.dasm - Deserializer:.ctor(IStreamReader,String):this
         -42 (-14.33% of base) : System.Private.CoreLib.dasm - Span`1:Fill(Nullable`1):this
          -4 (-12.12% of base) : ILCompiler.Reflection.ReadyToRun.dasm - NibbleReader:ReadInt():int:this
        -342 (-11.67% of base) : System.Private.CoreLib.dasm - Vector64`1:ToString():String:this (6 methods)
        -100 (-11.15% of base) : Microsoft.VisualBasic.Core.dasm - Interaction:Partition(long,long,long,long):String
         -20 (-8.89% of base) : Microsoft.VisualBasic.Core.dasm - VB6File:GetDecimal(long):Decimal:this
         -23 (-8.78% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - FileVersionInfo:.ctor(long,int):this
         -25 (-7.94% of base) : Microsoft.VisualBasic.Core.dasm - VB6File:PutDecimal(long,Decimal,bool):this
         -11 (-7.14% of base) : System.Reflection.Metadata.dasm - PortablePdbBuilder:.ctor(MetadataBuilder,ImmutableArray`1,MethodDefinitionHandle,Func`2):this
         -35 (-6.63% of base) : Microsoft.CodeAnalysis.CSharp.dasm - LanguageParser:IsQueryExpressionAfterFrom(bool,bool):bool:this
          -7 (-6.25% of base) : System.Private.Xml.dasm - Base64Decoder:ConstructMapBase64():ref
        -152 (-6.22% of base) : System.Management.dasm - ManagementClassGenerator:AddGetStatementsForEnumArray(CodeIndexerExpression,CodeMemberProperty):this
         -27 (-6.21% of base) : System.Private.CoreLib.dasm - UTF7Encoding:MakeTables():this

401 total methods with Code Size differences (331 improved, 70 regressed), 268047 unchanged.

AndyAyersMS added a commit to AndyAyersMS/runtime that referenced this issue Jan 13, 2022
Extend ref counting done by local morph so that we can determine
single-def single-use locals.

Add a phase that runs just after local morph that will attempt to
forward single-def single-use local defs to uses when they are in
adjacent statements.

Fix or work around issues uncovered elsewhere:
* `gtFoldExprCompare` might fold "identical" volatile subtrees
* `fgGetStubAddrArg` cannot handle complex trees
* some simd/hw operations can lose struct handles
* some calls cannot handle struct local args

Addresses dotnet#6973 and related issues. Still sorting through exactly
which ones are fixed, so list below may need revising.

Fixes dotnet#48605.
Fixes dotnet#51599.
Fixes dotnet#55472.

Improves some but not all cases in dotnet#12280 and dotnet#62064.

Does not fix dotnet#33002, dotnet#47082, or dotnet#63116; these require handling multiple
uses or bypassing statements.
AndyAyersMS added a commit that referenced this issue Feb 4, 2022
Extend ref counting done by local morph so that we can determine
single-def single-use locals.

Add a phase that runs just after local morph that will attempt to
forward single-def single-use local defs to uses when they are in
adjacent statements.

Fix or work around issues uncovered elsewhere:
* `gtFoldExprCompare` might fold "identical" volatile subtrees
* `fgGetStubAddrArg` cannot handle complex trees
* some simd/hw operations can lose struct handles
* some calls cannot handle struct local args
* morph expects args not to interfere
* fix arm; don't forward sub no return calls
* update debuginfo test (we may want to revisit this)
* handle subbing past normalize on store assignment
* clean up nullcheck of new helper

Addresses #6973 and related issues. Still sorting through exactly
which ones are fixed, so list below may need revising.

Fixes #48605.
Fixes #51599.
Fixes #55472.

Improves some but not all cases in #12280 and #62064.

Does not fix #33002, #47082, or #63116; these require handling multiple
uses or bypassing statements.
@AndyAyersMS
Copy link
Member

This should be largely addressed by #63720, though there are still a number of cases where we can improve.

@AndyAyersMS
Copy link
Member

There's now both a phase and a utility method, so I think this is done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions optimization tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

7 participants