Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow inlining in some over-bugdet cases #78874

Merged
merged 10 commits into from
Dec 1, 2022

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Nov 26, 2022

This PR reduces number of not inlined methods with [AggressiveInlining] attribute in cases where it's profitable.

Closes #78648
Closes #43761
Closes #51587
Closes #41692
(I've verified that this PR fixes all codegen issues in the snippets in those issues ^)

The current inliner is quite conservative around the time budget it uses for a call graph, here is how it looks like.
The following sample demonstrates why it's not perfect:

[MethodImpl(MethodImplOptions.AggressiveInlining)]
static bool LessThan<TKey>(TKey left, TKey right)
{
    if (typeof(TKey) == typeof(byte)) return (byte)(object)left < (byte)(object)right;
    if (typeof(TKey) == typeof(sbyte)) return (sbyte)(object)left < (sbyte)(object)right;
    if (typeof(TKey) == typeof(ushort)) return (ushort)(object)left < (ushort)(object)right;
    if (typeof(TKey) == typeof(short)) return (short)(object)left < (short)(object)right;
    if (typeof(TKey) == typeof(uint)) return (uint)(object)left < (uint)(object)right;
    if (typeof(TKey) == typeof(int)) return (int)(object)left < (int)(object)right;
    if (typeof(TKey) == typeof(ulong)) return (ulong)(object)left < (ulong)(object)right;
    if (typeof(TKey) == typeof(long)) return (long)(object)left < (long)(object)right;
    return false;
}

[MethodImpl(MethodImplOptions.AggressiveInlining)]
static bool LessThanWrapper<TKey>(TKey left, TKey right) =>
    LessThan(left, right);

public static void Main()
{
    LessThanWrapper(1, 2);
}

Codegen for Main:

; Method Program:Main()
G_M000_IG01:                
       4883EC28             sub      rsp, 40
G_M000_IG02:                
       B901000000           mov      ecx, 1
       BA02000000           mov      edx, 2
       FF15BCF82C00         call     [Program:LessThan[int](int,int):bool]
       90                   nop      
G_M000_IG03:               
       4883C428             add      rsp, 40
       C3                   ret      
; Total bytes of code: 26

As you can see, it's not inlined due to "exceeds the time budget" reason despite being marked as AggressiveInlining. While in reality the method doesn't have a lot of work for JIT to do - 95% of it is stripped during import.

So it seems to me that we should allow overbudget for forceinline callees with foldable branches inside.

jit-diff (-f -pmi) from this change:

PMI CodeSize Diffs for System.Private.CoreLib.dll, framework assemblies [invoking .cctors] for  default jit

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 54786092
Total bytes of diff: 54783678
Total bytes of delta: -2414 (-0.00 % of base)

Top method regressions (percentages):
         197 (160.16% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.ErrorFacts:CreateCategoriesMap():System.Collections.Immutable.ImmutableDictionary`2[int,System.String]
         197 (160.16% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.ErrorFacts:CreateHelpLinks():System.Collections.Immutable.ImmutableDictionary`2[int,System.String]
         197 (160.16% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - Microsoft.CodeAnalysis.VisualBasic.ErrorFactory:CreateCategoriesMap():System.Collections.Immutable.ImmutableDictionary`2[int,System.String]
         197 (160.16% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - Microsoft.CodeAnalysis.VisualBasic.ErrorFactory:CreateHelpLinks():System.Collections.Immutable.ImmutableDictionary`2[int,System.String]
         180 (28.80% of base) : Microsoft.CodeAnalysis.dasm - Microsoft.CodeAnalysis.Diagnostics.AnalyzerFileReference+<>c:<GetAnalyzerTypeNameMap>b__30_9(System.Linq.IGrouping`2[System.String,System.String]):System.Collections.Immutable.ImmutableHashSet`1[System.String]:this
          81 (18.93% of base) : System.Net.Sockets.dasm - System.Net.Sockets.Socket:ReceiveAsync(System.ArraySegment`1[ubyte]):System.Threading.Tasks.Task`1[int]:this
          81 (18.93% of base) : System.Net.Sockets.dasm - System.Net.Sockets.Socket:SendAsync(System.ArraySegment`1[ubyte]):System.Threading.Tasks.Task`1[int]:this
          81 (18.79% of base) : System.Net.Sockets.dasm - System.Net.Sockets.Socket:ReceiveAsync(System.ArraySegment`1[ubyte],int):System.Threading.Tasks.Task`1[int]:this
          81 (18.79% of base) : System.Net.Sockets.dasm - System.Net.Sockets.SocketTaskExtensions:ReceiveAsync(System.Net.Sockets.Socket,System.ArraySegment`1[ubyte],int):System.Threading.Tasks.Task`1[int]
          85 (18.01% of base) : System.Net.Sockets.dasm - System.Net.Sockets.UdpClient:BeginSend(ubyte[],int,System.AsyncCallback,System.Object):System.IAsyncResult:this
          81 (17.46% of base) : System.Net.Sockets.dasm - System.Net.Sockets.Socket:SendToAsync(System.ArraySegment`1[ubyte],System.Net.EndPoint):System.Threading.Tasks.Task`1[int]:this
          61 (16.90% of base) : System.Memory.dasm - System.Buffers.ReadOnlySequence`1+<>c[double]:<ToString>b__33_0(System.Span`1[ushort],System.Buffers.ReadOnlySequence`1[ushort]):this
          61 (16.90% of base) : System.Memory.dasm - System.Buffers.ReadOnlySequence`1+<>c[int]:<ToString>b__33_0(System.Span`1[ushort],System.Buffers.ReadOnlySequence`1[ushort]):this
          61 (16.90% of base) : System.Memory.dasm - System.Buffers.ReadOnlySequence`1+<>c[long]:<ToString>b__33_0(System.Span`1[ushort],System.Buffers.ReadOnlySequence`1[ushort]):this
          61 (16.90% of base) : System.Memory.dasm - System.Buffers.ReadOnlySequence`1+<>c[short]:<ToString>b__33_0(System.Span`1[ushort],System.Buffers.ReadOnlySequence`1[ushort]):this
          61 (16.90% of base) : System.Memory.dasm - System.Buffers.ReadOnlySequence`1+<>c[System.__Canon]:<ToString>b__33_0(System.Span`1[ushort],System.Buffers.ReadOnlySequence`1[ushort]):this
          61 (16.90% of base) : System.Memory.dasm - System.Buffers.ReadOnlySequence`1+<>c[System.Nullable`1[int]]:<ToString>b__33_0(System.Span`1[ushort],System.Buffers.ReadOnlySequence`1[ushort]):this
          61 (16.90% of base) : System.Memory.dasm - System.Buffers.ReadOnlySequence`1+<>c[System.Numerics.Vector`1[float]]:<ToString>b__33_0(System.Span`1[ushort],System.Buffers.ReadOnlySequence`1[ushort]):this
          61 (16.90% of base) : System.Memory.dasm - System.Buffers.ReadOnlySequence`1+<>c[ubyte]:<ToString>b__33_0(System.Span`1[ushort],System.Buffers.ReadOnlySequence`1[ushort]):this
          60 (14.05% of base) : System.Security.Cryptography.Xml.dasm - System.Security.Cryptography.Xml.RSAPKCS1SHA1SignatureDescription:.ctor():this
          60 (14.05% of base) : System.Security.Cryptography.Xml.dasm - System.Security.Cryptography.Xml.RSAPKCS1SHA256SignatureDescription:.ctor():this
          60 (14.05% of base) : System.Security.Cryptography.Xml.dasm - System.Security.Cryptography.Xml.RSAPKCS1SHA384SignatureDescription:.ctor():this
          60 (14.05% of base) : System.Security.Cryptography.Xml.dasm - System.Security.Cryptography.Xml.RSAPKCS1SHA512SignatureDescription:.ctor():this
          45 ( 8.64% of base) : System.Net.Sockets.dasm - System.Net.Sockets.Socket:ReceiveMessageFromAsync(System.ArraySegment`1[ubyte],System.Net.EndPoint):System.Threading.Tasks.Task`1[System.Net.Sockets.SocketReceiveMessageFromResult]:this
          76 ( 7.87% of base) : System.Text.Json.dasm - System.Text.Json.Utf8JsonWriter:WritePropertyName(System.Decimal):this
          70 ( 7.28% of base) : System.Text.Json.dasm - System.Text.Json.Utf8JsonWriter:WritePropertyName(System.Guid):this
          70 ( 6.90% of base) : System.Text.Json.dasm - System.Text.Json.Utf8JsonWriter:WritePropertyName(double):this
          28 ( 5.67% of base) : System.Net.Sockets.dasm - System.Net.Sockets.Socket:ReceiveFromAsync(System.ArraySegment`1[ubyte],System.Net.EndPoint):System.Threading.Tasks.Task`1[System.Net.Sockets.SocketReceiveFromResult]:this
         227 ( 5.14% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.SyntaxAndDeclarationManager:CreateState(System.Collections.Immutable.ImmutableArray`1[Microsoft.CodeAnalysis.SyntaxTree],System.String,Microsoft.CodeAnalysis.SourceReferenceResolver,Microsoft.CodeAnalysis.CommonMessageProvider,bool):Microsoft.CodeAnalysis.CSharp.SyntaxAndDeclarationManager+State
          44 ( 5.03% of base) : Microsoft.CodeAnalysis.dasm - Microsoft.CodeAnalysis.MetadataReaderExtensions:IsTheObjectClass(System.Reflection.Metadata.MetadataReader,System.Reflection.Metadata.TypeDefinition):bool
          38 ( 4.48% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - Microsoft.CodeAnalysis.VisualBasic.CodeGen.CodeGenerator:EmitStaticFieldLoad(Microsoft.CodeAnalysis.VisualBasic.Symbols.FieldSymbol,bool,Microsoft.CodeAnalysis.VisualBasic.VisualBasicSyntaxNode):this
          38 ( 4.24% of base) : System.Private.Xml.dasm - System.Xml.Schema.XmlBaseConverter:StringToDate(System.String):System.DateTime
          38 ( 4.24% of base) : System.Private.Xml.dasm - System.Xml.Schema.XmlBaseConverter:StringToDateTime(System.String):System.DateTime
          38 ( 4.24% of base) : System.Private.Xml.dasm - System.Xml.Schema.XmlBaseConverter:StringToGYearMonth(System.String):System.DateTime
          38 ( 4.24% of base) : System.Private.Xml.dasm - System.Xml.Schema.XmlBaseConverter:StringToTime(System.String):System.DateTime
          38 ( 4.19% of base) : System.Private.Xml.dasm - System.Xml.Schema.XmlBaseConverter:StringToDateOffset(System.String):System.DateTimeOffset
          38 ( 4.19% of base) : System.Private.Xml.dasm - System.Xml.Schema.XmlBaseConverter:StringToDateTimeOffset(System.String):System.DateTimeOffset
          38 ( 4.19% of base) : System.Private.Xml.dasm - System.Xml.Schema.XmlBaseConverter:StringToGYearMonthOffset(System.String):System.DateTimeOffset
          38 ( 4.19% of base) : System.Private.Xml.dasm - System.Xml.Schema.XmlBaseConverter:StringToTimeOffset(System.String):System.DateTimeOffset
          38 ( 4.07% of base) : System.Private.Xml.dasm - System.Xml.Schema.XmlBaseConverter:StringToGDay(System.String):System.DateTime
          38 ( 4.07% of base) : System.Private.Xml.dasm - System.Xml.Schema.XmlBaseConverter:StringToGMonthDay(System.String):System.DateTime
          38 ( 4.07% of base) : System.Private.Xml.dasm - System.Xml.Schema.XmlBaseConverter:StringToGYear(System.String):System.DateTime
          38 ( 4.03% of base) : System.Private.Xml.dasm - System.Xml.Schema.XmlBaseConverter:StringToGDayOffset(System.String):System.DateTimeOffset
          38 ( 4.03% of base) : System.Private.Xml.dasm - System.Xml.Schema.XmlBaseConverter:StringToGMonthDayOffset(System.String):System.DateTimeOffset
          38 ( 4.03% of base) : System.Private.Xml.dasm - System.Xml.Schema.XmlBaseConverter:StringToGYearOffset(System.String):System.DateTimeOffset
          20 ( 2.64% of base) : System.Security.Cryptography.dasm - System.Security.Cryptography.AesCng:CreateDecryptor(ubyte[],ubyte[]):System.Security.Cryptography.ICryptoTransform:this
          20 ( 2.64% of base) : System.Security.Cryptography.dasm - System.Security.Cryptography.TripleDESCng:CreateDecryptor(ubyte[],ubyte[]):System.Security.Cryptography.ICryptoTransform:this
          17 ( 2.24% of base) : System.Security.Cryptography.dasm - System.Security.Cryptography.AesCng:CreateEncryptor(ubyte[],ubyte[]):System.Security.Cryptography.ICryptoTransform:this
          17 ( 2.24% of base) : System.Security.Cryptography.dasm - System.Security.Cryptography.TripleDESCng:CreateEncryptor(ubyte[],ubyte[]):System.Security.Cryptography.ICryptoTransform:this
          16 ( 2.10% of base) : System.Private.CoreLib.dasm - System.Int128:TryFormat(System.Span`1[ushort],byref,System.ReadOnlySpan`1[ushort],System.IFormatProvider):bool:this
          25 ( 1.70% of base) : Microsoft.CodeAnalysis.dasm - Microsoft.CodeAnalysis.Diagnostics.AnalyzerFileReference:GetFullyQualifiedTypeName(System.Reflection.Metadata.TypeDefinition,Microsoft.CodeAnalysis.PEModule):System.String
           1 ( 0.07% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - Microsoft.CodeAnalysis.VisualBasic.Binder:BindXmlnsAttribute(Microsoft.CodeAnalysis.VisualBasic.Syntax.XmlNodeSyntax,System.String,System.String,Microsoft.CodeAnalysis.DiagnosticBag):Microsoft.CodeAnalysis.VisualBasic.BoundXmlAttribute:this

Top method improvements (percentages):
        -103 (-100.00% of base) : Microsoft.CodeAnalysis.dasm - Microsoft.CodeAnalysis.SmallDictionary`2[System.__Canon,int]:RightComplex(Microsoft.CodeAnalysis.SmallDictionary`2+AvlNode[System.__Canon,int]):Microsoft.CodeAnalysis.SmallDictionary`2+AvlNode[System.__Canon,int] (1 base, 0 diff methods)
        -138 (-82.63% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.Counter`1[System.Numerics.Vector`1[float]]:Add(System.Numerics.Vector`1[float]):this
        -138 (-82.63% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.Histogram`1[System.Numerics.Vector`1[float]]:Record(System.Numerics.Vector`1[float]):this
        -138 (-82.63% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.UpDownCounter`1[System.Numerics.Vector`1[float]]:Add(System.Numerics.Vector`1[float]):this
        -128 (-71.11% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.Counter`1[System.Numerics.Vector`1[float]]:Add(System.Numerics.Vector`1[float],System.Collections.Generic.KeyValuePair`2[System.String,System.Object]):this
        -128 (-71.11% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.Histogram`1[System.Numerics.Vector`1[float]]:Record(System.Numerics.Vector`1[float],System.Collections.Generic.KeyValuePair`2[System.String,System.Object]):this
        -128 (-71.11% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.UpDownCounter`1[System.Numerics.Vector`1[float]]:Add(System.Numerics.Vector`1[float],System.Collections.Generic.KeyValuePair`2[System.String,System.Object]):this
        -171 (-70.95% of base) : System.Linq.dasm - System.Linq.Enumerable:Order[int](System.Collections.Generic.IEnumerable`1[int]):System.Linq.IOrderedEnumerable`1[int]
        -171 (-70.95% of base) : System.Linq.dasm - System.Linq.Enumerable:Order[long](System.Collections.Generic.IEnumerable`1[long]):System.Linq.IOrderedEnumerable`1[long]
        -171 (-70.95% of base) : System.Linq.dasm - System.Linq.Enumerable:Order[short](System.Collections.Generic.IEnumerable`1[short]):System.Linq.IOrderedEnumerable`1[short]
        -171 (-70.95% of base) : System.Linq.dasm - System.Linq.Enumerable:Order[ubyte](System.Collections.Generic.IEnumerable`1[ubyte]):System.Linq.IOrderedEnumerable`1[ubyte]
        -171 (-70.95% of base) : System.Linq.dasm - System.Linq.Enumerable:OrderDescending[int](System.Collections.Generic.IEnumerable`1[int]):System.Linq.IOrderedEnumerable`1[int]
        -171 (-70.95% of base) : System.Linq.dasm - System.Linq.Enumerable:OrderDescending[long](System.Collections.Generic.IEnumerable`1[long]):System.Linq.IOrderedEnumerable`1[long]
        -171 (-70.95% of base) : System.Linq.dasm - System.Linq.Enumerable:OrderDescending[short](System.Collections.Generic.IEnumerable`1[short]):System.Linq.IOrderedEnumerable`1[short]
        -171 (-70.95% of base) : System.Linq.dasm - System.Linq.Enumerable:OrderDescending[ubyte](System.Collections.Generic.IEnumerable`1[ubyte]):System.Linq.IOrderedEnumerable`1[ubyte]
        -194 (-68.07% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.Counter`1[System.Numerics.Vector`1[float]]:Add(System.Numerics.Vector`1[float],byref):this
        -194 (-68.07% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.Histogram`1[System.Numerics.Vector`1[float]]:Record(System.Numerics.Vector`1[float],byref):this
        -194 (-68.07% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.UpDownCounter`1[System.Numerics.Vector`1[float]]:Add(System.Numerics.Vector`1[float],byref):this
        -137 (-59.57% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.Counter`1[System.Numerics.Vector`1[float]]:Add(System.Numerics.Vector`1[float],System.Collections.Generic.KeyValuePair`2[System.String,System.Object],System.Collections.Generic.KeyValuePair`2[System.String,System.Object]):this
        -137 (-59.57% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.Histogram`1[System.Numerics.Vector`1[float]]:Record(System.Numerics.Vector`1[float],System.Collections.Generic.KeyValuePair`2[System.String,System.Object],System.Collections.Generic.KeyValuePair`2[System.String,System.Object]):this
        -137 (-59.57% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.UpDownCounter`1[System.Numerics.Vector`1[float]]:Add(System.Numerics.Vector`1[float],System.Collections.Generic.KeyValuePair`2[System.String,System.Object],System.Collections.Generic.KeyValuePair`2[System.String,System.Object]):this
        -165 (-57.69% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.Counter`1[System.Numerics.Vector`1[float]]:Add(System.Numerics.Vector`1[float],System.Collections.Generic.KeyValuePair`2[System.String,System.Object],System.Collections.Generic.KeyValuePair`2[System.String,System.Object],System.Collections.Generic.KeyValuePair`2[System.String,System.Object]):this
        -165 (-57.69% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.Histogram`1[System.Numerics.Vector`1[float]]:Record(System.Numerics.Vector`1[float],System.Collections.Generic.KeyValuePair`2[System.String,System.Object],System.Collections.Generic.KeyValuePair`2[System.String,System.Object],System.Collections.Generic.KeyValuePair`2[System.String,System.Object]):this
        -165 (-57.69% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.UpDownCounter`1[System.Numerics.Vector`1[float]]:Add(System.Numerics.Vector`1[float],System.Collections.Generic.KeyValuePair`2[System.String,System.Object],System.Collections.Generic.KeyValuePair`2[System.String,System.Object],System.Collections.Generic.KeyValuePair`2[System.String,System.Object]):this
        -190 (-47.98% of base) : System.Private.CoreLib.dasm - System.MemoryExtensions:LastIndexOfAnyExcept[System.Numerics.Vector`1[float]](System.Span`1[System.Numerics.Vector`1[float]],System.Numerics.Vector`1[float]):int
        -192 (-47.76% of base) : System.Private.CoreLib.dasm - System.MemoryExtensions:IndexOfAnyExcept[System.Numerics.Vector`1[float]](System.Span`1[System.Numerics.Vector`1[float]],System.Numerics.Vector`1[float]):int
        -306 (-45.81% of base) : System.Private.CoreLib.dasm - System.MemoryExtensions:LastIndexOfAnyExcept[System.Numerics.Vector`1[float]](System.Span`1[System.Numerics.Vector`1[float]],System.Numerics.Vector`1[float],System.Numerics.Vector`1[float]):int
        -307 (-45.41% of base) : System.Private.CoreLib.dasm - System.MemoryExtensions:IndexOfAnyExcept[System.Numerics.Vector`1[float]](System.Span`1[System.Numerics.Vector`1[float]],System.Numerics.Vector`1[float],System.Numerics.Vector`1[float]):int
        -147 (-38.38% of base) : Microsoft.CodeAnalysis.dasm - Roslyn.Utilities.ValueTuple`2[System.Numerics.Vector`1[float],System.Nullable`1[int]]:op_Equality(Roslyn.Utilities.ValueTuple`2[System.Numerics.Vector`1[float],System.Nullable`1[int]],Roslyn.Utilities.ValueTuple`2[System.Numerics.Vector`1[float],System.Nullable`1[int]]):bool
        -147 (-37.60% of base) : Microsoft.CodeAnalysis.dasm - Roslyn.Utilities.ValueTuple`2[System.Numerics.Vector`1[float],System.Nullable`1[int]]:op_Inequality(Roslyn.Utilities.ValueTuple`2[System.Numerics.Vector`1[float],System.Nullable`1[int]],Roslyn.Utilities.ValueTuple`2[System.Numerics.Vector`1[float],System.Nullable`1[int]]):bool
        -134 (-27.07% of base) : CommandLine.dasm - CSharpx.MaybeExtensions:ToMaybe[System.Numerics.Vector`1[float]](System.Numerics.Vector`1[float]):CSharpx.Maybe`1[System.Numerics.Vector`1[float]]
        -134 (-25.57% of base) : System.Collections.Immutable.dasm - System.Collections.Frozen.ValueTypeDefaultComparerFrozenSet`1+GSW[System.Numerics.Vector`1[float]]:FindItemIndex(System.Numerics.Vector`1[float]):int:this
         -59 (-24.48% of base) : System.Linq.dasm - System.Linq.Enumerable:Order[double](System.Collections.Generic.IEnumerable`1[double]):System.Linq.IOrderedEnumerable`1[double]
         -59 (-24.48% of base) : System.Linq.dasm - System.Linq.Enumerable:Order[System.Nullable`1[int]](System.Collections.Generic.IEnumerable`1[System.Nullable`1[int]]):System.Linq.IOrderedEnumerable`1[System.Nullable`1[int]]
         -59 (-24.48% of base) : System.Linq.dasm - System.Linq.Enumerable:Order[System.Numerics.Vector`1[float]](System.Collections.Generic.IEnumerable`1[System.Numerics.Vector`1[float]]):System.Linq.IOrderedEnumerable`1[System.Numerics.Vector`1[float]]
         -59 (-24.48% of base) : System.Linq.dasm - System.Linq.Enumerable:OrderDescending[double](System.Collections.Generic.IEnumerable`1[double]):System.Linq.IOrderedEnumerable`1[double]
         -59 (-24.48% of base) : System.Linq.dasm - System.Linq.Enumerable:OrderDescending[System.Nullable`1[int]](System.Collections.Generic.IEnumerable`1[System.Nullable`1[int]]):System.Linq.IOrderedEnumerable`1[System.Nullable`1[int]]
         -59 (-24.48% of base) : System.Linq.dasm - System.Linq.Enumerable:OrderDescending[System.Numerics.Vector`1[float]](System.Collections.Generic.IEnumerable`1[System.Numerics.Vector`1[float]]):System.Linq.IOrderedEnumerable`1[System.Numerics.Vector`1[float]]
         -36 (-22.64% of base) : System.Private.CoreLib.dasm - System.Numerics.Vector:Divide[long](System.Numerics.Vector`1[long],long):System.Numerics.Vector`1[long]
         -16 (-17.98% of base) : System.Private.CoreLib.dasm - System.Numerics.Vector:Divide[int](System.Numerics.Vector`1[int],int):System.Numerics.Vector`1[int]
         -17 (-14.78% of base) : System.Private.CoreLib.dasm - System.Numerics.Vector:Divide[short](System.Numerics.Vector`1[short],System.Numerics.Vector`1[short]):System.Numerics.Vector`1[short]
         -13 (-13.98% of base) : System.Private.CoreLib.dasm - System.Numerics.Vector:Divide[short](System.Numerics.Vector`1[short],short):System.Numerics.Vector`1[short]
         -27 (-13.78% of base) : System.Private.CoreLib.dasm - System.Numerics.Vector:Divide[long](System.Numerics.Vector`1[long],System.Numerics.Vector`1[long]):System.Numerics.Vector`1[long]
         -13 (-12.26% of base) : System.Private.CoreLib.dasm - System.Numerics.Vector:Divide[int](System.Numerics.Vector`1[int],System.Numerics.Vector`1[int]):System.Numerics.Vector`1[int]
          -9 (-9.89% of base) : System.Private.CoreLib.dasm - System.Numerics.Vector:Divide[ubyte](System.Numerics.Vector`1[ubyte],ubyte):System.Numerics.Vector`1[ubyte]
          -9 (-8.11% of base) : System.Private.CoreLib.dasm - System.Numerics.Vector:Divide[ubyte](System.Numerics.Vector`1[ubyte],System.Numerics.Vector`1[ubyte]):System.Numerics.Vector`1[ubyte]
         -14 (-1.54% of base) : System.Net.Sockets.dasm - System.Net.Sockets.UdpClient:BeginSend(ubyte[],int,System.String,int,System.AsyncCallback,System.Object):System.IAsyncResult:this
         -24 (-1.09% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - Microsoft.CodeAnalysis.VisualBasic.VisualBasicSyntaxTree+ConditionalSymbolsMap+ConditionalSymbolsMapBuilder:ProcessCommandLinePreprocessorSymbols(System.Collections.Immutable.ImmutableDictionary`2[System.String,Microsoft.CodeAnalysis.VisualBasic.Syntax.InternalSyntax.CConst]):this

100 total methods with Code Size differences (48 improved, 52 regressed), 341155 unchanged.

Some examples:

  1. System.Linq.Enumerable:OrderDescending[int] - TypeIsImplicitlyStable was not inlined previously.
  2. System.Numerics.Vector:Divide[int] - Vector<T>(Vector, T) used to be extremely slow because of that, see System.Numerics.Vector: Recognize division by a constant and inline #43761
  3. CSharpx.MaybeExtensions:ToMaybe[System.Numerics.Vector'1[float]] - ouch, it turns out Vector<T>.IsSupported is not a jit intrinsic - see here.

Size-regressions (look to be perf improvements):

  1. System.Net.Sockets.Socket:ReceiveAsync - Task.FromResult was inlined (it's marked as AggressiveInlining).
  2. System.Xml.Schema.XmlBaseConverter:StringToDate(System.String) - DateTime::Kind is marked as AggressiveInlining
  3. System.Int128:TryFormat(...) - BitOperations.log2 was not inlined previously.
  4. System.Security.Cryptography.Xml.RSAPKCS1SHA1SignatureDescription:.ctor():this
  5. Five Microsoft.CodeAnalysis.* are questionable inline decisions (e.g. this), inliner recognized 2 foldable branches where in fact it's 0 of them, hopefully, we'll have a more precise analysis in future.
  6. System.Text.Json.Utf8JsonWriter:WritePropertyName(double):this - Memory.get_Span is inlined (it's marked with AggressiveInlining)

Might also help @stephentoub in #78580

@ghost ghost assigned EgorBo Nov 26, 2022
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Nov 26, 2022
@ghost
Copy link

ghost commented Nov 26, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

The current inline is quite conservative in terms of time budget it uses for a call graph, here is how it looks like.
The following sample demonstrates why it's not perfect:

[MethodImpl(MethodImplOptions.AggressiveInlining)]
static bool LessThan<TKey>(TKey left, TKey right)
{
    if (typeof(TKey) == typeof(byte)) return (byte)(object)left < (byte)(object)right;
    if (typeof(TKey) == typeof(sbyte)) return (sbyte)(object)left < (sbyte)(object)right;
    if (typeof(TKey) == typeof(ushort)) return (ushort)(object)left < (ushort)(object)right;
    if (typeof(TKey) == typeof(short)) return (short)(object)left < (short)(object)right;
    if (typeof(TKey) == typeof(uint)) return (uint)(object)left < (uint)(object)right;
    if (typeof(TKey) == typeof(int)) return (int)(object)left < (int)(object)right;
    if (typeof(TKey) == typeof(ulong)) return (ulong)(object)left < (ulong)(object)right;
    if (typeof(TKey) == typeof(long)) return (long)(object)left < (long)(object)right;
    return false;
}

[MethodImpl(MethodImplOptions.AggressiveInlining)]
static bool LessThanWrapper<TKey>(TKey left, TKey right) =>
    LessThan(left, right);

public static void Main()
{
    LessThanWrapper(1, 2);
}

Codegen for Main:

; Method Program:Main()
G_M000_IG01:                
       4883EC28             sub      rsp, 40
G_M000_IG02:                
       B901000000           mov      ecx, 1
       BA02000000           mov      edx, 2
       FF15BCF82C00         call     [Program:LessThan[int](int,int):bool]
       90                   nop      
G_M000_IG03:               
       4883C428             add      rsp, 40
       C3                   ret      
; Total bytes of code: 26

As you can see, it's not inlined due to "exceeds the time budget" reason. While in reality the method doesn't have a lot of work for JIT to do - 95% of it is stripped during import.

So it seems to me that we should allow overbudget for forceinline callees with foldable branches inside.

jit-diff from this change:

PMI CodeSize Diffs for System.Private.CoreLib.dll, framework assemblies [invoking .cctors] for  default jit

Total bytes of delta: -3832 (-0.01 % of base)
Total relative delta: NaN
    diff is an improvement.
    relative diff is a regression.


Top file regressions (bytes):
         316 : System.Net.Sockets.dasm (0.15% of base)

Top file improvements (bytes):
       -2286 : System.Diagnostics.DiagnosticSource.dasm (-1.16% of base)
       -1722 : System.Linq.dasm (-0.16% of base)
        -140 : System.Private.CoreLib.dasm (-0.01% of base)

4 total files with Code Size differences (3 improved, 1 regressed), 270 unchanged.

Top method regressions (bytes):
          81 (18.79% of base) : System.Net.Sockets.dasm - System.Net.Sockets.Socket:ReceiveAsync(System.ArraySegment`1[ubyte],int):System.Threading.Tasks.Task`1[int]:this
          81 (18.93% of base) : System.Net.Sockets.dasm - System.Net.Sockets.Socket:SendAsync(System.ArraySegment`1[ubyte]):System.Threading.Tasks.Task`1[int]:this
          81 (17.46% of base) : System.Net.Sockets.dasm - System.Net.Sockets.Socket:SendToAsync(System.ArraySegment`1[ubyte],System.Net.EndPoint):System.Threading.Tasks.Task`1[int]:this
          45 ( 8.64% of base) : System.Net.Sockets.dasm - System.Net.Sockets.Socket:ReceiveMessageFromAsync(System.ArraySegment`1[ubyte],System.Net.EndPoint):System.Threading.Tasks.Task`1[System.Net.Sockets.SocketReceiveMessageFromResult]:this
          28 ( 5.67% of base) : System.Net.Sockets.dasm - System.Net.Sockets.Socket:ReceiveFromAsync(System.ArraySegment`1[ubyte],System.Net.EndPoint):System.Threading.Tasks.Task`1[System.Net.Sockets.SocketReceiveFromResult]:this

Top method improvements (bytes):
        -194 (-68.07% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.Counter`1[System.Numerics.Vector`1[float]]:Add(System.Numerics.Vector`1[float],byref):this
        -194 (-68.07% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.Histogram`1[System.Numerics.Vector`1[float]]:Record(System.Numerics.Vector`1[float],byref):this
        -194 (-68.07% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.UpDownCounter`1[System.Numerics.Vector`1[float]]:Add(System.Numerics.Vector`1[float],byref):this
        -171 (-70.95% of base) : System.Linq.dasm - System.Linq.Enumerable:Order[int](System.Collections.Generic.IEnumerable`1[int]):System.Linq.IOrderedEnumerable`1[int]
        -171 (-70.95% of base) : System.Linq.dasm - System.Linq.Enumerable:Order[long](System.Collections.Generic.IEnumerable`1[long]):System.Linq.IOrderedEnumerable`1[long]
        -171 (-70.95% of base) : System.Linq.dasm - System.Linq.Enumerable:Order[short](System.Collections.Generic.IEnumerable`1[short]):System.Linq.IOrderedEnumerable`1[short]
        -171 (-70.95% of base) : System.Linq.dasm - System.Linq.Enumerable:Order[ubyte](System.Collections.Generic.IEnumerable`1[ubyte]):System.Linq.IOrderedEnumerable`1[ubyte]
        -171 (-70.95% of base) : System.Linq.dasm - System.Linq.Enumerable:OrderDescending[int](System.Collections.Generic.IEnumerable`1[int]):System.Linq.IOrderedEnumerable`1[int]
        -171 (-70.95% of base) : System.Linq.dasm - System.Linq.Enumerable:OrderDescending[long](System.Collections.Generic.IEnumerable`1[long]):System.Linq.IOrderedEnumerable`1[long]
        -171 (-70.95% of base) : System.Linq.dasm - System.Linq.Enumerable:OrderDescending[short](System.Collections.Generic.IEnumerable`1[short]):System.Linq.IOrderedEnumerable`1[short]
        -171 (-70.95% of base) : System.Linq.dasm - System.Linq.Enumerable:OrderDescending[ubyte](System.Collections.Generic.IEnumerable`1[ubyte]):System.Linq.IOrderedEnumerable`1[ubyte]
        -165 (-57.69% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.Counter`1[System.Numerics.Vector`1[float]]:Add(System.Numerics.Vector`1[float],System.Collections.Generic.KeyValuePair`2[System.String,System.Object],System.Collections.Generic.KeyValuePair`2[System.String,System.Object],System.Collections.Generic.KeyValuePair`2[System.String,System.Object]):this
        -165 (-57.69% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.Histogram`1[System.Numerics.Vector`1[float]]:Record(System.Numerics.Vector`1[float],System.Collections.Generic.KeyValuePair`2[System.String,System.Object],System.Collections.Generic.KeyValuePair`2[System.String,System.Object],System.Collections.Generic.KeyValuePair`2[System.String,System.Object]):this
        -165 (-57.69% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.UpDownCounter`1[System.Numerics.Vector`1[float]]:Add(System.Numerics.Vector`1[float],System.Collections.Generic.KeyValuePair`2[System.String,System.Object],System.Collections.Generic.KeyValuePair`2[System.String,System.Object],System.Collections.Generic.KeyValuePair`2[System.String,System.Object]):this
        -138 (-82.63% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.Counter`1[System.Numerics.Vector`1[float]]:Add(System.Numerics.Vector`1[float]):this
        -138 (-82.63% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.Histogram`1[System.Numerics.Vector`1[float]]:Record(System.Numerics.Vector`1[float]):this
        -138 (-82.63% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.UpDownCounter`1[System.Numerics.Vector`1[float]]:Add(System.Numerics.Vector`1[float]):this
        -137 (-59.57% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.Counter`1[System.Numerics.Vector`1[float]]:Add(System.Numerics.Vector`1[float],System.Collections.Generic.KeyValuePair`2[System.String,System.Object],System.Collections.Generic.KeyValuePair`2[System.String,System.Object]):this
        -137 (-59.57% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.Histogram`1[System.Numerics.Vector`1[float]]:Record(System.Numerics.Vector`1[float],System.Collections.Generic.KeyValuePair`2[System.String,System.Object],System.Collections.Generic.KeyValuePair`2[System.String,System.Object]):this
        -137 (-59.57% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.UpDownCounter`1[System.Numerics.Vector`1[float]]:Add(System.Numerics.Vector`1[float],System.Collections.Generic.KeyValuePair`2[System.String,System.Object],System.Collections.Generic.KeyValuePair`2[System.String,System.Object]):this
        -128 (-71.11% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.Counter`1[System.Numerics.Vector`1[float]]:Add(System.Numerics.Vector`1[float],System.Collections.Generic.KeyValuePair`2[System.String,System.Object]):this
        -128 (-71.11% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.Histogram`1[System.Numerics.Vector`1[float]]:Record(System.Numerics.Vector`1[float],System.Collections.Generic.KeyValuePair`2[System.String,System.Object]):this
        -128 (-71.11% of base) : System.Diagnostics.DiagnosticSource.dasm - System.Diagnostics.Metrics.UpDownCounter`1[System.Numerics.Vector`1[float]]:Add(System.Numerics.Vector`1[float],System.Collections.Generic.KeyValuePair`2[System.String,System.Object]):this
         -59 (-24.48% of base) : System.Linq.dasm - System.Linq.Enumerable:Order[double](System.Collections.Generic.IEnumerable`1[double]):System.Linq.IOrderedEnumerable`1[double]
         -59 (-24.48% of base) : System.Linq.dasm - System.Linq.Enumerable:Order[System.Nullable`1[int]](System.Collections.Generic.IEnumerable`1[System.Nullable`1[int]]):System.Linq.IOrderedEnumerable`1[System.Nullable`1[int]]
         -59 (-24.48% of base) : System.Linq.dasm - System.Linq.Enumerable:Order[System.Numerics.Vector`1[float]](System.Collections.Generic.IEnumerable`1[System.Numerics.Vector`1[float]]):System.Linq.IOrderedEnumerable`1[System.Numerics.Vector`1[float]]
         -59 (-24.48% of base) : System.Linq.dasm - System.Linq.Enumerable:OrderDescending[double](System.Collections.Generic.IEnumerable`1[double]):System.Linq.IOrderedEnumerable`1[double]
         -59 (-24.48% of base) : System.Linq.dasm - System.Linq.Enumerable:OrderDescending[System.Nullable`1[int]](System.Collections.Generic.IEnumerable`1[System.Nullable`1[int]]):System.Linq.IOrderedEnumerable`1[System.Nullable`1[int]]
         -59 (-24.48% of base) : System.Linq.dasm - System.Linq.Enumerable:OrderDescending[System.Numerics.Vector`1[float]](System.Collections.Generic.IEnumerable`1[System.Numerics.Vector`1[float]]):System.Linq.IOrderedEnumerable`1[System.Numerics.Vector`1[float]]
         -36 (-22.64% of base) : System.Private.CoreLib.dasm - System.Numerics.Vector:Divide[long](System.Numerics.Vector`1[long],long):System.Numerics.Vector`1[long]
         -27 (-13.78% of base) : System.Private.CoreLib.dasm - System.Numerics.Vector:Divide[long](System.Numerics.Vector`1[long],System.Numerics.Vector`1[long]):System.Numerics.Vector`1[long]
         -17 (-14.78% of base) : System.Private.CoreLib.dasm - System.Numerics.Vector:Divide[short](System.Numerics.Vector`1[short],System.Numerics.Vector`1[short]):System.Numerics.Vector`1[short]
         -16 (-17.98% of base) : System.Private.CoreLib.dasm - System.Numerics.Vector:Divide[int](System.Numerics.Vector`1[int],int):System.Numerics.Vector`1[int]
         -13 (-12.26% of base) : System.Private.CoreLib.dasm - System.Numerics.Vector:Divide[int](System.Numerics.Vector`1[int],System.Numerics.Vector`1[int]):System.Numerics.Vector`1[int]
         -13 (-13.98% of base) : System.Private.CoreLib.dasm - System.Numerics.Vector:Divide[short](System.Numerics.Vector`1[short],short):System.Numerics.Vector`1[short]
          -9 (-8.11% of base) : System.Private.CoreLib.dasm - System.Numerics.Vector:Divide[ubyte](System.Numerics.Vector`1[ubyte],System.Numerics.Vector`1[ubyte]):System.Numerics.Vector`1[ubyte]
          -9 (-9.89% of base) : System.Private.CoreLib.dasm - System.Numerics.Vector:Divide[ubyte](System.Numerics.Vector`1[ubyte],ubyte):System.Numerics.Vector`1[ubyte]

Some examples:

  1. System.Linq.Enumerable:OrderDescending[int]
  2. [System.Numerics.Vector:Divideint:System.Numerics.Vector1[int]](https://www.diffchecker.com/8f0ARhUw) - Vector(Vector, T)` used to be extremely slow because of that.
  3. System.Net.Sockets.Socket:ReceiveAsync - Task.FromResult was inlined (it's marked as AggressiveInlining).
Author: EgorBo
Assignees: EgorBo
Labels:

area-CodeGen-coreclr

Milestone: -

@EgorBo
Copy link
Member Author

EgorBo commented Nov 27, 2022

@dotnet/jit-contrib @jkotas PTAL - do you agree with the logic behind this change? I also checked a few apps I have locally and it didn't lead to "catastrophic" inline decisions, mostly because of the "friction" part.

jit-diff is negative despite the fact we inline more now 🙂 (this PR doesn't prevent inlining for something that was inlined previously)

return true;
}

if (m_CallsiteDepth == 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way we can better track m_CallsiteDepth for trivial cases?

That is, there are many scenarios where you might have a code pattern like the following:

public void PublicApi(...)
{
    // ArgumentValidation
    InternalApi(...);
}

void InternalApi(...)
{
    if (Vector256.IsHardwareAccelerated)
    {
        InternalApiV256();
    }
    else if (Vector128.IsHardwareAccelerated)
    {
        InternalApiV128();
    }
    else
    {
        InternalApiScalar();
    }
}

Such scenarios involve a couple small almost "stub" methods or methods that just forward to another (another example is int.LeadingZeroCount which just forwards to BitOperations.LeadingZeroCount).

In such scenarios, we will almost always by at m_CallsiteDepth >= 2 and not have gotten to the actual interesting "core" yet.

If we had a way to factor in such methods that are effectively just "forward" or "validate + forward" then we could more meaningfully do inlining for such cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the current impl, it now tries to guess final imported IL size. so for your case InternalApi won't affect the time budget since the majority of code will be folded anyway. So e.g. InternalApiV256 will have plenty of time budget left to be inlined (especially if inliner can detect foldable branches in it).

@EgorBo
Copy link
Member Author

EgorBo commented Nov 29, 2022

@dotnet/jit-contrib anyone brave enough to approve? 🙂

@markples
Copy link
Member

Given that this fixes the known examples and you've done the general analysis, I think I would be brave enough to approve. However, I don't feel very comfortable jumping into inliner strategy, in particular evaluating how it fits with the other heuristics. Things like - should other uses of m_CodeSize use this improved size estimate? Is it reasonable to just change this one use and possibly look at others later, or does the new use of two different sizes make things more difficult to maintain/update in the future?

@EgorBo
Copy link
Member Author

EgorBo commented Nov 30, 2022

Given that this fixes the known examples and you've done the general analysis, I think I would be brave enough to approve. However, I don't feel very comfortable jumping into inliner strategy, in particular evaluating how it fits with the other heuristics. Things like - should other uses of m_CodeSize use this improved size estimate? Is it reasonable to just change this one use and possibly look at others later, or does the new use of two different sizes make things more difficult to maintain/update in the future?

So technically we have three completely different types of checks/heuristics to inline a method:

  1. Legality - we have a set of limitations (e.g. exception handling in a callee)
  2. Profitability - it's where we analyze IL to find signs that it will be profitable to inline with the given arguments at its call site. In that case we also have a completely different model to estimate final codegen size of the inlinee (basically "IL-opcode -- weight" state machine) - so we basically don't use ILSize directly here.
  3. Time budget - we try to estimate time (say, in milliseconds) it will take for JIT to spend additionally if we inline this callee. We limit ourselves here for better JIT throughput. (BTW, we probably can ignore it completely for AOT)

So this PR only improves the 3rd one slightly. JIT used to incorrectly estimate "time it will take to compile this callee" for functions with foldable branches - it didn't take into account that a lot of dead code will be eliminated during import (won't be materialized into GenTree objects) so it will take much less time for JIT to compile than inliner estimated initially.

Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@am11
Copy link
Member

am11 commented Dec 1, 2022

@EgorBo, I just noticed that if we extract the LHS (typeof(TKey)) to a variable, e.g.:Type type = typeof(TKey); and use the variable type in those conditions, the codegen of LessThan goes from 5 to 217 lines. It look like the typeof inlining gives up on variable and doesn't look beyond whether the variable is pointing to a constant value. Roslyn does that kind of extraction to local variable in some pattern matching cases so it might be useful to improve it for number of scenarios.
examples: transpilation | codegen

@EgorBo
Copy link
Member Author

EgorBo commented Dec 1, 2022

@EgorBo, I just noticed that if we extract the LHS (typeof(TKey)) to a variable, e.g.:Type type = typeof(TKey); and use the variable type in those conditions, the codegen of LessThan goes from 5 to 217 lines. It look like the typeof inlining gives up on variable and doesn't look beyond whether the variable is pointing to a constant value. Roslyn does that kind of extraction to local variable in some pattern matching cases so it might be useful to improve it for number of scenarios. examples: transpilation | codegen

Thanks for the examples, they're not this PR related since they uncover some inefficiency in JIT for __Canon, e.g. minimal repro: https://godbolt.org/z/qTaYc87Pf but must be something we can fix! I'll file an issue

@EgorBo EgorBo merged commit 69cdbf1 into dotnet:main Dec 1, 2022
@EgorBo EgorBo deleted the jit-overbudget-inlining branch December 1, 2022 12:00
@markples
Copy link
Member

markples commented Dec 1, 2022

So technically we have three completely different types of checks/heuristics to inline a method:

Thanks for the explanation; it's helpful. However, should this improvement also be used in other places that use m_CodeSize (such as the comparisons against alwaysInlineSize and maxCodeSize)?

@EgorBo
Copy link
Member Author

EgorBo commented Dec 1, 2022

So technically we have three completely different types of checks/heuristics to inline a method:

Thanks for the explanation; it's helpful. However, should this improvement also be used in other places that use m_CodeSize (such as the comparisons against alwaysInlineSize and maxCodeSize)?

comparisons against alwaysInlineSize

this happens before we analyze IL so we don't have any info about foldable branches etc. And we can't move those checks to a later stage when we have that info because of throughput: IL > maxCodeSize -> bail out saves a lot of throughput for us. We still might want to implement a partial pre-scan at some point but we're not there yet.

@EgorBo
Copy link
Member Author

EgorBo commented Dec 6, 2022

Quite a few improvements on win-x64:
dotnet/perf-autofiling-issues#10365

@ghost ghost locked as resolved and limited conversation to collaborators Jan 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
6 participants