Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Issue 620 - Incorrect IND(ADDR(LCL_VAR)) folding when types do not match #40059

Closed
wants to merge 10 commits into from

Conversation

briansull
Copy link
Contributor

@briansull briansull commented Jul 29, 2020

Fixes for morphing a GT_IND of a GT_ADDR of a GT_LCL_VAR
When the GT_IND size is a small type.
We can have a GT_IND as either a load or as a store,
it is a store when it is on the Left-Hand-Side of an assignment
Otherwise it is a load

The test case provide with this PR covers the various combinations that were incorrect.

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 29, 2020
@briansull
Copy link
Contributor Author

Fixes #620

@briansull
Copy link
Contributor Author

briansull commented Jul 30, 2020

Diff is an improvement

Crossgen CodeSize Diffs for System.Private.CoreLib.dll, framework assemblies for x64 default jit
164 total methods with Code Size differences (118 improved, 46 regressed), 227498 unchanged.

Top file regressions (bytes):
          45 : Microsoft.CodeAnalysis.CSharp.dasm (0.00% of base)
          15 : Microsoft.CodeAnalysis.VisualBasic.dasm (0.00% of base)
          14 : System.Text.Json.dasm (0.00% of base)
           8 : System.Private.Xml.dasm (0.00% of base)
           6 : Microsoft.Extensions.Primitives.dasm (0.04% of base)
           4 : Microsoft.CodeAnalysis.dasm (0.00% of base)
           2 : System.Reflection.Metadata.dasm (0.00% of base)
           2 : System.Security.Cryptography.Algorithms.dasm (0.00% of base)
           2 : System.Security.Cryptography.Cng.dasm (0.00% of base)
           2 : System.Security.Cryptography.Pkcs.dasm (0.00% of base)
           1 : System.Net.HttpListener.dasm (0.00% of base)
           1 : System.Transactions.Local.dasm (0.00% of base)
Top file improvements (bytes):
       -1025 : System.Private.CoreLib.dasm (-0.03% of base)
         -15 : System.Data.Odbc.dasm (-0.01% of base)
         -10 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm (-0.00% of base)
          -3 : System.Diagnostics.PerformanceCounter.dasm (-0.00% of base)

Top method regressions (bytes):
          11 ( 1.36% of base) : System.Private.CoreLib.dasm - System.Reflection.MdConstant:GetValue(System.Reflection.MetadataImport,int,System.RuntimeTypeHandle,bool):System.Object
          10 ( 1.77% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - Microsoft.CodeAnalysis.VisualBasic.VisualBasicCommandLineParser:PublicSymbolsToInternalDefines(System.Collections.Generic.IEnumerable`1[[System.Collections.Generic.KeyValuePair`2[[System.String, System.Private.CoreLib, Version=5.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.Object, System.Private.CoreLib, Version=5.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], System.Private.CoreLib, Version=5.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]],System.String):System.Collections.Immutable.ImmutableDictionary`2[[System.String, System.Private.CoreLib, Version=5.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[Microsoft.CodeAnalysis.VisualBasic.Syntax.InternalSyntax.CConst, Microsoft.CodeAnalysis.VisualBasic, Version=1.1.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]
           8 ( 1.86% of base) : System.Private.CoreLib.dasm - System.Diagnostics.Tracing.ManifestBuilder:AddChannel(System.String,int,System.Diagnostics.Tracing.EventChannelAttribute):this
           8 ( 0.58% of base) : System.Private.Xml.dasm - System.Xml.XmlTextReaderImpl:ParseNumericCharRefInline(int,bool,System.Text.StringBuilder,byref,byref):int:this

Top method improvements (bytes):
         -69 (-16.01% of base) : System.Private.CoreLib.dasm - System.SpanHelpers:LastIndexOf(byref,ushort,int):int
         -61 (-14.39% of base) : System.Private.CoreLib.dasm - System.SpanHelpers:IndexOf(byref,ushort,int):int
         -40 (-57.97% of base) : System.Private.CoreLib.dasm - System.MemoryExtensions:IndexOfAny(System.ReadOnlySpan`1[Char],ushort,ushort,ushort):int
         -36 (-2.04% of base) : System.Private.CoreLib.dasm - System.Threading.Tasks.Task:RunContinuations(System.Object):this
         -26 (-50.00% of base) : System.Private.CoreLib.dasm - System.MemoryExtensions:IndexOfAny(System.ReadOnlySpan`1[Char],ushort,ushort):int
         -17 (-25.76% of base) : System.Private.CoreLib.dasm - System.Buffers.Binary.BinaryPrimitives:WriteInt16BigEndian(System.Span`1[Byte],short)
         -17 (-30.36% of base) : System.Private.CoreLib.dasm - System.Buffers.Binary.BinaryPrimitives:TryWriteInt16BigEndian(System.Span`1[Byte],short):bool
         -16 (-25.40% of base) : System.Private.CoreLib.dasm - System.Buffers.Binary.BinaryPrimitives:WriteUInt16BigEndian(System.Span`1[Byte],ushort)
         -16 (-30.19% of base) : System.Private.CoreLib.dasm - System.Buffers.Binary.BinaryPrimitives:TryWriteUInt16BigEndian(System.Span`1[Byte],ushort):bool
         -14 (-6.51% of base) : System.Private.CoreLib.dasm - System.Text.StringBuilder:AppendSpanFormattable(short):System.Text.StringBuilder:this
         -14 (-6.51% of base) : System.Private.CoreLib.dasm - System.Text.StringBuilder:AppendSpanFormattable(byte):System.Text.StringBuilder:this
         -13 (-3.76% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - Microsoft.Diagnostics.Tracing.Etlx.TraceActivity:ActivityKindToString(short):System.String
         -12 (-6.63% of base) : System.Private.CoreLib.dasm - System.Text.StringBuilder:AppendSpanFormattable(ushort):System.Text.StringBuilder:this
         -12 (-6.63% of base) : System.Private.CoreLib.dasm - System.Text.StringBuilder:AppendSpanFormattable(ubyte):System.Text.StringBuilder:this
         -12 (-29.27% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.GenericComparer`1[Boolean][System.Boolean]:Compare(bool,bool):int:this
         -12 (-60.00% of base) : System.Private.CoreLib.dasm - System.Collections.Generic.GenericEqualityComparer`1[Char][System.Char]:GetHashCode(ushort):int:this

Add check for loads using GTF_IND_ASG_LHS as we can read and normalize loads using a small type, but not stores.
For stores will will force the use of a GT_LCL_FLD whewn using small types
@briansull
Copy link
Contributor Author

@dotnet/jit-contrib PTAL

@briansull briansull requested a review from BruceForstall August 4, 2020 20:36
@briansull briansull requested a review from sandreenko August 5, 2020 02:15
Copy link
Contributor

@sandreenko sandreenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a concern about GT_LCL_FLD support.

src/coreclr/src/jit/morph.cpp Outdated Show resolved Hide resolved
src/coreclr/src/jit/morph.cpp Outdated Show resolved Hide resolved
src/coreclr/src/jit/morph.cpp Outdated Show resolved Hide resolved
src/coreclr/src/jit/morph.cpp Show resolved Hide resolved
src/coreclr/src/jit/morph.cpp Outdated Show resolved Hide resolved
src/coreclr/src/jit/morph.cpp Show resolved Hide resolved
src/coreclr/src/jit/morph.cpp Outdated Show resolved Hide resolved
@briansull
Copy link
Contributor Author

@sandreenko PTAL

@briansull
Copy link
Contributor Author

dotnet-runtime-perf (Libraries Build Linux x64 Release) is currently broken:
APICompat failures on Linux #40416

Copy link
Contributor

@sandreenko sandreenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions and could you please trigger outerloop/stress job for this PR before merge?

temp = nullptr;
ival1 = 0;
isStore = (tree->gtFlags & GTF_DONT_CSE) != 0;
;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like an unintentional new line with ;.

Copy link
Contributor Author

@briansull briansull Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can fix this in a follow on checkin

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, there should be a comment here that this should be checking GTF_IND_ASG_LHS once the above cleanup item is addressed.

@@ -13678,8 +13688,9 @@ GenTree* Compiler::fgMorphSmpOp(GenTree* tree, MorphAddrContext* mac)
}
else if (temp->OperIsLocal())
{
unsigned lclNum = temp->AsLclVarCommon()->GetLclNum();
LclVarDsc* varDsc = &lvaTable[lclNum];
unsigned lclNum = temp->AsLclVarCommon()->GetLclNum();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is still confusing that we are checking parent struct type for local fields even if these checks never pass and the code is working because of that.

I would suggest separate this else if (temp->OperIsLocal())
as

else if (temp->OperIs(GT_LCL_VAR)) { the new logic}
else if  (temp->OperIs(GT_LCL_FLD)) // the old logic that was applying to `LCL_FLD` only.
{
    // Assumes that when Lookup returns "false" it will leave "fieldSeq" unmodified (i.e.
    // nullptr)
    assert(fieldSeq == nullptr);
    bool b = GetZeroOffsetFieldMap()->Lookup(op1, &fieldSeq);
    assert(b || fieldSeq == nullptr);

    if (fieldSeq != nullptr) // that is from the initial CoreCLR commit, no idea why we do it only
                             // when we have a zero offset.
    {
        // Append the field sequence, change the type.
        temp->AsLclFld()->SetFieldSeq(
            GetFieldSeqStore()->Append(temp->AsLclFld()->GetFieldSeq(), fieldSeq));
        temp->gtType = typ;

        foldAndReturnTemp = true;
    }
} 

Copy link
Contributor Author

@briansull briansull Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we can address any other change as a follow up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why there is a rush to check this in today; I think it's worth making the change @sandreenko suggests, as it doesn't seem reasonable to use the parent type in the `GT_LCL_FLD' case.

Copy link
Contributor Author

@briansull briansull Aug 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this change:

else if  (temp->OperIs(GT_LCL_FLD)) // the old logic that was applying to `LCL_FLD` only.
else if (temp->OperIs(GT_LCL_VAR)) { the new logic}

But it results in a failure, to propagate a field sequence on x86:

fgMorphTree BB04, STMT00037 (before)
               [000221] -A----------              *  ASG       struct (copy)
               [000220] ------------              +--*  BLK       struct<System.DateTime, 8>
               [000219] ------------              |  \--*  ADDR      byref 
               [000218] -------N----              |     \--*  FIELD     struct Start
               [000215] ------------              |        \--*  ADDR      byref 
               [000216] -------N----              |           \--*  LCL_VAR   struct<System.Globalization.DaylightTimeStruct, 24> V11 tmp2         
               [000217] ------------              \--*  LCL_VAR   struct<System.DateTime, 8>(P) V24 tmp15        
                                                  \--*    long   V24._dateData (offs=0x00) -> V42 tmp33        

fgMorphTree BB04, STMT00037 (after)
               [000355] -A---+------              *  ASG       long  
               [000353] *------N----              +--*  IND       long  
               [000351] ------------              |  \--*  ADDR      byref 
<              [000352] U------N----              |     \--*  LCL_FLD   struct V11 tmp2         [+0] Fseq[Start, _dateData]
>              [000352] U------N----              |     \--*  LCL_FLD   struct V11 tmp2         [+0]
               [000354] ------------              \--*  LCL_VAR   long   V42 tmp33        

So that sequence is used by both GT_LCL_FLD and for GT_LCL_VAR that fail the optimization step.

Copy link
Contributor Author

@briansull briansull Aug 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up changing it to this:

                    else if (temp->OperIsLocal())
                    {
                        if (temp->OperIs(GT_LCL_VAR))
                        {
                           // newer code
                        }
                        if (!foldAndReturnTemp)
                        {
                           // old code

src/coreclr/src/jit/morph.cpp Outdated Show resolved Hide resolved
@briansull
Copy link
Contributor Author

I will trigger an outerloop/stress job for this PR before merge...

I have already run a JitStress=1 run on my system, so I don't expect any failures caused by my change,
Do you know if out outerloop Jit Stress is clean as of now?

I would ask that you sign off NOW so I can check in immediately after it finishes.

@briansull
Copy link
Contributor Author

FYI: JitStress outerloop has failed 9 of the last 9 runs,

Pipeline Insights

This pipeline has a pass rate of 28.57% ( 50%) in the last 14 days.

@briansull
Copy link
Contributor Author

Jit Stress outerloop tests now have started,

@sandreenko sandreenko dismissed their stale review August 6, 2020 20:40

I am not going to review or approve this change.

@briansull
Copy link
Contributor Author

@CarolEidt - have you analyzed the regressions? Are they expected? Are some of them GT_LCL_FLD cases?

Yes, I have spend a lot of time doing analysis on the differences and insuring that we don't have regressions.
I will post a sample diff for both a regression and an improvement later on this afternoon.

We typically don't see this pattern with a GT_LCL_FLD during the first morph phase, we only see them when we do a late remorph due to an assertionprop or a CSE subsitution.

Copy link
Contributor

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth being conservative here, and it makes sense to preserve the existing code for the GT_LCL_FLD case.

@@ -13678,8 +13688,9 @@ GenTree* Compiler::fgMorphSmpOp(GenTree* tree, MorphAddrContext* mac)
}
else if (temp->OperIsLocal())
{
unsigned lclNum = temp->AsLclVarCommon()->GetLclNum();
LclVarDsc* varDsc = &lvaTable[lclNum];
unsigned lclNum = temp->AsLclVarCommon()->GetLclNum();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why there is a rush to check this in today; I think it's worth making the change @sandreenko suggests, as it doesn't seem reasonable to use the parent type in the `GT_LCL_FLD' case.

temp = nullptr;
ival1 = 0;
isStore = (tree->gtFlags & GTF_DONT_CSE) != 0;
;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, there should be a comment here that this should be checking GTF_IND_ASG_LHS once the above cleanup item is addressed.

src/coreclr/src/jit/morph.cpp Outdated Show resolved Hide resolved
src/coreclr/src/jit/morph.cpp Outdated Show resolved Hide resolved
// We build a new 'result' tree to return, as we want to call fgMorphTree on it
//
GenTree* result = gtNewLclvNode(lclNum, lclType);
assert(result->OperGet() == GT_LCL_VAR);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it that we need to build a new result tree? It seems an unfortunate additional cost; why doesn't it work to clearn the GTF_DEBUG_NODE_MORPHED flag?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole clearing/setting the GTF_DEBUG_NODE_MORPHED flag is a total hack.

But fine I will revert the change to my original fix which is much simpler but introduces some regressions.

I believe that this fix is better, but will go with whatever you want.\

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally agree that the GTF_DEBUG_NODE_MOPRHED flag is a frustratingly fragile mechanism - though I do think it would be best to do a minimal fix for now.

@dotnet dotnet deleted a comment from CarolEidt Aug 7, 2020
@briansull
Copy link
Contributor Author

Replaced with #40535

@briansull briansull closed this Aug 7, 2020
@karelz karelz added this to the 5.0.0 milestone Aug 18, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect IND(ADDR(LCL_VAR)) folding when types do not match
5 participants