-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Issue 620 - Incorrect IND(ADDR(LCL_VAR)) folding when types do not match #40059
Conversation
Fixes #620 |
Diff is an improvement
|
Add check for loads using GTF_IND_ASG_LHS as we can read and normalize loads using a small type, but not stores. For stores will will force the use of a GT_LCL_FLD whewn using small types
@dotnet/jit-contrib PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a concern about GT_LCL_FLD
support.
@sandreenko PTAL |
dotnet-runtime-perf (Libraries Build Linux x64 Release) is currently broken: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few questions and could you please trigger outerloop/stress job for this PR before merge?
src/coreclr/src/jit/morph.cpp
Outdated
temp = nullptr; | ||
ival1 = 0; | ||
isStore = (tree->gtFlags & GTF_DONT_CSE) != 0; | ||
; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like an unintentional new line with ;
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can fix this in a follow on checkin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, there should be a comment here that this should be checking GTF_IND_ASG_LHS
once the above cleanup item is addressed.
src/coreclr/src/jit/morph.cpp
Outdated
@@ -13678,8 +13688,9 @@ GenTree* Compiler::fgMorphSmpOp(GenTree* tree, MorphAddrContext* mac) | |||
} | |||
else if (temp->OperIsLocal()) | |||
{ | |||
unsigned lclNum = temp->AsLclVarCommon()->GetLclNum(); | |||
LclVarDsc* varDsc = &lvaTable[lclNum]; | |||
unsigned lclNum = temp->AsLclVarCommon()->GetLclNum(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is still confusing that we are checking parent struct type for local fields even if these checks never pass and the code is working because of that.
I would suggest separate this else if (temp->OperIsLocal())
as
else if (temp->OperIs(GT_LCL_VAR)) { the new logic}
else if (temp->OperIs(GT_LCL_FLD)) // the old logic that was applying to `LCL_FLD` only.
{
// Assumes that when Lookup returns "false" it will leave "fieldSeq" unmodified (i.e.
// nullptr)
assert(fieldSeq == nullptr);
bool b = GetZeroOffsetFieldMap()->Lookup(op1, &fieldSeq);
assert(b || fieldSeq == nullptr);
if (fieldSeq != nullptr) // that is from the initial CoreCLR commit, no idea why we do it only
// when we have a zero offset.
{
// Append the field sequence, change the type.
temp->AsLclFld()->SetFieldSeq(
GetFieldSeqStore()->Append(temp->AsLclFld()->GetFieldSeq(), fieldSeq));
temp->gtType = typ;
foldAndReturnTemp = true;
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we can address any other change as a follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why there is a rush to check this in today; I think it's worth making the change @sandreenko suggests, as it doesn't seem reasonable to use the parent type in the `GT_LCL_FLD' case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this change:
else if (temp->OperIs(GT_LCL_FLD)) // the old logic that was applying to `LCL_FLD` only.
else if (temp->OperIs(GT_LCL_VAR)) { the new logic}
But it results in a failure, to propagate a field sequence on x86:
fgMorphTree BB04, STMT00037 (before)
[000221] -A---------- * ASG struct (copy)
[000220] ------------ +--* BLK struct<System.DateTime, 8>
[000219] ------------ | \--* ADDR byref
[000218] -------N---- | \--* FIELD struct Start
[000215] ------------ | \--* ADDR byref
[000216] -------N---- | \--* LCL_VAR struct<System.Globalization.DaylightTimeStruct, 24> V11 tmp2
[000217] ------------ \--* LCL_VAR struct<System.DateTime, 8>(P) V24 tmp15
\--* long V24._dateData (offs=0x00) -> V42 tmp33
fgMorphTree BB04, STMT00037 (after)
[000355] -A---+------ * ASG long
[000353] *------N---- +--* IND long
[000351] ------------ | \--* ADDR byref
< [000352] U------N---- | \--* LCL_FLD struct V11 tmp2 [+0] Fseq[Start, _dateData]
> [000352] U------N---- | \--* LCL_FLD struct V11 tmp2 [+0]
[000354] ------------ \--* LCL_VAR long V42 tmp33
So that sequence is used by both GT_LCL_FLD and for GT_LCL_VAR that fail the optimization step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up changing it to this:
else if (temp->OperIsLocal())
{
if (temp->OperIs(GT_LCL_VAR))
{
// newer code
}
if (!foldAndReturnTemp)
{
// old code
I will trigger an outerloop/stress job for this PR before merge... I have already run a JitStress=1 run on my system, so I don't expect any failures caused by my change, I would ask that you sign off NOW so I can check in immediately after it finishes. |
FYI: JitStress outerloop has failed 9 of the last 9 runs, Pipeline Insights This pipeline has a pass rate of 28.57% ( 50%) in the last 14 days. |
Jit Stress outerloop tests now have started, |
I am not going to review or approve this change.
@CarolEidt - have you analyzed the regressions? Are they expected? Are some of them GT_LCL_FLD cases? Yes, I have spend a lot of time doing analysis on the differences and insuring that we don't have regressions. We typically don't see this pattern with a GT_LCL_FLD during the first morph phase, we only see them when we do a late remorph due to an assertionprop or a CSE subsitution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth being conservative here, and it makes sense to preserve the existing code for the GT_LCL_FLD
case.
src/coreclr/src/jit/morph.cpp
Outdated
@@ -13678,8 +13688,9 @@ GenTree* Compiler::fgMorphSmpOp(GenTree* tree, MorphAddrContext* mac) | |||
} | |||
else if (temp->OperIsLocal()) | |||
{ | |||
unsigned lclNum = temp->AsLclVarCommon()->GetLclNum(); | |||
LclVarDsc* varDsc = &lvaTable[lclNum]; | |||
unsigned lclNum = temp->AsLclVarCommon()->GetLclNum(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why there is a rush to check this in today; I think it's worth making the change @sandreenko suggests, as it doesn't seem reasonable to use the parent type in the `GT_LCL_FLD' case.
src/coreclr/src/jit/morph.cpp
Outdated
temp = nullptr; | ||
ival1 = 0; | ||
isStore = (tree->gtFlags & GTF_DONT_CSE) != 0; | ||
; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, there should be a comment here that this should be checking GTF_IND_ASG_LHS
once the above cleanup item is addressed.
// We build a new 'result' tree to return, as we want to call fgMorphTree on it | ||
// | ||
GenTree* result = gtNewLclvNode(lclNum, lclType); | ||
assert(result->OperGet() == GT_LCL_VAR); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it that we need to build a new result tree? It seems an unfortunate additional cost; why doesn't it work to clearn the GTF_DEBUG_NODE_MORPHED
flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole clearing/setting the GTF_DEBUG_NODE_MORPHED flag is a total hack.
But fine I will revert the change to my original fix which is much simpler but introduces some regressions.
I believe that this fix is better, but will go with whatever you want.\
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally agree that the GTF_DEBUG_NODE_MOPRHED
flag is a frustratingly fragile mechanism - though I do think it would be best to do a minimal fix for now.
Replaced with #40535 |
Fixes for morphing a GT_IND of a GT_ADDR of a GT_LCL_VAR
When the GT_IND size is a small type.
We can have a GT_IND as either a load or as a store,
it is a store when it is on the Left-Hand-Side of an assignment
Otherwise it is a load
The test case provide with this PR covers the various combinations that were incorrect.