-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Fix profiler tail call insertion logic for FIELD_LIST #76883
Conversation
This logic was not handling FIELD_LIST and was also not handling linear order appropriately. The logic is still a bit odd, it would probably be better to use the same kind of logic as CFG (moving PUTARG nodes ahead of the profiler hook instead), but in practice this seems to be ok. Fix dotnet#76879
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsThis logic was not handling FIELD_LIST and was also not handling linear order appropriately. The logic is still a bit odd, it would probably be better to use the same kind of logic as CFG (moving PUTARG nodes ahead of the profiler hook instead), but in practice this seems to be ok. Fix #76879
|
// GT_RELOAD/GT_COPY use the child node | ||
argNode = argNode->gtSkipReloadOrCopy(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing/skipping these here would be illegal IIUC. SPMI replay does not find any occurrences of this, and codegenxarch.cpp also does not have the equivalent here.
/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress |
Azure Pipelines successfully started running 2 pipeline(s). |
cc @dotnet/jit-contrib PTAL @BruceForstall @kunalspathak |
// | ||
GenTree* Lowering::FindEarliestPutArg(GenTreeCall* call) | ||
{ | ||
size_t numMarkedNodes = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a 2-pass system? Is it the case that we have a required ordering where the putarg under the field_list are in order, such that we can (1) look for the first field_list and then (2) look for the first putarg under that, in list order?
I guess your solution seems more robust and doesn't require strict list ordering, but does require more traversals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I'd prefer that we don't make any assumptions on linear order vs operand order in LIR after rationalization. For example, it might be reasonable in the future to allow some trivial scheduling/reordering of the PUTARG_REG
order in lowering. One place today we are moving these nodes already is when control-flow guard is enabled (MoveCFGCallArg
-- but I think this ends up keeping the nodes in order).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One place where the operand order does not match the execution order is on x86, where the use list is reversed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that really true? I think the order matches execution order, but the arguments are pushed in that order so they end up on stack in opposite order of the other platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty sure?
// The code generator will push these fields in reverse order by offset. Reorder the list here s.t. the order
// of uses is visible to LSRA.
assert(fieldList->Uses().IsSorted());
fieldList->Uses().Reverse();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, for the FIELD_LIST
operands. Ok, interesting, was not aware of this. This logic skips x86 anyway.
} | ||
} | ||
|
||
if (numMarkedNodes <= 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: size_t
is unsigned, so <
is unnecessary (maybe confusing? Or just "defense in depth"?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a habit of mine. I always write this check like this.
} | ||
else | ||
{ | ||
node->gtLIRFlags |= LIR::Flags::Mark; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be appropriate to add:
assert((node->gtLIRFlags & LIR::Flags::Mark) == 0);
I.e., how do we know the LIR doesn't have any Mark
set at this point of compilation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that would be appropriate, I can add the assert.
I.e., how do we know the LIR doesn't have any Mark set at this point of compilation?
Same assumption is made by LIR::Range::GetTreeRange
that we are using from lowering already. This code is somewhat like LIR::Range::GetMarkedRange
.
Few diffs where we now place the profiler hook a bit earlier, and some TP improvements from removing the The failures look known according to build analysis. |
This logic was not handling FIELD_LIST and was also not handling linear order appropriately.
The logic is still a bit odd, it would probably be better to use the same kind of logic as CFG (moving PUTARG nodes ahead of the profiler hook instead), but in practice this seems to be ok.
Fix #76879