Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't swap under STRESS_REVERSE_FLAG in gtSetEvalOrder. #49074

Closed
wants to merge 1 commit into from

Conversation

sandreenko
Copy link
Contributor

@sandreenko sandreenko commented Mar 3, 2021

There could be a situation when gtSetEvalOrder was called from gtPrepareCost <- fgOptimizeBranch <- fgReorderBlocks <- fgInsertGCPolls , in this case the users were not expecting such swaps and were failing when calling fgDebugCheckStmtsList.

The alternative fixes were:

  1. call gtSetEvalOrder with new bool doChanges = false from gtPrepareCost - quite disruptive, cases asm diffs in default;
  2. call fgSetBlockOrder in fgOptimizeBranch or fgReorderBlocks - makes TP worse in default scenario;
  3. add fgTreeCostCalculated and don't call gtPrepareCost when it is already known - quite disruptive, cases asm diffs in default;

so because the issue happened only in JitStress=2 only on arm64 apple in 1 test I decided to go with the simplest fix for now.

Fixes #48786.

@sandreenko sandreenko added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 3, 2021
@AndyAyersMS
Copy link
Member

I think the fact that there's no way to cost a tree without potentially reshaping the tree it is quite unfortunate.

So I'd like to see us follow a path where the side effects are optional.

@sandreenko
Copy link
Contributor Author

@AndyAyersMS do you mean option 1? I hesitate to do it in this PR, to be honest, because it would be too big or not clearer than what we have now. The problem is that unsigned gtSetEvalOrder(GenTree* tree); does several things:

 *  This function sets:
 *      1. GetCostEx() to the execution complexity estimate
 *      2. GetCostSz() to the code size estimate
 *      3. Sometimes sets GTF_ADDRMODE_NO_CSE on nodes in the tree.
 *      4. DEBUG-only: clears GTF_DEBUG_NODE_MORPHED.

and if I change it to unsigned gtSetEvalOrder(GenTree* tree, bool allowChanges) it would be strange because it won't "set evaluation order".
and it would cause asm diffs, because CSE phase calls gtSetEvalOrder and sometimes sees new costs and decides to swap something.

I can open a new issue to split gtSetEvalOrder into several if you think we will have time for it or additional benefits.

@AndyAyersMS
Copy link
Member

Yes, something like option 1. Generally I don't think we would want to set the costs on the nodes -- just have a method to cost a tree that uses the same /similar costing logic. Getting the right degree of sharing and flexibility might be tricky.

It would be useful to be able to cost trees early on, even if the costs are estimates, without altering the trees. For example, for inline direct sub we might consider making copies of cheap trees; for finally cloning we could better assess the cost of the clone, etc.

@sandreenko
Copy link
Contributor Author

PTAL @dotnet/jit-contrib , the failure is unrelated.

@AndyAyersMS
Copy link
Member

Seems odd to me to suppress the stress case with !fgStmtListThreaded but allow it for the cases just above. Why is swapping safe in those cases?

Do you know why for just those few operators we actually mutate the tree operands instead of setting GTF_REVERSE_OPS to defer reversal until later?

@sandreenko
Copy link
Contributor Author

Seems odd to me to suppress the stress case with !fgStmtListThreaded but allow it for the cases just above. Why is swapping safe in those cases?

I was investigating this and my understanding is that all other cases where we can swap for profitability (level < lvl2 ) are guarded with fgSetBlockOrder, it could happen when we calculate the costs for the first time or when we change them because of CSE, all future calls without stress mode won't change the order.

Do you know why for just those few operators we actually mutate the tree operands instead of setting GTF_REVERSE_OPS to defer reversal until later?

The main problem is that gtSetEvalOrder is called from many different phases, GTF_REVERSE_OPS is mainly a global morph flag, setting it after it (in CSE or VN phases) makes little sense in my understanding.

Another example of such inconsistency: we clear all GTF_REVERSE_OPS during Rationalizer::RewriteNode but then codegen checks this flag in genCreateAddrMode .

@sandreenko
Copy link
Contributor Author

closing for now.

@sandreenko sandreenko closed this Apr 5, 2021
@JulieLeeMSFT JulieLeeMSFT added this to the 6.0.0 milestone Apr 5, 2021
@ghost ghost locked as resolved and limited conversation to collaborators May 5, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Apple Silicon eventsourceerror failure in jitstress2 CI
3 participants