Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: move return merging earlier #93997

Closed
wants to merge 6 commits into from

Conversation

AndyAyersMS
Copy link
Member

Instead of merging returns to the common return block in morph, do all the merging in fgAddInternal (where we already did some merging). This removes a case where morph would add a control flow edge in a way that might disrupt an ongoing RPO.

Earlier merging also opens up the possibility of tail merging some of the copies into the canonical return local, and possibly even some of the computations that feed the copies.

Modify the flow alterations done by morph. Previously if a tail call was expressed via a call to a CORINFO_TAILCALL_HELPER, morph would change the block kind to BBJ_RETURN and then merge the return, changing the block kind to BBJ_ALWAYS. Since merging now happens before moprh, morph needs to leave the block kind alone.

Generalize the post-tail-call sanity check in morph to recognize one new case that can come up.

Contributes to #93246.

Instead of merging returns to the common return block in morph, do all the
merging in `fgAddInternal` (where we already did some merging). This removes
a case where morph would add a control flow edge in a way that might disrupt
an ongoing RPO.

Earlier merging also opens up the possibility of tail merging some of the copies
into the canonical return local, and possibly even some of the computations that
feed the copies.

Modify the flow alterations done by morph. Previously if a tail call was expressed
via a call to a `CORINFO_TAILCALL_HELPER`, morph would change the block kind to
`BBJ_RETURN` and then merge the return, changing the block kind to `BBJ_ALWAYS`.
Since merging now happens before moprh, morph needs to leave the block kind alone.

Generalize the post-tail-call sanity check in morph to recognize one new case
that can come up.

Contributes to dotnet#93246.
@ghost ghost assigned AndyAyersMS Oct 25, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Oct 25, 2023
@ghost
Copy link

ghost commented Oct 25, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Instead of merging returns to the common return block in morph, do all the merging in fgAddInternal (where we already did some merging). This removes a case where morph would add a control flow edge in a way that might disrupt an ongoing RPO.

Earlier merging also opens up the possibility of tail merging some of the copies into the canonical return local, and possibly even some of the computations that feed the copies.

Modify the flow alterations done by morph. Previously if a tail call was expressed via a call to a CORINFO_TAILCALL_HELPER, morph would change the block kind to BBJ_RETURN and then merge the return, changing the block kind to BBJ_ALWAYS. Since merging now happens before moprh, morph needs to leave the block kind alone.

Generalize the post-tail-call sanity check in morph to recognize one new case that can come up.

Contributes to #93246.

Author: AndyAyersMS
Assignees: AndyAyersMS
Labels:

area-CodeGen-coreclr

Milestone: -

@AndyAyersMS
Copy link
Member Author

cc @dotnet/jit-contrib

A modest number of diffs, small net decrease in code size.

Comment on lines +2371 to +2373
// In principle we should be able to remove the proctection
// we add here, and remove the genReturnBB if it becomes unreachable,
// since we now introduce all the flow to it during this phase.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also some somewhat random code in lowering and codegen that references this block, so the deleted comment is still accurate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all those can be replaced by checks for the "right" BBJ_RETURN (the one and only without BBF_HAS_JMP), though I didn't want to push things quite that far.

@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented Oct 26, 2023

I'm puzzled by some the CI runs... eg the libraries x64 linux failure:

Assert failure(PID 71575 [0x00011797], Thread: 72229 [0x11a25]): Assertion failed 'retExpr->gtOper == GT_RETURN'
    File: /__w/1/s/src/coreclr/jit/morph.cpp Line: 15300

In my sources this assert is on line 15305.

@BruceForstall
Copy link
Member

In my sources this assert is on line 15305.

Because locally you don't have #93610?

@AndyAyersMS
Copy link
Member Author

In my sources this assert is on line 15305.

Because locally you don't have #93610?

Ah, could be. Thanks.

@AndyAyersMS
Copy link
Member Author

/azp run runtime-coreclr jitstress

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@AndyAyersMS
Copy link
Member Author

/azp run runtime-coreclr libraries-jitstress

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@AndyAyersMS
Copy link
Member Author

libraries jitstress failure: physical promotion is adding read-backs after a tail call

Replacing in BB04 [01B..028) -> BB06 (always), preds={BB03} succs={BB06}

STMT00003 ( 0x01B[E-] ... 0x027 )
               [000031] DAC-G------                         *  STORE_LCL_VAR struct<Microsoft.CodeAnalysis.CSharp.Symbols.TypeWithState, 16> V04 tmp2         
               [000019] --C-G------                         \--*  CALL      struct Microsoft.CodeAnalysis.CSharp.Symbols.TypeWithState:Create(Microsoft.CodeAnalysis.CSharp.Symbols.TypeSymbol,ubyte):Microsoft.CodeAnalysis.CSharp.Symbols.TypeWithState
               [000017] ----------- arg0                       +--*  LCL_VAR   ref    V05 tmp3          (last use)
               [000018] ----------- arg1                       \--*  CNS_INT   int    0
Processing block operation [000031] that involves replacements
Fields of [000031] in range [000..016) need to be read back: cannot decompose store
  V07 (V04.[000..008)) marked
  V08 (V04.[008..009)) marked
Reading back replacement V04.[000..008) -> V07 near the end of BB04:
STMT00008 ( ??? ... ??? )
               [000038] DA---------                         *  STORE_LCL_VAR ref    V07 tmp5         
               [000037] -----------                         \--*  LCL_FLD   ref    V04 tmp2         [+0]
Reading back replacement V04.[008..009) -> V08 near the end of BB04:
STMT00009 ( ??? ... ??? )
               [000040] DA---------                         *  STORE_LCL_VAR int    V08 tmp6         
               [000039] -----------                         \--*  LCL_FLD   ubyte  V04 tmp2         [+8]

and this makes the post-tail call sanity checker unhappy.

Assert failure(PID 10101 [0x00002775], Thread: 10115 [0x2783]): Assertion failed 'srcLclNum == callResultLclNumber' in 'Microsoft.CodeAnalysis.CSharp.NullableWalker:<VisitArgumentOutboundAssignmentsAndPostConditions>g__applyPostConditionsUnconditionally|242_3(Microsoft.CodeAnalysis.CSharp.Symbols.TypeWithState,int):Microsoft.CodeAnalysis.CSharp.Symbols.TypeWithState' during 'Morph - Global' (IL size 42; hash 0x2bb65d02; Tier1)

    File: /home/andy/repos/runtime0/src/coreclr/jit/morph.cpp Line: 15343
    Image: /home/andy/repos/runtime0/artifacts/bin/testhost/net9.0-linux-Release-x64/dotnet

One of the flaws of merging returns during fgAddInternal is that blocks with tail call candidates become attractive nuisances... they are not yet definite return points, so code after the call is potentially not dead, but if they become tail calls, we have to prove we don't lose anything by not executing that code.

We could avoid this by doing the tail call conversion earlier or the return merging later. The former seems difficult. The latter loses the potential benefits we might get from tail merges of the return value copies. So neither looks great but if necessary we can defer merging until just before morph.

Or we could try to avoid adding code to these spots... In this case the read backs could be avoided since along this path to the return the struct fields already have the right values... but along other paths they may not, so the writebacks in the return block would have to be hoisted into a pred for non-tail-call paths:

STMT00005 ( ??? ... ??? )
               [000029] -A---------                         *  RETURN    struct
               [000048] -A---------                         \--*  COMMA     struct
               [000047] UA---------                            +--*  STORE_LCL_FLD ref    V04 tmp2         [+0]
               [000046] -----------                            |  \--*  LCL_VAR   ref    V07 tmp5         
               [000051] -A---------                            \--*  COMMA     struct
               [000050] UA---------                               +--*  STORE_LCL_FLD ubyte  V04 tmp2         [+8]
               [000049] -----------                               |  \--*  LCL_VAR   int    V08 tmp6         
               [000028] -------N---                               \--*  LCL_VAR   struct<Microsoft.CodeAnalysis.CSharp.Symbols.TypeWithState, 16> V04 tmp2          (last use)

not sure how feasible this would be. @jakobbotsch any thoughts?

Even if we fix this it might be an uphill battle to keep other phases in the JIT from trying to add code after the calls. Right now it does not seem to be widespread problem. If we went this route we could beef up the diagnostic checker since we know the problematic blocks are those that end with a tail call and are marked with BBF_RETURN_MERGED; it would need to allow the same sort of post-call code that fgCheckStmtAfterTailCall is ok with (incidentally that checker and fgValidateIRForTailCall seem like they have a lot of overlap, and perhaps the former can be removed?).

@jakobbotsch
Copy link
Member

jakobbotsch commented Oct 27, 2023

How many tailcalls do we lose if we just discard the cases that we cannot prove to be safe? We have fgValidateIRForTailCall that should be a sound checker and which should be complete for the patterns produced by the JIT before this change. I do not think that fgCheckStmtAfterTailCall is a sound checker today -- I actually didn't realize we had that. I would be fine with just discarding the cases we cannot prove to be safe with fgValidateIRForTailCall. Presumably this is going to be a super rare case after this PR, and seems like long-term it makes things more robust.

@AndyAyersMS
Copy link
Member Author

How many tailcalls do we lose if we just discard the cases that we cannot prove to be safe?

Let me try this and see what the diff is.

@AndyAyersMS
Copy link
Member Author

Going to abandon this and just teach morph RPO to tolerate new edges being added to genReturnBB.

@ghost ghost locked as resolved and limited conversation to collaborators Dec 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants