Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Give up on the tail call if there are unexpected blocks after it #50806

Merged
merged 7 commits into from
Apr 15, 2021

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Apr 6, 2021

Fixes #50492 assert.

It only reproduces with the following env. vars:

COMPlus_TieredCompilation=1
COMPlus_TC_QuickJitForLoops=1
COMPlus_ReadyToRun=0
COMPlus_ZapDisable=1
COMPlus_TieredPGO=1

on ILSpy app (https://github.com/icsharpcode/AvaloniaILSpy) after some time randomly clicking, doesn't repro with TieredPGO=0

JitDump of a failed method https://gist.github.com/EgorBo/f0f58d04ec537ae815203f947309df65
(initially I wanted to perform fgCompactBlocks in such cases, but the sequence of blocks after the taill call is more complicated than fgCompact can handle)

I don't know how to make a unit test out of it, perhaps I can write it in IL.

diffs are empty.

/cc @AndyAyersMS @dotnet/jit-contrib

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 6, 2021
@AndyAyersMS
Copy link
Member

Can you share out a dump? I am curious how this happens (perhaps GDV of a tail call candidate?)

@EgorBo
Copy link
Member Author

EgorBo commented Apr 6, 2021

Can you share out a dump? I am curious how this happens (perhaps GDV of a tail call candidate?)

Ah, yes, it happens because of multiple GDVs, trying to make a quick repro - I guess we just need to reset the tailcall prefix in such cases?

@AndyAyersMS
Copy link
Member

Surprised we haven't run across this already somewhere. Seems like it ought to be easy to repro.

I guess we just need to reset the tailcall prefix in such cases?

We could also try and fix the GDV transform to not add blocks after the call, if the call is a tail call candidate.

@EgorBo
Copy link
Member Author

EgorBo commented Apr 7, 2021

Seems like it ought to be easy to repro.

I couldn't manage to create a repro for it 🙁 maybe I'm missing something.

We could also try and fix the GDV transform to not add blocks after the call, if the call is a tail call candidate.

Do you mean to not inline tail calls after GDV?

Minimal repro: create a net5.0 console app, add ICSharpCode.Decompiler nuget package.

using System;
using System.Threading;
using ICSharpCode.Decompiler;
using ICSharpCode.Decompiler.CSharp;
using ICSharpCode.Decompiler.TypeSystem;

class Program
{
    static void Main()
    {
        var decompiler = new CSharpDecompiler(typeof(object).Assembly.Location, new DecompilerSettings());
        for (int i = 0; i < 100; i++)
        {
            decompiler.DecompileTypeAsString(new FullTypeName("System.__ComObject"));
            Thread.Sleep(16);
        }
        Console.WriteLine("done");
    }
}

To get a jit dump:

DOTNET_JitDump=ICSharpCode.Decompiler.TypeSystem.Implementation.NullabilityAnnotatedType:AcceptVisitor(ICSharpCode.Decompiler.TypeSystem.TypeVisitor):ICSharpCode.Decompiler.TypeSystem.IType:this

@AndyAyersMS
Copy link
Member

Here's a relatively simple repro

using System;
using System.Runtime.CompilerServices;
using System.Threading;

class B 
{
    public virtual int F(B[] b, bool p)
    {
        int r = 0;
        for (int i = 0; i < 10; i++)
        {
            r = i * i;
        }
        for (int i = 0; i < 10; i++)
        {
            r = i * i;
        }
        return r;
    }
}

class D : B
{
    public override int F(B[] b, bool p) => p ? b[2].F(b, p) : 5;
}

class E : D
{
    public override int F(B[] b, bool p) => b[1].F(b, p);
}

class X
{
    static B[] b = { new E(), new D(), new B() };

    public static int Main()
    {
        int r = 0;

        for (int i = 0; i < 100; i++)
        {
            r += b[0].F(b, true);
            Thread.Sleep(15);
        }

        for (int i = 0; i < 100; i++)
        {
            r += b[0].F(b, true);
            Thread.Sleep(15);
        }

        Console.WriteLine($"r = {r}");
        return r;
    }
}

The problem arises in E.F. When we have implicit tail call candidates that are also GDV candidates, and we do multiple levels of GDV expansion, we can end up with multiple blocks between the call and the return, each of which just copies the return value from one temp to another.

In the example above, E.F expands its call via GDV to check for D, then on the happy path, inlines D.F as a tail call candidate; then expands the the call to F in D.F via GDV, leaving a call to B.F in tail position on the happy path. This call doesn't get inlined, and we hit the assert as it is several blocks away from the return by this point.

Since all this happens on the happy path, I think we should try and fix this instead of bailing out on the tail call, either by coalescing temps during GDV (if the return value of the original call is already going into a return value temp, just use that for the GDV temp, instead of creating a new one), or by making the check in morph recognize that we can see these sorts of chains of returns.

@EgorBo EgorBo force-pushed the fix-tail-call-assert branch from 14bf44f to 7f750ab Compare April 8, 2021 13:26
@EgorBo
Copy link
Member Author

EgorBo commented Apr 8, 2021

Since all this happens on the happy path, I think we should try and fix this instead of bailing out on the tail call, either by coalescing temps during GDV (if the return value of the original call is already going into a return value temp, just use that for the GDV temp, instead of creating a new one), or by making the check in morph recognize that we can see these sorts of chains of returns.

Thanks for the repro! I've implemented the second option but will check if I can easily track the fact that GT_RET_EXPR is already assigned to a temp so we don't need to introduce another one.

@AndyAyersMS
Copy link
Member

I have a fix for the first option in the works. It leverages the preexistingSpillTemp mechanism on the inline candidate.

@AndyAyersMS
Copy link
Member

Here's what I came up with: main...AndyAyersMS:FixGDVTailCallIssue.

It might make sense to take both our changes...

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider picking up my fix as well.

AndyAyersMS and others added 3 commits April 9, 2021 15:34
Inlinee guarded devirtualization candidates that are also tail call
candidates can share the return temp with the parent inline.

Blocks containing only GT_NOP or only GT_PHI are empty.
@EgorBo
Copy link
Member Author

EgorBo commented Apr 9, 2021

Consider picking up my fix as well.

Nice! Thanks, I've cherry-picked it to my branch.

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM, but since some of this is my code, maybe we should get a second review.

@AndyAyersMS
Copy link
Member

@dotnet/jit-contrib anyone else want to review? This has changes from both me and @EgorBo.

@AndyAyersMS
Copy link
Member

@AndyAyersMS AndyAyersMS merged commit 40a4cb6 into dotnet:main Apr 15, 2021
@JulieLeeMSFT JulieLeeMSFT added this to the 6.0.0 milestone Apr 15, 2021
AndyAyersMS added a commit to AndyAyersMS/runtime that referenced this pull request Apr 15, 2021
The detection of blocks with only PHI assignments was broken by dotnet#50806.
Fix by using existing helper to find the first non-PHI assignment.

Closes dotnet#51326.
BruceForstall pushed a commit that referenced this pull request Apr 16, 2021
The detection of blocks with only PHI assignments was broken by #50806.
Fix by using existing helper to find the first non-PHI assignment.

Closes #51326.
@ghost ghost locked as resolved and limited conversation to collaborators May 15, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JIT: Assertion failed 'nextNextBlock->bbJumpKind == BBJ_RETURN'
3 participants