Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Fix weights in fgOptimizeUncondBranchToSimpleCond #50490

Closed
wants to merge 2 commits into from

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Mar 31, 2021

Fixes #50419

Repro (COMPlus_TieredPGO=1):

using System.Runtime.CompilerServices;
using System.Threading;

class Program
{
    static void Main()
    {
        // Collect some profile, wait for tier0->tier1 promotion.
        for (int i = 0; i < 100; i++)
        {
            Test(new object());
            Thread.Sleep(16);
        }
    }

    static object someObj;

    [MethodImpl(MethodImplOptions.NoInlining)]
    static void Test(object o)
    {
        if (o != someObj)
            o = null;

        if (o != null)
            Test1();
    }

    [MethodImpl(MethodImplOptions.NoInlining)] static void Test1() { }
}

Codegen diff between main and this PR: https://www.diffchecker.com/iqSxlzMG (note "edge weights are invalid" on the left)

image

The edges' weights aren't even rendered on the left (same phase).

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 31, 2021
@EgorBo
Copy link
Member Author

EgorBo commented Mar 31, 2021

I've tested this on a heavy desktop app (https://github.com/icsharpcode/AvaloniaILSpy)

Before: 367 methods are reported to have invalid weights
After: 338

(investigating other failures)

@EgorBo
Copy link
Member Author

EgorBo commented Mar 31, 2021

PTAL @AndyAyersMS @dotnet/jit-contrib

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it needs to be more general than this. Try some cases where the condition is not always true or not always false.

We have something like this to begin with. Say we know all the block and edge weights and they are consistent and weight(X) is not zero (if it is we perhaps should not do this opt):

image - 2021-03-31T132808 350

We want to duplicate X into A (expecting that the one of the paths out of A will get optimized away):

image - 2021-03-31T133131 051

So the block weight of X needs to decrease (it was A+B, it should now be just B). The block weight of T is unknowable -- we have to make a guess here. There is no good answer; the simplest is to give the outgoing edges of A the same weight ratio as the outgoing edges of X. Then we're consistent if we don't optimize, but inconsistent if we do; that's ok as it means we've learned something contextual. We'll clean such things up in some future work items.

So the updates should be something like:

ScaleA = weight (A)  / weight(X)
ScaleB = weight (B)  / weight(X)   ;; or 1-ScaleA

weight(X->C) = weight(X->C) * ScaleB
weight(X->D) = weight(X->D) * ScaleB
weight(A->C) = weight(X->C) * ScaleA
weight(A->T)  = weight(X->D) * ScaleA
weight(T) = weight(X->D) * ScaleA
weight(X) = weight(B)

@AndyAyersMS
Copy link
Member

Since there can actually be many other preds, a cleaner expression is perhaps:

ScaleA = weight(A) / weight(X)
ScaleNotA = 1 - ScaleA

weight(X)    = weight(X)    * ScaleNotA
weight(X->C) = weight(X->C) * ScaleNotA
weight(X->D) = weight(X->D) * ScaleNotA

weight(A->C) = weight(X->C) * ScaleA
weight(A->T) = weight(X->D) * ScaleA
weight(T)    = weight(X->D) * ScaleA

@EgorBo
Copy link
Member Author

EgorBo commented Apr 26, 2021

Ping myself

@EgorBo
Copy link
Member Author

EgorBo commented Apr 29, 2021

@AndyAyersMS could you please take a look, I've integrated your changes

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM, but there's an assert in crossgen2 to investigate.

// C - target->bbJumpDest
// D - target->bbNext

const BasicBlock::weight_t scaleA = block->bbWeight / target->bbWeight;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably should watch for division by zero here.

Comment on lines +3211 to +3215
// The optimization doesn't make sense in this case
if (target->bbWeight == 0.0)
{
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// The optimization doesn't make sense in this case
if (target->bbWeight == 0.0)
{
return false;
}
// The optimization doesn't make sense in this case
if (target->isRunRarely())
{
return false;
}

Also this check is quite cheap so I'd move it up to the top of the method as the first check.

@EgorBo
Copy link
Member Author

EgorBo commented Jun 28, 2021

Ping myself

@EgorBo EgorBo closed this Sep 6, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Oct 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JIT: PGO: fgOptimizeUncondBranchToSimpleCond corrupts profile data
3 participants