Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: PGO: fgOptimizeUncondBranchToSimpleCond corrupts profile data #50419

Open
Tracked by #74873
EgorBo opened this issue Mar 30, 2021 · 3 comments
Open
Tracked by #74873

JIT: PGO: fgOptimizeUncondBranchToSimpleCond corrupts profile data #50419

EgorBo opened this issue Mar 30, 2021 · 3 comments
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI in-pr There is an active PR which will close this issue when it is merged tenet-performance Performance related issue
Milestone

Comments

@EgorBo
Copy link
Member

EgorBo commented Mar 30, 2021

It looks like fgOptimizeUncondBranchToSimpleCond (as part of the Optimize Layouts phase) produces invalid weights in some cases. E.g.

Before Optimize layout:
image

^ all edges look valid and satisfy Kirchhoff's Law.

After Optimize Layouts:

image

^
invalid edges:
BB06 (in: 0 + 0.5, out: 1)
BB09 (in: 0.25, out: 0)

I guess invalid edges are fine and will be re-calculated, but there are also invalid blocks such as BB09 which is a predecessor of a cold block but has weight 0.5 (BB05 should also be cold then?)

As the result, in the end I have:

; with IBC profile data, edge weights are invalid, and fgCalledCount is 36

But if I disable fgOptimizeUncondBranchToSimpleCond the final weights will be reported as valid:

image

^ "After Optimize Layout" without fgOptimizeUncondBranchToSimpleCond, all blocks look valid again (edges aren't though).

/cc @AndyAyersMS

category:correctness
theme:profile-feedback
skill-level:beginner
cost:small
impact:small

@EgorBo EgorBo added the tenet-performance Performance related issue label Mar 30, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI untriaged New issue has not been triaged by the area owner labels Mar 30, 2021
@EgorBo EgorBo added this to the 6.0.0 milestone Mar 30, 2021
@EgorBo EgorBo removed the untriaged New issue has not been triaged by the area owner label Mar 30, 2021
@EgorBo EgorBo self-assigned this Mar 30, 2021
@AndyAyersMS
Copy link
Member

optOptimizeLayout as a whole has similar problems -- don't recall if I tracked down which bits were responsible. See #48364 (comment) and #48476.

@EgorBo
Copy link
Member Author

EgorBo commented Mar 31, 2021

Simple repro:

using System.Runtime.CompilerServices;
using System.Threading;

class Program
{
    static void Main()
    {
        for (int i = 0; i < 100; i++)
        {
            Test(new object());
            Thread.Sleep(16);
        }
    }

    static object someObj;

    [MethodImpl(MethodImplOptions.NoInlining)]
    static void Test(object o)
    {
        if (o != someObj)
            o = null;

        if (o != null)
            Test1();
    }

    [MethodImpl(MethodImplOptions.NoInlining)] static void Test1() { }
}

codegen for Test (tier1 with TieredPGO):

; Assembly listing for method Program:Test(System.Object)
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; Tier-1 compilation
; optimized code
; optimized using profile data
; rsp based frame
; partially interruptible
; with IBC profile data, edge weights are invalid, and fgCalledCount is 36
; Final local variable assignments
;
;  V00 arg0         [V00,T00] (  4,  4   )     ref  ->  rcx         class-hnd
;  V01 OutArgs      [V01    ] (  1,  1   )  lclBlk (32) [rsp+00H]   "OutgoingArgSpace"
;
; Lcl frame size = 40

G_M3485_IG01:              ;; offset=0000H
       4883EC28             sub      rsp, 40
						;; bbWeight=1    PerfScore 0.25
G_M3485_IG02:              ;; offset=0004H
       48B8402D9FBF06020000 mov      rax, 0x206BF9F2D40
       483B08               cmp      rcx, gword ptr [rax]
       7505                 jne      SHORT G_M3485_IG04
						;; bbWeight=1    PerfScore 3.25
G_M3485_IG03:              ;; offset=0013H
       4885C9               test     rcx, rcx
       7505                 jne      SHORT G_M3485_IG05
						;; bbWeight=1    PerfScore 1.25
G_M3485_IG04:              ;; offset=0018H
       4883C428             add      rsp, 40
       C3                   ret      
						;; bbWeight=1    PerfScore 1.25
G_M3485_IG05:              ;; offset=001DH
       E8CEEAAFFF           call     Program:Test1()
       EBF4                 jmp      SHORT G_M3485_IG04
						;; bbWeight=0    PerfScore 0.00

image

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Mar 31, 2021
@EgorBo
Copy link
Member Author

EgorBo commented Jul 28, 2021

A minor issue, doesn't affect perf/correctness so moving to 7.0

@EgorBo EgorBo modified the milestones: 6.0.0, 7.0.0 Jul 28, 2021
@EgorBo EgorBo modified the milestones: 7.0.0, 8.0.0 Jul 6, 2022
@EgorBo EgorBo modified the milestones: 8.0.0, Future Apr 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI in-pr There is an active PR which will close this issue when it is merged tenet-performance Performance related issue
Projects
Development

Successfully merging a pull request may close this issue.

2 participants