Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Invalid negated result #57640

Closed
jakobbotsch opened this issue Aug 18, 2021 · 11 comments · Fixed by #57651
Closed

JIT: Invalid negated result #57640

jakobbotsch opened this issue Aug 18, 2021 · 11 comments · Fixed by #57651
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI blocking-release
Milestone

Comments

@jakobbotsch
Copy link
Member

Repros on multiple platforms:

// Generated by Fuzzlyn v1.2 on 2021-08-15 23:15:19
// Run on .NET 6.0.0-dev on Arm Linux
// Seed: 18219619158927602726
// Reduced from 82.6 KiB to 0.3 KiB in 00:02:54
// Debug: Outputs 14270
// Release: Outputs 4294953026
public class Program
{
    static long[] s_28 = new long[]{1};
    public static void Main()
    {
        var vr10 = s_28[0];
        for (int vr13 = 0; vr13 < 2; vr13++)
        {
            uint vr12 = (uint)(0 - (-14270 * vr10));
            System.Console.WriteLine(vr12);
        }
    }
}

Seems we end up with 1 negation instead of 2 or 0:

; Assembly listing for method Program:Main()
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; Final local variable assignments
;
;  V00 loc0         [V00,T03] (  2,  2   )    long  ->  rcx         single-def
;  V01 loc1         [V01,T00] (  4, 13   )     int  ->  rsi
;  V02 OutArgs      [V02    ] (  1,  1   )  lclBlk (32) [rsp+00H]   "OutgoingArgSpace"
;  V03 tmp1         [V03,T01] (  3,  6   )     ref  ->  rcx         single-def "arr expr"
;  V04 cse0         [V04,T02] (  2,  5   )     int  ->  rdi         "CSE - aggressive"
;
; Lcl frame size = 40

G_M27646_IG01:              ;; offset=0000H
       57                   push     rdi
       56                   push     rsi
       4883EC28             sub      rsp, 40
                                                ;; bbWeight=1    PerfScore 2.25
G_M27646_IG02:              ;; offset=0006H
       48B9A8C8F1EBFC7F0000 mov      rcx, 0x7FFCEBF1C8A8
       BA01000000           mov      edx, 1
       E8A660E65E           call     CORINFO_HELP_GETSHARED_NONGCSTATIC_BASE
       48B9902E492D79010000 mov      rcx, 0x1792D492E90
       488B09               mov      rcx, gword ptr [rcx]
       83790800             cmp      dword ptr [rcx+8], 0
       7623                 jbe      SHORT G_M27646_IG05
       488B4910             mov      rcx, qword ptr [rcx+16]
       33F6                 xor      esi, esi
       69F9BE370000         imul     edi, ecx, 0x37BE
                                                ;; bbWeight=1    PerfScore 11.00
G_M27646_IG03:              ;; offset=0039H
       8BCF                 mov      ecx, edi
       F7D9                 neg      ecx
       E8A6FCFFFF           call     System.Console:WriteLine(int)
       FFC6                 inc      esi
       83FE02               cmp      esi, 2
       7CF0                 jl       SHORT G_M27646_IG03
                                                ;; bbWeight=4    PerfScore 12.00
G_M27646_IG04:              ;; offset=0049H
       4883C428             add      rsp, 40
       5E                   pop      rsi
       5F                   pop      rdi
       C3                   ret
                                                ;; bbWeight=1    PerfScore 2.25
G_M27646_IG05:              ;; offset=0050H
       E87BD79D5E           call     CORINFO_HELP_RNGCHKFAIL
       CC                   int3
                                                ;; bbWeight=0    PerfScore 0.00

; Total bytes of code 86, prolog size 6, PerfScore 36.10, instruction count 25, allocated bytes for code 86 (MethodHash=cb019401) for method Program:Main()
; ============================================================

I'll take a look.

@jakobbotsch jakobbotsch added blocking-release area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Aug 18, 2021
@jakobbotsch jakobbotsch added this to the 6.0.0 milestone Aug 18, 2021
@jakobbotsch jakobbotsch self-assigned this Aug 18, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Aug 18, 2021
@ghost
Copy link

ghost commented Aug 18, 2021

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

Repros on multiple platforms:

// Generated by Fuzzlyn v1.2 on 2021-08-15 23:15:19
// Run on .NET 6.0.0-dev on Arm Linux
// Seed: 18219619158927602726
// Reduced from 82.6 KiB to 0.3 KiB in 00:02:54
// Debug: Outputs 14270
// Release: Outputs 4294953026
public class Program
{
    static long[] s_28 = new long[]{1};
    public static void Main()
    {
        var vr10 = s_28[0];
        for (int vr13 = 0; vr13 < 2; vr13++)
        {
            uint vr12 = (uint)(0 - (-14270 * vr10));
            System.Console.WriteLine(vr12);
        }
    }
}

Seems we end up with 1 negation instead of 2 or 0:

; Assembly listing for method Program:Main()
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; partially interruptible
; No PGO data
; Final local variable assignments
;
;  V00 loc0         [V00,T03] (  2,  2   )    long  ->  rcx         single-def
;  V01 loc1         [V01,T00] (  4, 13   )     int  ->  rsi
;  V02 OutArgs      [V02    ] (  1,  1   )  lclBlk (32) [rsp+00H]   "OutgoingArgSpace"
;  V03 tmp1         [V03,T01] (  3,  6   )     ref  ->  rcx         single-def "arr expr"
;  V04 cse0         [V04,T02] (  2,  5   )     int  ->  rdi         "CSE - aggressive"
;
; Lcl frame size = 40

G_M27646_IG01:              ;; offset=0000H
       57                   push     rdi
       56                   push     rsi
       4883EC28             sub      rsp, 40
                                                ;; bbWeight=1    PerfScore 2.25
G_M27646_IG02:              ;; offset=0006H
       48B9A8C8F1EBFC7F0000 mov      rcx, 0x7FFCEBF1C8A8
       BA01000000           mov      edx, 1
       E8A660E65E           call     CORINFO_HELP_GETSHARED_NONGCSTATIC_BASE
       48B9902E492D79010000 mov      rcx, 0x1792D492E90
       488B09               mov      rcx, gword ptr [rcx]
       83790800             cmp      dword ptr [rcx+8], 0
       7623                 jbe      SHORT G_M27646_IG05
       488B4910             mov      rcx, qword ptr [rcx+16]
       33F6                 xor      esi, esi
       69F9BE370000         imul     edi, ecx, 0x37BE
                                                ;; bbWeight=1    PerfScore 11.00
G_M27646_IG03:              ;; offset=0039H
       8BCF                 mov      ecx, edi
       F7D9                 neg      ecx
       E8A6FCFFFF           call     System.Console:WriteLine(int)
       FFC6                 inc      esi
       83FE02               cmp      esi, 2
       7CF0                 jl       SHORT G_M27646_IG03
                                                ;; bbWeight=4    PerfScore 12.00
G_M27646_IG04:              ;; offset=0049H
       4883C428             add      rsp, 40
       5E                   pop      rsi
       5F                   pop      rdi
       C3                   ret
                                                ;; bbWeight=1    PerfScore 2.25
G_M27646_IG05:              ;; offset=0050H
       E87BD79D5E           call     CORINFO_HELP_RNGCHKFAIL
       CC                   int3
                                                ;; bbWeight=0    PerfScore 0.00

; Total bytes of code 86, prolog size 6, PerfScore 36.10, instruction count 25, allocated bytes for code 86 (MethodHash=cb019401) for method Program:Main()
; ============================================================

I'll take a look.

Author: jakobbotsch
Assignees: jakobbotsch
Labels:

blocking-release, area-CodeGen-coreclr

Milestone: 6.0.0

@jakobbotsch jakobbotsch removed the untriaged New issue has not been triaged by the area owner label Aug 18, 2021
@jakobbotsch
Copy link
Member Author

cc @dotnet/jit-contrib

@jakobbotsch
Copy link
Member Author

Seems like a standard "outdated VN" issue during hoisting:

Hoisting a copy of [000053] into PreHeader for loop L00 <BB02..BB02>:
N005 (  7,  9) [000053] ------------                NEG       int    <l:$146, c:$145>
N004 (  6,  8) [000022] ------------              └──▌  MUL       int    <l:$2c6, c:$2c5>
N002 (  1,  1) [000021] C-----------                 ├──▌  LCL_VAR   int    V00 loc0         u:2 <l:$2c4, c:$2c3>
N003 (  1,  4) [000051] ------------                 └──▌  CNS_INT   int    -0x37be $87

New Basic Block BB04 [0005] created.

Created PreHeader (BB04) for loop L00 (BB02 - BB02), with weight = 1   
Setting edge weights for BB01 -> BB04 to [0 .. 3.402823e+38]
Setting edge weights for BB01 -> BB04 to [100 .. 100]
Setting edge weights for BB04 -> BB02 to [0 .. 3.402823e+38]
Setting edge weights for BB04 -> BB02 to [100 .. 100]
This hoisted copy placed in PreHeader (BB04):
               [000069] ------------                COMMA     void  
     (  6,  8) [000065] -------H----              ├──▌  MUL       int    <l:$2c6, c:$2c5>
     (  1,  1) [000066] C------H----                ├──▌  LCL_VAR   int    V00 loc0         u:2 <l:$2c4, c:$2c3>
               [000070] ------------                └──▌  CNS_INT   int    0x37BE
               [000068] ------------              └──▌  NOP       void  

[000022] and [000065] should not have the same VNs.

@EgorBo
Copy link
Member

EgorBo commented Aug 18, 2021

@EgorBo
Copy link
Member

EgorBo commented Aug 18, 2021

it needs mulOrDiv->SetVNsFromNode(tree);

@jakobbotsch
Copy link
Member Author

Indeed. In general it seems pretty dangerous that we morph with optValnumCSE_phase = false during hoisting, doesn't it?
I'm also not completely sure what the point is of morphing here since CSE will also do its own morph later.

@EgorBo
Copy link
Member

EgorBo commented Aug 18, 2021

Indeed. In general it seems pretty dangerous that we morph with optValnumCSE_phase = false during hoisting, doesn't it?
I'm also not completely sure what the point is of morphing here since CSE will also do its own morph later.

I suspect all optValnumCSE_phase checks have to be revised and either converted to if (fgGlobalMorph) or carefully update VNs in case of calling from Hoisting phase

@SingleAccretion
Copy link
Contributor

They always have to update the VNs (or be under the fgGlobalMorph guard), that's the model we have for morphing.

I would agree that calling morphing from hoisting seems unnecessary. The point of morphing in the optimizier is three-fold:

  1. Update side effect flags - not applicable
  2. Update args table for when we insert, e. g., assignments under calls - not applicable.
  3. Perform optimization - not applicable as well.

In theory, we should be able to get away with calling into the "optimizing morph" only in assertion propagation, where actual new opportunities are discovered. But then we have RangeCheck after assertion propagation. If we can guarantee that RangeCheck will not use stale VNs, then we may be able to eliminate the need to maintain VNs in morph completely.

@jakobbotsch
Copy link
Member Author

We should probably audit them, at least.
I guess optValnumCSE_phase checks are intended to ensure we don't fold away CSE defs but I guess this might also guard against outdated VNs in a lot of cases.

Currently one must be explicit about preserving VNs when calling SetOper. I wonder if we should have the same mechanism when setting gtOp1/gtOp2 (or, at least a flag that marks the VN as "dirty" in debug if the new operand VN is different).

jakobbotsch added a commit to jakobbotsch/runtime that referenced this issue Aug 18, 2021
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Aug 18, 2021
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Aug 18, 2021
@JulieLeeMSFT
Copy link
Member

@jakobbotsch please backport it to release/6.0 to seek approval to merge to RC1.

@jakobbotsch
Copy link
Member Author

I backported it to release/6.0 in #57686.

jeffschwMSFT pushed a commit that referenced this issue Aug 19, 2021
)

* Fix incorrect VN when folding GT_NEG(GT_MUL(A, C))

Fixes #57640

* Fix test name

Co-authored-by: Jakob Botsch Nielsen <jakob.botsch.nielsen@gmail.com>
@ghost ghost locked as resolved and limited conversation to collaborators Sep 17, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI blocking-release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants