Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test failure: access violation in cpblk tests #76506

Closed
BruceForstall opened this issue Oct 2, 2022 · 11 comments · Fixed by #76532
Closed

Test failure: access violation in cpblk tests #76506

BruceForstall opened this issue Oct 2, 2022 · 11 comments · Fixed by #76532
Assignees
Labels
arch-arm64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' blocking-outerloop Blocking the 'runtime-coreclr outerloop' and 'runtime-libraries-coreclr outerloop' runs
Milestone

Comments

@BruceForstall
Copy link
Member

BruceForstall commented Oct 2, 2022

Affected tests:

  • JIT/Directed/PREFIX/volatile/1/cpblk/cpblk.sh
  • JIT\\Directed\\PREFIX\\volatile\\1\\cpblk\\cpblk.cmd

Runfo hits for last 30 days as of 10/3:

  • First occurrence on 9/30 in rolling run 36073
  • Happened 2x-3x in each Rolling run in last 3 days across various platforms (all are arm64 -- on all platforms - Linux, OSX and Windows)

Original Report

arm64, JitStress

Also fails in MinOpts

https://dev.azure.com/dnceng-public/public/_build/results?buildId=37829&view=ms.vss-test-web.build-test-results-tab&runId=754244&paneView=debug&resultId=108471

Started with 20220929.1 build:
https://dev.azure.com/dnceng-public/public/_build/results?buildId=37829&view=results

Tests:

JIT\Directed\PREFIX\unaligned\2\cpblk\cpblk.cmd
JIT\IL_Conformance\Old\Conformance_Base\ldc_c_cpblk\ldc_c_cpblk.cmd
JIT\Directed\PREFIX\unaligned\4\cpblk\cpblk.cmd
JIT\IL_Conformance\Old\Base\cpblk\cpblk.cmd
JIT\IL_Conformance\Old\Conformance_Base\c_cpblk\c_cpblk.cmd
JIT\IL_Conformance\Old\Conformance_Base\cpblk\cpblk.cmd
JIT\Directed\PREFIX\unaligned\1\cpblk\cpblk.cmd

Example:

  Starting:    JIT.Directed.XUnitWrapper (parallel test collections = on, max threads = 8)
    JIT\Directed\PREFIX\unaligned\4\cpblk\cpblk.cmd [FAIL]
      Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
         at _cpblk.main(System.String[])

@dotnet/jit-contrib

Report

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
0 0 0
@BruceForstall BruceForstall added JitStress CLR JIT issues involving JIT internal stress modes area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Oct 2, 2022
@BruceForstall BruceForstall added this to the 8.0.0 milestone Oct 2, 2022
@ghost
Copy link

ghost commented Oct 2, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

arm64, JitStress

Also fails in MinOpts

https://dev.azure.com/dnceng-public/public/_build/results?buildId=37829&view=ms.vss-test-web.build-test-results-tab&runId=754244&paneView=debug&resultId=108471

Started with 20220929.1 build:
https://dev.azure.com/dnceng-public/public/_build/results?buildId=37829&view=results

Tests:

JIT\Directed\PREFIX\unaligned\2\cpblk\cpblk.cmd
JIT\IL_Conformance\Old\Conformance_Base\ldc_c_cpblk\ldc_c_cpblk.cmd
JIT\Directed\PREFIX\unaligned\4\cpblk\cpblk.cmd
JIT\IL_Conformance\Old\Base\cpblk\cpblk.cmd
JIT\IL_Conformance\Old\Conformance_Base\c_cpblk\c_cpblk.cmd
JIT\IL_Conformance\Old\Conformance_Base\cpblk\cpblk.cmd
JIT\Directed\PREFIX\unaligned\1\cpblk\cpblk.cmd

Example:

  Starting:    JIT.Directed.XUnitWrapper (parallel test collections = on, max threads = 8)
    JIT\Directed\PREFIX\unaligned\4\cpblk\cpblk.cmd [FAIL]
      Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
         at _cpblk.main(System.String[])

@dotnet/jit-contrib

Author: BruceForstall
Assignees: -
Labels:

JitStress, area-CodeGen-coreclr

Milestone: 8.0.0

@BruceForstall
Copy link
Member Author

@BruceForstall BruceForstall added the blocking-outerloop Blocking the 'runtime-coreclr outerloop' and 'runtime-libraries-coreclr outerloop' runs label Oct 2, 2022
@karelz karelz added blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' and removed JitStress CLR JIT issues involving JIT internal stress modes labels Oct 3, 2022
@karelz
Copy link
Member

karelz commented Oct 3, 2022

Yes, this is blocking outside of JIT Stress quite a bit -- it is blocking clean CI

@karelz karelz added arch-arm64 os-linux Linux OS (any supported distro) os-mac-os-x macOS aka OSX and removed os-linux Linux OS (any supported distro) os-mac-os-x macOS aka OSX labels Oct 3, 2022
@jakobbotsch
Copy link
Member

jakobbotsch commented Oct 3, 2022

Codegen for tier 0 for JIT\Directed\PREFIX\unaligned\2\cpblk\cpblk.cmd:

        52800081          mov     w1, #4
        93407C21          sxtw    x1, w1
        D292E882          movz    x2, #0x9744
        F2ABDFA2          movk    x2, #0x5EFD LSL #16
        CB020022          sub     x2, x1, x2
        B9000040          str     w0, [x2]

AV on the str. Looks like 4 - large constant, seems odd.

@jakobbotsch jakobbotsch self-assigned this Oct 3, 2022
@jakobbotsch
Copy link
Member

Bisected to bfef7cc

@jakobbotsch
Copy link
Member

jakobbotsch commented Oct 3, 2022

IR looks like:

Generating: N014 (  1,  2) [000010] -----------                   t10 =    CNS_INT   int    0xFFFFFFFF REG x0
IN0006:                           movn    w0, #0
                                                                        /--*  t10    int    
Generating: N016 (  2,  3) [000013] -c---------                   t13 = *  INIT_VAL  int    REG NA
Generating: N018 (  1,  2) [000007] -----------                    t7 =    CNS_INT   int    4 REG x1
IN0007:                           mov     w1, #4
                                                                        /--*  t7     int    
Generating: N020 (  2,  4) [000008] -----------                    t8 = *  CAST      long <- int REG x1
IN0008:                           sxtw    x1, w1
                                                                        /--*  t8     long   
Generating: N022 (  6, 17) [000009] -c---------                    t9 = *  LEA(b+-789485380) long   REG NA
                                                                        /--*  t9     long   
                                                                        +--*  t13    int    
Generating: N024 (  9, 19) [000012] -A-X-------                         *  STORE_BLK struct<4> (init) (Unroll) REG NA
IN0009:                           movz    x2, #0x9744
IN000a:                           movk    x2, #0x2F0E LSL #16
IN000b:                           sub     x2, x1, x2
IN000c:                           str     w0, [x2]

@jakobbotsch
Copy link
Member

Not sure that we are supposed to be creating LEA nodes with constants this large on ARM64.

jakobbotsch added a commit to jakobbotsch/runtime that referenced this issue Oct 3, 2022
The offset here can be a "base" address due to various JIT
transformations so we should ensure the range [offset, offset+size) does
not overflow.

Fix dotnet#76506
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Oct 3, 2022
@kunalspathak
Copy link
Member

kunalspathak commented Oct 3, 2022

If possible, could you paste a diff of JitDump after your change in #76532 ?

@jakobbotsch
Copy link
Member

If possible, could you paste a diff of JitDump after your change in #76532 ?

The diff in IR looks like:

@@ -1033,8 +1033,10 @@ N002 (  2,  3) [000013] -c---------                   t13 = *  INIT_VAL  int
 N003 (  1,  2) [000007] -----------                    t7 =    CNS_INT   int    4
                                                             /--*  t7     int    
 N004 (  2,  4) [000008] -----------                    t8 = *  CAST      long <- int
+N005 (  3, 12) [000006] H----------                    t6 =    CNS_INT(h) long   0x7ffcd0ed68bc static Fseq[DATA]
                                                             /--*  t8     long   
-N006 (  6, 17) [000009] -c---------                    t9 = *  LEA(b+-789747524) long  
+                                                            +--*  t6     long   
+N006 (  6, 17) [000009] -----------                    t9 = *  ADD       long  
                                                             /--*  t9     long   
                                                             +--*  t13    int    
 N007 (  9, 19) [000012] -A-X-------                         *  STORE_BLK struct<4> (init) (Unroll)
@@ -1122,8 +1124,10 @@ N002 (  2,  3) [000013] -c---------                   t13 = *  INIT_VAL  int
 N003 (  1,  2) [000007] -----------                    t7 =    CNS_INT   int    4
                                                             /--*  t7     int    
 N004 (  2,  4) [000008] -----------                    t8 = *  CAST      long <- int
+N005 (  3, 12) [000006] H----------                    t6 =    CNS_INT(h) long   0x7ffcd0ed68bc static Fseq[DATA]
                                                             /--*  t8     long   
-N006 (  6, 17) [000009] -c---------                    t9 = *  LEA(b+-789747524) long  
+                                                            +--*  t6     long   
+N006 (  6, 17) [000009] -----------                    t9 = *  ADD       long  
                                                             /--*  t9     long   
                                                             +--*  t13    int    
 N007 (  9, 19) [000012] -A-X-------                         *  STORE_BLK struct<4> (init) (Unroll)

Attached full jitdumps if you want to look further.
out_base.txt
out.txt

@jakobbotsch
Copy link
Member

jakobbotsch commented Oct 3, 2022

This containment should probably also be skipped for handles, but let me open separate issue about it to avoid rerunning CI in #76532.

(Opened #76552)

jakobbotsch added a commit that referenced this issue Oct 3, 2022
The offset here can be a "base" address due to various JIT
transformations so we should ensure the range [offset, offset+size) does
not overflow.

Fix #76506
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Oct 3, 2022
@JulieLeeMSFT
Copy link
Member

Thanks @jakobbotsch for quickly fixing this blocking issue.

@ghost ghost locked as resolved and limited conversation to collaborators Nov 3, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-arm64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' blocking-outerloop Blocking the 'runtime-coreclr outerloop' and 'runtime-libraries-coreclr outerloop' runs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants