Test failure JIT\\HardwareIntrinsics\\General\\Vector256\\Vector256_r\\Vector256_r.cmd #76280

v-wenyuxu · 2022-09-28T01:26:14Z

Run: runtime-coreclr jitstress 20220926.3

Failed test:

coreclr windows x64 Checked zapdisable @ Windows.10.Amd64.Open

- JIT\\HardwareIntrinsics\\General\\Vector256\\Vector256_r\\Vector256_r.cmd

Error message:

Return code:      1
Raw output file:      C:\h\w\B3BB09D9\w\A3F60922\uploads\Reports\JIT.HardwareIntrinsics\General\Vector256\Vector256_r\Vector256_r.output.txt
Raw output:
BEGIN EXECUTION
"C:\h\w\B3BB09D9\p\corerun.exe" -p "System.Reflection.Metadata.MetadataUpdater.IsSupported=false"  Vector256_r.dll
Beginning test case Abs.Byte at 9/27/2022 7:13:05 AM
Random seed: 20010415; set environment variable CORECLR_SEED to this value to repro

Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunClsVarScenario
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassLclFldScenario
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario

Ending test case at 9/27/2022 7:13:06 AM
Beginning test case Abs.Double at 9/27/2022 7:13:06 AM
Random seed: 20010415; set environment variable CORECLR_SEED to this value to repro

Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunClsVarScenario
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassLclFldScenario
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario

Ending test case at 9/27/2022 7:13:07 AM
Beginning test case Abs.Int16 at 9/27/2022 7:13:07 AM
Random seed: 20010415; set environment variable CORECLR_SEED to this value to repro

Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunClsVarScenario
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassLclFldScenario
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario

Ending test case at 9/27/2022 7:13:07 AM
Beginning test case Abs.Int32 at 9/27/2022 7:13:07 AM
Random seed: 20010415; set environment variable CORECLR_SEED to this value to repro

Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunClsVarScenario
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassLclFldScenario
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario

Ending test case at 9/27/2022 7:13:07 AM
Beginning test case Abs.Int64 at 9/27/2022 7:13:07 AM
Random seed: 20010415; set environment variable CORECLR_SEED to this value to repro

Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunClsVarScenario
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassLclFldScenario
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario

Ending test case at 9/27/2022 7:13:07 AM
Beginning test case Abs.SByte at 9/27/2022 7:13:07 AM
Random seed: 20010415; set environment variable CORECLR_SEED to this value to repro

Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunClsVarScenario
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassLclFldScenario
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario

Ending test case at 9/27/2022 7:13:07 AM
Beginning test case Abs.Single at 9/27/2022 7:13:07 AM
Random seed: 20010415; set environment variable CORECLR_SEED to this value to repro

Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunClsVarScenario
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassLclFldScenario
Beginnin


Stack trace
   at JIT_HardwareIntrinsics._General_Vector256_Vector256_r_Vector256_r_._General_Vector256_Vector256_r_Vector256_r_cmd()
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
   at System.Reflection.MethodInvoker.Invoke(Object obj, IntPtr* args, BindingFlags invokeAttr)

{ 
    "ErrorMessage":"JIT.HardwareIntrinsics\\General\\Vector256\\Vector256_",
    "BuildRetry": false
}

Report

Build	Definition	Test	Pull Request
78807	dotnet/runtime	JIT\HardwareIntrinsics\General\Vector256\Vector256_ro\Vector256_ro.cmd
78797	dotnet/runtime	JIT\HardwareIntrinsics\General\Vector256\Vector256_ro\Vector256_ro.cmd	#77737
77376	dotnet/runtime	JIT.HardwareIntrinsics.General.Vector256.WorkItemExecution
76912	dotnet/runtime	JIT\HardwareIntrinsics\General\Vector256\Vector256_ro\Vector256_ro.cmd	#77728
76381	dotnet/runtime	JIT\HardwareIntrinsics\General\Vector256\Vector256_ro\Vector256_ro.cmd	#77990
76377	dotnet/runtime	JIT\HardwareIntrinsics\General\Vector256\Vector256_ro\Vector256_ro.cmd	#77353
76043	dotnet/runtime	JIT\HardwareIntrinsics\General\Vector256\Vector256_ro\Vector256_ro.cmd	#76793
75886	dotnet/runtime	JIT\HardwareIntrinsics\General\Vector256\Vector256_ro\Vector256_ro.cmd
70948	dotnet/runtime	JIT\HardwareIntrinsics\General\Vector256\Vector256_ro\Vector256_ro.cmd	#77798
68929	dotnet/runtime	JIT.HardwareIntrinsics.General.Vector256.WorkItemExecution
67658	dotnet/runtime	JIT.HardwareIntrinsics.General.Vector256.WorkItemExecution
66875	dotnet/runtime	JIT.HardwareIntrinsics.General.Vector256.WorkItemExecution
66220	dotnet/runtime	JIT.HardwareIntrinsics.General.Vector256.WorkItemExecution	#73472
64828	dotnet/runtime	JIT.HardwareIntrinsics.General.Vector256.WorkItemExecution
62054	dotnet/runtime	JIT.HardwareIntrinsics.General.Vector256.WorkItemExecution

Summary

24-Hour Hit Count	7-Day Hit Count	1-Month Count
0	0	15

The text was updated successfully, but these errors were encountered:

ghost · 2022-09-28T01:26:29Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Run: runtime-coreclr jitstress 20220926.3

Failed test:

coreclr windows x64 Checked zapdisable @ Windows.10.Amd64.Open

- JIT\\HardwareIntrinsics\\General\\Vector256\\Vector256_r\\Vector256_r.cmd

Error message:

Return code:      1
Raw output file:      C:\h\w\B3BB09D9\w\A3F60922\uploads\Reports\JIT.HardwareIntrinsics\General\Vector256\Vector256_r\Vector256_r.output.txt
Raw output:
BEGIN EXECUTION
"C:\h\w\B3BB09D9\p\corerun.exe" -p "System.Reflection.Metadata.MetadataUpdater.IsSupported=false"  Vector256_r.dll
Beginning test case Abs.Byte at 9/27/2022 7:13:05 AM
Random seed: 20010415; set environment variable CORECLR_SEED to this value to repro

Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunClsVarScenario
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassLclFldScenario
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario

Ending test case at 9/27/2022 7:13:06 AM
Beginning test case Abs.Double at 9/27/2022 7:13:06 AM
Random seed: 20010415; set environment variable CORECLR_SEED to this value to repro

Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunClsVarScenario
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassLclFldScenario
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario

Ending test case at 9/27/2022 7:13:07 AM
Beginning test case Abs.Int16 at 9/27/2022 7:13:07 AM
Random seed: 20010415; set environment variable CORECLR_SEED to this value to repro

Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunClsVarScenario
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassLclFldScenario
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario

Ending test case at 9/27/2022 7:13:07 AM
Beginning test case Abs.Int32 at 9/27/2022 7:13:07 AM
Random seed: 20010415; set environment variable CORECLR_SEED to this value to repro

Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunClsVarScenario
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassLclFldScenario
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario

Ending test case at 9/27/2022 7:13:07 AM
Beginning test case Abs.Int64 at 9/27/2022 7:13:07 AM
Random seed: 20010415; set environment variable CORECLR_SEED to this value to repro

Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunClsVarScenario
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassLclFldScenario
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario

Ending test case at 9/27/2022 7:13:07 AM
Beginning test case Abs.SByte at 9/27/2022 7:13:07 AM
Random seed: 20010415; set environment variable CORECLR_SEED to this value to repro

Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunClsVarScenario
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassLclFldScenario
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario

Ending test case at 9/27/2022 7:13:07 AM
Beginning test case Abs.Single at 9/27/2022 7:13:07 AM
Random seed: 20010415; set environment variable CORECLR_SEED to this value to repro

Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunClsVarScenario
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassLclFldScenario
Beginnin


Stack trace
   at JIT_HardwareIntrinsics._General_Vector256_Vector256_r_Vector256_r_._General_Vector256_Vector256_r_Vector256_r_cmd()
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
   at System.Reflection.MethodInvoker.Invoke(Object obj, IntPtr* args, BindingFlags invokeAttr)

Author:	v-wenyuxu
Assignees:	-
Labels:	`os-windows`, `JitStress`, `arch-x64`, `area-CodeGen-coreclr`, `blocking-clean-ci-optional`
Milestone:	-

BruceForstall · 2022-09-28T15:29:26Z

The error is:

      Vector256.ConvertToDouble<Double>(Vector256<Int64>): RunClassFldScenario failed:
       firstOp: (5354736389871458141, 5213547752395406011, 6568981105086642517, 411765540062621435)
        result: (5.354736389871458E+18, 5.213547752395406E+18, 6.568981101446955E+18, 4.1176553693904896E+17)
      
      Beginning scenario: RunStructLclFldScenario
      Beginning scenario: RunStructFldScenario
      ERROR!!!-System.Exception: One or more scenarios did not complete as expected.

with:

set COMPlus_TieredCompilation=0
set COMPlus_ReadyToRun=0
set COMPlus_ZapDisable=1

cc @tannergooding

JulieLeeMSFT · 2022-09-28T20:31:28Z

@tannergooding, please check if this needs to be backported to 7.0.

tannergooding · 2022-09-28T20:47:49Z

The lowest 32-bits is being corrupted somehow.

Is:        6.568981101446955E+18 (0x43D6CA6D_60800000)
Should Be: 6.568981105086642E+18 (0x43D6CA6D_60B63C4E)

Is:        4.1176553693904896E+17 (0x4396DB8A_4C000000)
Should Be: 4.1176554006262144E+17 (0x4396DB8A_4EE8B7BC)

tannergooding · 2022-09-28T20:52:42Z

Notably this is only happening on the "upper half" of the Vector256.

tannergooding · 2022-09-29T14:19:54Z

I can't actually repro this. It also isn't reproducing in CI anymore as of the latest run.

There notably isn't anything "obvious" in the commit range (of last failing CI run to latest CI run which passes) either: 6c2cfa4...789b420

If this reproduces again, I'll take another look.

BruceForstall · 2022-10-02T16:12:55Z

@tannergooding A similar case failed again:

https://dev.azure.com/dnceng-public/public/_build/results?buildId=37829&view=ms.vss-test-web.build-test-results-tab&runId=754460&resultId=100427&paneView=dotnet-dnceng.dnceng-build-release-tasks.helix-test-information-tab

coreclr windows x64 Checked zapdisable @ Windows.10.Amd64.Open

      Beginning test case ConvertToDouble.Int64 at 10/2/2022 6:44:56 AM
      Random seed: 20010415; set environment variable CORECLR_SEED to this value to repro
      
      Beginning scenario: RunBasicScenario_UnsafeRead
      Beginning scenario: RunReflectionScenario_UnsafeRead
      Beginning scenario: RunClsVarScenario
      Beginning scenario: RunLclVarScenario_UnsafeRead
      Vector256.ConvertToDouble<Double>(Vector256<Int64>): RunLclVarScenario_UnsafeRead failed:
       firstOp: (6447432934267478723, 7446574367515794728, 1268019459039786021, 7143260941042548606)
        result: (6.447432934267479E+18, 7.446574367515794E+18, 1.2680194555442627E+18, 7.143260938076946E+18)
      
      Beginning scenario: RunClassLclFldScenario
      Beginning scenario: RunClassFldScenario
      Beginning scenario: RunStructLclFldScenario
      Beginning scenario: RunStructFldScenario
      ERROR!!!-System.Exception: One or more scenarios did not complete as expected.

tannergooding · 2022-10-02T18:59:02Z

Could you remind me what ZapDisable is testing?

It's very odd this still isn't repro'ing locally. Perhaps some determinism issue with where the test runs in relation to the RNG it uses for inputs. It's interesting it changed to a LclVar where-as the previous was ClassFld

BruceForstall · 2022-10-03T00:19:07Z

Could you remind me what ZapDisable is testing?

From the console log you can see:

set COMPlus_TieredCompilation=0
set COMPlus_ReadyToRun=0
set COMPlus_ZapDisable=1

I'm not sure if ZapDisable does anything anymore, but ReadyToRun=0 means we don't use any pre-compiled images, so everything gets JITed.

tannergooding · 2022-10-03T18:07:53Z

Still not able to repro this locally. Have tried 5 different full runs and over 20 different runs of just Vector256_r

mangod9 · 2022-10-04T04:30:01Z

Appears to have failed again here: https://dev.azure.com/dnceng-public/public/_build/results?buildId=39274&view=logs&j=2662b7d9-28ff-5dca-b58f-3053d0bc5578&t=e89e5678-21dc-594d-7484-bf6c7562f33f&s=96ac2280-8cb4-5df5-99de-dd2da759617d

mangod9 · 2022-10-04T04:59:12Z

Actually looks like the testcase failing for my PR was different:

      Beginning scenario: RunBasicScenario_UnsafeRead
      Beginning scenario: RunReflectionScenario_UnsafeRead
      Beginning scenario: RunClsVarScenario
      Beginning scenario: RunLclVarScenario_UnsafeRead
      Beginning scenario: RunClassLclFldScenario
      Beginning scenario: RunClassFldScenario
      Beginning scenario: RunStructLclFldScenario
      Beginning scenario: RunStructFldScenario
      
      Ending test case at 10/4/2022 4:36:41 AM
      Expected: 100
      Actual: 0
      END EXECUTION - FAILED
      FAILED
      Test Harness Exitcode is : 1
      To run the test:
      > set CORE_ROOT=C:\h\w\AD310950\p
      > C:\h\w\AD310950\w\AD84096D\e\JIT\HardwareIntrinsics\General\Vector256\Vector256_r\Vector256_r.cmd
      Expected: True
      Actual:   False
      Stack Trace:
           at JIT_HardwareIntrinsics._General_Vector256_Vector256_r_Vector256_r_._General_Vector256_Vector256_r_Vector256_r_cmd()
           at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
           at System.Reflection.MethodInvoker.Invoke(Object obj, IntPtr* args, BindingFlags invokeAttr)
      Output:

markples · 2022-10-04T20:52:24Z

Copying from previous failure before logs go away:

      Beginning test case ConditionalSelect.Double at 10/4/2022 1:27:47 AM
      Random seed: 20010415; set environment variable CORECLR_SEED to this value to repro
      
      Beginning scenario: RunBasicScenario_UnsafeRead
      Beginning scenario: RunReflectionScenario_UnsafeRead
      Vector256.ConditionalSelect<Double>(Vector256<Double>, Vector256<Double>, Vector256<Double>): RunReflectionScenario_UnsafeRead failed:
       firstOp: (0.6989466141438795, 0.4866212012649612, 0.6618925643441699, 0.09845733460898387)
      secondOp: (0.22517924393768388, 0.7154067716167339, 0.006026498976175906, 0.27767764510478715)
       thirdOp: (0.7471133040949298, 0.5692147801486844, 0.8094755237966196, 0.4609477219455632)
        result: (0.17140560267558583, 0.6942809413626239, 0.8094755237966196, 0.4609477219455632)

Beginning test case LessThan.UInt64 at 10/4/2022 1:27:56 AM
      Random seed: 20010415; set environment variable CORECLR_SEED to this value to repro
      
      Beginning scenario: RunBasicScenario_UnsafeRead
      Beginning scenario: RunReflectionScenario_UnsafeRead
      Vector256.LessThan<UInt64>(Vector256<UInt64>, Vector256<UInt64>): RunReflectionScenario_UnsafeRead failed:
          left: (3781761112722303931, 6235165959883877819, 5127093180115276564, 15915554120637788906)
         right: (1422405162339237022, 17563831884969782090, 11681625757407455840, 5912794640739462083)
        result: (0, 18446744073709551615, 18446744073709551615, 18446744073709551615)

Beginning test case Xor.Int32 at 10/4/2022 1:28:01 AM
      Random seed: 20010415; set environment variable CORECLR_SEED to this value to repro
      
      Beginning scenario: RunBasicScenario_UnsafeRead
      Beginning scenario: RunReflectionScenario_UnsafeRead
      Vector256.Xor<Int32>(Vector256<Int32>, Vector256<Int32>): RunReflectionScenario_UnsafeRead failed:
          left: (1604413603, 1222379432, 1738335450, 989877695, 884295071, 1181582375, 238131376, 426120629)
         right: (2083039301, 7104645, 2146387439, 1913728278, 1335194097, 826300609, 924496805, 1104768183)
        result: (596230374, 1219522349, 410231093, 1225881769, 1335194097, 826300609, 924496805, 1104768183)

markples · 2022-10-04T20:57:56Z

I hit failures in another configuration. This one is ildasm/ilasm roundtripping for an ilasm change, so theoretically it could be the change but it seems unlikely.

This uses

set COMPlus_TieredCompilation=0
set RunningIlasmRoundTrip=1

RunningIlasmRoundTrip is a test script variable that leads to the roundtripping. The runtime does not use it.

https://dev.azure.com/dnceng-public/public/_build/results?buildId=40034&view=logs&j=9d34e523-f5d6-52dc-46f2-0a66cb13e494&t=5aa7bffb-7d91-5080-bb39-aa229a02b9d3

https://helixre107v0xdeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-76590-merge-7401007fc42f47e7a1/JIT.HardwareIntrinsics.General.Vector256/1/console.e1969e3c.log?helixlogtype=result

Beginning test case Multiply.Int16 at 10/4/2022 7:17:32 PM
      Random seed: 20010415; set environment variable CORECLR_SEED to this value to repro
      
      Beginning scenario: RunBasicScenario_UnsafeRead
      Beginning scenario: RunReflectionScenario_UnsafeRead
      Vector256.Multiply<Int16>(Vector256<Int16>, Vector256<Int16>): RunReflectionScenario_UnsafeRead failed:
          left: (12349, 2077, 2517, 24831, 15335, 21277, 18310, 24742, 29224, 1501, 30542, 30209, 25839, 20619, 27223, 15463)
         right: (26331, 30859, 23718, 5423, 18949, 5399, 15719, 6382, 2686, 639, 9923, 32176, 9978, 23285, 4897, 26979)
        result: (-28113, -65, -5090, -17967, -3709, -10085, -19222, 27220, 0, 0, 0, 0, 0, 0, 0, 0)

Beginning test case Max.Double at 10/4/2022 7:17:29 PM
      Random seed: 20010415; set environment variable CORECLR_SEED to this value to repro
      
      Beginning scenario: RunBasicScenario_UnsafeRead
      Beginning scenario: RunReflectionScenario_UnsafeRead
      Vector256.Max<Double>(Vector256<Double>, Vector256<Double>): RunReflectionScenario_UnsafeRead failed:
          left: (0.30494365762218073, 0.8294232901322763, 0.3422121491014083, 0.7015113330918883)
         right: (0.4280050426851982, 0.980779036870589, 0.38619966916097315, 0.05648377959452745)
        result: (0.4280050426851982, 0.980779036870589, 0.38619966916097315, 0.05648377959452745)

v-wenyuxu · 2022-10-24T01:35:26Z

Failed again in: runtime-coreclr jitstress 20221022.1

Failed test:

coreclr windows x64 Checked zapdisable @ Windows.10.Amd64.Open

- JIT\\HardwareIntrinsics\\General\\Vector256\\Vector256_r\\Vector256_r.cmd

Error message:

      Vector256.ConvertToDouble<Double>(Vector256<UInt64>): RunClassLclFldScenario failed:
       firstOp: (4093955593098357454, 8100612776087538354, 2810113380124182167, 3723585571893864373)
        result: (4.093955593098357E+18, 8.100612776087539E+18, 2.810113379975299E+18, 3.723585570157363E+18)

jkotas · 2022-10-24T02:56:40Z

This seem to be failing only in Vector256.ConvertToDouble and only in JIT stress outer loop runs now.

jkotas · 2022-10-26T01:50:38Z

Details from the latest hit (build 62054):

Failing test: JIT.HardwareIntrinsics.General.VectorUnaryOpTest__ConvertToDoubleInt64.RunClassLclFldScenario

Expected:
43cc685375ea6cf5 43dc1aca19a96c13 43c37fc3 9d846fe3 43c9d66aacb3c074

Actual:
43cc685375ea6cf5 43dc1aca19a96c13 43c37fc39d800000 43c9d66aac800000 <- 2x 22 bits in upper half of Vector256 flipped to 0

Environment:

DOTNET_ReadyToRun=0
DOTNET_TieredCompilation=0
PROCESSOR_IDENTIFIER=AMD64 Family 23 Model 49 Stepping 0, AuthenticAMD

tannergooding · 2022-10-26T03:05:09Z

Is there any way we can pull this explicit machine from the pool to do manual testing on?

Given we've only seen this for Vector256 and only with the upper half my presumption is that there is either a bug with save/restore upper halves logic (either in the JIT or in the thread save/resume logic) -or- its something like the microcode patch that was called out above.

We could also try to get more info out of HKEY_LOCAL_MACHINE\HARDWARE\DESCRIPTION\System\* (both BIOS and CentralProcess\0), although it might be masked with it being a VM still.

My current install has Update AGESA version to ComboAM5PI 1.0.0.3 patch A and the corresponding Update Revision in the registry is 0x0A60_1203.
-- Notably the CI machine is Zen 2, so it'd be on the AM4 AGESA versions.
-- CPUID reports I currently have Stepping 2, Revision RPL-B2

jkotas · 2022-10-26T04:59:19Z

although it might be masked with it being a VM still

Yes, these details are masked out. Helix runs on Azure VMs so you cannot be even sure that the test runs on the same physical machine the whole time. The running VM can be migrated to a different physical machine in the middle of the test.

I have asked on the eng system support chat about interactive session on Helix VM.

jkotas · 2022-10-28T03:28:57Z

@tannergooding You are cced on the Teams discussion with the eng team about getting access to the Helix VM.

BruceForstall · 2022-10-29T03:41:50Z

A recent job failed with newly added FailFast instrumentation:

https://dev.azure.com/dnceng-public/public/_build/results?buildId=66220&view=ms.vss-test-web.build-test-results-tab&runId=1351562&resultId=102648&paneView=debug

Process terminated. Temporary instrumentation to diagnose https://github.com/dotnet/runtime/issues/76280
at System.Environment.FailFast(System.String)
at JIT.HardwareIntrinsics.General.VectorUnaryOpTest__ConvertToDoubleInt64.ValidateResult(Int64[], Double[], System.String)
at JIT.HardwareIntrinsics.General.VectorUnaryOpTest__ConvertToDoubleInt64.ValidateResult(System.Runtime.Intrinsics.Vector256`1<Int64>, Void*, System.String)
at JIT.HardwareIntrinsics.General.VectorUnaryOpTest__ConvertToDoubleInt64.RunClassLclFldScenario()
at JIT.HardwareIntrinsics.General.Program.ConvertToDoubleInt64()
at JIT.HardwareIntrinsics.General.Program.Main(System.String[])

jkotas · 2022-10-31T01:45:58Z

Here are the corrupted values from last 4 crashes (result local variable from JIT.HardwareIntrinsics.General.VectorUnaryOpTest__ConvertToDoubleUInt64.RunClassLclFldScenario method):

Build 66875

00000052`b7dce000  75ea6cf5 43cc6853 19a96c13 43dc1aca
00000052`b7dce010  9d800000 43c37fc3 ac800000 43c9d66a

Build 66220

000000f8`07d7df80  75ea6cf5 43cc6853 19a96c13 43dc1aca
000000f8`07d7df90  9d800000 43c37fc3 ac800000 43c9d66a

Build 64828

000000d7`1477e580  86000ab9 43d65e78 fbc053b7 43d9d5e2
000000d7`1477e590  fa000000 43b198e8 7e800000 43d8c87d

Build 62054

0000005c`b557e4a0  75ea6cf5 43cc6853 19a96c13 43dc1aca
0000005c`b557e4b0  9d800000 43c37fc3 ac800000 43c9d66a

jkotas · 2022-10-31T02:13:29Z

The consistent pattern is that the low 23 bits of the higher 2 qwords are zeroed out. 23 bits is unusual number. Where can it come from?

EgorBo · 2022-10-31T02:20:48Z

The consistent pattern is that the low 23 bits of in the higher 2 qwords are zeroed out. 23 bits is unusual number. Where can it come from?

I'm just curious why you edited your comment - the size seems to match single-precision mantissa indeed like you noted?

jkotas · 2022-10-31T02:39:24Z

I was thinking that it is half of the double mantissa and then immediately realized that my math is wrong (2 * 23 != 52).

Good point about single precision mantissa!

jkotas · 2022-10-31T18:48:03Z

@tannergooding You are cced on the Teams discussion with the eng team about getting access to the Helix VM.

There is no good way to get remote access to the exact hardware that this is failing on. Azure uses multiple different process models for the machine category used by Helix VMs. Creating Helix VM of the same machine category tend to give you different processor model (I have tried multiple times).

If we need to gather more information about the machine config, the best way to do that is to add extra logging before the temporary FailFast in Vector256 test.

tannergooding · 2022-10-31T18:50:58Z

My current concern is that if this is a microcode issue, then no amount of logging will provide the required information.

The only real way to validate is likely to get a machine, reliably repro, patch, and then try to repro again.

jkotas · 2022-10-31T19:10:36Z

The only real way to validate is likely to get a machine, reliably repro, patch, and then try to repro again.

What would you do if you got a VM that reproduces it semi-reliably? We should be able to do the same in the CI, just the feedback loop would be slower.

(I am running out of ideas on what to do to diagnose this further.)

tannergooding · 2022-10-31T19:23:45Z

I'd likely attach Intel VTune or AMD uProf and collect a system wide trace that includes when the process yields the timeslice.

From what we've seen from the dumps, all the disassembly looks correct and the input values are being corrupted somewhere between when the correct result is computed and the validation happens.

So, my guess is that it's either some state save/restore issue -or- something like what Egor linked above. Given this is effectively only happening in Vector256_r, I'd speculate the combination of "debug" (and therefore frequently spilling/loading) is causing heavy enough Vector256 usage that it triggers the issue Egor had found. In which case we'd patch the machine (install latest Windows/Microsoft Updates to start) and see if it continues reproing.

jkotas · 2022-10-31T20:44:22Z

To re-iterate, this is what we know:

The window where the corruption happens is very small (~12 instructions). You can tell by inspecting local variables on the stack. result local has the corrupted value in var result = Vector256.ConvertToInt64(test._fld1); test line.
The corruption is hit by just a handful test cases. VectorUnaryOpTest__ConvertToDoubleInt64.RunClassLclFldScenario has been the only test case hitting it recently. This sort of determinism suggests that the problem is not caused by asynchronous interrupts.
The corruption is only hit on AMD64 Family 23 Model 49 Stepping 0, AuthenticAMD machines

This is the Window where the corruption occurs:

00007ff7`ba53f9a7 c5fc28c8        vmovaps ymm1,ymm0
00007ff7`ba53f9ab c4e375020d2b000000aa vpblendd ymm1,ymm1,ymmword ptr [System_Private_CoreLib!System.Runtime.Intrinsics.Vector256.ConvertToDouble(System.Runtime.Intrinsics.Vector256`1<Int64>)+0x40 (00007ff7`ba53f9e0)],0AAh
00007ff7`ba53f9b5 c5fd73d020      vpsrlq  ymm0,ymm0,20h
00007ff7`ba53f9ba c5fdef053e000000 vpxor   ymm0,ymm0,ymmword ptr [System_Private_CoreLib!System.Runtime.Intrinsics.Vector256.ConvertToDouble(System.Runtime.Intrinsics.Vector256`1<Int64>)+0x60 (00007ff7`ba53fa00)]
00007ff7`ba53f9c2 c5fd5c0556000000 vsubpd  ymm0,ymm0,ymmword ptr [System_Private_CoreLib!System.Runtime.Intrinsics.Vector256.ConvertToDouble(System.Runtime.Intrinsics.Vector256`1<Int64>)+0x80 (00007ff7`ba53fa20)]
00007ff7`ba53f9ca c5fd58c1        vaddpd  ymm0,ymm0,ymm1
00007ff7`ba53f9ce c5fd1101        vmovupd ymmword ptr [rcx],ymm0
00007ff7`ba53f9d2 488bc1          mov     rax,rcx
00007ff7`ba53f9d5 c5f877          vzeroupper
00007ff7`ba53f9d8 c3              ret

00007ff7`ba5409dc c5fd104590      vmovupd ymm0,ymmword ptr [rbp-70h]
00007ff7`ba5409e1 c5fd1145d0      vmovupd ymmword ptr [rbp-30h],ymm0 <- [ebp-30h] value is corrupted

Fixes dotnet#76280

Fixes #76280

v-wenyuxu added os-windows JitStress CLR JIT issues involving JIT internal stress modes arch-x64 blocking-clean-ci-optional Blocking optional rolling runs labels Sep 28, 2022

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Sep 28, 2022

ghost added the untriaged New issue has not been triaged by the area owner label Sep 28, 2022

runfoapp bot mentioned this issue Sep 28, 2022

Infrastructure - Status/Health #702

Closed

JulieLeeMSFT assigned tannergooding Sep 28, 2022

JulieLeeMSFT removed the untriaged New issue has not been triaged by the area owner label Sep 28, 2022

JulieLeeMSFT added this to the 8.0.0 milestone Sep 28, 2022

tannergooding closed this as completed Sep 29, 2022

BruceForstall reopened this Oct 2, 2022

radical mentioned this issue Oct 4, 2022

[wasm] Avoid unnecessarily zipping tests #76580

Merged

mangod9 mentioned this issue Oct 4, 2022

fix lower heap hard limit condition for regions #76407

Merged

elinor-fung mentioned this issue Oct 4, 2022

Remove .dll from Interop.Libraries.HostPolicy for consistency with other runtime libraries #76466

Merged

markples mentioned this issue Oct 4, 2022

[ilasm] Lazily allocate memory in Indx256 #76590

Merged

markples mentioned this issue Oct 14, 2022

Vector128 and Vector256 General purpose DotProduct tests fail #75791

Closed

jkotas added a commit to jkotas/runtime that referenced this issue Oct 24, 2022

Temporary instrumentation for dotnet#76280

3e58ed9

BruceForstall mentioned this issue Oct 24, 2022

Temporary instrumentation for #76280 #77365

Merged

jkotas added a commit that referenced this issue Oct 24, 2022

Temporary instrumentation for #76280 (#77365)

e232f79

a74nh mentioned this issue Oct 24, 2022

VN based copy prop assertion when enabling debug #77380

Closed

jkotas added a commit to jkotas/runtime that referenced this issue Nov 18, 2022

Delete temporary instrumentation for dotnet#76280

3a85347

Fixes dotnet#76280

jkotas mentioned this issue Nov 18, 2022

Delete temporary instrumentation for #76280 #78537

Merged

ghost added the in-pr There is an active PR which will close this issue when it is merged label Nov 18, 2022

JulieLeeMSFT removed the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Nov 18, 2022

BruceForstall mentioned this issue Nov 18, 2022

Test failure JIT.HardwareIntrinsics.General._Vector256.VectorUnaryOpTest__ConvertToDoubleUInt64.ValidateResult(UInt64[], Double[], System.String) #78576

Closed

jkotas closed this as completed in #78537 Nov 18, 2022

jkotas added a commit that referenced this issue Nov 18, 2022

Delete temporary instrumentation for #76280 (#78537)

d413cc4

Fixes #76280

ghost removed the in-pr There is an active PR which will close this issue when it is merged label Nov 18, 2022

ghost locked as resolved and limited conversation to collaborators Dec 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test failure JIT\\HardwareIntrinsics\\General\\Vector256\\Vector256_r\\Vector256_r.cmd #76280

Test failure JIT\\HardwareIntrinsics\\General\\Vector256\\Vector256_r\\Vector256_r.cmd #76280

v-wenyuxu commented Sep 28, 2022 •

edited by build-analysis bot

Loading

ghost commented Sep 28, 2022

BruceForstall commented Sep 28, 2022

JulieLeeMSFT commented Sep 28, 2022

tannergooding commented Sep 28, 2022

tannergooding commented Sep 28, 2022

tannergooding commented Sep 29, 2022 •

edited

Loading

BruceForstall commented Oct 2, 2022

tannergooding commented Oct 2, 2022

BruceForstall commented Oct 3, 2022

tannergooding commented Oct 3, 2022

mangod9 commented Oct 4, 2022

mangod9 commented Oct 4, 2022

markples commented Oct 4, 2022

markples commented Oct 4, 2022

v-wenyuxu commented Oct 24, 2022 •

edited by jkotas

Loading

jkotas commented Oct 24, 2022

jkotas commented Oct 26, 2022

tannergooding commented Oct 26, 2022 •

edited

Loading

jkotas commented Oct 26, 2022 •

edited

Loading

jkotas commented Oct 28, 2022

BruceForstall commented Oct 29, 2022

jkotas commented Oct 31, 2022 •

edited

Loading

jkotas commented Oct 31, 2022 •

edited

Loading

EgorBo commented Oct 31, 2022

jkotas commented Oct 31, 2022 •

edited

Loading

jkotas commented Oct 31, 2022

tannergooding commented Oct 31, 2022

jkotas commented Oct 31, 2022

tannergooding commented Oct 31, 2022

jkotas commented Oct 31, 2022

Test failure JIT\\HardwareIntrinsics\\General\\Vector256\\Vector256_r\\Vector256_r.cmd #76280

Test failure JIT\\HardwareIntrinsics\\General\\Vector256\\Vector256_r\\Vector256_r.cmd #76280

Comments

v-wenyuxu commented Sep 28, 2022 • edited by build-analysis bot Loading

Report

Summary

ghost commented Sep 28, 2022

BruceForstall commented Sep 28, 2022

JulieLeeMSFT commented Sep 28, 2022

tannergooding commented Sep 28, 2022

tannergooding commented Sep 28, 2022

tannergooding commented Sep 29, 2022 • edited Loading

BruceForstall commented Oct 2, 2022

tannergooding commented Oct 2, 2022

BruceForstall commented Oct 3, 2022

tannergooding commented Oct 3, 2022

mangod9 commented Oct 4, 2022

mangod9 commented Oct 4, 2022

markples commented Oct 4, 2022

markples commented Oct 4, 2022

v-wenyuxu commented Oct 24, 2022 • edited by jkotas Loading

jkotas commented Oct 24, 2022

jkotas commented Oct 26, 2022

tannergooding commented Oct 26, 2022 • edited Loading

jkotas commented Oct 26, 2022 • edited Loading

jkotas commented Oct 28, 2022

BruceForstall commented Oct 29, 2022

jkotas commented Oct 31, 2022 • edited Loading

jkotas commented Oct 31, 2022 • edited Loading

EgorBo commented Oct 31, 2022

jkotas commented Oct 31, 2022 • edited Loading

jkotas commented Oct 31, 2022

tannergooding commented Oct 31, 2022

jkotas commented Oct 31, 2022

tannergooding commented Oct 31, 2022

jkotas commented Oct 31, 2022

v-wenyuxu commented Sep 28, 2022 •

edited by build-analysis bot

Loading

tannergooding commented Sep 29, 2022 •

edited

Loading

v-wenyuxu commented Oct 24, 2022 •

edited by jkotas

Loading

tannergooding commented Oct 26, 2022 •

edited

Loading

jkotas commented Oct 26, 2022 •

edited

Loading

jkotas commented Oct 31, 2022 •

edited

Loading

jkotas commented Oct 31, 2022 •

edited

Loading

jkotas commented Oct 31, 2022 •

edited

Loading