Skip to content

[Performance] OSR causes crash (0x80000004) during Microbenchmarks on Windows x64 - .NET 11 Preview #123691

@LoopedBard3

Description

@LoopedBard3

Description

Running BenchmarkDotNet MicroBenchmarks on AMD Zen 4 Windows x64 machines causes an immediate crash with exit code 0x80000004 (STATUS_SINGLE_STEP). The crash occurs in ClrRestoreNonvolatileContextWorker during exception handling/context restoration. The issue reproduces on AMD EPYC but not Intel Xeon, suggesting it may be AMD-specific.

Reproduction Steps

git clone https://github.com/dotnet/performance
cd performance
python .\scripts\benchmarks_ci.py --csproj .\src\benchmarks\micro\MicroBenchmarks.csproj --incremental no --architecture x64 -f net11.0 --dotnet-versions 11.0.100-preview.1.26076.102  --bdn-artifacts .\BenchmarkDotNet.Artifacts --partition=0 --bdn-arguments="--anyCategories Libraries Runtime --logBuildOutput --generateBinLog --partition-count 15 --partition-index 0"

This will install dotnet and run the performance testing, eventually failing when it actually goes to run the microbenchmarks. For quicker replication, run the above once until the failure and then run the final command instead (all the generated setup is usable), you will need to replace dotnet with the path to the installed dotnet.

.\tools\dotnet\x64\dotnet.exe run --project .\src\benchmarks\micro\MicroBenchmarks.csproj --configuration Release --framework net11.0 --no-restore --no-build -- --anyCategories Libraries Runtime --logBuildOutput

Expected behavior

Benchmarks run successfully

Actual behavior

Process exits immediately with code 2147483652 (0x80000004 - STATUS_SINGLE_STEP).

Debugger shows crash in: coreclr!ClrRestoreNonvolatileContextWorker+0xb1

A CLR exception (e0434352) occurs just before the crash, suggesting the issue is in exception handling when OSR-compiled code is on the stack.

Regression?

Last successful run (https://dev.azure.com/dnceng/internal/_build/results?buildId=2880806&view=results) used dotnet version: 11.0.100-alpha.1.26065.110 and used commit 93c450d (although this is reproable without corerun/runtime build dependency). The issue that started after this was dotnet/sdk#52542 and is not related to the above error, so the exact breaking time is unclear.

Also interestingly, our Windows x86 runs are not hitting this issue.

Known Workarounds

Any of these environment variables prevent the crash:
- DOTNET_TC_OnStackReplacement=0 ✓
- DOTNET_TC_QuickJit=0 ✓
- DOTNET_TieredCompilation=0 ✓
- DOTNET_JitMinOpts=1 ✓

DOTNET_JitEnableGuardedDevirtualization=0 does NOT fix the issue.

This suggests the bug is in On-Stack Replacement (OSR) code generation or exception handling when OSR is active.

Configuration

  • .NET Version: 11.0.0-preview.1.26076.102
  • OS: Windows 11 (Build 22621)
  • Architecture: x64
  • CPU (failing): AMD EPYC 9124 16-Core Processor (Zen 4)
  • CPU (working): Intel Xeon Platinum

Other information

Debugger analysis (Summarized with Copilot):

 EXCEPTION_RECORD:
 ExceptionAddress: coreclr!ClrRestoreNonvolatileContextWorker+0xb1
 ExceptionCode: 80000004 (Single step exception)

 FAULTING_SOURCE_FILE: D:\a\_work\1\s\src\runtime\src\coreclr\vm\amd64\Context.asm
 FAULTING_SOURCE_LINE_NUMBER: 74

 FAILURE_BUCKET_ID: SINGLE_STEP_80000004_coreclr.dll!ClrRestoreNonvolatileContextWorker

 Stack trace at crash:
 coreclr!ClrRestoreNonvolatileContextWorker+0xb1
 0x00007ffa0487e92f  (JIT'd code)
 0x00007ffa0487e7ca  (JIT'd code)
 System_Linq+0x4c0bf
 coreclr!CallDescrWorkerInternal+0x83
 coreclr!RunMainInternal+0x16c
 coreclr!RunMain+0x111
 coreclr!Assembly::ExecuteMainMethod+0x1c7

 Issue was discovered in CI/CD pipeline running performance benchmarks. 100% reproducible on AMD Zen 4 machines.

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-area-labelAn area label is needed to ensure this gets routed to the appropriate area ownersperf-pipelineIssues with dotnet-runtime-perf, or runtime-wasm-perf pipelines

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions