-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Closed
Closed
Copy link
Labels
arch-arm32area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMICLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMIblocking-clean-ciBlocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms'Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms'runtime-async
Milestone
Description
Seeing this across all System.Net tests for last couple of weeks, even after copilot analysis I can't find any existing issues, so here it is:
Environment
- Queue:
azurelinux.3.arm64.openwith Docker imagemcr.microsoft.com/dotnet-buildtools/prereqs:debian-13-helix-arm32v7 - Architecture: ARM32 (armv7) running on ARM64 host via Docker
- Branch:
refs/heads/main - Build: 1291383
- Helix Job:
1e1717e0-bb28-49fe-95f2-cbfb05aafe11 - Workitems:
System.Net.Requests.Tests,System.Net.Security.Tests(at least) - Product version: 11.0.0
Crash Details
- Signal: 7 (SIGBUS — unaligned memory access)
- Exit code: 135
- Crashing thread:
0x2e - Phase: During test execution (OuterLoop tests)
Multiple test suites in the same Helix job crashed with the identical pattern. At least two different test methods were affected:
HttpWebRequestTest.ReadWriteTimeout_CancelsResponse(Boolean)(System.Net.Requests.Tests, thread0x2e)CertificateValidationRemoteServer.DefaultConnect_EndToEnd_Ok(String)(System.Net.Security.Tests, thread0x2d)
Both methods were being invoked through the interpreter (InterpretedInvoke_Method), not JIT-compiled code.
Console logs
- System.Net.Requests.Tests: https://helixr18s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-heads-main-1e1717e0bb2849fe95/System.Net.Requests.Tests/1/console.a0b33339.log?helixlogtype=result
- System.Net.Security.Tests: https://helixr18s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-heads-main-1e1717e0bb2849fe95/System.Net.Security.Tests/1/console.17fb1494.log?helixlogtype=result
Managed Stack Trace (from dotnet-dump)
OS Thread Id: 0x2e (crashed)
[PrestubMethodFrame: e8766cf8] System.Net.Tests.HttpWebRequestTest.ReadWriteTimeout_CancelsResponse(Boolean)
System.Net.Tests.HttpWebRequestTest.ReadWriteTimeout_CancelsResponse(Boolean)
[InlinedCallFrame: e8766fc8]
System.Reflection.MethodBaseInvoker.InterpretedInvoke_Method(System.Object, IntPtr*)
System.Reflection.MethodBaseInvoker.InvokeDirectByRefWithFewArgs(...)
System.Reflection.MethodBaseInvoker.InvokeWithOneArg(...)
System.Reflection.RuntimeMethodInfo.Invoke(...)
System.Reflection.MethodBase.Invoke(System.Object, System.Object[])
Xunit.Sdk.TestInvoker`1.CallTestMethod(System.Object)
... (xunit test runner / thread pool frames)
Native-level analysis (lldb)
Loading the System.Net.Security.Tests core dump in lldb on ARM32 confirmed:
- Thread CoreFx hardcodes paths in the Native scripts. #9 (tid 45):
stop reason = signal SIGBUSat PC =0x00000072
Register dump at crash:
r0-r7 = 0x00000000 (all zeroed)
r8 = 0x00000032
r9 = 0xe8569b0c
r10-r12 = 0x00000000
sp = 0x00000000 ← stack pointer is NULL
lr = 0x00000072 ← link register (return address)
pc = 0x00000072 ← program counter (faulting instruction)
cpsr = 0x0000009c
Key observations:
- SP is NULL: The stack pointer is
0x00000000, so the thread's entire stack context is gone. lldb cannot unwind beyond frame #0. - Nearly all registers zeroed: r0–r7, r10–r12 are all zero. This is not consistent with a normal unaligned-access SIGBUS, where you'd expect valid register values with one holding the misaligned address.
- PC = LR =
0x00000072: Both point to the same invalid address in the unmapped zero page. - The zeroed register state suggests this is not a simple alignment bug — the thread's register context was corrupted or never properly initialized before the method dispatch. This looks more like a corrupted method descriptor or prestub table entry that caused the runtime to load an invalid context and branch to
0x72. - The identical PC value (
0x00000072) across both crashes suggests a deterministic bug.
Notes
- The console logs' stack traces were completely unsymbolized (
?? at ??:0:0for all frames). The crash report JSON anddotnet-dump(run in an ARM32 QEMU container) were needed to obtain the managed stacks. - The
PrestubMethodFrameat the top with IP0x00000072indicates the crash happened during interpreter method invocation — the runtime was trying to call the interpreted method but jumped to an invalid address instead. - Both crashes have the identical call chain:
PrestubMethodFrame→ test method →[InlinedCallFrame]→InterpretedInvoke_Method→InvokeDirectByRefWithFewArgs→ xunit runner. The only difference is the test method name. - This does not appear to be test-specific — the crash is in the interpreter/runtime layer, not in the test logic itself. Any test method invoked via the interpreter on ARM32 could potentially hit this bug.
Possibly Related
- [ARM][ARM64] Problems with empty struct passing #104369 — ARM32 SIGBUS in
VolatileLoad<long long>/FieldDesc::GetInstanceField(different code path but same signal/architecture)
Known Issue Error Message
Fill the error message using step by step known issues guidance.
{
"ErrorMessage": "",
"ErrorPattern": "",
"BuildRetry": false,
"ExcludeConsoleLog": false
}Report
Summary
| 24-Hour Hit Count | 7-Day Hit Count | 1-Month Count |
|---|---|---|
| 0 | 0 | 0 |
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
arch-arm32area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMICLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMIblocking-clean-ciBlocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms'Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms'runtime-async