-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NativeAOT] System.Runtime.Tests occasionally crashes on linux-arm-NativeAOT #99079
Comments
Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas Issue DetailsThe crash does not happen often - just once in a few CI runs. For example seen in #99031 :
|
Preliminary notes:
|
I once saw a failure at this stack:
Not necessarily because of I kind of think if this system was configured to fail on misaligned accesses, it would fail in more tests. Also this looks like an AV, not alignment problem, so maybe it was something else. |
Speaking of align8 - does NAOT respect align8 for e.g. static readonly double[] arr = new double[100]; during preinitialization? |
Last time I checked It is not a lot of changes to enable this, but it would be "a bit here, a bit there." |
// Currently we don't support frozen objects with special alignment requirements
// TODO: We should also give up on arrays of doubles on 32-bit platforms.
// (we currently never allocate them on frozen segments)
#if FEATURE_64BIT_ALIGNMENT
if (type->RequiresAlign8)
{
// Align8 objects are not supported yet
return null;
}
#endif So, nope. |
The code you quoted is used to allocate objects on nongc in run-time, it's different from preinitialization that also allocates on nongc. |
Afair, some floating-point related loads/store (as well as atomicity-related) are not fixable by OS and may fail |
Right, that is not even in NAOT. Sorry. Generally aligning frozen arrays should be doable. Just do what GC does - overallocate space and pad in front with empty objects (dummy byte[] would work too). I do not see any code attempting to do that. |
That what I suspected too. Perhaps OS was configured to handle misaligned accesses, but could not handle all the cases. I am not very familiar with how robust the emulation is. I'd assume it is robust, since it is in OS, but would not be too surprised if there are limitations. |
Actually, since it is not in the real heap, I am not sure if it needs to be parseable/walkable, so maybe just leaving gaps in front will work too. One way to find out. :-) |
In CoreCLR it's still walkable in two cases:
|
So here is a test we can try: using System.Runtime.CompilerServices;
public class Prog
{
static void Main()
{
Test(doubles);
}
static readonly double[] doubles = new double[100];
[MethodImpl(MethodImplOptions.NoInlining)]
static double Test(double[] d)
{
return d[0];
}
} I can't check real codegen for NAOT-arm32 (perhaps, @filipnavara can check?) but on CoreCLR AltJIT I am seeing: ; Assembly listing for method Prog:Test(double[]):double (FullOpts)
G_M31521_IG01: ;; offset=0x0000
000000 push {r11,lr}
000004 mov r11, sp
G_M31521_IG02: ;; offset=0x0006
000006 movs r3, 0
000008 ldr r2, [r0+0x04]
00000A cmp r3, r2
00000C bhs SHORT G_M31521_IG04
00000E vldr d0, [r0+0x08] ;;;;;;; <--------------------
G_M31521_IG03: ;; offset=0x0012
000012 pop {r11,pc}
G_M31521_IG04: ;; offset=0x0016
000016 movw r3, 0x7bd0
00001A movt r3, 0x8248
00001E blx r3 // CORINFO_HELP_RNGCHKFAIL
000020 bkpt
; Total bytes of code 34 And that |
That is mildly disturbing. |
Wouldn't Align8 be also needed for 8B |
FWIW ARMv7+ handles unaligned accesses in hardware with few notable exceptions:
Several of the test crashes point to misalignment of the |
I'll simply block these in #99104. It's not clear if this is the problem for these tests, but it's necessary for correctness anyway. |
This collected a dump. You can get it by running:
The stacktrace of the crash is:
Looks like it crashed somewhere in |
That's the issue tracked in #98795. Dump is useful. |
I got the payload from
I'll need to figure out which linker [version] did that. In my experiments the thunks were generated in a separate section and LLD source code shows that was the intended behavior. |
While analyzing this I found one pattern that we don't recognize in
Furthermore, some of |
I think this it resolved now. |
Yes, i don't see crashes in the arm legs in the past week. Thank you Filip! |
The crash does not happen often - just once in a few CI runs.
So far I was unable to get a dump or any other info for the failure. I'd need to set up a local repro.
For example seen in #99031 :
https://helixre8s23ayyeko0k025g8.blob.core.windows.net/dotnet-runtime-refs-pull-99031-merge-ad45d5f9cac945e0af/System.Runtime.Tests/1/console.a6a20f16.log?helixlogtype=result
The text was updated successfully, but these errors were encountered: