Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mono][ios] Mono prevents crash logs from being generated in case SIGSEGV is raised from native code #106064

Closed
ivanpovazan opened this issue Aug 7, 2024 · 10 comments · Fixed by #110741
Labels
area-Diagnostics-mono in-pr There is an active PR which will close this issue when it is merged os-ios Apple iOS
Milestone

Comments

@ivanpovazan
Copy link
Member

ivanpovazan commented Aug 7, 2024

Description

While investigating the behavior of the crash reported in: #105245 on iOS device we noticed that such error does not:

  1. Produce any useful information in the Console.app
  2. It did not generate a crash report on a device

Repro

  1. Install official .NET 9 preview 6 release
  2. Install MAUI workloads
  3. Create a MAUI template app
  4. Build/run it on a physical device in Debug mode:
dotnet build -f net9.0-ios -r ios-arm64 -t:Run -p:_DeviceName=XXXXXX
  1. Console.app shows only that the program exited:
default  14:14:51.131793+0200  SpringBoard  [app<com.companyname.myapp(80F7F2DD-E9BB-43D8-AB25-7A0DE1A64558)>:6714] Process exited: <RBSProcessExitContext| voluntary>.
  1. No crash logs are generated on a device

Investigation

As initially assumed, the crash was not caused by asserting from the runtime (where we abort() properly), but instead the code was trying to read from an invalid memory address (something like 0xa8) so SIGSEGV was raised.
The mono's signal handler catches this, but does not remove it self as a SIGSEGV handler when it starts handling the signal.
After the handling is done, the handler returns, but the same signal is caught again and on second handling the program exits out with -1 in case of double faulting:

} else {
g_async_safe_printf ("\nAn error has occurred in the native fault reporting. Some diagnostic information will be unavailable.\n");
g_async_safe_printf ("\nExiting early due to double fault.\n");
_exit (-1);

This causes the app to silently exit not including any information about the crash.
Additionally, the messages from the signal handler (like information about the native and managed stack traces) that are using g_async_safe_printf are not shown in the system log.

FWIW In the signal handler we do unregister the runtime for some signals, but not for SIGSEGV:

g_assert (sigaction (SIGABRT, &sa, NULL) != -1);
/* On some systems we get a SIGILL when calling abort (), because it might
* fail to raise SIGABRT */
g_assert (sigaction (SIGILL, &sa, NULL) != -1);
/* Remove SIGCHLD, it uses the finalizer thread */
g_assert (sigaction (SIGCHLD, &sa, NULL) != -1);
/* Remove SIGQUIT, we are already dumping threads */
g_assert (sigaction (SIGQUIT, &sa, NULL) != -1);

Proposal

  1. Investigate if we can unregister the SIGSEGV handler if we've detected that the signal it's not coming from managed code (to distinguish NullReferenceException)
  2. Try using a platform specific logging mechanism so that messages from the signal handler end up in the system log when a fatal error occurs

PS Thanks @lambdageek for assistance

@ivanpovazan ivanpovazan added this to the 9.0.0 milestone Aug 7, 2024
@ivanpovazan
Copy link
Member Author

/cc: @rolfbjarne

@rolfbjarne
Copy link
Member

The mono's signal handler catches this, but does not remove it self as a SIGSEGV handler when it starts handling the signal.
After the handling is done, the handler returns

This sounds a bit weird.

If the SIGSEGV comes from managed code, the signal handler should raise a NullReferenceException.

This didn't happen, so the SIGSEGV didn't come from managed code, and in that case the signal handler should chain to the previous signal handler.

This didn't happen (presumably because there was no previous signal handler), and in that case Mono will print native crash info and call abort:

mono_handle_native_crash (mono_get_signame (info->si_signo), &mctx, info);
abort ();

But that didn't happen?

Instead we got a double fault... it sounds like the second SIGSEGV didn't come from returning from the first signal handler, but instead occurred during the first signal handler.

@ivanpovazan
Copy link
Member Author

This didn't happen (presumably because there was no previous signal handler), and in that case Mono will print native crash info and call abort:

mono_handle_native_crash (mono_get_signame (info->si_signo), &mctx, info);
abort ();

But that didn't happen?

If I am not mistaken the code you linked is the SIGABRT handler.


We are instead reaching this code:

MONO_SIG_HANDLER_FUNC (, mono_sigsegv_signal_handler)

which further handles the signal and returns since signal chaining is enabled:

mono_handle_native_crash (mono_get_signame (SIGSEGV), &mctx, (MONO_SIG_HANDLER_INFO_TYPE*)info);
if (mono_do_crash_chaining) {
mono_chain_signal (MONO_SIG_HANDLER_PARAMS);
return;

@rolfbjarne
Copy link
Member

If I am not mistaken the code you linked is the SIGABRT handler.

Ah my bad, sorry about the confusion.

I'm just really surprised this isn't a regression, because I'm certain I've seen crash reports for SIGSEGVs originating in native code before (but then again I might be misremembering - I just tried in .NET 8, and I got the double fault with no crash report).

@vitek-karas vitek-karas modified the milestones: 9.0.0, 10.0.0 Aug 22, 2024
@rolfbjarne
Copy link
Member

FWIW I've seen customers run into this now.

@ivanpovazan
Copy link
Member Author

FWIW I've seen customers run into this now.

Do we have a specific issue to link to this one + raise a priority on fixing this?

@rolfbjarne
Copy link
Member

FWIW I've seen customers run into this now.

Do we have a specific issue to link to this one + raise a priority on fixing this?

The customer in question is the one who reported this: #107641

Although much of the interaction happened via email, so reading the issue isn't all that useful.

@Redth
Copy link
Member

Redth commented Dec 2, 2024

@ivanpovazan @steveisok @vitek-karas this one is happening more frequently with customers now - could we get this prioritized? I think @StephaneDelcroix can provide a repro...

@steveisok
Copy link
Member

I think @StephaneDelcroix can provide a repro...

If you have a repro, great and I'll work with @vitek-karas to get this prioritized.

@ivanpovazan
Copy link
Member Author

I just synced with @vitek-karas I will look into this as soon as possible.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Diagnostics-mono in-pr There is an active PR which will close this issue when it is merged os-ios Apple iOS
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants