Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[android] Add runtime logging support for CoreCLR on Android #112865

Open
4 of 7 tasks
ivanpovazan opened this issue Feb 24, 2025 · 6 comments
Open
4 of 7 tasks

[android] Add runtime logging support for CoreCLR on Android #112865

ivanpovazan opened this issue Feb 24, 2025 · 6 comments
Assignees
Milestone

Comments

@ivanpovazan
Copy link
Member

ivanpovazan commented Feb 24, 2025

Description

This issue is made for tracking necessary work for enabling runtime logging for CoreCLR on Android.

Logging

Currently, most of the messages logged by the runtime end up in /dev/null (either because they
are disabled in release build or because they log to stdio which doesn't work on Android).

Logcat is the only way to get information from remote devices, especially via Google Play Console.

We should log to logcat:

  • C++ exception messages
  • abort() messages / fatal errors
  • warnings
  • errors

A subsystem should be added which will provide a single function that will do actual output, implementation of which
will be specific to the platform. API should allow specification of severity, the actual message, and possibly a flag
to indicate whether the process should be aborted (the decision might also be based on the severity). Severity should
be shared between all targets, which then can (if needed) translate it to the target platform's value(s), if any.

Tasks

References on runtime logging

https://github.com/dotnet/runtime/blob/9ff850ea5d29f487961d1c773bd495630aa8d2ea/docs/design/coreclr/botr/logging.md

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Feb 24, 2025
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Feb 24, 2025
@ivanpovazan
Copy link
Member Author

/cc: @janvorli @elinor-fung @grendello @lateralusX

Please fill in the description as you see fit

@ivanpovazan ivanpovazan added area-VM-coreclr and removed untriaged New issue has not been triaged by the area owner needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Feb 24, 2025
@ivanpovazan ivanpovazan added this to the 10.0.0 milestone Feb 24, 2025
@am11
Copy link
Member

am11 commented Feb 24, 2025

I think it would be nice to consolidate all logging implementation in src/native/minipal, where we have various similar shared components.

log/assert like native code in the repo, which are overlapping:

mono_log_write_logcat (const char *log_domain, GLogLevelFlags level, mono_bool hdr, const char *message)

void mono_log_open_logcat (const char *path, void *userData);

#define SEE_LOGCAT_MESSAGE_LEN (int32_t)(sizeof(see_logcat_message))

logger.opener = mono_log_open_logcat;

#define _ASSERTE_MSG(expr, msg) \

#define assert_msg(cond, msg, val) do \

@jkotas
Copy link
Member

jkotas commented Feb 25, 2025

Enable runtime logging

The most valuable form of runtime logging is stresslog. Stresslog is high-performance logging that is not printed to console. Instead, it is recorded in a circular buffer that can be extracted from a crash dump via SOS DumpLog command. Stresslog is compiled into release builds and we depend on it to diagnose toughest bugs like intermittent crashes.

It would be great if we can create a documented workflow for how to use stresslog on Android.

@lateralusX
Copy link
Member

Enable runtime logging

The most valuable form of runtime logging is stresslog. Stresslog is high-performance logging that is not printed to console. Instead, it is recorded in a circular buffer that can be extracted from a crash dump via SOS DumpLog command. Stresslog is compiled into release builds and we depend on it to diagnose toughest bugs like intermittent crashes.

It would be great if we can create a documented workflow for how to use stresslog on Android.

Regarding stress log, on Android we won't get access to core dumps, at least not what's currently reported to things like Google Play Console or standard Android bug reports, those reports are also limited in the data they provide, but they include logcat output, tombstone files and ANR callstack dumps, device events etc. Logcat is an internal circular memory buffers handled by the Android OS. For apps running on regular device not hooked up to adb, I believe the size of the logcat buffer is rather small (64KB) and I don't expect users to run stress log directly on end user devices (no simple way for users to enable runtime features without running adb). In cases you can access device using adb, it is possible to increase the logcat buffers making it a potential output target for all our Android logging needs. On Mono we centralized all our runtime logging into one place and categorized it into different log levels that have pretty much 1:1 mapping to logcat log levels.

Since we won't get hold of core dumps or additional data except what's included in default bug report, storing stress log in internal runtime memory buffer will mainly be for internal live debugging using native debugger and SOS DumpLog command. External users are more likely to help diagnose issues using adb and logcat output. In that scenario logcat would server a similar purpose as our stress log memory buffer, except that is accessible through regular adb tooling without need of native debugger or SOS. I have not looked too deep into stress log, but what I seen is that its rather optimized to only store raw pointers to format string and arguments, optimizing the internal storage and postpone any string formatting until log gets extracted. Each thread have their own "buffer" meaning that there is no contention when logging when there is still free space available. Since logcat works closer to regular printf logging, it will have a higher impact per logging compared to in memory stress log.

Based on above I see a couple of different scenarios:

  • For installed apps running in the wild, I don't think there will be an easy way to enable stress logging or even change default runtime log levels unless the embedding application adds support for this in retail application and integrate it with some UI where users can enable/disable additional diagnostics. Even if we get there, a crash won't give us the core, the logcat buffer on retail apps is probably too small and there is no way to submit additional files in generated bug reports (in case we would put stress log in a separate file). In order to get more information out of the device, we would need to rely on 3'rd party crash tooling or implement something ourselves that could upload needed additional data in case of a crash, with the risk of interfering with 3'rd party crash tooling already used through the Android ecosystem.
  • External users/developers having a repro and access to adb, hook up device, enabled stress log using adb, increase logcat buffer if needed and run repro, that would give us a stress log up until the point of the crash that could be shared with us.
  • Internal debugging or advanced external developers, either use above strategy or attach native debugger and uses SOS DumpLog command when app crash under debugger.

So regardless I believe there is a value to make sure we could get stress log into logcat on Android, even if it will add more overhead compared to in memory stress log due to calling a printf style 3'rd party API ending up formatting the string put into logcat memory buffer.

There is still value to have option to record stress log in internal memory buffer so we could use same debug pattern for live debugging over native debugger + SOS, when we have SOS working on Android.

@lateralusX
Copy link
Member

lateralusX commented Feb 28, 2025

Added two new task to this issue, one is to "Figure out" stress log on Android, inline with comment above. The other puts idea above to consolidate native logging into minipal. In order to quicker hook up runtime logging making it simpler to diagnose runtime errors and problems during initial work with the CoreClr Android port, I would assume we would do the "Enable Runtime logging" based on what we currently have, just adding support for logcat and then change when/if we end up doing the consolidation of logging, since that is probably a longer running task.

@jkotas
Copy link
Member

jkotas commented Feb 28, 2025

on Android we won't get access to core dumps, at least not what's currently reported to things like Google Play Console or standard Android bug reports

It is the case for CoreCLR too. The regular bug reports or crash dumps do not have stress log enabled by default.

The typical workflow is to work with the customer to enable the stresslog, reproduce the issue with stress log enabled and capture the crashdump. It is typically needed to configure the stress log size to be in the order of megabytes to successfully diagnose GC-related crashes that are the most common use case for stress log. Also, you typically needed both the stresslog and the dump to figure out what happened.

Here is an example of my go-to stresslog settings and how the workflow used to diagnose hard to reproduce crash: #45557 (comment) .

I have not looked too deep into stress log, but what I seen is that its rather optimized to only store raw pointers to format string and arguments, optimizing the internal storage and postpone any string formatting until log gets extracted. Each thread have their own "buffer" meaning that there is no contention when logging when there is still free space available.

The main reason for this architecture is to minimize the timing differences of enabled stresslog. The stress bugs typically require very specific timing to reproduce. If the logging disrupts the timing, there is a high change that the bug won't reproduce anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants