Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV when user interaction instrumentation is enabled #3653

Open
OlivierGenez opened this issue Aug 23, 2024 · 27 comments
Open

SIGSEGV when user interaction instrumentation is enabled #3653

OlivierGenez opened this issue Aug 23, 2024 · 27 comments

Comments

@OlivierGenez
Copy link

OlivierGenez commented Aug 23, 2024


❗ EDIT by the maintainers:

  • The issue has been fixed by Google, see issue on the ART issue tracker: https://issuetracker.google.com/issues/361129298#comment7
  • Google's rollout plans are currently not communicated, but it is to be expected that the fix will be rolled out with the next system/security updates, similar to how the code was rolled out that caused the issue with the August/September updates
  • In the mean time you can mitigate this issue by deactivating User Interaction Tracing and/or Profiling (see snippets below), which means that Sentry Profiler will not start the crashing method-tracer from Android Tracer any more. (Note that other code in your app might still do that and cause the crashes unrelated to Sentry)
options.isEnableUserInteractionTracing = false
options.profilesSampleRate = 0.0

Integration

sentry-android

Build System

Gradle

AGP Version

8.3.2

Proguard

Disabled

Version

7.12.1

Steps to Reproduce

My team has observed an increase in this type of crashes in Sentry/Android vitals with the latest update of our app:

Check failed: tlsPtr_.method_trace_buffer == nullptr (tlsPtr_.method_trace_buffer=0x<sanitized>, nullptr=(null)) 

*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
pid: 0, tid: 28018 >>> <Application ID redacted> <<<

backtrace:
  #00  pc 0x0000000000058290  /apex/com.android.runtime/lib64/bionic/libc.so (__strlen_aarch64+16)
  #01  pc 0x00000000005b510c  /apex/com.android.art/lib64/libart.so (art::Thread::DumpState(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, art::Thread const*, int)+556)
  #02  pc 0x00000000005b487c  /apex/com.android.art/lib64/libart.so (art::Thread::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, unwindstack::AndroidLocalUnwinder&, bool, bool) const+52)
  #03  pc 0x00000000005b6814  /apex/com.android.art/lib64/libart.so (art::DumpCheckpoint::Run(art::Thread*)+216)
  #04  pc 0x000000000054eeb0  /apex/com.android.art/lib64/libart.so (art::ThreadList::RunCheckpoint(art::Closure*, art::Closure*, bool)+684)
  #05  pc 0x00000000005b6148  /apex/com.android.art/lib64/libart.so (art::ThreadList::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, bool)+292)
  #06  pc 0x0000000000933e24  /apex/com.android.art/lib64/libart.so (art::AbortState::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&) const+204)
  #07  pc 0x000000000093023c  /apex/com.android.art/lib64/libart.so (art::Runtime::Abort(char const*)+712)
  #08  pc 0x00000000000160fc  /apex/com.android.art/lib64/libbase.so (android::base::SetAborter(std::__1::function<void (char const*)>&&)::$_0::__invoke(char const*)+80)
  #09  pc 0x00000000000156d0  /apex/com.android.art/lib64/libbase.so (android::base::LogMessage::~LogMessage()+516)
  #10  pc 0x00000000005b74ec  /apex/com.android.art/lib64/libart.so (art::Thread::~Thread()+1512)
  #11  pc 0x000000000030b2b4  /apex/com.android.art/lib64/libart.so (art::ThreadList::Unregister(art::Thread*, bool)+708)
  #12  pc 0x000000000063eec8  /apex/com.android.art/lib64/libart.so (art::Thread::CreateCallback(void*)+2208)
  #13  pc 0x000000000063e618  /apex/com.android.art/lib64/libart.so (art::Thread::CreateCallbackWithUffdGc(void*)+8)
  #14  pc 0x000000000006efbc  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+204)
  #15  pc 0x0000000000060d60  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64)

This is not a new issue (we've seen reports as far back as a year ago) but there has been a significant increase in crash reports.

Our app's sentry config has user interaction instrumentation enabled:

SentryAndroid.init(context) { options ->
    // [...]
    options.tracesSampleRate = 1.0
    options.profilesSampleRate = 1.0
    // [...]
    options.isEnableUserInteractionTracing = true
    // [...]
}

After some investigation, we've been able to replicate the issue in the debug version of our app (i.e., R8 is disabled) on Pixel 6a and Pixel 7a devices with Android 14 by:

  1. opening the app
  2. tap on any of our bottom navigation bar navigation item in very rapid succession until the app crashes

Based on Sentry/Android vitals crash reports this definitely occurs on a wide variety of devices with standard app usage, but this is one way we've been able to replicate the issue somewhat consistently.

Expected Result

The application proceeds as normal and doesn't crash.

Actual Result

After a while, the interactions slow down a bit, then the application crashes:

Check failed: tlsPtr_.method_trace_buffer == nullptr (tlsPtr_.method_trace_buffer=0x<sanitized>, nullptr=(null)) 

*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
pid: 0, tid: 28018 >>> <Application ID redacted> <<<

backtrace:
  #00  pc 0x0000000000058290  /apex/com.android.runtime/lib64/bionic/libc.so (__strlen_aarch64+16)
  #01  pc 0x00000000005b510c  /apex/com.android.art/lib64/libart.so (art::Thread::DumpState(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, art::Thread const*, int)+556)
  #02  pc 0x00000000005b487c  /apex/com.android.art/lib64/libart.so (art::Thread::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, unwindstack::AndroidLocalUnwinder&, bool, bool) const+52)
  #03  pc 0x00000000005b6814  /apex/com.android.art/lib64/libart.so (art::DumpCheckpoint::Run(art::Thread*)+216)
  #04  pc 0x000000000054eeb0  /apex/com.android.art/lib64/libart.so (art::ThreadList::RunCheckpoint(art::Closure*, art::Closure*, bool)+684)
  #05  pc 0x00000000005b6148  /apex/com.android.art/lib64/libart.so (art::ThreadList::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, bool)+292)
  #06  pc 0x0000000000933e24  /apex/com.android.art/lib64/libart.so (art::AbortState::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&) const+204)
  #07  pc 0x000000000093023c  /apex/com.android.art/lib64/libart.so (art::Runtime::Abort(char const*)+712)
  #08  pc 0x00000000000160fc  /apex/com.android.art/lib64/libbase.so (android::base::SetAborter(std::__1::function<void (char const*)>&&)::$_0::__invoke(char const*)+80)
  #09  pc 0x00000000000156d0  /apex/com.android.art/lib64/libbase.so (android::base::LogMessage::~LogMessage()+516)
  #10  pc 0x00000000005b74ec  /apex/com.android.art/lib64/libart.so (art::Thread::~Thread()+1512)
  #11  pc 0x000000000030b2b4  /apex/com.android.art/lib64/libart.so (art::ThreadList::Unregister(art::Thread*, bool)+708)
  #12  pc 0x000000000063eec8  /apex/com.android.art/lib64/libart.so (art::Thread::CreateCallback(void*)+2208)
  #13  pc 0x000000000063e618  /apex/com.android.art/lib64/libart.so (art::Thread::CreateCallbackWithUffdGc(void*)+8)
  #14  pc 0x000000000006efbc  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+204)
  #15  pc 0x0000000000060d60  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64)

Attached is a full crash dump: tombstone.txt.

The issue cannot be replicated when user interaction instrumentation is disabled:

SentryAndroid.init(context) { options ->
    // [...]
    options.isEnableUserInteractionTracing = false
    // [...]
}
@ash-wtag
Copy link

I am facing the same issue, any update on this?

Sentry version:

io.sentry.android.gradle:4.5.1
io.sentry:sentry-android: 6.19.0
SentryAndroid.init(app) { options: SentryAndroidOptions ->
   options.dsn = token
    options.environment = buildType
    options.release = releaseName
}

here's the stack trace

Check failed: tlsPtr_.method_trace_buffer == nullptr (tlsPtr_.method_trace_buffer=0x<sanitized>, nullptr=(null)) 

*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
pid: 0, tid: 17593 >>> ch.pickebike <<<

backtrace:
  #00  pc 0x0000000000097390  /apex/com.android.runtime/lib64/bionic/libc.so (__strlen_aarch64+16)
  #01  pc 0x00000000005b510c  /apex/com.android.art/lib64/libart.so (art::Thread::DumpState(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, art::Thread const*, int)+556)
  #02  pc 0x00000000005b487c  /apex/com.android.art/lib64/libart.so (art::Thread::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, unwindstack::AndroidLocalUnwinder&, bool, bool) const+52)
  #03  pc 0x00000000005b6814  /apex/com.android.art/lib64/libart.so (art::DumpCheckpoint::Run(art::Thread*)+216)
  #04  pc 0x000000000054eeb0  /apex/com.android.art/lib64/libart.so (art::ThreadList::RunCheckpoint(art::Closure*, art::Closure*, bool)+684)
  #05  pc 0x00000000005b6148  /apex/com.android.art/lib64/libart.so (art::ThreadList::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, bool)+292)
  #06  pc 0x0000000000933e24  /apex/com.android.art/lib64/libart.so (art::AbortState::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&) const+204)
  #07  pc 0x000000000093023c  /apex/com.android.art/lib64/libart.so (art::Runtime::Abort(char const*)+712)
  #08  pc 0x00000000000160fc  /apex/com.android.art/lib64/libbase.so (android::base::SetAborter(std::__1::function<void (char const*)>&&)::$_0::__invoke(char const*)+80)
  #09  pc 0x00000000000156d0  /apex/com.android.art/lib64/libbase.so (android::base::LogMessage::~LogMessage()+516)
  #10  pc 0x00000000005b74ec  /apex/com.android.art/lib64/libart.so (art::Thread::~Thread()+1512)
  #11  pc 0x000000000030b2b4  /apex/com.android.art/lib64/libart.so (art::ThreadList::Unregister(art::Thread*, bool)+708)
  #12  pc 0x000000000063eec8  /apex/com.android.art/lib64/libart.so (art::Thread::CreateCallback(void*)+2208)
  #13  pc 0x000000000010ba80  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+208)
  #14  pc 0x000000000009f690  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64)

@markushi
Copy link
Member

Hey everyone, thanks for reaching out!

This looks like another issue with Androids built-in profiler. Similar to #2604 and #3561

Disabling user interaction instrumentation just hides the real culprit, as user interaction instrumentation creates transactions which in turn creates profiles, which itself uses the built-in Android profiler.

Could you try to disable profiling instead?

SentryAndroid.init(context) { options ->
    options.profilesSampleRate = 0.0
}

On top of that: Is your app using any native (C/C++) code in combination with some custom threading?

@OlivierGenez
Copy link
Author

Could you try to disable profiling instead?

We actually had tried this when debugging the issue and found that it seemed to prevent crashes from happening as well. Would you advise disabling profiling instead of user interaction instrumentation?

On top of that: Is your app using any native (C/C++) code in combination with some custom threading?

Our app doesn't use native code "directly", but some libraries we depend on do. The code is not open source though and is not shared with us, so I can't tell exactly how it deals with threading.

@kahest
Copy link
Member

kahest commented Aug 27, 2024

For reference:

@markushi
Copy link
Member

Could you try to disable profiling instead?

[...]Would you advise disabling profiling instead of user interaction instrumentation?

@OlivierGenez Yes, we would advise disabling profiling in the meantime instead.

@markushi
Copy link
Member

Let's try to reproduce this issue in a minimal environment (Android 14, as seen in the attached tombstone).

@kahest
Copy link
Member

kahest commented Sep 2, 2024

Update from Google on the issue tracker:

We have shared this with our product and engineering team and will update this issue with more information as it becomes available.

@markushi
Copy link
Member

markushi commented Sep 4, 2024

@OlivierGenez

My team has observed an increase in this type of crashes in Sentry/Android vitals with the latest update of our app

Is there any configuration change you did in the "latest update" of your app? E.g. did you change the sampling rate, enable a specific feature, bumped an SDK version tc?

@empowerDan
Copy link

Hi @markushi , just a heads up that this will occur even with options.profilesSampleRate = 0.0.
It's also happening on Android 12, 13 and 14.

@ashwin-coles
Copy link

I can confirm with the latest update to disable profiling, we are still observing crashes. As @OlivierGenez mentioned, turning off profiling and disabling isEnableUserInteractionTracing reduced events of crashes resulting from aggressive monkey-taps, but lifecycle events seem to be the last listed event in some of the breadcrumbs in crashes. We have now disabled all tracing and are waiting to see if that helps at all.

options.isEnableActivityLifecycleTracingAutoFinish = false
options.isEnableAutoActivityLifecycleTracing = false
options.isEnableTimeToFullDisplayTracing = false
options.isEnableUserInteractionTracing = false

@romtsn
Copy link
Member

romtsn commented Sep 13, 2024

@empowerDan @ashwin-coles could you share the backtrace of these crashes (after disabling profiling)? Is it the same as the other ones in this thread?

@empowerDan
Copy link

empowerDan commented Sep 13, 2024

Yep, same - also can confirm that @ashwin-coles snippet brings all art::Thread::DumpState errors down to 0, however this silences quite a lot of other things too so it's not a very viable long term solution as a paying customer.

options.isEnableActivityLifecycleTracingAutoFinish = false
options.isEnableAutoActivityLifecycleTracing = false
options.isEnableTimeToFullDisplayTracing = false
options.isEnableUserInteractionTracing = false

Do we know if the issue occurs on previous versions of Sentry too?

@markdrake-dev
Copy link

markdrake-dev commented Sep 16, 2024

Hey guys, we've seen the same crash happening very often in our prod apps that also uses Sentry, we don't use instrumentation. The specific crash we have is this:

Check failed: tlsPtr_.method_trace_buffer == nullptr (tlsPtr_.method_trace_buffer=0x<sanitized>, nullptr=(null))

and the backtrace is this:

#00  pc 0x0000000000085ed0  /apex/com.android.runtime/lib64/bionic/libc.so (__strlen_aarch64+16)
  #01  pc 0x00000000005b510c  /apex/com.android.art/lib64/libart.so (art::Thread::DumpState(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, art::Thread const*, int)+556)
  #02  pc 0x00000000005b487c  /apex/com.android.art/lib64/libart.so (art::Thread::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, unwindstack::AndroidLocalUnwinder&, bool, bool) const+52)
  #03  pc 0x00000000005b6814  /apex/com.android.art/lib64/libart.so (art::DumpCheckpoint::Run(art::Thread*)+216)
  #04  pc 0x000000000054eeb0  /apex/com.android.art/lib64/libart.so (art::ThreadList::RunCheckpoint(art::Closure*, art::Closure*, bool)+684)
  #05  pc 0x00000000005b6148  /apex/com.android.art/lib64/libart.so (art::ThreadList::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, bool)+292)
  #06  pc 0x0000000000933e24  /apex/com.android.art/lib64/libart.so (art::AbortState::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&) const+204)
  #07  pc 0x000000000093023c  /apex/com.android.art/lib64/libart.so (art::Runtime::Abort(char const*)+712)
  #08  pc 0x00000000000160fc  /apex/com.android.art/lib64/libbase.so (android::base::SetAborter(std::__1::function<void (char const*)>&&)::$_0::__invoke(char const*)+80)
  #09  pc 0x00000000000156d0  /apex/com.android.art/lib64/libbase.so (android::base::LogMessage::~LogMessage()+516)
  #10  pc 0x00000000005b74ec  /apex/com.android.art/lib64/libart.so (art::Thread::~Thread()+1512)
  #11  pc 0x000000000030b2b4  /apex/com.android.art/lib64/libart.so (art::ThreadList::Unregister(art::Thread*, bool)+708)
  #12  pc 0x000000000063eec8  /apex/com.android.art/lib64/libart.so (art::Thread::CreateCallback(void*)+2208)
  #13  pc 0x00000000000fc230  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+208)
  #14  pc 0x000000000008e310  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64)

Hope we can hear of a solution soon 👍

@kahest
Copy link
Member

kahest commented Sep 16, 2024

@markdrake-dev thank you for the report and the backtrace. Does "we don't use instrumentation" mean you have the options below all deactivated? Do you have an options.profilesSampleRate set?

options.isEnableActivityLifecycleTracingAutoFinish = false
options.isEnableAutoActivityLifecycleTracing = false
options.isEnableTimeToFullDisplayTracing = false
options.isEnableUserInteractionTracing = false

@markdrake-dev
Copy link

@kahest Sorry for not checking those values first. I found out we were configuring it via manifest and here are the values we are using:

  <!-- enable automatic breadcrumbs for user interactions (clicks, swipes, scrolls) -->
        <meta-data android:name="io.sentry.traces.user-interaction.enable" android:value="true" />
        <!-- disable screenshot for crashes (could contain sensitive/PII data) -->
        <meta-data android:name="io.sentry.attach-screenshot" android:value="false" />
        <!-- enable view hierarchy for crashes -->
        <meta-data android:name="io.sentry.attach-view-hierarchy" android:value="true" />

        <!-- enable the performance API by setting a sample-rate, adjust in production env -->
        <meta-data android:name="io.sentry.traces.sample-rate" android:value="1.0" />
        <!-- enable profiling when starting transactions, adjust in production env -->
        <meta-data android:name="io.sentry.traces.profiling.sample-rate" android:value="1.0" />

@AdnanYupi
Copy link

We also experience the same crash which went under our radar because, for some reason, Firebase Crashlytics didn't catch this crash until we got an email from Google that our crash rate exceeds the device bad behavior threshold of 8.0% on 8 device models affecting 8.46% of installs. Around 2K of our users are affected by this crash and many of them are paid users.

This is pretty bad guys, I hope this will be solved soon, meantime we will have to de-integrate Sentry from our Android project.

Check failed: tlsPtr_.method_trace_buffer == nullptr (tlsPtr_.method_trace_buffer=0x<sanitized>, nullptr=(null))

@kahest
Copy link
Member

kahest commented Sep 17, 2024

@AdnanYupi do you have any more information on if there's specific devices, OS versions, etc. affected?

@AdnanYupi
Copy link

@kahest Hey, thanks for reaching out. Yeah, I can share the list of affected devices. The majority, around 90%, are Pixel devices and one Samsung device. Pixels are from Pixel 6 to Pixel 8 Pro. For now that Samsung device is kinda irrelevant because we don't have many users using that specific device. Here is the screenshot from the Play Console:
Image

Judging by the rate percentage most of the time Android 13 was affected but, the case might be that these devices are mostly on Android 13. Not sure.

@aakashchoubey
Copy link

aakashchoubey commented Sep 17, 2024

We also experience the same crash which went under our radar because, for some reason, Firebase Crashlytics didn't catch this crash until we got an email from Google that our crash rate exceeds the device bad behavior threshold of 8.0% on 8 device models affecting 8.46% of installs. Around 2K of our users are affected by this crash and many of them are paid users.

This is pretty bad guys, I hope this will be solved soon, meantime we will have to de-integrate Sentry from our Android project.

Check failed: tlsPtr_.method_trace_buffer == nullptr (tlsPtr_.method_trace_buffer=0x<sanitized>, nullptr=(null))

Same issue with us.
2 primary crashes -

  1. [libc.so] abort
invalid pthread_t 0x<sanitized> passed to pthread_getcpuclockid
*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
pid: 0, tid: 11317 >>> com.myapp <<<

backtrace:
  #00  pc 0x000000000008d394  /apex/com.android.runtime/lib64/bionic/libc.so (abort+168)
  #01  pc 0x00000000000f5870  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_internal_find(long, char const*)+200)
  #02  pc 0x00000000000f5788  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_internal_gettid(long, char const*)+12)
  #03  pc 0x00000000000f5548  /apex/com.android.runtime/lib64/bionic/libc.so (pthread_getcpuclockid+28)
  #04  pc 0x000000000079e518  /apex/com.android.art/lib64/libart.so (art::Trace::CompareAndUpdateStackTrace(art::Thread*, std::__1::vector<art::ArtMethod*, std::__1::allocator<art::ArtMethod*> >*)+120)
  #05  pc 0x000000000079ec64  /apex/com.android.art/lib64/libart.so (art::Trace::RunSamplingThread(void*)+756)
  #06  pc 0x00000000000f5298  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+208)
  #07  pc 0x000000000008ebdc  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+68)
  1. [libc.so] __strlen_aarch64
Thread
Check failed: tlsPtr_.method_trace_buffer == nullptr (tlsPtr_.method_trace_buffer=0x<sanitized>, nullptr=(null)) 
*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
pid: 0, tid: 11990 >>> com.myapp <<<

backtrace:
  #00  pc 0x0000000000096850  /apex/com.android.runtime/lib64/bionic/libc.so (__strlen_aarch64+16)
  #01  pc 0x00000000005b510c  /apex/com.android.art/lib64/libart.so (art::Thread::DumpState(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, art::Thread const*, int)+556)
  #02  pc 0x00000000005b487c  /apex/com.android.art/lib64/libart.so (art::Thread::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, unwindstack::AndroidLocalUnwinder&, bool, bool) const+52)
  #03  pc 0x00000000005b6814  /apex/com.android.art/lib64/libart.so (art::DumpCheckpoint::Run(art::Thread*)+216)
  #04  pc 0x000000000054eeb0  /apex/com.android.art/lib64/libart.so (art::ThreadList::RunCheckpoint(art::Closure*, art::Closure*, bool)+684)
  #05  pc 0x00000000005b6148  /apex/com.android.art/lib64/libart.so (art::ThreadList::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, bool)+292)
  #06  pc 0x0000000000933e24  /apex/com.android.art/lib64/libart.so (art::AbortState::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&) const+204)
  #07  pc 0x000000000093023c  /apex/com.android.art/lib64/libart.so (art::Runtime::Abort(char const*)+712)
  #08  pc 0x00000000000160fc  /apex/com.android.art/lib64/libbase.so (android::base::SetAborter(std::__1::function<void (char const*)>&&)::$_0::__invoke(char const*)+80)
  #09  pc 0x00000000000156d0  /apex/com.android.art/lib64/libbase.so (android::base::LogMessage::~LogMessage()+516)
  #10  pc 0x00000000005b74ec  /apex/com.android.art/lib64/libart.so (art::Thread::~Thread()+1512)
  #11  pc 0x000000000030b2b4  /apex/com.android.art/lib64/libart.so (art::ThreadList::Unregister(art::Thread*, bool)+708)
  #12  pc 0x000000000063eec8  /apex/com.android.art/lib64/libart.so (art::Thread::CreateCallback(void*)+2208)
  #13  pc 0x0000000000104fc4  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+208)
  #14  pc 0x000000000009e764  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+68)

We had to roll back Sentry Integration.
Our app is a mixture of Android + React Native.

Initialization was done like this -

// Android's Application Class
SentryAndroid.init(this)

// React Native's App.tsx
Sentry.wrap(App);
Sentry.init({
    dsn: SENTRY_DSN,
    release: version,
    tracesSampleRate: 1.0,
    environment: environment,
});

I'm willing to discuss this further over a call. Pinged @kahest over twitter DM.

@aakashchoubey
Copy link

Our hunch is that there's no user impact, as there were no escalations on two of our apps.
Likely crash is when user is returning to the app from background - app launches just from the starting screen.

Android SDKs affected -
Android 14 (SDK 34)
Android 13 (SDK 33)
Android 12 (SDK 31)
Android 12L (SDK 32)

Device Distribution
We're seeing more OnePlus devices (25.7% for strlen crash and 18.4% for abort crash)
Samsung R8Q has 11.8% contribution in abort crash
But since numbers in specific devices are low, it could be just how some devices are more popular in the regions we serve (India, UAE, SGP)

Issue was not caught by Firebase but Sentry was able to catch it. Most likely we need Firebase NDK for firebase to be able to catch it.
Disabling Sentry's init behind a flag helped bring a flatline - so we are sure it's Sentry SDK.

@kahest
Copy link
Member

kahest commented Sep 17, 2024

@aakashchoubey thanks for the report. Please note that the first crash in your previous post is already tracked in a separate issue: #2604

Also based on the init snippets you shared, both User Interaction Tracing and Profiling are disabled in your app - can you double-check this please? About "Disabling Sentry's init behind a flag helped bring a flatline" - where do you see this flatline?

@aakashchoubey
Copy link

aakashchoubey commented Sep 17, 2024

Hey @kahest, thanks for the reply.
I'm aware and have been tracking #2604. Would you recommend that I update this there as well, or start a new issue altoge#2604

both User Interaction Tracing and Profiling are disabled in your app
I think I missed the manifest part, adding here -

<meta-data android:name="io.sentry.auto-init" android:value="false" />
<!-- Required: set your sentry.io project identifier (DSN) -->
<meta-data
    android:name="io.sentry.dsn"
    android:value="myDSN"/>

<!-- enable automatic breadcrumbs for user interactions (clicks, swipes, scrolls) -->
<meta-data
    android:name="io.sentry.traces.user-interaction.enable"
    android:value="true" />

<!-- enable view hierarchy for crashes -->
<meta-data
    android:name="io.sentry.attach-view-hierarchy"
    android:value="true" />

<!-- enable the performance API by setting a sample-rate, adjust in production env -->
<meta-data
    android:name="io.sentry.traces.sample-rate"
    android:value="1.0" />
<!-- enable profiling when starting transactions, adjust in production env -->
<meta-data
    android:name="io.sentry.traces.profiling.sample-rate"
    android:value="1.0" />
<!-- enable app start profiling -->
<meta-data
    android:name="io.sentry.traces.profiling.enable-app-start"
    android:value="true" />

So as we can see, the flag is on natively.

Disabling Sentry's init behind a flag helped bring a flatline

Sure, adding a screenshot.
Image
So this is the screenshot for [libc.so] abort issue.
We had disabled a flag from backend, which would stop the init method call. This brought a flatline in the events, since the SDK had stopped initializing.
Timeline - Sentry was rolled out on 29th July, flag disabled on 9th August and enabled back on 31st August.

Few additional observations that I have -
The crash [libc.so] __strlen_aarch64 did not happen for us in initial rollout.
When we enabled the flag again, after the release on 31st August, we had updated the init method -

// before 31st August
Sentry.init({
	dsn: SENTRY_DSN,
	release: version,
	tracesSampleRate: 1.0,
});

// after 31st August
Sentry.init({
    dsn: SENTRY_DSN,
    release: version,
    tracesSampleRate: 1.0,
    environment: environment,  // where environment = 'production'
});

So, is it possible that adding the environment flag triggered this? Or maybe it could just be that the numbers were low.

There are two different stacktraces for this crash.

// first stacktrace
  #11  pc 0x000000000030b2b4  /apex/com.android.art/lib64/libart.so (art::ThreadList::Unregister(art::Thread*, bool)+708)
  #12  pc 0x000000000063eec8  /apex/com.android.art/lib64/libart.so (art::Thread::CreateCallback(void*)+2208)
  #13  pc 0x0000000000104fc4  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+208)

// second stacktrace
  #11  pc 0x000000000030b2b4  /apex/com.android.art/lib64/libart.so (art::ThreadList::Unregister(art::Thread*, bool)+708)
  #12  pc 0x000000000063eec8  /apex/com.android.art/lib64/libart.so (art::Thread::CreateCallback(void*)+2208)
>>  #13  pc 0x000000000063e618  /apex/com.android.art/lib64/libart.so (art::Thread::CreateCallbackWithUffdGc(void*)+8)
  #14  pc 0x0000000000104fe4  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+208)

Also there's another similar crash [libc.so] strlen_a15 - here's the stacktrace

Thread
Check failed: tlsPtr_.method_trace_buffer == nullptr (tlsPtr_.method_trace_buffer=0xb1f7f6c0, nullptr=(null)) 
*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
pid: 0, tid: 32007 >>> com.myapp <<<

backtrace:
  #00  pc 0x000000000005f62c  /apex/com.android.runtime/lib/bionic/libc.so (strlen_a15+72)
  #01  pc 0x0000000000520b47  /apex/com.android.art/lib/libart.so (art::Thread::DumpState(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, art::Thread const*, int)+2310)
  #02  pc 0x0000000000535a6f  /apex/com.android.art/lib/libart.so (art::DumpCheckpoint::Run(art::Thread*)+646)
  #03  pc 0x0000000000530f29  /apex/com.android.art/lib/libart.so (art::ThreadList::RunCheckpoint(art::Closure*, art::Closure*, bool)+560)
  #04  pc 0x000000000053030b  /apex/com.android.art/lib/libart.so (art::ThreadList::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, bool)+1022)
  #05  pc 0x00000000004f742d  /apex/com.android.art/lib/libart.so (art::AbortState::Dump(std::__1::basic_ostream<char, std::__1::char_traits<char> >&) const+188)
  #06  pc 0x00000000004e59dd  /apex/com.android.art/lib/libart.so (art::Runtime::Abort(char const*)+1316)
  #07  pc 0x000000000000e0f1  /apex/com.android.art/lib/libbase.so (android::base::SetAborter(std::__1::function<void (char const*)>&&)::$_0::__invoke(char const*)+48)
  #08  pc 0x000000000000d965  /apex/com.android.art/lib/libbase.so (android::base::LogMessage::~LogMessage()+332)
  #09  pc 0x0000000000524521  /apex/com.android.art/lib/libart.so (art::Thread::~Thread()+1368)
  #10  pc 0x0000000000534eb1  /apex/com.android.art/lib/libart.so (art::ThreadList::Unregister(art::Thread*, bool)+596)
  #11  pc 0x0000000000518001  /apex/com.android.art/lib/libart.so (art::Thread::CreateCallback(void*)+1784)
  #12  pc 0x00000000000ad143  /apex/com.android.runtime/lib/bionic/libc.so (__pthread_start(void*)+40)
  #13  pc 0x00000000000642dd  /apex/com.android.runtime/lib/bionic/libc.so (__start_thread+30)

Hope this helps. Let me know if you need any further info on these.

@kahest
Copy link
Member

kahest commented Sep 17, 2024

@aakashchoubey thanks for the details - let me answer one-by-one.

I'm aware and have been tracking #2604. Would you recommend that I update this there as well, or start a new issue altoge#2604

#2604 looks similar in many respects, but is a different root cause most likely, so we're trying not to conflate the two. If you have new info for the crash with pthread_getcpuclockid in the backtrace, please add it to #2604. No need to create a new issue.

So, is it possible that adding the environment flag triggered this? Or maybe it could just be that the numbers were low.

Environment should not affect this in any way, we can rule this out.

There are two different stacktraces for this crash.

These are almost identical, a difference in the 13th/14th frame is most likely not relevant, but thanks for pointing it out 👍

@sagarbhojaviya
Copy link

@kahest can you please share plan for release this bug.

@kahest
Copy link
Member

kahest commented Sep 20, 2024

@sagarbhojaviya please see the updates at the top. There is currently no way for us to fix this on the SDK side, it will most likely require a fix inside of the Android Tracer. We will keep this issue updated.

@sagarbhojaviya
Copy link

sagarbhojaviya commented Sep 23, 2024

@kahest can you please help me in this

i have below meta-data inside android manifest file, i want to know which tag i want to new add or update current meta-data value for stop this issue temporary.

    <meta-data
        android:name="io.sentry.traces.user-interaction.enable"
        android:value="true" /> <!-- enable screenshot for crashes -->

    <meta-data
        android:name="io.sentry.attach-screenshot"
        android:value="true" /> <!-- enable view hierarchy for crashes -->

    <meta-data
        android:name="io.sentry.attach-view-hierarchy"
        android:value="true" /> <!-- enable the performance API by setting a sample-rate, adjust in production env -->

    <meta-data
        android:name="io.sentry.traces.sample-rate"
        android:value="1.0" /> <!-- enable profiling when starting transactions, adjust in production env -->

    <meta-data
        android:name="io.sentry.traces.profiling.sample-rate"
        android:value="1.0" />

    <meta-data
        android:name="io.sentry.anr.timeout-interval-mills"
        android:value="5000" /> <!-- Required: set your sentry.io project identifier (DSN) -->

    <meta-data
        android:name="io.sentry.dsn"
        android:value="${dsnValueSentry}" />

@romtsn
Copy link
Member

romtsn commented Sep 23, 2024

@sagarbhojaviya it's this:

  <meta-data
        android:name="io.sentry.traces.profiling.sample-rate"
        android:value="0.0" />

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Status: Blocked
Development

No branches or pull requests