Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV in MapRenderer #2579

Closed
kodebach opened this issue Jul 5, 2024 · 17 comments
Closed

SIGSEGV in MapRenderer #2579

kodebach opened this issue Jul 5, 2024 · 17 comments
Labels
android bug Something isn't working

Comments

@kodebach
Copy link
Contributor

kodebach commented Jul 5, 2024

We are seeing an issue similar to #2206 in our app. The crash occurs a lot, but so far we have not heard from users that it impacts them. So the crash might occur in the background or while leaving the app.

This is the current crash statistic from our Sentry instance

device model crashes in last 90 days
SM-A600FN 108
SM-A336B 39
SM-A405FN 36
SM-A325F 56
SM-A320FL 212
SM-G525F 584

To frame the number of crashes seen per device: This is a private app deployed to fewer than 500 devices, all of them Samsung phones. The list above is the complete list of devices in use, so as far as we can tell, it is not hardware specific (unless it is a general Samsung issue).

With Maplibre 11 we finally got a more detailed stack trace

OS Version: Android 13 (TP1A.220624.014.G525FXXSACXCA)
Report Version: 104

Exception Type: Unknown (SIGSEGV)

Application Specific Information:
Segfault

Thread 0 Crashed:
0   split_config.arm64_v8a.apk      0x70a5f1d3ec        std::__ndk1::__split_buffer<T>::~__split_buffer
1   split_config.arm64_v8a.apk      0x70a5dfcacc        mbgl::android::MapRenderer::render
2   split_config.arm64_v8a.apk      0x70a5dff5fc        jni::MakeNativeMethod<T>::lambda::__invoke<T>
3   base.odex                       0x70bea5c2d4        <unknown> + 484234871508

Not sure if that is helpful in any way, __split_buffer seems to be a an internal class of the C++ stdlib used in std::vector and possibly others. But I did not see any std::vector instances in MapRenderer.

Sentry also shows the __split_buffer frame as (although I'm not sure how it got that information)

std::__ndk1::__split_buffer<mbgl::util::DefaultStyle, std::__ndk1::allocator<mbgl::util::DefaultStyle>&>::~__split_buffer()

That would point to

std::vector<mbgl::util::DefaultStyle> defaultStyles;

But I did not check how this is connected to MapRenderer (if at all).

Originally posted by @kodebach in #2206 (comment)

@louwers
Copy link
Collaborator

louwers commented Jul 11, 2024

Is this already a symbolicated stack trace? Because I don't see any line numbers.

https://github.com/maplibre/maplibre-native/wiki/Symbolicating-Crash-Reports-MapLibre-Native-(Android)

@kodebach
Copy link
Contributor Author

No, I did not manually symbolicate the stack trace. I'll try doing that.

Is there a reason you publish the debug symbols separately instead of including them in the AAR on Maven? AFAIK the debug symbols are stripped by AGP anyway during a release build, but when building an app bundle AGP include the debug symbols separately so Google Play (and others) can use them to symbolicate traces automatically.

@louwers
Copy link
Collaborator

louwers commented Jul 11, 2024

Is there a reason you publish the debug symbols separately instead of including them in the AAR on Maven?

I remember looking into it, but I couldn't find a way to do it. If you manage to find a way, a PR would be welcome.

@kodebach
Copy link
Contributor Author

kodebach commented Jul 17, 2024

Not sure whether this is the same crash, but we now managed to get a full symbolicated stack trace:

OS Version: Android 14 (UP1A.231005.007.A336BXXS9EXE2)
Report Version: 104

Exception Type: Unknown (SIGSEGV)

Application Specific Information:
Segfault

Thread 0 Crashed:
0   split_config.arm64_v8a.apk      0x7391afc3ec        [inlined] mbgl::gfx::BackendScope::activate (backend_scope.cpp:55)
1   split_config.arm64_v8a.apk      0x7391afc3ec        mbgl::gfx::BackendScope::BackendScope (backend_scope.cpp:31)
2   split_config.arm64_v8a.apk      0x73919dbacc        mbgl::android::MapRenderer::render (map_renderer.cpp:186)
3   split_config.arm64_v8a.apk      0x73919de5fc        [inlined] jni::MakeNativeMethod<T>::lambda::operator()<T> (native_method.hpp:57)
4   split_config.arm64_v8a.apk      0x73919de5fc        jni::MakeNativeMethod<T>::lambda::__invoke<T> (native_method.hpp:53)
5   base.odex                       0x73b3bda2b4        <unknown> + 496936788660

Registers contents:

fp 0x000000743b3db6b0 x5 0x000000000ac1d7d2 x14 0x000000000ac1d7d2 x23 0x0000000000000001
lr 0x0000007391afc378 x6 0x00000000000003cc x15 0x000008a9d6fb38d6 x24 0xb400007497c06800
pc 0x0000007391afc3ec x7 0x00000000000001a1 x16 0x0000007391f7b300 x25 0x000000005b02c5a8
sp 0x000000743b3db6b0 x8 0x0000000000000000 x17 0x00000076e36f5460 x26 0x0000000012d09e50
x0 0x0000000000000000 x9 0x000000743b3dc040 x18 0x00000072f544a000 x27 0x0000000000000000
x1 0x0000000000000000 x10 0x0000000000000001 x19 0x000000743b3db718 x28 0x00000000130ec3a0
x2 0x0000000000000001 x11 0x0000000000000001 x20 0x0000000000000000    
x3 0x000008a9d6f99d45 x12 0x0000007715d63030 x21 0x0000000000000001    
x4 0x0100000000000000 x13 0x000000007fffffff x22 0x0000007391f870c0    

Not sure whether that helps. AFAICT the crash happens here

but backend is a RendererBackend& so it shouldn't be null. However activate is declared as

virtual void activate() = 0;

So maybe that causes the segfault.

PS. just to clarify, we're currently using version 11.0.1

@louwers
Copy link
Collaborator

louwers commented Jul 17, 2024

That is a pure virtual method which you cannot call.

On Android its implementation is a no-op:

So it can only crash there if RendererBackend& is a dangling reference.

@kodebach
Copy link
Contributor Author

I had another look at the logs for the crashes. It seems they happen exclusively after the map is no longer visible. Most crashes happen shortly (1 second or less) after Activity.onPause or Activity.onDestroy is called. That's the case for all crashes were we have the fully symbolicated stack trace.

For the older crashes, I can't be certain, it may be that some of the time the Activity is still alive, but the user switched screens. However, I suspect it the Activity is always closed/backgrounded.

The crash being related to the Activity closing/going into background would certainly explain, why we have lots of automated crash reports, but zero complaints from users.

@louwers
Copy link
Collaborator

louwers commented Jul 18, 2024

Should be fixed by #2631

Will be included in the next release.

@louwers louwers closed this as completed Jul 18, 2024
@kodebach
Copy link
Contributor Author

It seems the issue is not fully fixed, with version 11.1.0 we now get this stack trace for the segfault:

OS Version: Android 14 (UP1A.231005.007.G525FXXSCDXG2)
Report Version: 104

Exception Type: Unknown (SIGSEGV)

Application Specific Information:
Segfault

Thread 0 Crashed:
0   split_config.arm64_v8a.apk      0x764aa06668        [inlined] std::__ndk1::unique_ptr<T>::operator-> (memory:2651)
1   split_config.arm64_v8a.apk      0x764aa06668        mbgl::Renderer::render (renderer.cpp:35)
2   split_config.arm64_v8a.apk      0x764a84a928        mbgl::android::MapRenderer::render (map_renderer.cpp:197)
3   split_config.arm64_v8a.apk      0x764a84d47c        [inlined] jni::MakeNativeMethod<T>::lambda::operator()<T> (native_method.hpp:57)
4   split_config.arm64_v8a.apk      0x764a84d47c        jni::MakeNativeMethod<T>::lambda::__invoke<T> (native_method.hpp:53)
5   base.odex                       0x76641352d4        <unknown> + 508485128916

I think #2631 just fixed symptoms of a bigger issue, and this bigger issue now causes a crash somewhere else. The bigger issue AFAICT being that MapLibre tries to render after the Android has disposed of the Surface or some other resources.

It may also be the case, that we are not correctly calling MapLibre's lifecycle methods in our app, but at a quick glance it seems we do what the docs tell us to.

@louwers louwers reopened this Jul 29, 2024
@kodebach
Copy link
Contributor Author

Is there any update on this issue? Do you need any additional crash data? We're now seeing this crash at a rate of about 5-10 per day.

@louwers
Copy link
Collaborator

louwers commented Aug 27, 2024

@kodebach It's really weird that there are not more people affected by this. Any chance you can share how you are using the library? A reproducer would be ideal...

In any case, thanks for the reminder, I can put in some more defensive programming there.

@kodebach
Copy link
Contributor Author

We haven't been able to reproduce it in testing ourselves, so I can't provide a reproducer.

We are using Jetpack Compose with our own wrapper similar to https://github.com/ramani-maps/ramani-maps (the library didn't exist when we switched to Compose). Maybe there is a subtly bug there, but AFAICT we do the same Lifecycle mapping as ramani-maps.

I will check the code again, and maybe investigate switching to ramani-maps. AFAIK it doesn't have any known lifecycle issues.

@kodebach
Copy link
Contributor Author

We tried switching to ramani-maps, but it doesn't currently support all the features we need. However, we have updated our lifecycle logic with a mix of code from ramani-maps and https://github.com/googlemaps/android-maps-compose. We are now less aggressive with calling MapView.onDestroy and instead accept possible memory leaks in some edge cases.

Let's see if this fixes the issue.

@kodebach
Copy link
Contributor Author

Seems that it indeed was an error in our lifecycle code. We've not seen any crashes on the new versions.

@louwers
Copy link
Collaborator

louwers commented Sep 23, 2024

@kodebach Well, even if you mess up, that still shouldn't result in a crash.

I'm happy your issue is resolved, but it would be interesting to know what was happening here.

@kodebach
Copy link
Contributor Author

AFAICT in some situation we called MapView.onDestory twice. We relied on a DisposableEffect from compose, but used an extra key parameter which meant the onDispose callback and there for MapView.onDestroy could be called, even if the MapView instance was the same. Our new solution uses the onRelease lambda of the AndroidView composable (which didn't exist at the time the old code was written). Using onRelease is a lot more reliable, because it is guaranteed to be called exactly once when the the View instance is disposed forever, which exactly matches the contract for MapView.onDestroy.

@elican-doenyas
Copy link

@louwers I experienced the same issue when I upgraded to use the latest map libre android sdk (v11.6.1). Looking into the issue, there seems to be a nullptr dereference on MapRenderer:render method. I believe this happens when the style.json does not include a renderer property in its metadata. I was able to resolve the issue by adding;

"maputnik:renderer": "mbgljs"

value to the metadata section of the style.json I am using.

Can you please verify this and possibly introduce a fix so it does not crash in the absence of this property? We were not aware that it was required as the metadata is listed as 'optional' in the style specs here: https://maplibre.org/maplibre-style-spec/root/#metadata

Note 1: Previously, our company was using the v10.2.0 maplibre android sdk, and it was working fine without this property in the style.

Note 2: The relevant crash log is attached.
maplibre-nullptr.txt

@louwers
Copy link
Collaborator

louwers commented Nov 20, 2024

@elican-doenyas That sounds like a separate issue. Could you open an issue and ideally share the style that is causing a crash?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
android bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants