Skip to content
This repository has been archived by the owner on Aug 8, 2023. It is now read-only.

Send Android Telemetry Events At App Close #4066

Closed
bleege opened this issue Feb 22, 2016 · 17 comments
Closed

Send Android Telemetry Events At App Close #4066

bleege opened this issue Feb 22, 2016 · 17 comments
Assignees
Labels
Android Mapbox Maps SDK for Android telemetry Integration with Mapbox Telemetry libraries

Comments

@bleege
Copy link
Contributor

bleege commented Feb 22, 2016

Currently when an app is terminated either via the OS or directly by user only the location listening service and wakelock are removed. This should be updated to also include flushing the events queue as it's currently only sending every 2 minutes potentially stranding a lot of data.

Related Issue: #2421

@bleege bleege added Android Mapbox Maps SDK for Android telemetry Integration with Mapbox Telemetry libraries labels Feb 22, 2016
@bleege bleege self-assigned this Feb 22, 2016
@bleege bleege added this to the android-v4.0.0 milestone Feb 22, 2016
@bleege bleege mentioned this issue Feb 22, 2016
3 tasks
@bleege
Copy link
Contributor Author

bleege commented Feb 22, 2016

Tried testing the initial logic, but only the following printed out. The staging server has now been converted to V2 so I'll pause this work until the data model is V2 ready.

02-22 14:24:40.538 19155-19155/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: flushEventsQueueImmediately() called...
02-22 14:24:40.539 19155-19155/com.mapbox.mapboxsdk.testapp D/MapboxEventManager: turnstile event pushed.
02-22 14:24:40.540 19155-19208/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: FlushTheEventsTask.doInBackground() called...
02-22 14:24:40.541 19155-19208/com.mapbox.mapboxsdk.testapp D/MapboxEventManager: original = 1824d76a-9611-4a46-9a9c-a4544f85be6d; hex = 02225E39096914AA24BCEDB5243AB3CE46CF5315
02-22 14:24:40.554 19155-19208/com.mapbox.mapboxsdk.testapp D/MapboxEventManager: original = 1824d76a-9611-4a46-9a9c-a4544f85be6d; hex = 02225E39096914AA24BCEDB5243AB3CE46CF5315
02-22 14:24:40.562 19155-19208/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: url = https://cloudfront-staging.tilestream.net/events/v1?access_token=sk.eyJ1IjoiYmxlZWdlIiwiYSI6InNpcml1c2x5In0.KyT-boMyC_xZYTYojTc8zg
02-22 14:24:41.283 19155-19208/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: Response Code from Mapbox Events Server: 422 for 2 events sent in.
02-22 14:24:50.459 19155-19411/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: FlushTheEventsTask.doInBackground() called...
02-22 14:24:50.460 19155-19411/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: No events in the queue to send so returning.
02-22 14:25:55.783 19155-19155/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: flushEventsQueueImmediately() called...
02-22 14:25:55.784 19155-20499/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: FlushTheEventsTask.doInBackground() called...
02-22 14:25:55.785 19155-20499/com.mapbox.mapboxsdk.testapp D/MapboxEventManager: original = 1824d76a-9611-4a46-9a9c-a4544f85be6d; hex = 02225E39096914AA24BCEDB5243AB3CE46CF5315

@mb12
Copy link

mb12 commented Feb 22, 2016

@bleege If an application acquires a wake lock, it needs to be declared in the application's manifest file. Google Play will show the following under Permissions for that app.

  • Prevent device from sleeping

IMHO this is too high a permission for telemetry or any component that only makes REST calls to a server.

@bleege
Copy link
Contributor Author

bleege commented Feb 22, 2016

If an application acquires a wake lock, it needs to be declared in the application's manifest file.

Correct. There are a variety of wake locks available on the Android platform, and the one currently in use is the least obtrusive PARTIAL_WAKE_LOCK which is limited to CPU. This allows the screen to fully sleep. The need for the wake lock is so that Location changes can be listened for when the screen sleeps.

@mb12
Copy link

mb12 commented Feb 23, 2016

Typical running/fitness apps acquire partial_wake_lock and register as Location Receiver . And they always end up as top battery draining apps under Android settings. Is there any reason you expect this to be different with telemetry turned on?

We should pause telemetry collection when the Activity/Fragment containing the MapView receives onPause event. It is more useful to collect data when user is actually interacting with the application containing the map(not when one moves away from the app by tapping on home button). This is what Mapbox iOS SDK does as well. On iOS, the OS enforces it by adding a blue bar at the top of the home screen indicating that the app is using background location data to help developers avoid exactly these kind of bugs.

@bleege
Copy link
Contributor Author

bleege commented Feb 23, 2016

Is there any reason you expect this to be different with telemetry turned on?

Yes. The use of any hardware (GPS chip, Accelerometer, cellular modem, etc) in a device over time is typically the driver of battery use. The SDK switches to NO_POWER mode in the LOST Location library when the app is no longer the actively visible one on the device (aka "goes to the background"). This translates to using the stock Android LocationManager's PASSIVE_PROVIDER which does not initiate any requests for location data and therefore no calls to device hardware to provide it. It simply is notified of the new data when other apps or the OS get updated location data and then does minimal processing.

bleege added a commit that referenced this issue Feb 25, 2016
…ng Thread.sleep() to ensure it runs at shutdown
@bleege
Copy link
Contributor Author

bleege commented Feb 25, 2016

I found a way to to ensure, at least in local testing, that the app stays alive long enough for the events to be sent to the server when the application is stopped. This is done by putting a Thread.sleep(1000) command before the WAKE_LOCK is released. Granted the HTTP request happens on a different thread, but in practice the test location data that is only sent during shutdown always makes it to the server. The end result is that the Choreographer does complain about skipping a few frames in the logs, but there is nothing noticeable to the user in the device's UI. In fact the app closes and goes away just as fast as it always seems to.

The benefits of this technique is that HTTP requests with Telemetry data still only send every 2 minutes to minimize network traffic (and the underlying cellular modem hardware working), but safeguards the data by sending it if the app is shutdown in between the 2 minute cycle. Again, this only happens when the TelemetryService receives an onDestroy() for when the OS kills the process or an onTaskRemoved() when the user terminates the app via the app switcher.

Still before merging this in I want to consult with my colleagues @zugaldia @tobrun to get their thoughts on this as well as to see if they have any other suggestions. Assuming they're good with this I'll remove the logging and the test data from the new development branch and then commit.

@tobrun
Copy link
Member

tobrun commented Feb 26, 2016

@bleege I haven't got the whole global picture around telemetry or know about all the requirements but here are some ideas/thoughts on this:

Android Service and the main thread

An normal Android service runs on the main thread and this is why you are seeing skipped frames. The reason why you feel that the app closes as fast as always, is because the device is doing a lot of work when switching apps or killings apps and as a result it's doing a lot of GC'ing. That said, this isn't giving us the bandwidth to add a sleep on the main thread.

Holding the wakelock until work is done

if the primary reason for the sleep is to hold the wakelock until telemetry data has been send.
Than we should use a callback interface that is invoked from the onPostExecute from the AsyncTask. With this construction we are actually holding the wakelock for the actual amount needed to send the data. For example: you are now testing on fast network and data is being send within that 1 second timeframe, but if you are on a slow network, we will not receive the data.

Side note: used components

I feel this is worth mentioning: We are now using a normal Service + BroadcastReceiver + Asynctask.. I think we should try out a WakeFullBroadCastReceiver with an IntentService as highlighted by the Android docs.

Side note: batching/battery

Batching http request is a good way to minimise traffic and safe battery but sending these batches every 2 minutes is throwing all these benefits. The minimum time slot should be 10 minutes and preferable much much longer (10 minutes is needed to go from an active - stand by - sleep state of the cellular radio).

@bleege
Copy link
Contributor Author

bleege commented Feb 26, 2016

Thanks for the 👀 and insight @tobrun! To clarify the only thing that that TelemetryService does is handle the start / stop of the BroadcastReceiver in concert with the app starting and stopping so it's impact on the main thread is pretty minimal. This is done to make sure that location listening code stops when the app stops. It's also the main reason for using the AsyncTask to actually send the data because we don't want to block the main thread during regular activity.

if the primary reason for the sleep is to hold the wakelock until telemetry data has been send.

Nope, the primary reason for the (PARTIAL) wakelock is so that Location updates will continue when the screen sleeps. I think it's just coincidental in this case where I put the Thread.sleep() experiment.

I totally agree with the idea that putting a Timer.sleep() anywhere in use, especially on the main thread, is generally "not a good thing™". TBH, I'm not really happy with this solution which is why I wanted to touch base with @tobrun @zugaldia on this. That said, in this case it literally is one of the last things that the app would do as it's in the process of being shutdown so there's no UX nor UI degradation at all. The stuff I mentioned about the Choreographer was just log messages in LogCat... ie, something that the end user never sees. It's also something that bigger picture wise doesn't matter as the app's process is going to be terminated. Still, my inner programmer doesn't like it. 😄

In talking with @zugaldia this morning, he suggested trying to use a Handler as that might be a way to ensure that the actual send data process completes. I'll investigate that next to see how it plays. Hopefully it works as it seems like a more legitimate solution.

Side Notes

I think we should try out a WakeFullBroadCastReceiver with an IntentService

I agree. Let's look at this in the next iteration.

The minimum time slot should be 10 minutes and preferable much much longer (10 minutes is needed to go from an active - stand by - sleep state of the cellular radio).

Interesting. Is there a source for this information? I'd love to know more about how this work. Regardless, if we did go to 10 minutes the whole "flush at app close" scenario would become even more important as there'd likely be more data at stake.

@bleege
Copy link
Contributor Author

bleege commented Feb 26, 2016

I just tried converting to use a Handler.postAtFrontOfQueue() and based on the logs I see that it ran, but the test data never made it to server.

I'll try some other experiments, but will need to timebox this process.

02-26 11:17:32.698 26122-26157/com.mapbox.mapboxsdk.testapp I/mbgl: {Map}[Android]: Not deactivating as we are not ready
02-26 11:17:32.711 26122-26122/com.mapbox.mapboxsdk.testapp I/TelemetryService: shutdownTelemetry()
02-26 11:17:32.715 26122-26122/com.mapbox.mapboxsdk.testapp I/TelemetryService: done with shutdownTelemetry()...
02-26 11:17:32.716 26122-26122/com.mapbox.mapboxsdk.testapp I/TelemetryService: Handler.run() started
02-26 11:17:32.716 26122-26122/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: flushEventsQueueImmediately() called...

@tobrun
Copy link
Member

tobrun commented Feb 26, 2016

@bleege

Is there a source for this information?

Yes, there are written sources but videos with Reto Meier are way cooler

he suggested trying to use a Handler

This is indeed the general accepted alternative to Thread.sleep.

test data never made it to server

I believe the reason why this isn't working is because you are posting at front of the queue and that message will be immediately executed. The alternative to Thread.sleep(1000) would be Handler.postDelayed(runnable,1000).

I'll try some other experiments, but will need to timebox this process.

Repeating myself a bit, sorry for being a broken record, why timebox when we can get a callback when work is done? If can quickly code up an example if you like.

@bleege
Copy link
Contributor Author

bleege commented Feb 26, 2016

why timebox when we can get a callback when work is done? If can quickly code up an example if you like.

That's just it. I think the app's process is terminating before any callback can be called. If there's no process to run there's nothing to call back to. Happy to be wrong about this though. 😄

@bleege
Copy link
Contributor Author

bleege commented Feb 26, 2016

I believe the reason why this isn't working is because you are posting at front of the queue. The alternative to Thread.sleep(1000) would be Handler.postDelayed(runnable,1000).

If I read the docs right, that will run the job 1 second in the future. The issue in regards to time is making sure that the HTTP request is made before the app's process is killed. The whole thing about using Thread.sleep() was to "distract" the app long enough to get the HTTP request made before the app's process was terminated.

@bleege
Copy link
Contributor Author

bleege commented Feb 26, 2016

I tried another option, which is to put the processing and sending of the Telemetry data into it's own standalone method that can be called immediately as well as by the FlushTheEventsTask AsyncTask. According to the logs below this is looking more promising, as both the FlushTheEventsTask worked and the timing of the immediate sendDataToServer() method looks to be in the right sequence. The only trouble was running into the NetworkOnMainThreadException. I'll see if OkHttp will let me fire this off in another thread.

02-26 13:11:00.720 28003-28003/com.mapbox.mapboxsdk.testapp I/TelemetryService: shutdownTelemetry()
02-26 13:11:00.722 28003-28003/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: flushEventsQueueImmediately() called...
02-26 13:11:00.722 28003-28003/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: past events in queue check...
02-26 13:11:00.723 28003-28003/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: past network check...
02-26 13:11:00.723 28003-28003/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: past json building ...
02-26 13:11:00.724 28003-28003/com.mapbox.mapboxsdk.testapp E/MapboxEventManager: FlushTheEventsTask borked: android.os.NetworkOnMainThreadException
02-26 13:11:00.725 28003-28003/com.mapbox.mapboxsdk.testapp I/TelemetryService: done with shutdownTelemetry()...

@bleege
Copy link
Contributor Author

bleege commented Feb 26, 2016

I tried converting the OkHttp client call to use Callbacks / background thread. While it did indeed solve the NetworkOnMainThreadException issue, I couldn't see that the data actually made it into the server.

02-26 13:30:27.074 29763-29791/com.mapbox.mapboxsdk.testapp I/mbgl: {Map}[Android]: Not deactivating as we are not ready
02-26 13:30:27.094 29763-29763/com.mapbox.mapboxsdk.testapp I/TelemetryService: shutdownTelemetry()
02-26 13:30:27.097 29763-29763/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: flushEventsQueueImmediately() called...
02-26 13:30:27.097 29763-29763/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: past events in queue check...
02-26 13:30:27.097 29763-29763/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: past network check...
02-26 13:30:27.097 29763-29763/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: past json building ...
02-26 13:30:27.099 29763-29763/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: Will try to send 21 in to the server.
02-26 13:30:27.104 29763-29763/com.mapbox.mapboxsdk.testapp I/TelemetryService: done with shutdownTelemetry()...

bleege added a commit that referenced this issue Feb 26, 2016
…ribute as it was causing validation errors
@bleege
Copy link
Contributor Author

bleege commented Feb 26, 2016

@mick was able to confirm that the reason for the location events not showing up earlier was that there was that it was a validation error as the Application State value was being sent in as an empty string. I fixed that by adding a check to make it truly optional and then saw location events show up again (along with other map interactive events).

I then tried the usual test of quitting the app and seeing if the test quit location would show up. The same log trace appeared (see below), but I didn't see the event show up in the database. I worked with @mick to see if there was any validation errors on the server and he didn't see any. This tells me that that the HTTP request was never sent because the process was terminated first.

02-26 13:58:12.836 4357-4376/com.mapbox.mapboxsdk.testapp E/libEGL: call to OpenGL ES API with no current context (logged once per thread)
02-26 13:58:12.836 4357-4376/com.mapbox.mapboxsdk.testapp I/mbgl: {Map}[Android]: Not deactivating as we are not ready
02-26 13:58:12.850 4357-4357/com.mapbox.mapboxsdk.testapp I/TelemetryService: shutdownTelemetry()
02-26 13:58:12.853 4357-4357/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: flushEventsQueueImmediately() called...
02-26 13:58:12.854 4357-4357/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: past events in queue check...
02-26 13:58:12.854 4357-4357/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: past network check...
02-26 13:58:12.854 4357-4357/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: past json building ...
02-26 13:58:12.856 4357-4357/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: Will try to send 1 in to the server.
02-26 13:58:12.862 4357-4357/com.mapbox.mapboxsdk.testapp I/TelemetryService: done with shutdownTelemetry()...

@bleege
Copy link
Contributor Author

bleege commented Feb 26, 2016

Ok, so I tried the Handler, separating the send code into a single method on the same thread, and using the use of Callback all to no avail. I could never see the shutdown test location show up in the database nor in the validation logs.

On account of these attempts I'm not going to let the perfect be the enemy of the good and am going to go with the Thread.sleep() approach. If we can come up with a better solution in the future I'm all for it. However, in the mean time working code > no code. Sure enough after re-implementing it and running it I got the following log trace and more importantly saw the shutdown test location show up in the database.

02-26 14:43:02.755 7619-8205/com.mapbox.mapboxsdk.testapp I/mbgl: {Map}[Android]: Not deactivating as we are not ready
02-26 14:43:02.768 7619-7619/com.mapbox.mapboxsdk.testapp I/TelemetryService: shutdownTelemetry()
02-26 14:43:02.771 7619-7619/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: flushEventsQueueImmediately() called...
02-26 14:43:02.773 7619-10364/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: FlushTheEventsTask.doInBackground()...
02-26 14:43:02.773 7619-10364/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: past events in queue check...
02-26 14:43:02.773 7619-10364/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: past network check...
02-26 14:43:02.774 7619-10364/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: past json building ...
02-26 14:43:03.335 7619-10364/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: Server Response: 204 from sending 1
02-26 14:43:03.775 7619-7619/com.mapbox.mapboxsdk.testapp I/TelemetryService: done with shutdownTelemetry()...
02-26 14:43:03.776 7619-7619/com.mapbox.mapboxsdk.testapp I/Choreographer: Skipped 60 frames!  The application may be doing too much work on its main thread.

@bleege
Copy link
Contributor Author

bleege commented Feb 26, 2016

Re-implemented the flush events when app terminates from 9fc37c2 on new branch of master as it was cleaner and safer than rebasing. Tested the the code and all events process as expected and when I tried the flush event at app termination my termination sample location data (not checked in) showed up in the database as expected. Considering this done for now.

02-26 15:36:28.994 11646-11687/com.mapbox.mapboxsdk.testapp I/mbgl: {Map}[Android]: Not activating as we are not ready
02-26 15:36:29.043 11646-11687/com.mapbox.mapboxsdk.testapp E/libEGL: call to OpenGL ES API with no current context (logged once per thread)
02-26 15:36:29.043 11646-11687/com.mapbox.mapboxsdk.testapp I/mbgl: {Map}[Android]: Not deactivating as we are not ready
02-26 15:36:29.063 11646-11646/com.mapbox.mapboxsdk.testapp I/TelemetryService: shutdownTelemetry()
02-26 15:36:29.065 11646-11646/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: flushEventsQueueImmediately() called...
02-26 15:36:29.907 11646-16930/com.mapbox.mapboxsdk.testapp I/MapboxEventManager: response code = 204 for events 2
02-26 15:36:30.067 11646-11646/com.mapbox.mapboxsdk.testapp I/Choreographer: Skipped 60 frames!  The application may be doing too much work on its main thread.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Android Mapbox Maps SDK for Android telemetry Integration with Mapbox Telemetry libraries
Projects
None yet
Development

No branches or pull requests

5 participants