Skip to content
This repository has been archived by the owner on Aug 8, 2023. It is now read-only.

Eliminate separate "MapContext" thread #2909

Merged
merged 7 commits into from
Apr 14, 2016
Merged

Eliminate separate "MapContext" thread #2909

merged 7 commits into from
Apr 14, 2016

Conversation

jfirebaugh
Copy link
Contributor

@jfirebaugh jfirebaugh commented Nov 17, 2015

We're currently rendering on the MapContext thread. However, when triggering a render call from the main thread, we're also blocking the main thread, so there's no advantage from parallelization here, but we have the overhead of managing the OpenGL context on the MapContext thread. Instead, we should split up Mapbox GL into two parts: the MapContext thread manages the state of the objects, coordinates tile loading and style updates, the Painter thread (which can run either on the MapContext thread, the main thread, or an entirely different thread) that just takes the state the MapContext thread produces, and renders a frame.


TODO list:

  • Eliminate use of ThreadContext for obtaining FileSource / GLObjectStore
    • Restore GLObjectStore, but pass it in wherever needed
  • iOS / OSX
    • Find the right place to call mbgl::gl::InitializeExtensions (fixes "Not using Vertex Array Objects" warning)
    • Convert RunLoop implementation to pure C++
    • Figure out why glfw app exits immediately
  • Android
  • Node

@jfirebaugh jfirebaugh self-assigned this Nov 11, 2015
@jfirebaugh
Copy link
Contributor

I'm not sure the "arbitrary thread" or thread division described above is feasible. Firstly, the styles API will expose virtually all state to the main thread in ways that will make it very difficult to safely or productively share that state with other threads. Once a full read-write API for the style is exposed, every method must essentially be an invokeSync to the thread owning the state, thus negating the point of having a separate thread. Secondly, as we already know, having a separate thread for rendering doesn't really work either due to the need to invokeSync the render method.

Therefore I'd like to reduce the scope of this issue to something simpler: move style state and rendering to the main thread. Though simpler, this will still be a somewhat difficult transition from our current model. Because the main thread already has a native event loop, we cannot continue to uv_run(loop, UV_RUN_DEFAULT) as our current MapContext thread does. We have to either:

  1. Replace all use of the libuv event loop with equivalent platform-specific constructs for cross-thread function invocation, timers, etc.
  2. Figure out how to embed a libuv event loop next to the native event loop on each platform.

Option 1

The first option is to replace our use of a libuv event loop for the map state and rendering with the use of the native event loop of the embedding application. This approach means we would need to replace the following usages of libuv:

  • uv_async used by MapContext to coalesce multiple updates and run them on the next tick
  • uv_timer used in this PR to expand the update coalescing across multiple ticks
  • uv_async used by FileSource implementation to dispatch asynchronous responses back to the requesting thread
  • uv_async used by WorkQueue (via RunLoop) to dispatch worker requests back to the MapContext thread

The use of uv_async looks feasible to replace:

  • For darwin, we can use GCD
  • For Android, I'm sure there's something
  • GLFW is most tricky, since the only cross thread notification mechanism available is glfwPostEmptyEvent. But we could replace every uv_async dispatch with a threadsafe method queue, ensure that each tick of the glfw event loop includes a check of this queue, and call glfwPostEmptyEvent whenever a method is pushed on the queue by another thread.

I don't know how we'd replace uv_timer in glfw. I guess it would involve running a separate thread that sleeps, then posts to the method queue? Seems like a pain. Is there any way we can avoid needing a timer?

Note that this approach does not necessarily mean we'd remove libuv entirely. We'd still continue to use a libuv event loop on the FileSource and Worker threads.

Option 2

The "embed a libuv event loop" approach. The essence of this approach is that on each platform, you insert into the native event loop something that calls uv_run(loop, UV_RUN_NOWAIT) on each native tick. The usual problem that you run into after that is a situation where the libuv loop has something to do, but there aren't any native events to wake up the native event loop so that it can tick the libuv loop. We're somewhat fortunate in having no async IO happening on the MapContext libuv loop: the only situations it has work to do are when a uv_async has been sent, or when the uv_timer goes off. In the former situation, we can post a native event (like glfwPostEmptyEvent) when sending the uv_async, to ensure the native loop wakes up. But then the uv_timer case is, once again, more tricky.

Since even with Option 2, we still have to implement most of the "wakeup" machinery we would need for Option 1, I lean toward Option 1.

Thoughts @kkaefer @tmpsantos @brunoabinader?

@tmpsantos
Copy link
Contributor

I like the option 1 and we already have that working for the Qt port. The Qt port doesn't depend on libuv at all (not entirely true, because we still use uv::rwlock and uv::tls, but that can be replaced with std::).

Pretty much what we need from a platform port is to implement Timer, AsyncTask and the RunLoop which for Qt was just a few lines of code and pretty easy to implement.

IMO that change will make Mapbox GL more "embedded friendly" as we can reduce the binary footprint on systems that have something else already doing what libuv does.

As for the GLFW port, IMO we can use libuv as its main loop on the main thread.

Finally, I pushed a branch for your appreciation. It is not fully functional, not all the tests are passing ATM, but if we reach a consensus here I can move forward with it.

https://github.com/mapbox/mapbox-gl-native/tree/tmpsantos-libuv_abstracted

@jfirebaugh
Copy link
Contributor

As for the GLFW port, IMO we can use libuv as its main loop on the main thread.

How would you do that? AFAICT, with GLFW you have to implement the event loop with glfwWaitEvents.

@tmpsantos
Copy link
Contributor

How would you do that? AFAICT, with GLFW you have to implement the event loop with glfwWaitEvents.

I wouldn't call the blocking glfwWaitEvents() anymore. Instead, we would wait on the main loop and have a timer ticking at a certain frequency (60Hz) calling glfwPollEvents() that doesn't block.

Bad side is the app will never sleep completely anymore, but I think it will work.

@jfirebaugh
Copy link
Contributor

glfwPollEvents()

Ah, good call. I pushed a proof of concept of that and converted this to a PR. It also includes stuff from tmpsantos-libuv_abstracted (with a few fixes).

@jfirebaugh
Copy link
Contributor

Next step here is to peel off the parts from tmpsantos-libuv_abstracted related to removing libuv dependencies from the FileSource thread. That's a bigger chunk of work, mainly due to the use of libuv IO functionality, that we don't need to bite off right now.

The model for render-main will be that the main thread uses the native event loop, with a RunLoop wrapper that isn't actually run() except for the GLFW app, and other threads use a libuv loop.

@jfirebaugh
Copy link
Contributor

Scratch that, it doesn't actually work, it'd require conflicting RunLoop implementations. The main thread needs a RunLoop backed by the native event loop, while the FileSource and worker threads need a RunLoop backed by libuv. I think the answer is to remove the direct dependency of MapContext on uv::async/AsyncTask entirely. "Update" should be an operation the MapContext asks the view to do, and the view takes care of throttling / deduplicating updates in a platform-specific manner. (We already have something like this in the vestigial View::notify method.) So I'm not going to use most of libuv_abstracted (yet) after all.

@jfirebaugh
Copy link
Contributor

...except that that doesn't work because the FileSource thread depends on the rendering thread having an active RunLoop for invokeWithCallback. Ugh, I keep thinking in circles here.

@kkaefer
Copy link
Member Author

kkaefer commented Nov 18, 2015

"Update" should be an operation the MapContext asks the view to do, and the view takes care of throttling / deduplicating updates in a platform-specific manner.

Yeah, we are sort of having something like that already with view.invalidate().

@tmpsantos
Copy link
Contributor

I'm working on this. The changes needed for adding a new platform RunLoop were pushed at #3139 and they don't change anything in the client API, thus can be already merged.

For rendering in the main thread, I'm also experimenting an alternative and less intrusive approach that would be a main loop integration (glib-style), but that would also depend on #3139.

@tmpsantos
Copy link
Contributor

I managed to create a main loop integration mechanism that is extremely simple and worked well. @jfirebaugh please have a look on the 5 last patches of this branch here (the rest of the delta with master is basically #3139):

https://github.com/tmpsantos/mapbox-gl-native/commits/render-main

@tmpsantos tmpsantos force-pushed the render-main branch 2 times, most recently from 7948ce8 to 652a6ce Compare December 3, 2015 14:42
@jfirebaugh
Copy link
Contributor

So this is the "Option 2" approach. How will Timer notifications wake the main thread with this approach?

@tmpsantos
Copy link
Contributor

So this is the "Option 2" approach. How will Timer notifications wake the main thread with this approach?

They won't work. I documented it here: 9a11287

We don't use timers on the main thread anyway and the commit adds an assertion for that. But this is not final, I'm still experimenting with this approach.

@jfirebaugh
Copy link
Contributor

👍

@tmpsantos
Copy link
Contributor

On linux, Annotations.FillAnnotation and Annotations.NonImmediateAdd core unit tests are failing due to an unknown issue. The "actual" rendering looks like this:

Made progress today on #4682. On that branch, all tests are passing on Linux.

@jfirebaugh
Copy link
Contributor

Thanks @tmpsantos. I integrated your changes into this PR. However, when running the Android test app, I only see a white screen, no map.

@@ -88,6 +89,7 @@ class NativeMapView : public mbgl::View, private mbgl::util::noncopyable {
JNIEnv *renderEnv = nullptr;

// Ensure these are initialised last
std::unique_ptr<mbgl::util::RunLoop> loop;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tmpsantos Is it ok to create multiple RunLoops representing the main thread, if multiple map views are created? Or does this need to be a singleton somewhere else?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, it should be a singleton. I will fix that.

@jfirebaugh
Copy link
Contributor

On linux, Annotations.FillAnnotation and Annotations.NonImmediateAdd core unit tests are failing due to an unknown issue.

This turned out to be caused by #3883.

@tmpsantos
Copy link
Contributor

However, when running the Android test app, I only see a white screen, no map.

It works on Nexus 5x just fine but throws an exception on Nexus 5 when trying to swap buffers.

I don't think it has to do with the RunLoop anymore, but how we manage the GL context. It is now over complicated for rendering in the main thread and we should do it on the Java side. I will give #1662 a try.

jfirebaugh and others added 7 commits April 14, 2016 13:44
Do not create a thread for the MapContext anymore.
This is needed for plugging into an existing Looper because
we won't call ::run() on the main thread (this is called
on the Java side of the application automatically by Android).

But we still need to wake up and process the events. For that
we create a pipe and write to the pipe to wake up the main
loop and we process events on the callback of the fd handler.

Fixes #4682
Not needed as everything is now running on the same thread.
Not needed, same thread already.
@jfirebaugh
Copy link
Contributor

Android rendering issues were due to missing eglSwapBuffers; fixed. I think this is ready!

@jfirebaugh jfirebaugh merged commit 7bb74ac into master Apr 14, 2016
@jfirebaugh jfirebaugh deleted the render-main branch April 14, 2016 21:10
This was referenced Apr 14, 2016
tmpsantos added a commit that referenced this pull request Apr 15, 2016
Apparently some versions of Android, notable 4.4.4 running
on my Nexus 5, creates a GLContext on the Android UIThread
which is the one we are now using for rendering after #2909.

If the context is not restored, nothing gets rendered (or
sometimes partially, or artifacts) because Android will try
to do GL stuff on the context used by Mapbox GL and mess things up.
tmpsantos added a commit that referenced this pull request Apr 18, 2016
Apparently some versions of Android, notable 4.4.4 running
on my Nexus 5, creates a GLContext on the Android UIThread
which is the one we are now using for rendering after #2909.

If the context is not restored, nothing gets rendered (or
sometimes partially, or artifacts) because Android will try
to do GL stuff on the context used by Mapbox GL and mess things up.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants