-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve texture memory usage, by unloading tiles, partly resolving #12906 #12924
Improve texture memory usage, by unloading tiles, partly resolving #12906 #12924
Conversation
@tristan-morris this is really impressive work, thanks so much for taking on this! Our memory handling definitely needs this kind of tightening up, and it's been a while since we did such a thorough hunt for memory leaks. For a little bit of background on texture retention — originally it was introduced 9 years ago (!) in #597 for performance reasons described in #591 — to have a pool of common size textures to reuse instead of continuously creating and deleting textures. Before we remove this mechanism, let's investigate to understand why exactly it leads to unbounded memory growth — in theory, if I remember correctly, the number of saved textures should be bounded by what's seen on screen, and any newly visible tiles should reuse textures left in the pool for those that went offscreen, thus balancing the total amount. Did we break that balance at some point? Or is the amount retained too much even if it's bounded, and it's better for performance to always delete and create new textures without reuse? |
Thanks @mourner. More than happy to help and it's been a good opportunity to understand the codebase in a lot more detail. The background is really useful and makes a lot of sense. My takeaway is the texture handling has evolved over time and for the different concerns you've outlined, this has somewhat dropped off, and become quite fragmented. This can be seen in My thinking so far is to find the last bits of code which are holding onto textures quite aggressively, and then re-optimise texture allocations with reuse across everything utilising a texture? There might be a bit of a resource trade-off here between CPU/GPU and VRAM allocation. As a comparison, if I load the pathological-flyto.html sample which utilises the MB Earthquake example, the texture usage is ~72MB and then in my app it's ~170MB. Similar enough, but I'm loading sig more data across ~7-8 sources. If I zoom in and out a few times on the same location with geojson loaded, this balloons to 1GB and 223MB, respectively. This high point is indicative of the texture caches being fully hydrated (tiles being retained due to current viewport & |
4972e34
to
b7c31ca
Compare
b7c31ca
to
65c5442
Compare
I think this is ready for another look @mourner. It looks like different gl unload codepaths weren't releasing their resources consistently, which meant the texture/raster tile cache could continually grow. There was an unloadTileData and an unloadTile call which would have pushed twice into the cache, the same texture I guess but then it would have been popped once. Then again (eventually) perhaps causing some weird texture changes(?) but I wonder if this quirk is worked around elsewhere so it's not obvious, or an unload check. Either way, it causes the cache to grow. The solution I implemented was to add types, and then destroy everything with the same Tile.destroy() method. The tile cache has remained pretty stable since (e.g. instrumenting the cache to check growth). I did have a fixed cap, but I think if it's all correctly pushed/popped, as you said, it should be fine. Feels slightly diiirty without a hard cap but ok. I refactored other cleanup logic so things aren't as easily missed. This becomes most problematic on Webkit, iOS, with dem or raster and doing more with the library (consuming higher mems). Higher dpi devices (iPhone etc.) also fetching the 2x textures chew more allocation too. So you kind of end up with the absolute worst case user - an iPhone target, with an allocator that behaves like a potato. With the changes: I'll write up a separate issue with larger geojson data sources running on a potato. |
@tristan-morris this is fantastic work, thanks so much for working further on this! I'm pretty sure we'll try to land this before v3 final — let me spend some time reviewing some details, but overall this looks great. |
if (tile.texture) { | ||
painter.saveTileTexture(tile.texture); | ||
} | ||
// Texture caching on unload occurs in unloadTile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stepankuzmin can you please check the logic here re CustomSource
? Looks like the way this is currently wired, after the change, it would stop caching textures — what's the best way to accommodate the new approach while making sure nothing regresses there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tile.destroy(true); | ||
|
||
// Save the texture to the cache | ||
if (tile.texture && tile.texture instanceof Texture) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: redundant check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that too! Seems that Flow picks up that tile
may have been modified
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case you can usually placate Flow by separating the condition (or a texture reference) into a variable.
} else { | ||
tile.destroy(); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering — would it simplify things if we moved this whole logic to tile.destroy
instead of having a conditional parameter? It might be better for consistency, since the counterpart getTileTexture
logic of reusing textures is inside tile.js
(setTexture
method).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that too. My thinking against was, if you call destroy() on something you think it should be destroyed. Or at least provide a mechanism which makes sense. I also didn't want to move the calls around too much :)
Co-authored-by: Volodymyr Agafonkin <agafonkin@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 given #12993 will be merged afterwards to address CustomSource
caching.
This partly PR resolves #12906, and likely contributes to a few other issues relating to loss of WebGL context related to excessive GPU memory.
The root cause is textures aren't being unloaded from GL memory when they're no longer needed. This is most pronounced when moving around the map, with
flyTo
being the most pathological case. From what I can tell, this impacts all platforms but particularly on resources constrained devices. Safari on iOS and macOS enforces lower memory limits, contingent on the device being used, and seems to manifest more often.The root cause of this problem was really difficult to identify originally. For me, this showed up initially in my app as WkWebView being terminated and raising webgl context lost errors. If you follow the device logs (not the app logs in XCode) it's possible to see warnings related to excessive memory usage, but the excessive memory use isn't attributed to VRAM usage! (I might raise this on their Bugzilla.) I've been able to replicate this issue for weeks with the pathological flyto (zipping between Australia and USA), but understanding the cause was another problem. Anyway... This PR is somewhat complete, but it needs some input from MB.
A few notes
debug/pathological-flyto.html
. This seems to really help identifying what is going on with GPU memory allocations.any
type here likely contributed to the missing derefs.Status
My poking around so far has been to consolidate the destroy codepaths, aggressively drop textures and ensure the map continues to work. I am super keen to get some feedback from MB devs who know the codebase a lot better. Otherwise, I'm happy to continue poking around and liberate all of the lost vrams.
There's one remaining path which seems to cause some significant memory retention - if you zoom into a location with lots of geojson loaded, and then zoom out the tiles remain loaded. And can use a lot of VRAM! See
src/source/source_cache.js
I think...Before
Top left shows an overlay with predicted vram usage in MB. By 40 seconds, Safari has terminated the tab, so ~7 pans/zooms.
Before.mp4
After
Max GL memory reaches ~210MB, with satellite raster and DEM. Safai seems happy.
After.mp4
Launch Checklist
@mapbox/map-design-team
@mapbox/static-apis
if this PR includes style spec API or visual changes@mapbox/gl-native
if this PR includes shader changes or needs a native portmapbox-gl-js
changelog:<changelog>Fix an issue with excessive video memory usage with raster sources, as textures weren't being unloaded from video memory when no longer needed. </changelog>
@