Skip to content
This repository has been archived by the owner on Aug 8, 2023. It is now read-only.

offline-capable file source #2939

Closed
wants to merge 2 commits into from
Closed

Conversation

jfirebaugh
Copy link
Contributor

First movements towards #584.

My current thinking here is to have ThreadContext return an array of ordered file sources, not just a lone one with which the map/map context/thread context was created. For example, you'd have an offline source, then the DefaultFileSource, and a fallback would occur if the offline one didn't have a resource (either by circumstance or design).

This should work during e.g. VectorTileMonitor::monitorTile() request kickoffs as they query the util::ThreadContext::getFileSource() API for each request, so that means the array could be mutated at various points during the lifecycle and such an API could return the latest iteration of the array.

Currently working on a basic mockup file source, then some structure to test this design out as the first steps.

I do think we should keep offline sources clear of DefaultFileSource and its associated (SQLite-backed) cache complexity.

@ljbade
Copy link
Contributor

ljbade commented Nov 5, 2015

Yeah this sounds like a good design. Keeping offline and online well separated will reduce complexity and provide flexibility for future enhancements.

@incanus
Copy link
Contributor Author

incanus commented Nov 5, 2015

I am currently thinking of having a vector of file sources per above, as well as adding a FileSource::canHandle() API to assist in the fallback above. For offline sources, they would answer true if they promise to properly provide the asset, which may be determined by performing an actual e.g. SQLite query synchronously to see if they can. Ideally, there is metadata (say, in MBTiles, which also requires a SQLite query, but more a lightweight one) which can answer this question more quickly.

This may not work out in practice, but I'm going to attempt it first.

@incanus
Copy link
Contributor Author

incanus commented Nov 5, 2015

Ok, as of 0e568b2, we have multiple file sources in place and the first (mock offline) one provides local tiles if a) we are requesting tiles and b) they contain .png. It passes through all other requests (style, sprites, etc.) to DefaultFileSource as per normal.

It lives!

Next up will be hooking up a more concrete MBTiles file source in place of FrontlineFileSource, plus building out the file source list runtime manipulation API.

@incanus
Copy link
Contributor Author

incanus commented Nov 5, 2015

I should add that this is more a proof of concept of FileSource::canHandle() than it is specific to tiles. A true offline solution will need to provide all the types of assets we use to render a map. I'm just starting with raster tiles since they're so easy to debug visually.

@ljbade
Copy link
Contributor

ljbade commented Nov 5, 2015

How will an app bundling the style files work with this?

E.g. if you want 100% offline map you would need MBTiles of all the vector data, plus locally bundled style JSON, sprites, patterns, fonts etc

@incanus
Copy link
Contributor Author

incanus commented Nov 5, 2015

Yeah, working on that. I have some ideas.

@incanus
Copy link
Contributor Author

incanus commented Nov 6, 2015

Ok, as of 6acda8c we have a basic hookup to a (raster) MBTiles (see here) as a proof of concept of the idea of multiple file sources and failover between them based on coverage.

anigif-1446771681

Next up is to start experimenting with offline vector maps. Tiles are easy, but we need to think about related assets such as sprites, style, glyphs, etc. and if MBTiles is the right way to go there.

@incanus
Copy link
Contributor Author

incanus commented Nov 6, 2015

The biggest variable I see here is both quantity of, and determining required assets for, font data for various regions. In some minimal browsing, requests tend to look like this:

https://gist.github.com/incanus/c402b7e32ff064028797

Sorting and de-deduping, across 13 zoom levels that's:

  • 1 style JSON
  • 2 sprite assets (1 image, 1 JSON)
  • 98 vector tiles
  • 96 font assets

@incanus
Copy link
Contributor Author

incanus commented Nov 9, 2015

Font glyphs are turning out to be the real wildcard here.

The way we determine which font glyphs need downloaded is:

  • Fonts are uploaded to our system, processed by node-fontnik, and served as pbf-encoded assets in ranges of 256 at a time based on UTF-32 code points.

    start should be a multiple of 256 from 0 to 2560, and end should be equal to start + 255

    • Say for a simple character n which comes up in a text label, the range of this pedestrian ASCII character is then 0-255.
    • URLs look like mapbox://fonts/mapbox/DIN%20Offc%20Pro%20Bold%2cArial%20Unicode%20MS%20Bold/0-255.pbf which are normalized to https://api.mapbox.com/fonts/v1/mapbox/DIN%20Offc%20Pro%20Bold%2cArial%20Unicode%20MS%20Bold/0-255.pbf?access_token=x
  • These glyph ranges are determined in SymbolBucket parsing and are dependent on the style currently in play (tile parsing only exists in the context of a given style, else you'd constantly parse out more tile features than you'd possibly need).

  • This means two things:

    1. Figuring out which font URLs to download for offline use is dependent upon the desired style(s) to be used offline.
    2. Change of the style while offline is somewhat restricted, by:
      • Only fonts available offline.
      • Only glyphs of those fonts available offline.

    So you couldn't change the font of a property beyond that, or change labelling from e.g. English to Spanish necessarily as additional code points may be introduced which weren't already cached (e.g. we didn't download the range containing ñ because the style at the time didn't introduce the need).

Anything sound not right about this @mikemorris @kkaefer?

/cc @twbell re: possible offline feature restrictions.

@incanus
Copy link
Contributor Author

incanus commented Nov 9, 2015

Of course an option here would be to just download all 10 ranges for a given font style upfront (as "0 to 2560" seems to imply that that's all there are, i.e. max 2.5k characters per font). Will do some measurement on relative sizes of glyphs compared to tiles themselves next.

@incanus
Copy link
Contributor Author

incanus commented Nov 9, 2015

In that case, you could scan a style upfront for font strings, e.g. streets-v8 contains:

  • DIN Offc Pro Regular
  • DIN Offc Pro Italic
  • DIN Offc Pro Medium
  • Arial Unicode MS Regular

Sourced from mapbox://fonts/mapbox/{fontstack}/{range}.pbf.

Four fonts x 10 ranges each = 40 font resources to download to take all language possibilities of the style offline.

@incanus
Copy link
Contributor Author

incanus commented Nov 9, 2015

In some basic testing, generally I am seeing glyph assets be about the same, but occasionally more numerous, and about the same, but occasionally larger in size.

Basic Asia-focused activity test:

  • 112 vector tile requests, min/max/mean 2.8kb/1.39mb/181kb
  • 181 glyph asset requests, min/max/mean 29kb/206kb/190kb

@incanus
Copy link
Contributor Author

incanus commented Nov 9, 2015

Re: #2939 (comment) and just pre-downloading all ranges, @jfirebaugh indicates our docs might not be up to date and:

valid glyph ranges currently cover the entire Basic Multilingual Plane, i.e. there are 65,536 / 256 valid ranges, not 10

That's 256 things per font face to download, not 10. So that's just if we wanted to get ahead of possible style changes, but wouldn't be relevant if the fonts/labels stay the same while offline and only drawing properties (colors, strokes, fills, details hidden, etc.) are changed at runtime instead.

@mikemorris
Copy link
Contributor

@incanus Was about to reply that the full range for a font is 0-65535, not 0-2560 (where did you see this range?) but you beat me to it. Other than that, sounds about right. The deep rabbit hole here is #260 for system fonts as fallbacks.

@incanus
Copy link
Contributor Author

incanus commented Nov 10, 2015

The short of all of this is that it's becoming apparent that fetch-and-store like we did on raster won't suffice for vector, as we will have to also parse vector tiles to find dependent glyphs to download as well. As best I can see so far. That's a bit more overhead, hopefully not tremendous, but does add quite a bit of complexity, particularly if we want to (rightfully, I think) reuse parts of our rendering-based tile parsing infrastructure as it already holds the logic combining vector tiles with applied styling.

@lucaswoj
Copy link
Contributor

Have you thought about creating an API endpoint to determine glyphs needed for a bbox?

@incanus
Copy link
Contributor Author

incanus commented Nov 10, 2015

That's a neat idea @lucaswoj — it'd still need crossed with a style to be efficient, but that's doable.

@incanus
Copy link
Contributor Author

incanus commented Nov 10, 2015

An example why: a style might not show POIs, which would greatly reduce the number of strings needing to be parsed and glyph-ranged. Or a style might only use {name_en} and so wouldn't care about {name_es} or other multilingual fields in a tile.

@jfirebaugh jfirebaugh assigned jfirebaugh and unassigned incanus Nov 10, 2015
@jfirebaugh
Copy link
Contributor

Good progress here. One thing that jumps out at me is that a synchronous and blocking FileSource::canHandle() is going to be problematic. In the current implementation, this would cause the rendering thread to block on file IO, and in turn the main thread would pile up behind that via renderSync. In the threading implementation we're looking to move towards, it's probably even more likely to wind up blocking the main thread.

I think we'll need to keep the FileSource API the same for consumers, move iteration over possible handlers into the implementation, and do blocking SQLite IO into a separate thread, as we do with SQLiteCache.

@incanus
Copy link
Contributor Author

incanus commented Nov 12, 2015

Yeah, I naively started with blocking, with the goal of digging into it further. Thanks for the eyeballs @jfirebaugh.

@friedbunny friedbunny mentioned this pull request Nov 17, 2015
@incanus
Copy link
Contributor Author

incanus commented Nov 20, 2015

Doing some tile fetching work over in https://github.com/incanus/TileCacher, which also serves as a good test of delegate render completion callbacks and could be used in future as a benchmarker once we have it hooked up directly to an atomic offline source, since we can measure render time given known local data.

@jfirebaugh
Copy link
Contributor

I updated this to latest master. Going to start looking at the core API more closely.

@mb12
Copy link

mb12 commented Dec 18, 2015

@incanus Is there any plan to add routing and search to the offline roadmap? Will it be developed in the open like gl native client? Its one big item that's missing from mapbox sdk.

@twbell
Copy link

twbell commented Dec 18, 2015

@mb12 -- ancillary offline services are on our radar; focusing on getting the basics in-play first.

@jfirebaugh jfirebaugh force-pushed the 2939-jrm-offline-file-source branch from 89874f8 to d4feaf6 Compare December 18, 2015 23:19
@jfirebaugh
Copy link
Contributor

The architectural changes needed here are:

  • Rename DefaultFileSourceOnlineFileSource
  • Write a new DefaultFileSource class that implements the behavior of first checking the OfflineFileSource, then falling back to OnlineFileSource internally.
  • Revert the changes to MGLMapView.{h,mm}, map.{hpp,cpp}, map_context.{hpp,cpp}, raster_tile_data.cpp, source.cpp, vector_tile.cpp, sprite_store.cpp, glyph_pbf.cpp, and thread_context.{hpp,cpp}. Given the above modifications, none of the APIs used or exposed by these files will change.
  • Remove handlesResource. Instead of asking OfflineFileSource if it can handle a request, and then making the actual request (2x the queries), just make the actual request and handle a "not found" response by falling back to online.
  • Remove the offlineMapPath option; use a fixed path instead.

@jfirebaugh jfirebaugh force-pushed the 2939-jrm-offline-file-source branch from d4feaf6 to a5f0c0b Compare December 18, 2015 23:50
@incanus
Copy link
Contributor Author

incanus commented Dec 21, 2015

@jfirebaugh Keep in mind here the use case of multiple offline documents, the set of which can differ at runtime. A concrete use case for this is a parks or golf courses app that uses a paywall to give access to individual sections of offline area. At runtime, the dev just wants to pass in the supported documents/regions (or perhaps alter them later as new ones becomes available, or maybe even just say "use all of the downloaded regions"), but does not want to have to reinitialize or reconfigure the map view (perhaps jarringly visually) just to change the attached regions. This would be most evident when the map viewport is made to change to a new region and associated backend configuration / singular referenced file path would have to change in response, possibly delaying actual map rendering as file sources are reconfigured.

My original goal here was an API that allows management of a set of OfflineFileSource objects (add/remove/insert/replace), ordered according to the caller's preference, each backed by their own filesystem path for the backing document. Then, when a map view zooms to a new region, it can quickly check (linear O(n) complexity) across the attached documents (especially if, per the spec suggestion for performance, the bounds metadata field is used) to which one(s) can provide tiles and other assets for the current viewport.

You can see a successful model of this in the old MBXMapKit idea of multiple MBXRasterTileOverlay layers on a map view at once, each of which controlled their own region and answered for it when relevant to the current viewport, or in the iOS SDK 1.x multiple tile source API, which did the same. The idea in this SDK was to use what we learned there to finally have an API which supported multiple regions/documents configured at init time and just did the right thing, showing as much as it could based on what had been previously cached for offline use, while keeping visual ordering hierarchy in the map unified style across multiple style sources.

Keep in mind also that while we haven't explicitly promised it, we may also want to allow for arbitrary MBTiles file path hookup (#3053) alongside the internally-managed and SDK-downloaded documents.

I think we need a multiple offline file source API for these reasons, and as the evolution of what we've learned through use and feedback of our past offline mobile APIs.

With regard to 2x the queries, yes, we could optimize this a bit; however, using the (optional) bounds metadata field for a document is a single query, and so is just attempting to pull a tile; we may find that the tile query is less performant given the potential larger database table size and index complexity — not sure yet. This could be measured.

@jfirebaugh
Copy link
Contributor

Yes, good to keep those use cases in mind. However, I believe the management of "sets of offline resources" should be handled internally to DefaultFileSource and/or OfflineFileSource, and the FileSource and ThreadContext interfaces should not change.

@incanus
Copy link
Contributor Author

incanus commented Dec 21, 2015

Yep, agree. My read on this was instead of obtaining fs synchronously and kicking off the request asynchronously, the whole thing should kick off asynchronously and the right file source(s) would be determined internally.

@jfirebaugh jfirebaugh force-pushed the 2939-jrm-offline-file-source branch from e7ded52 to 8f3d5fa Compare December 21, 2015 20:21
@jfirebaugh jfirebaugh force-pushed the 2939-jrm-offline-file-source branch 3 times, most recently from cbb6c31 to 765db30 Compare December 22, 2015 20:10
@jfirebaugh
Copy link
Contributor

Build issues are now fixed, down to test failures.

@jfirebaugh jfirebaugh force-pushed the 2939-jrm-offline-file-source branch 3 times, most recently from be04d73 to 920f8e9 Compare December 23, 2015 21:52
@jfirebaugh
Copy link
Contributor

#3715

@jfirebaugh jfirebaugh closed this Jan 27, 2016
@jfirebaugh jfirebaugh deleted the 2939-jrm-offline-file-source branch January 27, 2016 23:04
@robmaceachern robmaceachern mentioned this pull request Mar 17, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants