-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should this be independent of the cache API? #3
Comments
I decided against this because I didn't want to create yet another storage system in the browser, and instead lean on the request/response store we already have. Another question that came up is "can we give access to the in-progress response?" for cases when you have enough of a podcast to play. Having this feature in the cache API would be cool too, so saves us having to define it twice. |
I would prefer we spec'd this something like:
I would like this approach for both implementation and spec reasons. From an implementation point of view we probably don't want to write directly to Cache API anyway. Cache API does not support restarting downloads. It would make more sense to download to http cache or another disk area in chunks. We can then restart the download at the last chunk if we need to. At the end we stitch it all together and send it where it needs to go. From a spec perspective, writing to Cache API would raise these questions:
I imagine we would probably spec things to open and do the Cache.put() when the download is complete. If we are going to do that, we might as well let the js script decide what to do with the Response. Anyway, just my initial thoughts. |
Agreed. And this means my "in-progress" response idea doesn't really work. We'd be better off making a general way to get pending fetches from same-origin fetch groups.
I started off with background-fetch and thought I was simplifying standardisation and implementation by rolling it into the cache. If it isn't doing that, I'm happy to split it back up. Background-fetch is a more meaningful name too. My gut instinct is developers won't much care about the extra step for adding to the cache. |
Why do we need this? |
Background caching a movie, but I'd like to start watching it now it's 90% fetched. |
I guess I'd rather put a getter on the background download registration to get a Response for the in-progress fetch. I don't think a window or worker would be in the same "fetch group" as this background thing (per my understanding of gecko load groups, anyway). |
Yeah, that's why I said "same-origin fetch groups". The reason I'm pondering around making this general is we've seen a few requests for knowing about general in-progress fetches in the service worker repo. FWIW I think we can make the 90% playback case v2 (but the kind of v2 we actually do). |
It makes sense for your background-fetch API to expose a list of the pending downloads and their progress. This needs to be tracked anyways and there are UX benefits for the user. It seems like adding this introspection for all fetches is just asking for trouble. In the requests for the ability to introspect pending requests in the SW repo, the requests seem motivated by a lack of understanding of or confidence in the HTTP cache. The SW spec could likely do with more references to http://httpwg.org/specs/rfc7234.html or similar to help make it clear that the HTTP cache exists and it knows how to unify requests and is generally very clever. For the movie use-case, knowing the download is 90% complete should provide confidence that the HTTP cache is sufficiently primed that straightforward use of the online URL can occur. Because of the range requests issue, it seems like providing a Response from background-fetch may be the wrong answer until the file is entirely complete. I suspect it may be worth involving media/network experts for this specific scenario. |
This feedback is great. Interested to hear from other implementers, but leaning towards making this background-fetch rather than background-cache. |
I've raised a (hopefully!) coherent request for feedback from Firefox/Gecko network and media experts on the Mozilla dev-platform list at https://groups.google.com/forum/#!topic/mozilla.dev.platform/C2CwjW9oPFM |
My testing suggests that firefox http cache does not re-use any in-progress requests from http cache. See: Edit: Don't click this unless you want to download 200+ MB! |
Andrew pointed out my file was too big. We have some size thresholds in our http cache that was preventing the in-progress request sharing from working. I've updated it now to use a 10MB file which does get the request sharing: (downloads 30MB on FF and maybe 50MB on other browsers with fetch) |
Sooooo this kind of thing isn't good for video/podcasts? |
Well it means a getter on the background download request is a good idea. For this reason and also for requests restarted after browser shutdown, etc. The http cache heuristics are tuned for the common request cases. |
Maybe one of the network people will comment, but I think the size threshold is there due to the constrained cache size. If any single resource is a large enough percentage of the total http cache, then the cache becomes much less useful in general. You don't want to evict 25℅ of the cache for a single video file. I think anyway. |
Yeah, a getter would solve this, and it's something we can add later as long as we keep it in mind. I'm just worried that we're going to end up needing to create the same thing for the cache API. |
Exactly. We have a rule of thumb right now that we don't store resources larger than 50 MB in the HTTP cache. (Back in the days when the entire HTTP cache was 50 MB max, the rule was nothing larger than 1/8 of the entire cache, and IIRC that's still true for mobile if the cache there is set to be small enough). It's quite likely that we could bump that limit up by possibly a lot if it's useful. The old HTTP cache couldn't start reading a resource that was being written until the write ended. I know we put a lot of effort into fixing that in the new cache (I also seem to recall that there are at least some cases where we still can't do it, but I think most of the time we can--I can check with the cache folks). We don't have an API right now that lets you know when, for instance, enough of a video file has been stored in the cache to make playing the video possible. But we could add one if needed. The HTTP cache right now doesn't count towards quota limits--that might be an issue? Happy to talk more about this, or you can contact Honza Bambas and/or Michal Novotny directly. |
Thats not a problem. This background-fetch thing is different than normal http cache. It could be implemented in http cache, but not necessary. The question was more if we needed an API to "get in-progress requests" in general. For most requests I think this is overkill and the http cache semantics already DTRT. |
Wait... what are you talking about here? One of the goals stated is: "Allow the OS to handle the fetch, so the browser doesn't need to continue running" Then I don't understand why Necko should at all be involved in such a fetch or upload and why we are testing behavior of the Necko HTTP cache at all. Also remember that DOM Cache (serviceworkers APIs) is completely separated from the Necko's HTTP cache. It uses a different storage area (disk folder) and different storage format. What I mean is that moving from http cache to dom cache might not be a trivial task. But, if that above mentioned goal is something "in the stars", then I still don't think you should rely on the HTTP cache. The response and the physical data has to end up in the dom cache. We had similar discussion when DOM cache was being developed, and the final and only logical :) conclusion was to not use/rely on HTTP caching at all. |
Ok, so we'd likely add a "get in-progress" API for background fetch. Are we likely to need this for the cache API too, and does that warrant merging these APIs? We could look at this at TPAC. |
I've been raising the HTTP cache issue because:
@mayhemer It's sounding like the answer is indeed to stay out of the HTTP cache for background-fetch, but I figured it was worth asking rather than assuming. And it would be great if we could determine whether Firefox/Gecko might need to do something like "the background-fetch in-progress Response snapshots the existing download and new bytes won't magically show up until you caller the getter again" or not. If the answer is going to be very Gecko-specific and doesn't have spec implications, maybe we should take this to the Mozilla dev-platform thread.
I've been reading requirements like this as a combination of:
I would expect that in Firefox/Gecko we would implement this entirely in the browser and expose the downloads via browser chrome using the existing downloads UI. |
Authors could use MSE for playback and break the resource into chunks. It sounds like that would solve this problem. |
Done ead8574 |
This could be entirely separate to the cache API and be called "background-fetch". The "bgfetchcomplete" event could hold a map of requests to responses, and it's down to the developer to put them wherever they want.
The text was updated successfully, but these errors were encountered: