-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A way to specify a pattern of destination URLs to hit/skip SW #1454
Comments
/cc @jakearchibald @n8schloss @wanderview @mattto a potential alternative approach I've mentioned in the other thread. Want to know if this could be useful for experimenting something |
This sounds really good to me! I think it will fit our use case and allow for rapid testing. |
Is this just about navigation preload or are there use-cases where the ServiceWorker would intentionally not respondWith() to the initial navigation (non-subresource) request? |
This is about subresource requests. Even if the SW is going to just going to respondWith the subresource request, the introduction of the SW onto the critical path for all these resources introduces a bottleneck and can slow things down. |
My confusion is about when the web server gets an opportunity to provide the "Service-Worker-Fetch-Scope" in a way that impacts a given registration and the expected uses cases around it. If the ServiceWorker invokes respondWith() on the navigation fetch and there's no navigation preload, then there's no network request for the server to send overrides. If there's navigation preload, that does provide an opportunity for the server to impact things. It also works if the browser is offline. If the ServiceWorker doesn't respondWith to the navigation request, there's also an opportunity to tie the headers to the registration, but the website is now broken if the user is offline because the page didn't load. I'm wondering if this is a use-case that's envisioned or if it's really just about the navigation preload scenario used by mega-sites. |
I'm not sure how that follows? The response it passes to respondWith can have this header added, either because the server added it when the resource the SW returns was cached/fetched from the server, or because the SW explicitly adds the header to the response it is returning? |
I'm parsing bullet 1's reference of "main resource request" to be a non-subresource/navigation request. Do I have that backwards? |
I think that is right. The non-subresource/navigation request, that is presumably intercepted by a service worker because if the page wouldn't be controlled subresources wouldn't be intercepted either. And the response to that request can set this header to influence which sub-resource requests will bypass the controlling service worker and instead go directly to the network? |
I interpreted the proposal the same way @mkruisselbrink did. In other words, either the service worker code itself can add the I like the proposal. It would be nice if there was a similar lightweight API that allowed bypassing the service worker for navigation requests too. |
Yeah, the intention is that the response header (to a non-subresource/navigation request) can be set by the service worker or from the network / server (it can be also cached in the cache storage), and it affects all the subresource requests for the client afterwards. |
Okay, so as I understand it, the general use case we're trying to satisfy are the use cases discussed in #1195 and #1026 where a (mega)site is using the ServiceWorker for latency optimization. The sites have no interest in involving the ServiceWorker if it doesn't make things faster.
From an implementation perspective, there's 2 key things going on:
The 2nd part seems fairly problematic. It adds complexity to fetch and would seem to result in a lot of potential for ordering races, plus the ability for a ServiceWorker to accidentally disable itself by caching a magic Response with the "Service-Worker-Fetch-Scope" header and serving it up. And it's not clear that it adds much more beyond exposing an API to the ServiceWorkers to manipulate the new hidden prefix. In particular, for the case where a ServiceWorker needs to bring itself up-to-date, it seems like the ServiceWorker should know and be able to decide when it's up-to-date. And it also seems like the ServiceWorker would need to know it's not up-to-date in order to start updating itself. So why not just have the server tell the SW it's out-of-date in the navigation preload response and then the ServiceWorker uses an API to manipulate the state variable at that point, and then set it back when it's updated. It does sound reasonable to think about exposing such a mechanism via API (so the 1st half of the implementation plus API exposure instead of response headers) if we aren't able to gain traction on static routing at TPAC. |
Yes, this is trying to solve the problems that are same/similar to that of #1195 and #1126. For implementation I was imagining that we'd add a hidden state variable to the service worker client (e.g. window or workers) but not to the registrations. The state should be determined and fixed when the client is instantiated (e.g. when a navigation commits for a frame in chrome's implementation for window cases), therefore should not have a racy situation with SW registration update. Could something like this make sense to you? Reg: the risk of a SW accidentally caches the magic Response with the header, we can probably make a minor modification to the proposal so that on storing the response to CacheStorage the response header should be discarded / ignored? Then the header would always only affect the current response (which might come either from the server or might be modified by the service worker on-the-fly). Reg: the possibility to have a similar mechanism via API: I agree that it'd be reasonable to also explore a potential API surface for this kind of mechanism. |
Just want to note that while all the examples here use a path only, we also need to be sure to support full URL entries since subresources can be cross-origin. This is a difference from the existing SW scope concept. |
@wanderview yep that's right, URL can be cross-origin and full URL entries should be supported. Thanks for clarifying! |
@asutherland if I'm understanding the proposal correctly, the state would sit with the client, not the registration. https://fetch.spec.whatwg.org/#http-fetch - before step 3, if it's a subresource request, we'd look at the request's client, which would contain data about the URLs the service worker should handle. |
@mattto and I chatted about this, so here's a lump of thoughts: I assume that URLs are resolved relative to the page? Since the value can be a url or a 'special value' like "none", we need a way of differentiating between the two. We could probably use the same rules are module specifiers. As in, we treat it as a URL if it's one of the following:
Otherwise we treat it as an enum, eg "none". What happens with:
Is this discarded as an unknown value, or does it activate the feature with no matching URLs (same as 'none')? Is this allowed?
…and will fire fetch events for subresources starting We probably need to think of a name that doesn't include 'scope', as it may be confused with service worker scope. But meh bikeshedding. We need to make sure this header is processed before any headers that trigger subresource fetches, eg I assume that this would only work as a genuine header, not some I guess this will work for other client types like workers? It's difficult to express "bypass the service worker for urls starting In cases where we inherit the controller of the parent document (eg about:blank, srcdoc etc) does it also inherit the fetch scope rules? Can this be feature detected in any way? The ergonomics of adding a header to a response from the cache or network aren't totally friendly: addEventListener('fetch', event => {
event.respondWith((async function() {
if (event.request.mode === 'navigate') {
const response = await fetch(event.request);
const responseCopy = new Response(response.body, response);
responseCopy.headers.set('Service-Worker-Fetch-Scope', '/profile/');
return responseCopy;
}
return fetch(event.request);
})());
}); With the declarative routes proposal, I tied the state to the service worker. This means the same thing that specifies the routes, also specifies the handling: addEventListener('install', event => {
event.router.add({ url: { startsWith: '/video/' } }, 'network');
});
addEventListener('fetch', event => {
// You will never see a request for /video/* here
}); The header-based proposal doesn't give the same guarantees, and I'm worried this will create some unexpected gotchas: addEventListener('fetch', event => {
event.respondWith((async function() {
if (event.request.mode === 'navigate') {
const response = await fetch(event.request);
const responseCopy = new Response(response.body, response);
responseCopy.headers.set('Service-Worker-Fetch-Scope', '/profile/');
return responseCopy;
}
// Will you see subresource requests for /profile/* here?
})());
}); In this example it looks like I'm forcing all controlled pages to have a fetch scope of The controlled page may have been served by an earlier version of the service worker (due to If you end up with items in the cache with the |
Thanks, all good feedback. I felt that following might need more discussions among others:
More comments inline:
Yes that's my current thinking.
I'd vote for discarding but can be cool with either.
Didn't mention this in the initial post as I didn't have strong opinion. (Could be open to either)
Error out and ignore if both are given?
That sounds reasonable.
Good question. Maybe add some special request header to indicate that?
Yep, you're right that there could be a race. My impression around skipWaiting has been that it inherently adds some race, and therefore it could be probably okay to have the race like this, but maybe not. I'm interested in learning how concerning/critical does this race look to you (and everyone)!
I'm wondering if making this header always stripped away when cached could introduce more or less confusion. |
Yeah, discarding seems good.
We should treat:
…and…
…the same (due to how headers work). So I guess it would fire fetch events for subresources starting
Maybe. Depends if we'd want to support:
You could say longest match wins. Not sure what happens with:
…though.
It's more that you're taking over a page that may have some fetch behaviour dictated by an older service worker. This is certain to happen if you use Maybe it doesn't matter because:
But I think it's much easier to understand if the fetch event handler and "when to use the fetch event" have the same lifetime. The benefit of tying it to the client is you can give different clients different instructions. Is that useful @n8schloss?
It's definitely a different confusion! 😀 |
I would vote for not magically stripping headers when in cache. I don't think we have anything else like that in the platform... Can someone clarify for me which response the headers are evaluated on? Is it:
Sorry if that's defined somewhere in here, but the thread has got a bit long. If the answer is (1), then I assume the service worker FetchEvent handler can manually add/remove these headers before returning a Response back to respondWith(), correct? |
That's the proposal, yeah.
Yeah, see the examples in #1454 (comment) |
If we think the header state is stick to clients, does exposing the white list and the black list to client API make sense? self.addEventListener('activate', e => {
e.waitUntil(async () => {
(await Clients.matchAll()).forEach(client => {
// client.fetchEventAllowedScopes == ['/previously-set-allowed-scope']
// client.fetchEventDisallowedScopes == ['/previously-set-disallowed-scope']
await Promise.all(
client.setFetchEventAllowedScopes(['/foo/', '/bar/']),
client.setFetchEventDisallowedScopes(['/foo/posts/']);
// client.fetchEventAllowedScopes == ['/foo/', '/bar/']
// client.fetchEventDisallowedScopes == ['/foo/posts/']
});
});
}); while I'm feeling that it might look more like the dynamic routing and be a big hammer. |
Thoughts for TPAC:
|
TPAC resolution:
|
We often hear a demand for specifying certain set of destination URLs to be intercepted by SW (e.g. #1026, #1373), and this is yet another (lightweight) proposal to address a subset of these using HTTP response headers (therefore it's ephemeral & scoped only to the current navigation).
Proposal: Service-Worker-Fetch-Scope header
Service-Worker-Fetch-Scope: <url-scope>
HTTP response header to main resource requests, so that they can express what URLs should be hit by their SW for the service worker client that is instantiated by the main resource.<url-scope>
can be either “none” or a scope URL.service-workers
mode of all subresource fetch requests from the service worker client that do NOT prefix-match the given scopeis set to
”none”
, i.e. this makes only the subresource requests that match the scope hit the Service Worker.Examples:
Service-Worker-Fetch-Scope: ‘/sw_cache/’
to document requests so that only the subresources whose URL start with/sw_cache/
are intercepted by the SW.Service-Worker-Fetch-Scope: “none”
after a certain period so that it can skip SW after that. (This could be slightly more efficient than scripting this in SWs as UA can skip routing requests entirely if UA can see it in NavPreload response before/during starting a SW) [EDIT: This use case is a bit hand-wavy, maybe we should focus on the first use-case only]Alternative can be a header that works opposite, e.g.
Service-Worker-Ignore-Fetch
or something, while I've heard some sentimenet that allowlisting urls could be handier.Technically this can be a subset of Declarative Routing proposal (#1373) and can potentially be subsumed by the proposal if we decide to implement it. Motivation of this proposal is to see if we can have a smaller, incremental iteration that can be easier to experiment with.
The text was updated successfully, but these errors were encountered: