Prefetch and double-key caching #82

yoavweiss · 2018-08-29T08:34:52Z

Moving a private discussion with @kinu and @igrigorik to a public forum

#78 raised questions regarding which origin should a navigation prefetch be tied to in terms of service workers.

Similar questions also arise when thinking about prefetch and double key caching.
Let's say host A is prefetching a linked document from host B.

If we were to consider A as the origin used as the secondary key for the document, when the user were to navigate to B, the resource won't be used, another would be downloaded instead, resulting in slower experience and sadness.

So, it probably makes sense to consider B the double-key origin for the prefetched document, when double-keying is applied.

The plot thickens when talking about prefetching subresources. If they are same origin as the document that will use them, then we can consider caching them similarly to documents, using their origin as the secondary key. But if they are cross-origin, we'd need to explicitly state which document/origin they are prefetched for. Not sure that's worth the complexity though.

Thoughts?

/cc @wanderview @cdumez @youennf

youennf · 2018-09-29T10:17:51Z

Prefetch makes most sense for navigation loads so it might be best to focus on this specific scenario.
Something like the following might work with double key caching:

Prefetched resources are loaded with: credentials=omit, referrerPolicy=no-referrer, redirect=manual
Prefetch loads bypass service workers.
Prefetch loads are optional: low power mode/network cache already having an entry
Prefetched resources are stored in a non-partitioned memory-based cache, cache entries are cleared after some limited time.
Prefetched resources can only match top level document navigation.

kinu · 2018-10-09T08:57:30Z

Thanks @youennf, I think this is a pretty good/clear proposal to start with. Hoping that we can discuss more at TPAC but giving some quick thoughts here too:

Prefetched resources are loaded with: credentials=omit, referrerPolicy=no-referrer, redirect=manual

To clarify, do we even want to avoid going with credentials=same-origin?

Prefetch loads bypass service workers.

Have been thinking about this a while, but I think this makes a lot sense at least to start with. (One interesting option @wanderview mentioned off-thread is to skip service workers for prefetch but use the prefetch as NavigationPreload for the service worker when the real navigation occurs. I actually like this idea but given that NavigationPreload is not yet widely supported we can put off considering this further)

Prefetch loads are optional: low power mode/network cache already having an entry

Agreed, and I believe this is currently spec'ed.

Prefetched resources are stored in a non-partitioned memory-based cache, cache entries are cleared after some limited time.

Prefetched resources can only match top level document navigation.

Sounds sensible to me.

One related question is if spec helps prefetches for top-level navigations be distinguishable from others (so that UAs can make better decisions). One way is to use as=document as a signal (while it can't tell whether it's for top-level frames or subframes, and it's proposed to be deprecated).

youennf · 2018-10-09T17:45:01Z

Prefetched resources are loaded with: credentials=omit, referrerPolicy=no-referrer, redirect=manual

To clarify, do we even want to avoid going with credentials=same-origin?

Agreed we should tackle this.
I restricted it this way for simplicity and since that this is the biggest issue right now.
Same-origin prefetches do not require all these protections, we could decide to special case them for instance.

Also, in the case of prefetch, it is not clear how it is interacting with the fetch spec, its browsing context, if it is attached to a browsing context, whether it should be cancelled or kept alive when the context goes away...

yoavweiss · 2018-10-12T08:49:37Z

Prefetch loads bypass service workers.

Have been thinking about this a while, but I think this makes a lot sense at least to start with. (One interesting option @wanderview mentioned off-thread is to skip service workers for prefetch but use the prefetch as NavigationPreload for the service worker when the real navigation occurs. I actually like this idea but given that NavigationPreload is not yet widely supported we can put off considering this further)

I'm concerned that this will trigger cases of double download in scenarios where the SW is e.g. modifying the request for a navigation request.

At the same time, this seems necessary for privacy protection - otherwise the destination SW can leak the fact that the prefetch happened.

Also, in the case of prefetch, it is not clear how it is interacting with the fetch spec, its browsing context, if it is attached to a browsing context, whether it should be cancelled or kept alive when the context goes away...

Agree we need to better specify how prefetch relates to Fetch, how the prefetched resources are cached, etc.

igrigorik · 2018-10-18T06:42:29Z

👍 to the above.

As a brief aside, I'd actually propose we pull out prefetch from RH into a standalone spec doc, or spec it directly in Fetch.. WDYT?

yoavweiss · 2018-10-23T11:26:54Z

Specifying a processing model that tied directly into HTML's <link> processing model (similar to what we ended up doing with preload) seems the best approach to me. I think Fetch already has all the primitives we'd need for this. I'll sketch something up.

yoavweiss · 2018-10-23T14:36:52Z

I think Fetch already has all the primitives we'd need for this

That's actually not true. We need to introduce the concept of a "speculative fetch" and the concept of a "prefetch cache" that would not be partitioned.

domfarolino · 2019-07-25T03:26:12Z

Prefetched resources are loaded with: credentials=omit, referrerPolicy=no-referrer, redirect=manual

Most of this makes sense to me, however I'm wondering if someone could clarify the following:

What is the significance of redirect=manual? I think this would mean upon redirect, a redirect response would be stored in the underlying cache instead of the final resource. Is this the intention? Are there privacy reasons to not follow the redirect?
The credentials mode and referrerpolicy seem fixed, does this mean if the developer supplies crossorigin or referrerpolicy attribute values, they should be ignored?
The request mode has not been talked about here. Today, this is influenced by the crossorigin attribute (i.e., no-cors => cors mode). Should the request mode be unaffected by the presence of a crossorigin attribute too, and default to 'no-cors'?

/cc @yutakahirano

youennf · 2019-08-02T22:46:36Z

What is the significance of redirect=manual? I think this would mean upon redirect, a redirect response would be stored in the underlying cache instead of the final resource. Is this the intention? Are there privacy reasons to not follow the redirect?

Yes, that is the intention. The principle is to emulate a navigation load which redirect mode is manual.
Prefetching is speculative so keeping it small seems good.
Not following redirections forbids the request to go to various domains and simplifies the implementation. For instance, if we were to store all redirections, it is not clear what we should do if the actual navigation goes directly to the second redirection for instance.

The credentials mode and referrerpolicy seem fixed, does this mean if the developer supplies crossorigin or referrerpolicy attribute values, they should be ignored?

credentials should be same-origin.
crossorigin does not make a lot of sense here since we are trying to emulate a navigation load.
I would disregard it. no-referrer limits tracking risks.

The request mode has not been talked about here. Today, this is influenced by the crossorigin attribute (i.e., no-cors => cors mode). Should the request mode be unaffected by the presence of a crossorigin attribute too, and default to 'no-cors'?

I would tend to disregard the crossorigin attribute.
This is a navigate-like load so things like CORP checks do not make sense.

domfarolino · 2019-08-05T01:58:19Z

For instance, if we were to store all redirections, it is not clear what we should do if the actual navigation goes directly to the second redirection for instance.

Is the current proposal entirely clear though? I think you're saying it is clear that a redirect response in the prefetch cache should be matched if it is the first one, but maybe not otherwise. Is there a reason that matching the first one is more appealing/obvious than later ones in the chain?

I would disregard it

Sounds good to me.

I would tend to disregard the crossorigin attribute.
This is a navigate-like load so things like CORP checks do not make sense.

Also sounds good to me.

I think another question is: What should we do when we ignore those attributes? Cancel the request? Or optionally throw a console warning indicating some attributes have been disregarded, and continue as usual? @yutakahirano prefers cancelling the request, but seems worth discussing as I think it will need reflected in the spec.

yutakahirano · 2019-08-05T09:49:31Z

What is the significance of redirect=manual? I think this would mean upon redirect, a redirect response would be stored in the underlying cache instead of the final resource. Is this the intention? Are there privacy reasons to not follow the redirect?

Yes, that is the intention. The principle is to emulate a navigation load which redirect mode is manual.
Prefetching is speculative so keeping it small seems good.
Not following redirections forbids the request to go to various domains and simplifies the implementation. For instance, if we were to store all redirections, it is not clear what we should do if the actual navigation goes directly to the second redirection for instance.

In that case can we use "error" redirect mode? Redirect starts in https://fetch.spec.whatwg.org/#http-fetch, after storing the response into the cache in https://fetch.spec.whatwg.org/#http-network-or-cache-fetch, so I think you will get what you want with "error" redirect mode. I prefer using "error" because it's simpler and easier to understand.

annevk · 2019-08-07T12:50:41Z

Apologies for weighing in late, but I'm not sure I fully understand all the rationale here. Do we have any data on how prefetch is used (subresource vs navigation; same-origin vs cross-origin) today? I suppose google.com still uses it for navigations? (I see you all are focused on navigations, but https://developer.mozilla.org/en-US/docs/Web/HTTP/Link_prefetching_FAQ advocates using it for subresources afaict, so it'd be good to have some data.)

If the user navigated to the prefetched resource before, it's highly likely they'll get a better experience if cookies are included. Does the proposed setup make sense for a majority of resources or do we end up with a lot of cache mismatches (nobody sets Vary: Cookie afaik)? (If I'm not missing anything here I really wonder why it's still worth supporting this feature for Safari rather than ignoring the feature altogether or recommending it be exclusively used for same-top-level-origin subresources.)

(Bypassing the service worker seems problematic as the service worker is no longer in control of some of the document's network traffic, making it less reliable. This is already true to some extent, but I'm not a big fan of continuing to carve out small exceptions.)

kinu · 2019-08-07T13:54:42Z

@annevk we're working on gathering more data. google.com uses it both for subresources and navigations but we're communicating that x-origin subresource prefetch won't be able to work with double-keyed caching (at least until we come up with a workable, privacy-preserving solution).

I can imagine that cookieless navigation part can be debatable, while the site that triggers prefetch can also only do so for the pages that will unlikely need cookies.

annevk · 2019-08-12T14:22:48Z

That would make it really hard to use the feature correctly though.

domfarolino · 2019-08-29T15:52:19Z

Just to be clear on the data collection bit: right now Chrome is only measuring how many prefetches redirect, to estimate how serious impactful changing the redirect mode would be. We're not sure how to accurately measure the impact of credentials (especially since as we've mentioned, Vary: Cookie is quite underused). I guess we could also put a use counter on the referrer policy attribute speciically for prefetches too.

annevk · 2019-08-30T06:04:47Z

I see, the main problem is credentials (or cross-origin navigations) though... Our current plan in Firefox is to use the top-level origin as additional key for this cache, at which point it'll be mostly useless for a number of scenarios. One of the things we're considering is dropping support.

I'm curious to know though if @youennf has found that Safari's approach has measurable benefits.

youennf · 2019-08-30T07:16:54Z

I see you all are focused on navigations

I understand it that prefetch is for navigations, preload for subresources.
In general, hoping that one website will efficiently preload subresources for another website seems fragile design to me.

I'm curious to know though if @youennf has found that Safari's approach has measurable benefits.

Safari implementation is experimental and incomplete at the moment.
I understand Firefox position to limit prefetch to same origin navigations only (in which case it is very similar to preloads).

In general, cross-site tracking protection will probably continue increasing the cost to do cross-site navigation. It would be nice to have some safe ways to mitigate these costs.

In terms of scenarios, search engines come to mind. Web packaging has a similar constraint so the same scenarios might apply.

That would make it really hard to use the feature correctly though.

As long as a resource is cacheable by intermediaries, it should be safe to prefetch it. Or am I too optimistic?

I agree it makes the feature harder to use, although I think the whole feature is quite hard to use with or without this restriction.
For instance, the website has to determine the probability for the user to actually navigate to the prefetched destination, how much the prefetch will slow down other important resource loading...

annevk · 2019-08-30T07:52:15Z

To be clear, I'm not sure what Safari's approach is. I assumed it to be #82 (comment), but maybe it's something not stated in this thread?

This CL renames the PrefetchRedirectError flag to PrefetchPrivacyChanges so the flag can be generalized to encapsulate more privacy-preserving changes proposed in [1]. Also implements the usage of kNoReferrer referrer policy when the privacy changes flag is enabled. A LinkLoader unit test is added to test that the referrer policy is set and persists correctly. It is likely too early to invest in WPTs for this change, since standards discussion must take place before we can determine this is the correct way forward. [1]: w3c/resource-hints#82 R=kinuko@chromium.org, yhirano@chromium.org Bug: 988956 Change-Id: Id01771a1c077b0e018b311983e2d198733fec23b Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1781303 Reviewed-by: Kinuko Yasuda <kinuko@chromium.org> Reviewed-by: Yutaka Hirano <yhirano@chromium.org> Commit-Queue: Dominic Farolino <dom@chromium.org> Cr-Commit-Position: refs/heads/master@{#693033}

kinu · 2019-09-10T14:06:04Z

Reg: credentials and (cross-origin) navigations, one option we thought of is to add an opt-in http header for the target site to explicitly express that "making uncredentialed prefetches and navigations to this site is okay", say, allowed-uncredentialed-navigation. Then UA can cancel the prefetch if it doesn't see the header in the response for cross-origin prefetch requests. How would something like that sound?

Thinking about this space a bit further I suspect we'll need the similar restrictions (like being uncredentialed) for any cross-origin speculative loading, e.g. prerender (by the way I'm trying to put up possible / potential threat model for cross-origin speculative loading here: https://github.com/kinu/speculative-loading#threat-model).

As youenn mentioned these features are always anyways a bit hard to use, but it could be still useful to accelerate navigations if used appropriately. I think it'd be worth exploring the most plausible design that could work with reasonable trade-offs.

annevk · 2019-09-13T15:10:49Z

That could work, but at that point I wonder whether we should use a new opt-in keyword as well (and drop the current feature) as everything currently annotated as prefetch won't have that and would result in a redundant fetch and cache miss.

yoavweiss · 2019-09-17T06:02:07Z

Discussed at the WebPerfWG F2F: For compat and confusion avoidance reasons, it would make sense to define a new keyword. @achristensen07 suggested "prenavigate" which seems like a good option.

/cc @ericlaw

kinu · 2019-09-30T09:58:49Z

Some of us also discussed this in a breakout discussion during TPAC on Friday (@annevk, @youennf, @yoavweiss, @domfarolino, @yutakahirano, @jyasskin, @bslassey, @kinu and some others were there), and here's a quick summary:

For cross-origin prenavigate, one of the concerns is always requiring an opt-in header will likely limit the adoption. As an alternative approach the following two-paths approach (instead of opt-in only solution) was discussed:

Case A: If there’s no credentials stored for the site:
- Just send prenavigate request as a regular request (no cookies will be sent)
- If its response has set-cookie headers, set them in an ephemeral, isolated cookie store
- If next navigation happens on the same URL, commit the cookies change made by the prenavigate. Otherwise discard the cookie store
Case B: Otherwise (some credentials are stored)
- Send prenavigate request as uncredentialed (no cookies will be sent)
- If its response header doesn’t have ‘Allow-Uncredentialed-Navigation’ header just abort the prenavigate.

In either case the prenavigate request itself will be always sent without credentials, and nothing should be observable if the response is not used (i.e. no credentials changes are committed, no onerror/onload should be propagated).

One of the concerns was that Fetch spec integration could be a bit tricky, and one option that was discussed was to introduce a new credentials mode like prenavigate.

Next step:

Write down the proposal (done in this comment)
Each will examine the proposal

annevk · 2019-09-30T12:25:28Z

Could you also write down the proposed model for prefetch? I believe it was stated that the preference was to keep that around as well.

kinu · 2019-10-01T07:41:41Z

Let me try. For prefetch my current understanding is as following (if anyone had a different view please chime in):

It can be more strictly for prefetching (sub)resources for same-origin navigations, i.e. the prefetched resources may not be available in next navigations if the HTTP cache is split
Prefetch loads are optional (same as before)
Restricts that were discussed on this thread probably do not need to be applied? (I.e. it feels it can be a regular subresource loading, but I don't think this was explicitly discussed)

Reg: whether we want to keep it around, or can it be just prenavigate and preload? -- we probably want to keep it around, and afair followings are what were stated as the differences between prefetch and preload:

There are types that are hard to support as preload, e.g. workers (and to specifically talk about chrome impl it doesn't support as=document and all media types)
prefetch loading is optional while preload is not
preload is supposed to be for the current navigation, and will give the bytes back to the page (while prefetch only populate things in HTTP cache)

(While, I started to feel that the difference between prefetch and preload might look more subtle now)

annevk · 2019-10-01T09:48:19Z

I guess the other question is how this integrates with Fetch as the above models don't make everything clear. Does prefetch bypass service workers? Can the creation of a shared worker use the prefetch cache of another document? (Also, in a world with service workers prefetch being optional can lead to surprises between browsers that do and those that do not, especially if developers only code against a browser that does. That does not seem desirable.)

addyosmani · 2019-10-05T01:31:19Z

My understanding is that we are proposing keeping around prefetch (for same-origin optional prefetches) and introducing prenavigate as a form of prefetch which is always sent without credentials. Is that correct?

For what it's worth, I and other JavaScript library authors who rely on prefetch (for Quicklink, instant.page and Flying Pages) would like to avoid renaming it for the same-origin use-case if possible. It also appears there's reasonable usage of prefetch in the wild.

I guess the other question is how this integrates with Fetch as the above models don't make everything clear. Does prefetch bypass service workers?

+1 to more clearly defining this part of the prefetch model.

kinu · 2019-10-15T08:27:33Z

My understanding is that we are proposing keeping around prefetch (for same-origin optional prefetches)

That's my understanding as well, which means most existing prefetch (for same-origin) can stay as is.

I guess the other question is how this integrates with Fetch as the above models don't make everything clear. Does prefetch bypass service workers? Can the creation of a shared worker use the prefetch cache of another document? (Also, in a world with service workers prefetch being optional can lead to surprises between browsers that do and those that do not, especially if developers only code against a browser that does. That does not seem desirable.)

Assuming that prefetch is for same-origin only, a strawperson could look like following:

doesn't bypass service workers
shared worker (which is same-origin) can use the prefetch cache

josephrocca · 2020-02-13T06:40:46Z

Sorry to jump in here as a non-expert, but can I ask: Would prenavigate be suitable for a site like jsbin or codepen which embeds an iframe which is served from a subdomain (like embed.jsbin.com or embed.codepen.io)? Serving from a different origin is needed because these sites allow arbitrary JS to be run within the embed. The main page might be https://jsbin.com/abc123, and it would include this:

<link rel="prenavigate" href="https://embed.jsbin.com/abc123" as="document">

So it can start loading the document that will be put in the iframe embed right at the moment the main page (with the code editor interface) begins loading, rather than having to wait for the main page to render and thus loading the main page and the embed in a serial manner.

Is this a use case covered by prenavigate?

achristensen07 · 2020-02-13T22:24:59Z

No, prenavigate would load https://embed.jsbin.com/abc123 as if it were navigated to in the main frame, and its resources would be cached in the partition of embed.jsbin.com, which would be unavailable for use by a page with the main frame on a different domain. Preload would be used for resources intended to be used from pages in the current partition.

josephrocca · 2020-02-14T02:22:30Z

@achristensen07 Ahh, okay, thank you. So I guess for that use case I'd need to wait for preload as=document to be supported in browsers.

yoavweiss · 2022-02-24T14:33:21Z

To be defined as part of #86

/cc @noamr

noamr · 2022-02-25T08:26:49Z

See #86 (comment) for action plan

noamr · 2023-01-02T09:30:08Z

As per #86 (comment), <link rel=prefetch> will work as it is today, meaning that it would not change anything about cache partitioning - those prefetches only work when the prefetching document and the consuming document are in the same partition.

noamr · 2023-03-27T07:39:45Z

See previous comment.

yoavweiss added the Prefetch label Oct 21, 2018

yoavweiss mentioned this issue Oct 23, 2018

Add prefetch processing model, including double-key caching privacy protections whatwg/html#4115

Closed

yoavweiss mentioned this issue Nov 26, 2018

Use <link rel=“prefetch”> for prefetching vercel/next.js#5737

Merged

jyasskin mentioned this issue Mar 18, 2019

Extend link HTTP header to support subresource signed exchange loading WICG/webpackage#347

Open

domfarolino mentioned this issue Jul 30, 2019

Prefetch request changes to improve privacy w3ctag/design-reviews#398

Closed

5 tasks

josephrocca mentioned this issue Feb 13, 2020

Warning: <link rel=preload> must have a valid as value ampproject/amphtml#2492

Closed

fmarier mentioned this issue Jun 3, 2020

Enable the prefetch-privacy-changes flag by default brave/brave-browser#8319

Closed

brettz9 mentioned this issue Jul 30, 2020

Support CDNs JonasKruckenberg/rollup-plugin-sri#3

Closed

jeremyroman added a commit to WICG/nav-speculation that referenced this issue Aug 11, 2020

link to w3c/resource-hints#82

571652f

jeremyroman mentioned this issue Aug 11, 2020

Proposal to define privacy-enhanced prefetching and prerendering WICG/proposals#2

Closed

yoavweiss mentioned this issue Aug 13, 2020

Update loading spec to support subresource substitution WICG/webpackage#591

Merged

noamr mentioned this issue Feb 25, 2022

Specify processing model in terms of Fetch #86

Closed

noamr mentioned this issue Mar 9, 2022

WIP: Define the prefetch cache whatwg/html#7693

Closed

3 tasks

yoavweiss added the Triaged label Mar 24, 2022

othermaciej mentioned this issue Dec 22, 2022

Request for position: aligning on <link rel=prefetch> WebKit/standards-positions#114

Closed

clelland assigned noamr Mar 13, 2023

noamr closed this as completed Mar 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prefetch and double-key caching #82

Prefetch and double-key caching #82

yoavweiss commented Aug 29, 2018

youennf commented Sep 29, 2018

kinu commented Oct 9, 2018

youennf commented Oct 9, 2018

yoavweiss commented Oct 12, 2018

igrigorik commented Oct 18, 2018

yoavweiss commented Oct 23, 2018

yoavweiss commented Oct 23, 2018

domfarolino commented Jul 25, 2019

youennf commented Aug 2, 2019

domfarolino commented Aug 5, 2019

yutakahirano commented Aug 5, 2019

annevk commented Aug 7, 2019 •

edited

Loading

kinu commented Aug 7, 2019

annevk commented Aug 12, 2019

domfarolino commented Aug 29, 2019

annevk commented Aug 30, 2019

youennf commented Aug 30, 2019

annevk commented Aug 30, 2019

kinu commented Sep 10, 2019

annevk commented Sep 13, 2019

yoavweiss commented Sep 17, 2019

kinu commented Sep 30, 2019 •

edited

Loading

annevk commented Sep 30, 2019

kinu commented Oct 1, 2019

annevk commented Oct 1, 2019

addyosmani commented Oct 5, 2019

kinu commented Oct 15, 2019

josephrocca commented Feb 13, 2020 •

edited

Loading

achristensen07 commented Feb 13, 2020

josephrocca commented Feb 14, 2020

yoavweiss commented Feb 24, 2022

noamr commented Feb 25, 2022

noamr commented Jan 2, 2023

noamr commented Mar 27, 2023

Prefetch and double-key caching #82

Prefetch and double-key caching #82

Comments

yoavweiss commented Aug 29, 2018

youennf commented Sep 29, 2018

kinu commented Oct 9, 2018

youennf commented Oct 9, 2018

yoavweiss commented Oct 12, 2018

igrigorik commented Oct 18, 2018

yoavweiss commented Oct 23, 2018

yoavweiss commented Oct 23, 2018

domfarolino commented Jul 25, 2019

youennf commented Aug 2, 2019

domfarolino commented Aug 5, 2019

yutakahirano commented Aug 5, 2019

annevk commented Aug 7, 2019 • edited Loading

kinu commented Aug 7, 2019

annevk commented Aug 12, 2019

domfarolino commented Aug 29, 2019

annevk commented Aug 30, 2019

youennf commented Aug 30, 2019

annevk commented Aug 30, 2019

kinu commented Sep 10, 2019

annevk commented Sep 13, 2019

yoavweiss commented Sep 17, 2019

kinu commented Sep 30, 2019 • edited Loading

annevk commented Sep 30, 2019

kinu commented Oct 1, 2019

annevk commented Oct 1, 2019

addyosmani commented Oct 5, 2019

kinu commented Oct 15, 2019

josephrocca commented Feb 13, 2020 • edited Loading

achristensen07 commented Feb 13, 2020

josephrocca commented Feb 14, 2020

yoavweiss commented Feb 24, 2022

noamr commented Feb 25, 2022

noamr commented Jan 2, 2023

noamr commented Mar 27, 2023

annevk commented Aug 7, 2019 •

edited

Loading

kinu commented Sep 30, 2019 •

edited

Loading

josephrocca commented Feb 13, 2020 •

edited

Loading