-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new cache-policy & cache middleware structure to support full page caching #1856
Conversation
@jsha @rust-lang/docs-rs I would love your feedback on this before I invest more time. Before merging I would probably add some more cache-policy checks in the rustdoc tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally looks great to me, and I like the direction. A couple of high-level comments:
Prior to #1569, docs.rs set no Cache-Control and no Expires on rustdoc pages, which put us in the heuristic caching world. Normally in heuristic caching, browsers look at the Last-Modified to figure out how long to cache (10% of time since Last-Modified). Since docs.rs also doesn't set Last-Modified, I think that means 0 cache time, but it's not guaranteed by spec.
After #1569, we explicitly set max-age=0[, stale-while-revalidate=XX]
on /latest/
pages. After #1856 (this PR), we will revert the max-age=0
part and go back to heuristic caching in browsers. I think the result probably winds up the same in practice, though it was nice to be able to be explicit. I don't see a way to continue to give the explicit instruction to browsers without interfering with CloudFront's default TTL.
10% of time since Last-Modified is good to know, this means we should never set it so we're sure we can invalidate in the CDN.
Yeah, I thought the same, I like it better explicit. Without switching CDNs or lambda@edge I don't see any alternative. |
…edirects" This reverts commit cdedb65.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Looking forward to seeing this deployed.
Before deploy, do you want to collect some timing measurements so we get a before / after? For instance here are some popular pages that are likely to be available in the CloudFront cache shortly after the deploy:
https://docs.rs/reqwest/latest/reqwest/
https://docs.rs/clap/latest/clap/
https://docs.rs/reqwest/latest/reqwest/
https://docs.rs/serde_json/latest/serde_json/
https://docs.rs/regex/latest/regex/
https://docs.rs/chrono/0.4.0/chrono/struct.DateTime.html
https://docs.rs/anyhow/latest/anyhow/
https://docs.rs/rand/latest/rand/
It would also be interesting to see the drop in requests to origin after deploy.
To be clear: directly after deploy nothing will happen, apart from probably caching less in some cases because we explicitly set I don't think we to collect CDN-level metrics on this for now.
This we definitely will see in our metrics. in general: @jsha were you able to manually test too while reviewing? I would also love another pair of eyes on this by @Nemo157 or @jyn514 to be safe that I didn't forget anything important. |
I have not manually tested but I can try and do that later today. |
Some notes from testing:
|
Thank you for investing the time!
valid point, I removed it for these cases.
I can see a point where builds.json should get
This PR is far from covering all edge-cases where we could cache more and it aims mainly for rustdoc pages. I would leave out adding new pages to cache for the sake of finishing this PR. When this change alive and kicking we can try to cache more. |
Sounds good to me. 👍🏻 |
Resolves #1552
This is the implementation of #1552 , following the idea in #1552 (comment).
Basically:
no-cache
headers by default in all pages,This enables us to have things cached for longer in the CDN and just invalidating the needed paths when the content changes, without having to think about browser caches or other proxies / CDNs.
Invalidation after build was added in #1825.
For now I only focused on rustdoc pages and their redirects, and putting the policy structure & safeguard middleware into place. Some other pages I added it too.
Around CSP we believe that we can cache pages, even with nonces ( see #1569 (comment) ), online I can find sources that support both claims (see also https://serverfault.com/a/1064775/549071 ). To keep in mind too: for rustdoc pages we don't add CSP nonces yet, so the biggest part of the new caching is for now not affected by this caching change.
The only option to combine random CSP nonces & CDN caching would be to use edge workers to rewrite the HTML and update the nonce in the header. ( with fastly or cloudflare we could even do this in rust ;) ).