Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Page caching is not detected (AMP Di AMP Project Contributors) #4638

Closed
4 tasks
KinG-InFeT opened this issue Jan 10, 2022 · 12 comments
Closed
4 tasks

Page caching is not detected (AMP Di AMP Project Contributors) #4638

KinG-InFeT opened this issue Jan 10, 2022 · 12 comments
Labels
3rd party compatibility Issues related to 3rd party compatibility like theme, plugin or hosting needs: r&d Needs research and development (R&D) before a solution can be proposed and scoped.

Comments

@KinG-InFeT
Copy link

Describe the bug
into wordpress Site Health out the cache error for plugin
Plugin WP-Rocket version: 3.10.5.1
Plugin AMP version: 2.2.0
Wp version: 5.8.3
Theme: Neve

To Reproduce
Steps to reproduce the behavior:

  1. Go to 'Site Health' into wordpress backend
  2. See error "Page caching is not detected"

Expected behavior

The AMP plugin performs at its best when page caching is enabled. This is because the additional optimizations performed require additional server processing time, and page caching ensures that responses are served quickly.

Page caching is detected by making three requests to the homepage and looking for Cache-Control: max-age=…, Expires, or Age HTTP response headers.

Backlog Grooming (for WP Media dev team use only)

  • Reproduce the problem
  • Identify the root cause
  • Scope a solution
  • Estimate the effort
@piotrbak
Copy link
Contributor

Hello @KinG-InFeT and thank you for the issue. The way in which AMP plugin is looking for a page caching is very unreliable. We'll check if there's anything we can do on our side, but most likely, it'll has to be fixed on the AMP plugin side.

@piotrbak piotrbak added 3rd party compatibility Issues related to 3rd party compatibility like theme, plugin or hosting needs: r&d Needs research and development (R&D) before a solution can be proposed and scoped. labels Jan 10, 2022
@KinG-InFeT
Copy link
Author

hi @piotrbak , i open new issue to AMP plugin for find a new solutions.

@westonruter
Copy link
Contributor

@piotrbak hi:

The way in which AMP plugin is looking for a page caching is very unreliable.

What is unreliable about it?

@piotrbak
Copy link
Contributor

piotrbak commented Jan 12, 2022

Hello @westonruter and thanks for your comment.

The existence of browser caching headers doesn't mean that the page is cached. Our plugin, to handle the browser caching and make sure that it's invalidated correctly uses Last-Modified header.

There's also ETag header that could be used in a similar way.

The Varnish part is of course totally correct, but some hosts are not using browser caching for text/html documents at all, and their headers are a bit different than Age:
-x-kinsta-cache
-x-cache, etc.

It's also possible the that websites without page caching applied will just use the browser caching headers for text/html documents. Then the warning won't be displayed at all while the page caching is not present.

@westonruter
Copy link
Contributor

Thanks for the additional information. We tested with several page caching plugins and they were consistently using Expires and/or Cache-Control. However, clearly we didn't test every plugin and hosting platform.

The existence of browser caching headers doesn't mean that the page is cached. ¶ It's also possible the that websites without page caching applied will just use the browser caching headers for text/html documents. Then the warning won't be displayed at all while the page caching is not present.

I recall considering ETag and/or Last-Modified but likewise they don't necessarily indicate that any page caching is present. It could be that pages are still being generated dynamically and they send the Last-Modified/ETag headers so that the client can send them in an If-Modified-Since/If-None-Match request header so the server can send back a 304 Not Modified response without any body. The benefit here is that no bytes are sent over the wire and the page in browser cache can be reused. But again, this doesn't necessarily indicate that there is page caching present.

It's clearly difficult to sniff with 100% accuracy whether page caching is present based on response headers. So what's important is to detect most cases and err on the side of false positives rather than false negatives.

The Varnish part is of course totally correct, but some hosts are not using browser caching for text/html documents at all, and their headers are a bit different than Age:
-x-kinsta-cache
-x-cache, etc.

Alas. Good to call that out. I suppose we'll have to hard-code such custom response headers.

Our plugin, to handle the browser caching and make sure that it's invalidated correctly uses Last-Modified header.

Wouldn't it be beneficial for you to also send the Expires and/or Cache-Control: max-age=X response headers? This would allow clients to avoid having to do conditional requests for pages that are not stale in the browser cache.

@piotrbak
Copy link
Contributor

piotrbak commented Jan 13, 2022

It's clearly difficult to sniff with 100% accuracy whether page caching is present based on response headers. So what's important is to detect most cases and err on the side of false positives rather than false negatives.
After adding my comment I read the conversation related to your PR and realised that, indeed, it's not easy to have universal answer to this.

The only idea that comes to my mind is to make calls to the cached and not cached version of the page and based on loading times determine whether it's served from static file or not:

  1. Make a calls to domain.com and domain.com?cache_check=[random-int]
  2. Make a second round of calls to domain.com and to domain.com?cache_check=[another-random-int]
    After the first call we should be sure that the domain.com is cached, the second call to domain.com should be much faster than the call to domain.com?cache_check=[another-random-int]. Of course the much faster value would be a problem to think about (and also slow websites that would timeout the PHP process during 2 calls to uncached pages 😄 ).

When it comes to the Last-Modified and 304 status, this is exactly what we're doing.

Wouldn't it be beneficial for you to also send the Expires and/or Cache-Control: max-age=X response headers? This would allow clients to avoid having to do conditional requests for pages that are not stale in the browser cache.
It could be beneficial, but it's also a bit risky due to the content not being updated during the second visit. Not sure if setting it to the very low number brings any value though, we'll do some research about it.

Alas. Good to call that out. I suppose we'll have to hard-code such custom response headers.
Those are some examples of headers that I was able to think of:

  • x-cache: HIT , could be also x-cache: HIT: 1
  • x-proxy-cache: HIT
  • x-kinsta-cache: HIT
  • cf-cache-status: HIT
  • cf-apo-via: tcache

At the end, how do you think we could move forward, to make sure not to confuse our mutual customers?

@westonruter
Copy link
Contributor

The only idea that comes to my mind is to make calls to the cached and not cached version of the page and based on loading times determine whether it's served from static file or not:

I think we tried going down that route. The problem with cache-busting query parameters is that caching is some caching plugins disable caching altogether when there is a query parameter present.

When it comes to the Last-Modified and 304 status, this is exactly what we're doing.

It seems like we'll need to just include Last-Modified and ETag among the signals for whether page caching is present. Since WordPress doesn't send these by default (except for /feed/ responses), it's unlikely they'd be present if a page caching plugin is not in place.

@piotrbak
Copy link
Contributor

The problem with cache-busting query parameters is that caching is some caching plugins disable caching altogether when there is a query parameter present.
Exactly, then there'll be a noticeable difference between not-cached request (because of the random query string) and the cached one.

It seems like we'll need to just include Last-Modified and ETag among the signals for whether page caching is present. Since WordPress doesn't send these by default (except for /feed/ responses), it's unlikely they'd be present if a page caching plugin is not in place.

This solution will resolve all problems between our products, so it's very good from our point of view.

Thank you and let me know if there's anything we can help you with.

@westonruter
Copy link
Contributor

Exactly, then there'll be a noticeable difference between not-cached request (because of the random query string) and the cached one.

I think the problem is it's not consistent. Some caching plugins may disable caching altogether when a random query parameter is present. Other plugins may strip it out silently and serve a response as if it wasn't present in the first place. Others will cache it separately from a page without the query parameter. All of this variability, along with the variability introduced by the network itself, seems to make a timing-based check for page caching infeasible.

The only sensible timing-based check seems to me to re-fetch the homepage 3 times, and if any of them response in less than 200ms then we can presume a page cache is responsible for such a “fast” response, or else the site is fast enough that page caching is not necessary. PageSpeed Insights recommends a TTFB under 200ms.

So I'd say we change the test so that instead of checking for page caching we instead check for the need for page caching. As part of that we:

  1. Check for a <200ms response
  2. Check for one of the recognized response headers for page caching.

The test results would then be as follows:

<200ms response Page caching detected Test Result
Yes Yes Good
Yes No Recommended
No No Critical

In the second case, this catches the case where your site responds just fine when it is getting little traffic. But if the site lacks page caching, it can quickly crash if no page caching is in place.

@westonruter
Copy link
Contributor

OK, we've merged the improved page caching test: ampproject/amp-wp#6849.

See the various test scenarios here: ampproject/amp-wp#6849 (comment)

@KinG-InFeT
Copy link
Author

OK, we've merged the improved page caching test: ampproject/amp-wp#6849.

See the various test scenarios here: ampproject/amp-wp#6849 (comment)

yes, now Site Health not return the error for cache 👍

@piotrbak
Copy link
Contributor

Thank you @westonruter, it looks great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3rd party compatibility Issues related to 3rd party compatibility like theme, plugin or hosting needs: r&d Needs research and development (R&D) before a solution can be proposed and scoped.
Projects
None yet
Development

No branches or pull requests

3 participants