-
Notifications
You must be signed in to change notification settings - Fork 847
Description
Hello,
ATS does not update existing cache of 'negative' response even if ATS fetches a fresh equally-negative response from the backend.
CONFIG proxy.config.http.negative_revalidating_enabled INT 1
CONFIG proxy.config.http.negative_revalidating_lifetime INT 3600
If the backend replies 502 result having Cache-Control: max-age=60, public, it is well cached in ATS, that is good. When it becomes stale, i.e. more than 60 seconds later, ATS fetchs a new version from the backend, which, if it is equally negative to the cached version, will be discarded and the stale entry will be sent to the client. That means :
- The client cannot get a newer 'negative' response.
- Once the negative cache becomes stale, all incoming requests will involve requests to the backend.
It is of course good 'not updating positive stale cache by negative fresh response', but there is no reason 'not to update negative stale cache by negative fresh response'.
[background]
My idea is switching between 'normal' backend and 'maintenance mode' backend using HAProxy behind ATS. Usually normal backend replies 200 status with Cache-Control: max-age=300, public. When we switch to the 'maintenance mode' backend, it replies a beautiful maintenance information page having 502 status with Cache-Control: max-age=60, public for whatever HTML requests. By using this short max-age cache, we would like to avoid heavy load on the maintenance mode backend.
Thanks to 502 status of 'maintenance mode' backend, ATS tries to return a stale cache of 'positive' response, if exists, until long-enough negative_revalidating_lifetime reaches, i.e. we can provide a normal site as much as possible. And when negative_revalidating_lifetime reaches, or positive cache does not exist (for explicit no-store requests etc.), the client will get a beautiful maintenance information page.
But because of the behaviour describing here, the client cannot get the updated maitenance information page during maintenance mode and also the maintenance mode backend needs to respond for each request in vain.
Here is an example timeline.
00:00 ATS fetches 200 response from the normal backend and replies it
00:01 (we switch to the maintenance mode backend)
00:03 ATS replies a normal cache
00:10 cache is already stale, but ATS still replies a normal cache thanks to negative_revalidating_lifetime
01:15 ATS fetches from the maintenance mode backend, then caches it and replies it
01:16 (we update the maintenance page on the maintenance backend)
01:17 'negative' cache is already stale, thus ATS fetches again from the maintenance mode backend,
but replies the old existing stale 'negative' cache without updating it
01:20 'negative' cache is always stale, thus ATS fetches again from the maintenance mode backend.
If ATS supports stale-if-error, our responses would be:
- from 'normal' backend:
Cache-Control: max-age=...,stale-if-error=<7 days> - from 'maintenance' backend:
Cache-Control: max-age: 60(withoutstale-if-error)
but it is not supported for now, thus we are trying this approach with using negative_revalidating_lifetime.
I verified this behaviour on ATS version 7.1.1 and 8.0.2.
Thanks in advance !
Kazuhiko
/cc @vpelletier