stale negative cache is not updated by fresh negative response

Hello,

ATS does not update existing cache of 'negative' response even if ATS fetches a fresh equally-negative response from the backend.

```
CONFIG proxy.config.http.negative_revalidating_enabled INT 1
CONFIG proxy.config.http.negative_revalidating_lifetime INT 3600
```

If the backend replies 502 result having `Cache-Control: max-age=60, public`, it is well cached in ATS, that is good. When it becomes stale, i.e. more than 60 seconds later, ATS fetchs a new version from the backend, which, if it is equally negative to the cached version, will be discarded and the stale entry will be sent to the client. That means : 

 - The client cannot get a newer 'negative' response.
 - Once the negative cache becomes stale, all incoming requests will involve requests to the backend.

It is of course good 'not updating positive stale cache by negative fresh response', but there is no reason 'not to update negative stale cache by negative fresh response'.

[background]

My idea is switching between 'normal' backend and 'maintenance mode' backend using HAProxy behind ATS. Usually normal backend replies 200 status with `Cache-Control: max-age=300, public`. When we switch to the 'maintenance mode' backend, it replies a beautiful maintenance information page having 502 status with `Cache-Control: max-age=60, public` for whatever HTML requests. By using this short max-age cache, we would like to avoid heavy load on the maintenance mode backend.

Thanks to 502 status of 'maintenance mode' backend, ATS tries to return a stale cache of 'positive' response, if exists, until long-enough `negative_revalidating_lifetime` reaches, i.e. we can provide a normal site as much as possible. And when `negative_revalidating_lifetime` reaches, or positive cache does not exist (for explicit `no-store` requests etc.), the client will get a beautiful maintenance information page.

But because of the behaviour describing here, the client cannot get the updated maitenance information page during maintenance mode and also the maintenance mode backend needs to respond for each request in vain.

Here is an example timeline.

```
00:00 ATS fetches 200 response from the normal backend and replies it
00:01 (we switch to the maintenance mode backend)
00:03 ATS replies a normal cache
00:10 cache is already stale, but ATS still replies a normal cache thanks to negative_revalidating_lifetime
01:15 ATS fetches from the maintenance mode backend, then caches it and replies it
01:16 (we update the maintenance page on the maintenance backend)
01:17 'negative' cache is already stale, thus ATS fetches again from the maintenance mode backend,
      but replies the old existing stale 'negative' cache without updating it
01:20 'negative' cache is always stale, thus ATS fetches again from the maintenance mode backend.
```

If ATS supports `stale-if-error`, our responses would be:

- from 'normal' backend: `Cache-Control: max-age=...,stale-if-error=<7 days>`
- from 'maintenance' backend: `Cache-Control: max-age: 60` (without `stale-if-error`)

but it is not supported for now, thus we are trying this approach with using `negative_revalidating_lifetime`.

I verified this behaviour on ATS version 7.1.1 and 8.0.2.

Thanks in advance !
Kazuhiko
/cc @vpelletier

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

stale negative cache is not updated by fresh negative response #7417

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

stale negative cache is not updated by fresh negative response #7417

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions