Skip to content

Commit

Permalink
Document prepared query staleness quirk and force all background requ…
Browse files Browse the repository at this point in the history
…ests to AllowStale so we can spread service discovery load across servers.
  • Loading branch information
banks committed Sep 6, 2018
1 parent 74f71b8 commit 64580b5
Show file tree
Hide file tree
Showing 7 changed files with 45 additions and 14 deletions.
6 changes: 6 additions & 0 deletions agent/cache-types/catalog_services.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,12 @@ func (c *CatalogServices) Fetch(opts cache.FetchOptions, req cache.Request) (cac
reqReal.QueryOptions.MinQueryIndex = opts.MinIndex
reqReal.QueryOptions.MaxQueryTime = opts.Timeout

// Allways allow stale - there's no point in hitting leader if the request is
// going to be served from cache and endup arbitrarily stale anyway. This
// allows cached service-discover to automatically read scale across all
// servers too.
reqReal.AllowStale = true

// Fetch
var reply structs.IndexedServiceNodes
if err := c.RPC.RPC("Catalog.ServiceNodes", reqReal, &reply); err != nil {
Expand Down
1 change: 1 addition & 0 deletions agent/cache-types/catalog_services_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ func TestCatalogServices(t *testing.T) {
require.Equal(1*time.Second, req.QueryOptions.MaxQueryTime)
require.Equal("web", req.ServiceName)
require.Equal("canary", req.ServiceTag)
require.True(req.AllowStale)

reply := args.Get(2).(*structs.IndexedServiceNodes)
reply.QueryMeta.Index = 48
Expand Down
6 changes: 6 additions & 0 deletions agent/cache-types/health_services.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,12 @@ func (c *HealthServices) Fetch(opts cache.FetchOptions, req cache.Request) (cach
reqReal.QueryOptions.MinQueryIndex = opts.MinIndex
reqReal.QueryOptions.MaxQueryTime = opts.Timeout

// Allways allow stale - there's no point in hitting leader if the request is
// going to be served from cache and endup arbitrarily stale anyway. This
// allows cached service-discover to automatically read scale across all
// servers too.
reqReal.AllowStale = true

// Fetch
var reply structs.IndexedCheckServiceNodes
if err := c.RPC.RPC("Health.ServiceNodes", reqReal, &reply); err != nil {
Expand Down
1 change: 1 addition & 0 deletions agent/cache-types/health_services_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ func TestHealthServices(t *testing.T) {
require.Equal(1*time.Second, req.QueryOptions.MaxQueryTime)
require.Equal("web", req.ServiceName)
require.Equal("canary", req.ServiceTag)
require.True(req.AllowStale)

reply := args.Get(2).(*structs.IndexedCheckServiceNodes)
reply.QueryMeta.Index = 48
Expand Down
8 changes: 7 additions & 1 deletion agent/cache-types/prepared_query.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,19 @@ type PreparedQuery struct {
func (c *PreparedQuery) Fetch(opts cache.FetchOptions, req cache.Request) (cache.FetchResult, error) {
var result cache.FetchResult

// The request should be a DCSpecificRequest.
// The request should be a PreparedQueryExecuteRequest.
reqReal, ok := req.(*structs.PreparedQueryExecuteRequest)
if !ok {
return result, fmt.Errorf(
"Internal cache failure: request wrong type: %T", req)
}

// Allways allow stale - there's no point in hitting leader if the request is
// going to be served from cache and endup arbitrarily stale anyway. This
// allows cached service-discover to automatically read scale across all
// servers too.
reqReal.AllowStale = true

// Fetch
var reply structs.PreparedQueryExecuteResponse
if err := c.RPC.RPC("PreparedQuery.Execute", reqReal, &reply); err != nil {
Expand Down
1 change: 1 addition & 0 deletions agent/cache-types/prepared_query_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ func TestPreparedQuery(t *testing.T) {
req := args.Get(1).(*structs.PreparedQueryExecuteRequest)
require.Equal("geo-db", req.QueryIDOrName)
require.Equal(10, req.Limit)
require.True(req.AllowStale)

reply := args.Get(2).(*structs.PreparedQueryExecuteResponse)
reply.QueryMeta.Index = 48
Expand Down
36 changes: 23 additions & 13 deletions website/source/api/index.html.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,34 +134,44 @@ Combining `?cached` with `?consistent` is an error.

Endpoints supporting simple caching may return a result directly from the local
agent's cache without a round trip to the servers. By default the agent caches
results for a relatively long time (days) such that it can still return a result
even if the servers are unavailable for an extended period to enable "fail
static" semantics.
results for a relatively long time (3 days) such that it can still return a
result even if the servers are unavailable for an extended period to enable
"fail static" semantics.

That means that with no other arguments, cached queries might receive a response
which is days old. To request better freshness, the HTTP `Cache-Control` header
may be set with a directive like `max-age=<seconds>`. In this case the agent
will attempt to re-fetch the result from the servers if the cached value is
older than the given `max-age`. If the servers can't be reached a 500 is
returned as normal.
That means that with no other arguments, `?cached` queries might receive a
response which is days old. To request better freshness, the HTTP
`Cache-Control` header may be set with a directive like `max-age=<seconds>`. In
this case the agent will attempt to re-fetch the result from the servers if the
cached value is older than the given `max-age`. If the servers can't be reached
a 500 is returned as normal.

To allow clients to maintain fresh results in normal operation but allow stale
ones if the servers are unavailable, the `stale-if-error=<seconds>` directive
may be additionally provided in the `Cache-Control` header. It is meaningless to
provide this without a `max-age`. The `Age` response header can be used to
determine if this has occurred.
may be additionally provided in the `Cache-Control` header. This will return the
cached value anyway even it it's older than `max-age` (provided it's not older
than `stale-if-error`) rather than a 500. It must be provided along with a
`max-age` or `must-revalidate`. The `Age` response header, if larger than
`max-age` can be used to determine if the server was unreachable and a cached
version returned instead.

For example, assuming there is a cached response that is 65 seconds old, and
that the servers are currently unavailable, `Cache-Control: max-age=30` will
result in a 500 error, while `Cache-Control: max-age=30 stale-if-error=259200`
will result in the cached response being returned.

A request setting either `max-age=0` or `must-revalidate` directives will cause
the agent to re-fetch the response from servers always. Either can be combined
the agent to always re-fetch the response from servers. Either can be combined
with `stale-if-error=<seconds>` to ensure fresh results when the servers are
available, but falling back to cached results if the request to the servers
fails.

Requests that do not use `?cached` currently bypass the cache entirely so the
cached response returned might be more stale than the last uncached response
returned on the same agent. If this causes problems, it is possible to make
requests using `?cached` and setting `Cache-Control: must-revalidate` to have
always-fresh results yet keeping the cache populated with the most recent
result.

In all cases the HTTP `X-Cache` header is always set in the response to either
`HIT` or `MISS` indicating whether the response was served from cache or not.

Expand Down

0 comments on commit 64580b5

Please sign in to comment.