Add memcached support to index cache #1881

pracucci · 2019-12-12T14:30:54Z

This PR proposes to introduce memcached support for the index cache. Few reasons why memcached may make sense:

Do not wipe out the index cache on thanos store restarts/redeploys
Scale the cache size above the single thanos store memory limits (ie. a cluster of memcached nodes)
Use it in Cortex, where we're experimenting a TSDB storage based on some Thanos components and we currently have the "bucket store" replicated across different nodes and a shared cache may benefit

Open questions:

Should MaxGetMultiBatchConcurrency be global (like right now) or a per MultiGet() call limit? I would be more lean towards the latter (see TODO in pkg/cacheutil/memcached_client.go)

I added CHANGELOG entry for this change.
Change is not relevant to the end user.

Changes

Moved index cache key struct outside of the file containing the in-memory cache backend because generic
Added MemcachedIndexCache support, with the backend client inspired by Cortex, with some substantial differences:
- DNS discovery based on the Thanos one
- SetAsync() instead of Set()

Verification

Manual and unit tests.

thorfour

couple nits otherwise LGTM

pkg/cacheutil/memcached_client.go

bwplotka

Nice, awesome work, thanks for this 👍 Generally looking good, some suggestions so far, plus response to open questions.

Is this change set feasible to be admitted into Thanos?

Yes, happy to deploy this on our setup as well.

How should memcached config options specified? Through CLI flags or config file?

I believe the same as tracing and objstore.config - so config file.

Should MaxGetMultiBatchConcurrency be global (like right now) or a per MultiGet() call limit? I would be more lean towards the latter (see TODO in pkg/cacheutil/memcached_client.go)

Good question, I think in both cases we can congest at some point (local resources vs memcached server resources). With global we can at least know that we are short of workers. (: Thoughts? @squat @brancz ?
In any case it would be nice to have observabiity: latency metrics / traces for those.

bwplotka · 2019-12-13T18:11:45Z

pkg/cacheutil/memcached_client.go

+)
+
+// MemcachedClient is a high level client to interact with memcached.
+type MemcachedClient interface {


Do we need another interface?

I think we way we do it is to accept interfaces but return structs as advised by Go (:

Do we need another interface?

The interface is currently used to mock it in pkg/store/cache/memcached_test.go. It may also be need to add multi-tenancy support in Cortex: an option may be having 1 single underlying memcached client and wrapping it through a proxy for each tenant having the proxy prefixing the keys with the tenant ID (but no decision has been taken and I'm open to suggestions).

I think we way we do it is to accept interfaces but return structs as advised by Go (:

Right, done.

It sounds like a YAGNI here. Does it mean creating distributing each client implementation for each block? (: Sounds like something we can add unless we know clear decision here, but let's discuss

I'm not getting if the comment is about (a) removing the MemcachedClient interface or (b) merging memcachedClient with MemcachedIndexCache?

I was thinking about c merging this with IndeCache as this layer might be a bit shallow, but the idea of using this for other caches like chunks is tempting.

I think I am happy with this for this iteration (:

pkg/cacheutil/memcached_client.go

pkg/cacheutil/memcached_client_test.go

pkg/cacheutil/memcached_client.go

bwplotka · 2019-12-13T18:45:28Z

pkg/cacheutil/memcached_client.go

+)
+
+const (
+	defaultTimeout                     = 100 * time.Millisecond


not too shorty?

Maybe. Generally, we want a cache being significantly faster than the backend storage. Assuming a 1ms networking latency to comunicate to a memcached cluster running within the same cloud region, 100ms should be enough (ie. it's what we set in our Cortex clusters).

However, I do see that it's probably a low value for a default. What would you suggest?

Let's gather some data and start with something (:

In our Cortex clusters running on GCE we measure a 99th memcached latency of about 15ms (pretty constant) and max latency between 50ms and 250ms. I agree a default timeout should be conservative, so I'm going to set it to 500ms here, and then can be adjusted by the user (ie. in our cluster I don't see a good reason of setting it above 100ms, given we want to skip the cache if it's too slow for any reason).

pkg/store/cache/memcached.go

pracucci · 2019-12-16T17:33:01Z

@bwplotka Thanks for your initial review! I've addressed your feedback (or commented otherwise), added the config file and written some doc. May you take another look, please?

bwplotka

To respond on #1881 (comment)

Batch size feels ok to me 👍 So no batches by default.

Ok, IMO we do lots of micro-optimizations which makes this PR quite hard to review and iterate over. But given they were mostly based on Cortex code I think I am fine with adding this as long as there will be someone to maintain this code other than a few of us from maintainer team. We can definitely consider adding someone from your side as maintainers as well (:

It's LGTM from me, modulo small suggestions.

Would be nice to see some review from others like @GiedriusS @brancz, @squat or @metalmatze but it takes some effort (:

bwplotka · 2019-12-16T19:04:48Z

docs/components/store.md

+type: in-memory
+config:
+  # Maximum number of bytes the cache can contain (defaults to 250MB).
+  max_size_bytes: 262144000


how hard would be to support strings based sizes? 🤔

Perhaps the same code from kingpin can be reused here: https://godoc.org/github.com/alecthomas/units#Base2Bytes ?

I've introduced storecache.Bytes, whose yaml marshalling is based on units.Base2Bytes, to allow to configure in-memory index cache size using byte units.

What's your take?

docs/components/store.md

pkg/cacheutil/memcached_server_selector_test.go

pkg/store/cache/cache.go

bwplotka · 2019-12-17T15:14:12Z

pkg/store/cache/cache_test.go

+
+	uid := ulid.MustNew(1, nil)
+
+	tests := map[string]struct {


Usually we do list for deterministic order, but that's fine here

bwplotka · 2019-12-17T15:17:15Z

pkg/cacheutil/memcached_client.go

+)
+
+// MemcachedClient is a high level client to interact with memcached.
+type MemcachedClient interface {


It sounds like a YAGNI here. Does it mean creating distributing each client implementation for each block? (: Sounds like something we can add unless we know clear decision here, but let's discuss

pkg/cacheutil/memcached_client.go

bwplotka · 2019-12-17T15:40:55Z

pkg/cacheutil/memcached_client.go

+				batchEnd = len(keys)
+			}
+
+			c.getMultiQueue <- &memcachedGetMultiBatch{


We lost the discussion on GitHub but I guess at the end we decided on per client concurrency for now?

Thinking loudly: if we choose this I would say we need good observability: how to find latency caused by starvation of this queue? The latency metric is per single call to Memcache (which is fair for from mameched client perspective).

I think tracing would help as we already create some spans here e.g in getMultiSingle so should be enough..

Maybe our current gate https://github.com/thanos-io/thanos/blob/master/pkg/store/gate.go could be adopted here somehow? It would be nice to have some metric here or in getMultiSingle because otherwise all users would need to have some kind of tracing enabled in case this becomes a problem (and most of the time these things do, as we all know :P).

I like the metric standpoint, that's a good point. Do you think @pracucci we can adapt the existing way of queuing requests via gate in this way? as suggested by @GiedriusS ? I don't have a strong opinion but can see those pros & cons:

Advantages:

Reuse

Latency + queue size metric

We can switch from global to per request limit easily gate here and there.

Downsides:

Each gated request requires a new goroutine. Bit more goroutine overhead and coordination.

I think for this iteration I would be happy with what we have now.

I personally see a big value in having such metrics. Also the gate would simplify the implementation (less code).

We lost the discussion on GitHub but I guess at the end we decided on per client concurrency for now?

Re-iterating on this, I think the current implementation is confusing. We're currently limiting the batched GetMulti() but not non-batched GetMulti(). For example, if you have MaxGetMultiBatchSize=1000 and MaxAsyncConcurrency=10 and you concurrently call 100 times MemcachedClient.GetMulti(1K keys) we end up with 100 concurrent underlying requests (each with 1K keys for a total of 100K keys). However, if you call once MemcachedClient.GetMulti(100K keys) we end up with 10 concurrent underlying requests, each with 1K keys, until all 100K keys are fetched.

I see three options:

Keep it as is (but weird)

Enforce the concurrency limit even to non-batched requests

Pro: the concurrency limit is an effective max number of concurrent fetch requests

Con: a single request with a bunch of keys will slow down everything else because of the queue

Switch to a per-request gate

Pro: a single request with a bunch of keys doesn't slow down others

Con: there's no way to set a cap on the max number of connections towards memcached

I'm currently more keen to #2, cause it looks the safest option.

Do you think @pracucci we can adapt the existing way of queuing requests via gate in this way? as suggested by @GiedriusS ?

I think so and I believe it would make sense. Observability - at this stage - is more important than goroutine overhead and we could implement it in a way that goroutines are created only for batched requests (so as far as the batching is disabled or requests do not exceed the max batch size, there will be no goroutine overhead).

In this commit I've:

Switched the max get multi concurrency to a gate

Applied the gate also to non batched requests

What's your take?

thorfour

LGTM

pkg/store/cache/cache.go

pkg/cacheutil/jump_hash.go

GiedriusS · 2019-12-17T20:55:36Z

pkg/cacheutil/memcached_client.go

+
+	// Wait for all batch results. In case of error, we keep
+	// track of the last error occurred.
+	items := make([]map[string]*memcache.Item, 0, numResults)


Maybe sync.Pool could be used here? Because this will be invoked quite a lot so it should help.

Maybe, but fine to play with it with some microbenchmark on later PRs.

GiedriusS · 2019-12-17T21:03:33Z

cmd/thanos/store.go

 		Default("250MB").Bytes()

+	indexCacheConfig := extflag.RegisterPathOrContent(cmd, "index-cache.config",


bwplotka

LGTM (: Thanks, I think there are few bits that we can tweak/optimize later on with some microbenchmarks like adding Gate or pooling but this is fine for now (:

docs/components/store.md

bwplotka · 2019-12-18T15:40:09Z

pkg/cacheutil/memcached_client.go

+)
+
+// MemcachedClient is a high level client to interact with memcached.
+type MemcachedClient interface {


I was thinking about c merging this with IndeCache as this layer might be a bit shallow, but the idea of using this for other caches like chunks is tempting.

I think I am happy with this for this iteration (:

bwplotka · 2019-12-18T16:08:46Z

pkg/cacheutil/memcached_client.go

+				batchEnd = len(keys)
+			}
+
+			c.getMultiQueue <- &memcachedGetMultiBatch{


I like the metric standpoint, that's a good point. Do you think @pracucci we can adapt the existing way of queuing requests via gate in this way? as suggested by @GiedriusS ? I don't have a strong opinion but can see those pros & cons:

Advantages:

Reuse

Latency + queue size metric

We can switch from global to per request limit easily gate here and there.

Downsides:

Each gated request requires a new goroutine. Bit more goroutine overhead and coordination.

I think for this iteration I would be happy with what we have now.

bwplotka · 2019-12-18T16:19:51Z

pkg/cacheutil/memcached_client.go

+
+	// Wait for all batch results. In case of error, we keep
+	// track of the last error occurred.
+	items := make([]map[string]*memcache.Item, 0, numResults)


Maybe, but fine to play with it with some microbenchmark on later PRs.

bwplotka · 2019-12-18T16:20:35Z

pkg/store/cache/units.go

+	"github.com/alecthomas/units"
+)
+
+// Bytes is a data type which supports yaml serialization/deserialization


Thanks for this 👍

bwplotka · 2019-12-18T16:33:41Z

I think I am happy to merge this in this form (after rebase) (:

@squat @thorfour @brancz anyone wants to take a look before we do that?

pracucci · 2019-12-19T10:11:48Z

Thanks @bwplotka for your extensive review and sorry for this big PR. Next time I will address changes in a more iterative way. I went through a self review and further manual tests and it looks good to me, but I will be happy to re-iterate on it in case of further feedback.

I think I am fine with adding this as long as there will be someone to maintain this code other than a few of us from maintainer team

Sure. I'm OK maintaining it. Feel free to assign me memcached-related issues in the next future.

GiedriusS · 2020-01-03T11:26:29Z

Last small conflict needs to be solved before the final review. Thanks for your work!

…mory cache backend because generic Signed-off-by: Marco Pracucci <marco@pracucci.com>

Signed-off-by: Marco Pracucci <marco@pracucci.com>

…op() Signed-off-by: Marco Pracucci <marco@pracucci.com>

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Co-Authored-By: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: Marco Pracucci <marco@pracucci.com>

Signed-off-by: Marco Pracucci <marco@pracucci.com>

…ng in MemcachedClient Signed-off-by: Marco Pracucci <marco@pracucci.com>

Signed-off-by: Marco Pracucci <marco@pracucci.com>

…is to have a global limit on the max concurrent batches Signed-off-by: Marco Pracucci <marco@pracucci.com>

Signed-off-by: Marco Pracucci <marco@pracucci.com>

…ment Signed-off-by: Marco Pracucci <marco@pracucci.com>

Signed-off-by: Marco Pracucci <marco@pracucci.com>

pracucci · 2020-01-03T11:30:32Z

@GiedriusS I've rebased master and fixed the conflict. Thanks!

bwplotka

Let's merge on green 👍

bwplotka · 2020-01-03T12:03:13Z

Thanks @pracucci

thorfour reviewed Dec 12, 2019

View reviewed changes

pkg/cacheutil/memcached_client.go Show resolved Hide resolved

pkg/cacheutil/memcached_client.go Outdated Show resolved Hide resolved

pkg/cacheutil/memcached_client.go Outdated Show resolved Hide resolved

pracucci force-pushed the add-memcached-support-to-index-cache branch from a302126 to 38a9a41 Compare December 13, 2019 08:18

bwplotka reviewed Dec 13, 2019

View reviewed changes

pracucci force-pushed the add-memcached-support-to-index-cache branch 2 times, most recently from f744f29 to a54f6bd Compare December 16, 2019 09:00

pracucci marked this pull request as ready for review December 16, 2019 17:31

bwplotka approved these changes Dec 17, 2019

View reviewed changes

thorfour approved these changes Dec 17, 2019

View reviewed changes

pkg/store/cache/cache.go Outdated Show resolved Hide resolved

GiedriusS reviewed Dec 17, 2019

View reviewed changes

pkg/cacheutil/jump_hash.go Outdated Show resolved Hide resolved

GiedriusS reviewed Dec 17, 2019

View reviewed changes

cmd/thanos/store.go

Default("250MB").Bytes()

indexCacheConfig := extflag.RegisterPathOrContent(cmd, "index-cache.config",

Copy link

Member

GiedriusS Dec 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

bwplotka approved these changes Dec 18, 2019

View reviewed changes

pracucci force-pushed the add-memcached-support-to-index-cache branch 2 times, most recently from ee88dd5 to 54eefc4 Compare December 18, 2019 19:59

pracucci force-pushed the add-memcached-support-to-index-cache branch from 1249341 to cf41f71 Compare December 19, 2019 13:04

pracucci and others added 11 commits January 3, 2020 12:29

Moved index cache key struct outside of the file containing the in-me…

cc101b8

…mory cache backend because generic Signed-off-by: Marco Pracucci <marco@pracucci.com>

Added cacheKey.string() to make it memcached friendly

ba6af3d

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Added MemcachedIndexCache support

0f57c95

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Export Prometheus metrics from MemcachedIndexCache

aaa15f4

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Fixed linter issues

60b23d4

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Simplified workers wait group in memcachedClient

b114623

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Fixed memcached client GetMulti() results batches channel buffer

4afe30a

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Wait for addrs resolution gorouting to complete on memcachedClient.St…

bcc5ceb

…op() Signed-off-by: Marco Pracucci <marco@pracucci.com>

Return struct from NewMemcachedClient()

ebd9af0

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Update pkg/cacheutil/memcached_client.go

4581111

Co-Authored-By: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: Marco Pracucci <marco@pracucci.com>

Simplified memcachedClient tests

c9d37f9

Signed-off-by: Marco Pracucci <marco@pracucci.com>

pracucci added 23 commits January 3, 2020 12:29

Cleaned up code based on feedback

77413ca

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Removed error from GetMulti() return and introduced metrics and traci…

16d368a

…ng in MemcachedClient Signed-off-by: Marco Pracucci <marco@pracucci.com>

Fixed compilation errors in store E2E tests

9178cdd

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Added leaktest check to all tests

8951227

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Introduced --index.cache-config support

390e93c

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Fixed compilation errors in store E2E tests

702dc82

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Updated store flags doc

bf5dd7c

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Updated index cache doc

fb6d122

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Updated changelog

a7c20f2

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Increased default memcached client timeout from 100ms to 500ms

e2c72a0

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Disabled memcached client max batch size by default

f8b318e

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Fixed index cache key max length

491c5bb

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Removed TODO from memcached client since looks the general consensus …

db9356b

…is to have a global limit on the max concurrent batches Signed-off-by: Marco Pracucci <marco@pracucci.com>

Allow to configure in-memory index cache using byte units

bb3eb88

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Refactored index cache config file doc

1ea97b2

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Fixed nits in comments

08f1bd3

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Replaced hardcoded 16 with ULID calculated length based on review com…

f83b2d2

…ment Signed-off-by: Marco Pracucci <marco@pracucci.com>

Do not expose jumpHash func

b8b8517

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Updated changelog

f62a0b4

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Switched MemcachedClient GetMulti() concurrency limit to a gate

b2d822e

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Fixed flaky tests

9113f0c

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Fixed typos in comments

da13b4b

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Renamed memcached config option addrs to addresses

f1482f2

Signed-off-by: Marco Pracucci <marco@pracucci.com>

pracucci force-pushed the add-memcached-support-to-index-cache branch from cf41f71 to f1482f2 Compare January 3, 2020 11:30

bwplotka approved these changes Jan 3, 2020

View reviewed changes

GiedriusS approved these changes Jan 3, 2020

View reviewed changes

bwplotka merged commit bb346c0 into thanos-io:master Jan 3, 2020

pracucci deleted the add-memcached-support-to-index-cache branch January 7, 2020 11:56

		Default("250MB").Bytes()

		indexCacheConfig := extflag.RegisterPathOrContent(cmd, "index-cache.config",

Add memcached support to index cache #1881

Add memcached support to index cache #1881

Conversation

pracucci commented Dec 12, 2019 • edited Loading

Changes

Verification

thorfour left a comment

Choose a reason for hiding this comment

bwplotka left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pracucci commented Dec 16, 2019

bwplotka left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GiedriusS Dec 17, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GiedriusS Dec 17, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thorfour left a comment

Choose a reason for hiding this comment

GiedriusS Dec 17, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bwplotka left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bwplotka commented Dec 18, 2019

pracucci commented Dec 19, 2019

GiedriusS commented Jan 3, 2020

pracucci commented Jan 3, 2020

bwplotka left a comment

Choose a reason for hiding this comment

bwplotka commented Jan 3, 2020

pracucci commented Dec 12, 2019 •

edited

Loading

GiedriusS Dec 17, 2019 •

edited

Loading

GiedriusS Dec 17, 2019 •

edited

Loading

GiedriusS Dec 17, 2019 •

edited

Loading