Options for geo-redundancy #3313

mvandijk1001 · 2022-10-26T15:40:38Z

mvandijk1001
Oct 26, 2022

I'm looking for some guidance. I have a requirement for geo-redundancy and am wondering if there are any deployment options that would work. I have two k8s clusters in different regions and want queries to either cluster to return the same (or as close to the same as possible) results.

The options I have explored:

Each cluster writing to their own storage buckets, and all agents remote write to both clusters. The downside here is having "two sources of truth", and no way to reconcile or detect data differences.
Both clusters writing to the same storage bucket, and remote write to only one cluster. The problem here is that recent data will be missing from the "other" cluster until the blocks are written to storage, and attempts to minimize the missing data window by shortening the block_ranges_period causes increased write amplification, so I don't think this will scale well.
Both clusters writing to the same storage bucket, and all agents remote write to both clusters. Queries are throwing err-mimir-store-consistency-check-failed errors, caused by what looks like missing blocks. From my understanding, this is to do with having two independent sets of compactors, but I'm not sure. Will disabling the compactors in one cluster solve all the problems here? Will the redundant data written by the two clusters be detected and de-duplicated?

I think option 3 seems the most promising but I'm not sure it plays nicely with the various components and there are a lot of unanswered questions about its behaviour. Does anyone have any advice?

pracucci · 2022-11-03T11:38:25Z

pracucci
Nov 3, 2022
Maintainer

Mimir is designed for high-availability and requires a low latency connection within the cluster. High availability is typically achieved running with the default replication factor of 3 and deploying Mimir in multiple availability zones within the same region. In my experience, cross-region redundancy is not a very common use case.

That being said, assuming you really need cross-region redundancy (and you want a full replication, including the object storage)...

Each cluster writing to their own storage buckets, and all agents remote write to both clusters. The downside here is having "two sources of truth", and no way to reconcile or detect data differences.

I would consider to start with this, which is the only battle tested solution among the ones you listed. If a Mimir cluster is down, you can route queries to the other one. When the unhealthy cluster will get back, the agents will resume writing there so metrics are generally expected to reconcile (in such case, I would recommend to enable out of order ingestion with a configured time window of the max outage you want to handle, e.g. 12h).

Both clusters writing to the same storage bucket, and remote write to only one cluster.

Most recent data is kept in the ingesters, so most recent metrics will not be available for querying from the "secondary" cluster. Also with this approach you will not replicate the storage bucket to a different region, so you're going to achieve a full cross-region redundancy.

Both clusters writing to the same storage bucket, and all agents remote write to both clusters.

This is uncharted territory. As far as I know, none has ever battle tested it, so there may be unknowns I can't think of right now. For sure you have to run compactor only in one of the two clusters, but there may be more issues. Also in this case you're not replicating the bucket.

0 replies

mvandijk1001 · 2022-11-16T16:03:23Z

mvandijk1001
Nov 16, 2022
Author

I have a bit of a better understanding of what was going wrong now. The approaches I listed earlier resulted in multiple hash rings, causing all kinds of problems for components that use the hash ring. For multi-region to work properly there needs to be a global memberlist.

What we are exploring now, and it looks promising, is a combination of using a multi-k8s-cluster service mesh for networking, tweaking the memberlist.join_members on each Mimir cluster to allow gossiping among all the services, and leveraging the zone awareness capabilities by making each cluster a different zone.

Thankfully the Helm chart was updated in the past month to simplify zone awareness configuration, but it required some customizing to allow the memberlist.join_members to be modified. The result is each region is a unique set of zones and all rings are connected properly.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Options for geo-redundancy #3313

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Options for geo-redundancy #3313

mvandijk1001 Oct 26, 2022

Replies: 2 comments

pracucci Nov 3, 2022 Maintainer

mvandijk1001 Nov 16, 2022 Author

mvandijk1001
Oct 26, 2022

pracucci
Nov 3, 2022
Maintainer

mvandijk1001
Nov 16, 2022
Author