Skip to content

Send Raft snapshots between follower replicas #42491

@nvb

Description

@nvb

Raft currently requires all snapshots to come from leader replicas. This generally works well and is straightforward to think about.

However, this does lead to restrictions in flexibility which would be beneficial for certain classes of replica movement. Specifically, this limits flexibility in cases where there is an asymmetry between the leader replica, an up-to-date follower replica, and a follower replica that requires a snapshot. In the cases where the follower replicas as closer together than the leader replica, it would be beneficial to be able to source a snapshot from the up-to-date follower replica. Here are three concrete cases where this would be important:

  1. replica movement within a region in a cross-region replication group.
  2. rehoming an entire replication group between regions in a cross-AZ replication group where the destination region already contains a learner replica.
  3. replica movement within a single host across disks.

The "follower snapshots" would allow the first two of these cases to avoid WAN traffic, opting for faster intra-region traffic. It would also be a general enough mechanism to allow for the third case to avoid traversing the network stack at all, opting for filesystem-level data movement instead.

This might make a good intern project sometime in the future.

Epic: CRDB-5354

Metadata

Metadata

Assignees

Labels

A-kv-distributionRelating to rebalancing and leasing.A-kv-replicationRelating to Raft, consensus, and coordination.C-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)T-kvKV Team

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions