-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make reparents more robust #5391
Conversation
Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
Reparent: Move TER vtctl command from vttablet to wrangler
Reparent: add ability to watch shard data
* PlannedReparentShard: Allow retrying PRS to the existing master. This is an incremental first step toward making PRS more useful for repairing situations when replication across a shard is not fully consistent. The main thing this enables is retrying the step of reconfiguring all replicas (including the old master) to point to the new master. Signed-off-by: Anthony Yeh <enisoc@planetscale.com> * Fix PRS test: Old master should have no slave status. Signed-off-by: Anthony Yeh <enisoc@planetscale.com> * Fix comment. Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
In particular, if we know we're master but the shard record is wrong, update it. And if another tablet takes over the shard record by having a more recent master term start time, we know we need to stop claiming to be master. Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
tabletmanager: Keep tablet and shard in sync.
The new TER in wrangler skipped setting the master term start time. Now we start a master term if ChangeType() is called with type MASTER. Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
* Fix PlannedReparentShard unit tests We should not explicitly call SetMaster on the old master because PromoteSlaveWhenCaughtUp sets newMaster's tablet type to MASTER, which leads ShardSync to update the Shard record, which notifies the oldMaster's ShardSync, which calls SetMaster Signed-off-by: deepthi <deepthi@planetscale.com> * PromoteSlave should use a separate context and not reuse remoteCtx Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
Fix vtgate_buffer test
Duplicated relevant RPC tests for wrangler. Moved unrelated tests to a different file, fixed RPC tests to not error out during SetMaster Signed-off-by: deepthi <deepthi@planetscale.com>
…otected by mutex Signed-off-by: deepthi <deepthi@planetscale.com>
* Remove obsolete comments. These are talking about the serving graph, which no longer exists. Instead of storing serving state of each tablet in topo, we now have vtgate directly query serving state of every tablet. Signed-off-by: Anthony Yeh <enisoc@planetscale.com> * Make DemoteMaster idempotent. Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
…Cancel Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
Unit tests for wrangler version of TabletExternallyReparented
Signed-off-by: deepthi <deepthi@planetscale.com>
unit tests for shard watch
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
…update shard master Signed-off-by: deepthi <deepthi@planetscale.com>
…arting with InitTablet Signed-off-by: deepthi <deepthi@planetscale.com>
applicable conditions vttablet InitTablet should check MasterTermStartTime and take over if necessary fix unit test setup to work with changes to InitTablet functions Signed-off-by: deepthi <deepthi@planetscale.com>
…n-zero Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
…that new tablet is returned only if there is no error Signed-off-by: deepthi <deepthi@planetscale.com>
InitTablet should not update master alias on shard record
…er will do it (#5363) Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
* PlannedReparentShard: Fix more known-recoverable problems. PlannedReparentShard should be able to fix replication as long as all tablets are reachable and all replication positions are in a mutually-consistent state. PRS also no longer trusts that the shard record contains up-to-date information on the master, because we update that record asynchronously now. Instead, it looks at MasterTermStartTime values stored in each master tablet's record, so it makes the same choice of master as vtgates. Signed-off-by: Anthony Yeh <enisoc@planetscale.com> * PlannedReparentShard: Add -lag_threshold flag. Signed-off-by: Anthony Yeh <enisoc@planetscale.com> * Fix expected error in reparent test. Signed-off-by: Anthony Yeh <enisoc@planetscale.com> * PRS: Add test case for graceful recovery. Signed-off-by: Anthony Yeh <enisoc@planetscale.com> * PRS: Measure replication progress instead of lag. Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
@sougou Before merging this, please make sure you change from "Squash and merge" to "Create a merge commit" so we don't lose individual authorship. We already reviewed and squashed along the way as we merged PRs into the dev branch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// WatchShard will set a watch on the Shard object. | ||
// It has the same contract as conn.Watch, but it also unpacks the | ||
// contents into a Shard object | ||
func (ts *Server) WatchShard(ctx context.Context, keyspace, shard string) (*WatchShardData, <-chan *WatchShardData, CancelFunc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll eventually need to harden this to make sure it stays connected to the topo.
This is the implementation of the plan discussed in #5172. The main features of the new implementation include:
-new_master
tablet is already the master. This means, for example, if PRS reports partial failure (e.g. some replicas couldn't be reached to reparent them), you can run it again to retry any failed operations.-new_master
is able to make progress replicating from the current master before setting the current master read-only. This avoids causing any disruption to the current master in the case when the candidate master is too far behind on replication to catch up within the timeout of the reparent operation.RELEASE NOTE: ACTION REQUIRED
When updating from a version before this PR to a version after it, it is critical that you follow the recommended upgrade order. In particular, you must upgrade all the vttablets in the cluster before upgrading any of the vtctlds.
Similarly, if you need to downgrade from a version after this PR to a version before it, you must downgrade in the reverse order: downgrade all vtctlds before downgrading any vttablets.