Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PlannedReparent fixes #6050

Merged
merged 6 commits into from
Apr 15, 2020
Merged

PlannedReparent fixes #6050

merged 6 commits into from
Apr 15, 2020

Conversation

deepthi
Copy link
Member

@deepthi deepthi commented Apr 11, 2020

This PR implements parts 1-3 listed in #5991. Part 4 will be done as a separate PR.

RELEASE NOTE: ACTION REQUIRED

When updating from a version before this PR to a version after it, it is critical that you follow the recommended upgrade order. In particular, you must upgrade all the vttablets in the cluster before upgrading any of the vtctlds.

Similarly, if you need to downgrade from a version after this PR to a version before it, you must downgrade in the reverse order: downgrade all vtctlds before downgrading any vttablets.

Copy link
Contributor

@sougou sougou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. We'll wait for @enisoc to have a look also.

Copy link
Member

@enisoc enisoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't finished a full pass yet, but sending what I have so far.

go/mysql/slave_status.go Outdated Show resolved Hide resolved
go/vt/vttablet/tabletmanager/rpc_replication.go Outdated Show resolved Hide resolved
}
return mysql.EncodePosition(pos), nil
return mysql.EncodePosition(rs.RelayLogPosition), nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, if there are errant transactions on this replica, they'll show up in the returned position. Does that end up protecting us against any scenario? Do we lose that if we return the relay log pos instead?

Maybe we should check for this case and fail if our expected invariant does not hold? If one of relayPos or masterPos are a superset of the other (AtLeast()), then return the one that's farther ahead. If neither is a superset of the other, return an error saying we detected errant transactions and this replica should not be used as a master.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, if masterPos is a superset of relayPos, that is exactly the errant transaction situation. The latest commit addresses this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on experiments with relayLogPosition, we have decided to exclude that logic from this PR, and do it as a separate PR.

Copy link
Member

@enisoc enisoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be another one of those changes that makes it critical to use the recommended upgrade order since vtctld expects a new tabletmanager RPC, right? What's the process for making a sufficient amount of noise about that?

go/vt/vttablet/tabletmanager/rpc_replication.go Outdated Show resolved Hide resolved
go/vt/wrangler/reparent.go Outdated Show resolved Hide resolved
go/vt/wrangler/reparent.go Outdated Show resolved Hide resolved
go/vt/wrangler/reparent.go Outdated Show resolved Hide resolved
go/vt/wrangler/reparent.go Show resolved Hide resolved
go/vt/wrangler/reparent.go Outdated Show resolved Hide resolved
@@ -763,9 +797,11 @@ func (maxPosSearch *maxReplPosSearch) processTablet(tablet *topodatapb.Tablet) {
maxPosSearch.wrangler.logger.Warningf("failed to get replication status from %v, ignoring tablet: %v", topoproto.TabletAliasString(tablet.Alias), err)
return
}
replPos, err := mysql.DecodePosition(status.Position)
// The replica that is most progressed may actually be the one with the furthest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should check both positions and use the one that's a superset of the other. If neither is a true superset, then we have detected errant transactions and it isn't safe to promote this replica.

Copy link
Member Author

@deepthi deepthi Apr 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using RelayLogPosition has been deferred to a future PR.

@@ -934,27 +970,28 @@ func (wr *Wrangler) emergencyReparentShardLocked(ctx context.Context, ev *events
if !ok {
return fmt.Errorf("couldn't get master elect %v replication position", topoproto.TabletAliasString(masterElectTabletAlias))
}
masterElectPos, err := mysql.DecodePosition(masterElectStatus.Position)
// Use RelayLogPosition to determine most advanced tablet
masterElectPos, err := mysql.DecodePosition(masterElectStatus.RelayLogPosition)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the same suggestion here about looking at both positions whenever both are known.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using RelayLogPosition has been deferred to a future PR.

@deepthi
Copy link
Member Author

deepthi commented Apr 14, 2020

This is going to be another one of those changes that makes it critical to use the recommended upgrade order since vtctld expects a new tabletmanager RPC, right? What's the process for making a sufficient amount of noise about that?

Fair point. We will announce this on vitess slack, and put it in release notes. I will also edit the PR description and add a notice.

… tablet is written to topo

Signed-off-by: deepthi <deepthi@planetscale.com>
PromoteSlaveWhenCaughtUp RPC

Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
…th current master

Signed-off-by: deepthi <deepthi@planetscale.com>

if err := wr.tmc.SetReadWrite(rwCtx, masterElectTabletInfo.Tablet); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't put my finger on it, but this makes me feel vaguely uneasy. Is it clear in your mind how we're confident at this point that no other tablet's mysqld is read-write? I'm having trouble putting together that chain of reasoning.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's replay the steps:

  1. Mark old master read-only
  2. Designate new master in topo (mastership+timestamp)
  3. Mark new master read-write
  4. Do the rest of the steps

If something fails after step 1, the topo will return the old one as master, and we just have to PRS on it.
If something fails after step 2, the topo will return the new one as master, and we perform steps 3 onwards.

This is probably your discomfort (which I also feel a bit): Is there a situation where the topo will return the old one as master after we complete step 2? By our reasoning of timestamps, it should not, because we compare all timestamps before identifying the true master.

I would feel slightly better if we marked the old master as replica before step 2, at least as best effort. But you and deepthi convinced me that it's unnecessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reasoning seems sound. I had included a new test that re-runs PRS on the candidate master after first simulating a failure in step 2. I could add more tests like that (simulating failures at other points in the sequence).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm convinced as well. Thanks.

Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
Copy link
Member

@enisoc enisoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@deepthi deepthi merged commit 545c432 into vitessio:master Apr 15, 2020
@deepthi deepthi deleted the ds-prs-fixes branch May 14, 2020 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants