Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: stop pretending to deal with Raft leadership when draining #55619

Conversation

andreimatei
Copy link
Contributor

@andreimatei andreimatei commented Oct 15, 2020

The draining code was pretending to deal with transferring Raft
leadership, besides leases, but it was all an illusion. This patch stops
the pretending, simplifying the code and preventing mis-interpretation
(which I've suffered from).
There are a few cases to discuss:

  1. There is no lease. The code really looked like it will try to do
    something to the leadership, but ended up not doing anything because
    maybeTransferRaftLeadership() is a no-op when there's no lease.
  2. The leases needs to be moved. In this case, the code was first trying
    to move the lease and then, if that *failed, was trying to move the
    leadership. This was a no-op since maybeTransferRaftLeadership() only
    moves to the leadership to the leaseholder; if the leaseholder hasn't
    moved, it's a no-op.
  3. The lease doesn't need to be moved, but the leadership does. In this
    case the code could theoretically do the leadership transfer, but
    this case is very rare - the leadership moves as lease transfers
    apply, and if that fails, we'll continue attempting to move the
    leadership every Raft tick (every 200ms).

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion either way on this.

The draining code was pretending to deal with transferring Raft
leadership, besides leases, but it was all an illusion. This patch stops
the pretending, simplifying the code and preventing mis-interpretation
(which I've suffered from).
There are a few cases to discuss:
1) There is no lease. The code really looked like it will try to do
   something to the leadership, but ended up not doing anything because
   maybeTransferRaftLeadership() is a no-op when there's no lease.
2) The leases needs to be moved. In this case, the code was first trying
   to move the lease and then, if that *failed, was trying to move the
   leadership. This was a no-op since maybeTransferRaftLeadership() only
   moves to the leadership to the leaseholder; if the leaseholder hasn't
   moved, it's a no-op.
3) The lease doesn't need to be moved, but the leadership does. In this
   case the code could theoretically do the leadership transfer, but
   this case is very rare - the leadership moves as lease transfers
   apply, and if that fails, we'll continue attempting to move the
   leadership every Raft tick (every 200ms).

Release note: None
maybeTransferRaftLeadership -> maybeTransferRaftLeadershipToLeaseholder
to make it a lot more suggestive.

Release note: None
@andreimatei andreimatei force-pushed the drain.dont-transfer-leadership-when-lease-fails branch from 389f8e3 to 9b554b6 Compare October 16, 2020 20:44
@andreimatei andreimatei changed the title kvserver: dont drain Raft leadership when draining lease fails kvserver: stop pretending to deal with Raft leadership when draining Oct 16, 2020
@andreimatei
Copy link
Contributor Author

This has changed because the code previously mislead me. Please read again.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1, 3 of 3 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @nvanbenschoten, and @tbg)

@nvanbenschoten
Copy link
Member

  1. The lease doesn't need to be moved, but the leadership does. In this
    case the code could theoretically do the leadership transfer, but
    this case is very rare - the leadership moves as lease transfers
    apply, and if that fails, we'll continue attempting to move the
    leadership every Raft tick (every 200ms).

Why is this very rare? Just because we expect the new leaseholder to keep calling an election every 200ms? Are you at all concerned that an election is less efficient than a leadership transfer?

The change itself LGTM and I appreciate the simplification. Is this a backport target?

Copy link
Contributor

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lease doesn't need to be moved, but the leadership does. In this
case the code could theoretically do the leadership transfer, but
this case is very rare

The case where (at the start of the drain) we are a leader without a lease is rare (because we try to correct this state continuously, not just in the background). But I'm a bit more concerned that this code was already mishandling an aspect of case 2: if we transfer a lease, the leadership will follow, but if the former leader doesn't remain alive long enough, the range will have to wait for an election timeout. We need to make sure that as we're transferring leases away we don't shut down immediately when the last lease is gone without waiting for the leadership to move too. (I'm not sure if the entire drain process might make up for this, but the if leaseTransferred { needsRaftTransfer = false } line looks wrong to me)

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @nvanbenschoten, and @tbg)

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

bors r+


Is this a backport target?

I wasn't planning on it...

Why is this very rare? Just because we expect the new leaseholder to keep calling an election every 200ms? Are you at all concerned that an election is less efficient than a leadership transfer?

I was not particularly aware of the perf differences of elections vs transfers, but I can't say I'm very concerned... Because we attempt the leadership transfer in leasePostApply and then periodically which I think is quite enough to make it rare that we get here.

The case where (at the start of the drain) we are a leader without a lease is rare (because we try to correct this state continuously, not just in the background).

Ben what did you mean here ("continuously" vs "in the background")? Did you mean there's another called to maybeTransferRaftLedership besides leasePostApply and tick? Cause I don't see it...

But I'm a bit more concerned that this code was already mishandling an aspect of case 2: if we transfer a lease, the leadership will follow, but if the former leader doesn't remain alive long enough, the range will have to wait for an election timeout. We need to make sure that as we're transferring leases away we don't shut down immediately when the last lease is gone without waiting for the leadership to move too. (I'm not sure if the entire drain process might make up for this, but the if leaseTransferred { needsRaftTransfer = false } line looks wrong to me)

Luckily #55460 introduces a sleep after draining which should allow time for the leaderships to follow the transferred leases.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bdarnell, @nvanbenschoten, and @tbg)

@craig
Copy link
Contributor

craig bot commented Oct 19, 2020

Build succeeded:

@craig craig bot merged commit 9083be9 into cockroachdb:master Oct 19, 2020
Copy link
Contributor

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not particularly aware of the perf differences of elections vs transfers

The difference is mainly in availability instead of efficiency - an election implies 5-10 seconds of disruption while a transfer is just a few round trips. So this is a big deal, but I agree with you that it's rare enough that we'll be in this state when we get here.

Ben what did you mean here ("continuously" vs "in the background")?

Oops that's not what I meant to say at all. What I meant was "continuously and in the background" as opposed to "only during the drain process".

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @tbg)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants