Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: don't always refuse to take leases when draining #55624

Merged

Commits on Oct 26, 2020

  1. kvsever: introduce some testing knobs for node liveness

    Release note: None
    andreimatei committed Oct 26, 2020
    Configuration menu
    Copy the full SHA
    be25a97 View commit details
    Browse the repository at this point in the history
  2. kvserver: delete TestRefreshPendingCommands

    This test either has completely rotted, or has always been confused. It
    tries to verify that different mechanisms trigger raft reproposals, but
    it doesn't seem to actually depend on reproposals. The test says that an
    increment depend on the reproposal of a previous lease request, but
    there's no such lease request. Also, technically reproposals of lease
    requests stopped being a thing a while ago. It also talks about Raft
    leadership changing, but that's not explicit in the test.
    
    The test fails with the next commit that changes how draining replicas
    handle lease requests. It's unclear to me what's salvageable from the
    test.
    
    Release note: None
    andreimatei committed Oct 26, 2020
    Configuration menu
    Copy the full SHA
    480c9f0 View commit details
    Browse the repository at this point in the history
  3. kvserver: don't always refuse to take leases when draining

    Before this patch, a draining node would not take any new leases once
    it's draining. This is a problem in case all replicas of a range are
    draining at the same time (e.g. when draining a single-node cluster, or
    when draining multiple nodes at the same time perhaps by mistake) -
    nobody wants the lease. Particularly because the liveness range is
    expiration-based (and thus permanently in need of new leases to be
    granted), this quickly results in nodes failing their liveness.
    It also becomes more of a problem with cockroachdb#55148, where we start refusing
    to take the lease on replicas that are not the leader - so if the leader
    is draining, we deadlock.
    
    This patch makes an exception for leaders, which now no longer refuse
    the lease even when they're draining. The reasonsing being that it's too
    easy to deadlock otherwise.
    
    Release note: None
    andreimatei committed Oct 26, 2020
    Configuration menu
    Copy the full SHA
    acc1ad1 View commit details
    Browse the repository at this point in the history