Skip to content

Commit

Permalink
kvserver: add assertions for invariants around up liveness records
Browse files Browse the repository at this point in the history
Now that we have cockroachdb#53842, we maintain the invariant that there always
exists a liveness record for any given node. We can now simplify our
handling of liveness records internally: where previously we had code to
handle the possibility of empty liveness records (we created a new one
on the fly), we can change them to assertions to verify that's no longer
possible.

When retrieving the liveness record from our in-memory cache,
it's possible for us to not find anything due to gossip delays. Instead
of simply giving up then, now we can read the records directly from KV
(and update our caches while in the area). This PR introduces this
mechanism through usage of `getLivenessRecordFromKV`.

Finally, as a bonus, this PR also surfaces a better error when trying to
decommission non-existent nodes. We're able to do this because now we
can always assume that a missing liveness record, as seen in the
decommission codepath, implies that the user is trying to decommission a
non-existent node.

---

We don't intend to backport this to 20.2 due to the hazard described in
\cockroachdb#54216. We want this PR to bake on master and (possibly) trip up the
assertions added above if we've missed anything. They're the only ones
checking for the invariant we've introduced around liveness records.
That invariant will be depended on for long running migrations, so
better to shake things out early.

Release note: None
  • Loading branch information
irfansharif committed Sep 18, 2020
1 parent c37cff8 commit e8f6137
Show file tree
Hide file tree
Showing 5 changed files with 140 additions and 130 deletions.
6 changes: 6 additions & 0 deletions pkg/cli/error.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ import (
"strconv"
"strings"

"github.com/cockroachdb/cockroach/pkg/kv/kvserver"
"github.com/cockroachdb/cockroach/pkg/security"
"github.com/cockroachdb/cockroach/pkg/server"
"github.com/cockroachdb/cockroach/pkg/sql/pgwire/pgcode"
Expand Down Expand Up @@ -359,6 +360,11 @@ func MaybeDecorateGRPCError(
return server.ErrClusterInitialized
}

// Are we trying to recommission/decommision a node that does not exist?
if strings.Contains(err.Error(), kvserver.ErrMissingLivenessRecord.Error()) {
return fmt.Errorf("node does not exist")
}

// Nothing we can special case, just return what we have.
return err
}
Expand Down
5 changes: 2 additions & 3 deletions pkg/kv/kvserver/helpers_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -515,11 +515,10 @@ func (nl *NodeLiveness) SetDrainingInternal(

func (nl *NodeLiveness) SetDecommissioningInternal(
ctx context.Context,
nodeID roachpb.NodeID,
liveness LivenessRecord,
oldLivenessRec LivenessRecord,
targetStatus kvserverpb.MembershipStatus,
) (changeCommitted bool, err error) {
return nl.setMembershipStatusInternal(ctx, nodeID, liveness, targetStatus)
return nl.setMembershipStatusInternal(ctx, oldLivenessRec, targetStatus)
}

// GetCircuitBreaker returns the circuit breaker controlling
Expand Down
Loading

0 comments on commit e8f6137

Please sign in to comment.