Skip to content

Commit

Permalink
storage: preserve consistency when applying widening preemptive snaps…
Browse files Browse the repository at this point in the history
…hots

Merges can cause preemptive snapshots that widen existing replicas. For
example, consider the following sequence of events:

    1. A replica of range A is removed from store S, but is not garbage
       collected.
    2. Range A subsumes its right neighbor B.
    3. Range A is re-added to store S.

In step 3, S will receive a preemptive snapshot for A that requires
widening its existing replica, thanks to the intervening merge.

Problematically, the code to check whether this widening was possible,
in Store.canApplySnapshotLocked, was incorrectly mutating the range
descriptor in the snapshot header! Applying the snapshot would then fail
to clear all of the data from the old incarnation of the replica, since
the bounds on the range deletion tombstone were wrong. This often
resulted in replica inconsistency. Plus, the in-memory copy of the range
descriptor would be incorrect until the next descriptor update--though
this usually happened quickly, as the replica would apply the change
replicas command, which updates the descriptor, soon after applying the
preemptive snapshot.

To fix the problem, teach Store.canApplySnapshotLocked to make a copy of
the range descriptor before it mutates it.

To prevent regressions, add an assertion that a range's start key is
never changed to the descriptor update path. With this assertion in
place, but without the fix itself,
TestStoreRangeMergeReadoptedLHSFollower reliably fails.

Fixes cockroachdb#29252.

Release note: None
  • Loading branch information
benesch committed Sep 7, 2018
1 parent 97e9830 commit a62b081
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 3 deletions.
5 changes: 5 additions & 0 deletions pkg/storage/replica.go
Original file line number Diff line number Diff line change
Expand Up @@ -1717,6 +1717,11 @@ func (r *Replica) setDescWithoutProcessUpdate(desc *roachpb.RangeDescriptor) {
log.Fatalf(ctx, "cannot replace initialized descriptor with uninitialized one: %+v -> %+v",
r.mu.state.Desc, desc)
}
if r.mu.state.Desc != nil && !r.mu.state.Desc.StartKey.Equal(desc.StartKey) {
ctx := r.AnnotateCtx(context.TODO())
log.Fatalf(ctx, "attempted to change replica's start key from %s to %s",
r.mu.state.Desc.StartKey, desc.StartKey)
}

newMaxID := maxReplicaID(desc)
if newMaxID > r.mu.lastReplicaAdded {
Expand Down
3 changes: 3 additions & 0 deletions pkg/storage/replica_raftstorage.go
Original file line number Diff line number Diff line change
Expand Up @@ -945,6 +945,9 @@ func (r *Replica) applySnapshot(
// Update the range and store stats.
r.store.metrics.subtractMVCCStats(*r.mu.state.Stats)
r.store.metrics.addMVCCStats(*s.Stats)
// TODO(benesch): the next line updates r.mu.state.Desc, but that's supposed
// to be handled by the call to setDescWithoutProcessUpdate below. This is not
// a correctness issue right now, but it's liable to become one.
r.mu.state = s
r.assertStateLocked(ctx, r.store.Engine())
r.mu.Unlock()
Expand Down
6 changes: 3 additions & 3 deletions pkg/storage/store_snapshot.go
Original file line number Diff line number Diff line change
Expand Up @@ -394,7 +394,7 @@ func (s *Store) canApplySnapshot(
func (s *Store) canApplySnapshotLocked(
ctx context.Context, snapHeader *SnapshotRequest_Header,
) (*ReplicaPlaceholder, error) {
desc := snapHeader.State.Desc
desc := *snapHeader.State.Desc
if v, ok := s.mu.replicas.Load(int64(desc.RangeID)); ok && (*Replica)(v).IsInitialized() {
// We have an initialized replica. Preemptive snapshots can be applied with
// no further checks if they do not widen the existing replica. Raft
Expand All @@ -420,7 +420,7 @@ func (s *Store) canApplySnapshotLocked(

// TODO(benesch): consider discovering and GC'ing *all* overlapping ranges,
// not just the first one that getOverlappingKeyRangeLocked happens to return.
if exRange := s.getOverlappingKeyRangeLocked(desc); exRange != nil {
if exRange := s.getOverlappingKeyRangeLocked(&desc); exRange != nil {
// We have a conflicting range, so we must block the snapshot.
// When such a conflict exists, it will be resolved by one range
// either being split or garbage collected.
Expand Down Expand Up @@ -454,7 +454,7 @@ func (s *Store) canApplySnapshotLocked(
}

placeholder := &ReplicaPlaceholder{
rangeDesc: *desc,
rangeDesc: desc,
}
return placeholder, nil
}
Expand Down

0 comments on commit a62b081

Please sign in to comment.