Skip to content

Commit

Permalink
server: introduce a hook for short-running migrations
Browse files Browse the repository at this point in the history
TODO: see if this migration is actually "short-running". That is,
in a sufficiently large cluster, does this cause significant load?

----

This is a baby version of cockroachdb#39182 that handles only short-running
migrations but is useful in itself because it allows us to migrate us
fully into the following two KV below-Raft migrations:

1. use the RangeAppliedState on all ranges
2. use the unreplicated TruncatedState on all ranges

These two migrations have been around for a while and it has been
getting in the way of things. While ideally we introduce cockroachdb#39182 in the
near future and use it to address a larger class of migration concerns,
this PR serves as a starting point and work done on cockroachdb#39182 should be
able to absorb this PR with relative ease.

With this PR, legacy code related to 1) and 2) above can be removed from
`master` once the 20.1 branch is cut, i.e. roughly in April 2020.

The main ingredients in this PR are

a) introduce a hook that is called during `SET CLUSTER SETTING version`,
   after the change has been validated but before it is made.
b) introduce a KV-level `Migrate` ranged write command that triggers
   the migrations for 1) and 2) above. It is called from the hook for
   all of the keyspace.
c) before returning to the client, `Migrate` waits for the command to
   be durably applied on all followers.

Trying to get this 100% correct has proven tricky, perhaps foreshadowing
similar issues that expect us once we try cockroachdb#39182 in earnest. For one,
the mechanism only reaches replicas that members of the raft group, that
is, it won't touch replicas which are gc'able. For the migrations at
hand this means that in 20.2 there may - in theory - still be replicas
that have a replicated truncated state. I believe that our recent
efforts around not re-using replicas across replicaIDs has made sure
that this isn't an issue for this particular migration, but in general
it will have to remain on the radar. Similarly, it seems hard to prove
conclusively that no snapshot is in-flight that would set up a new
follower with a state predating the explicit migration, though it would
be exceptionally rare in practice.

Release note (general change): version upgrades now perform maintenance
duties that may slightly delay the `SET CLUSTER SETTING version` command
and may cause a small amount of additional load on the cluster while an
upgrade is being finalized.
  • Loading branch information
tbg authored and irfansharif committed Mar 5, 2020
1 parent a415689 commit 11da41b
Show file tree
Hide file tree
Showing 39 changed files with 3,128 additions and 1,098 deletions.
597 changes: 589 additions & 8 deletions c-deps/libroach/protos/roachpb/api.pb.cc

Large diffs are not rendered by default.

563 changes: 539 additions & 24 deletions c-deps/libroach/protos/roachpb/api.pb.h

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/generated/settings/settings.html
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,6 @@
<tr><td><code>trace.debug.enable</code></td><td>boolean</td><td><code>false</code></td><td>if set, traces for recent requests can be seen in the /debug page</td></tr>
<tr><td><code>trace.lightstep.token</code></td><td>string</td><td><code></code></td><td>if set, traces go to Lightstep using this token</td></tr>
<tr><td><code>trace.zipkin.collector</code></td><td>string</td><td><code></code></td><td>if set, traces go to the given Zipkin instance (example: '127.0.0.1:9411'); ignored if trace.lightstep.token is set</td></tr>
<tr><td><code>version</code></td><td>custom validation</td><td><code>19.2-14</code></td><td>set the active cluster version in the format '<major>.<minor>'</td></tr>
<tr><td><code>version</code></td><td>custom validation</td><td><code>19.2-15</code></td><td>set the active cluster version in the format '<major>.<minor>'</td></tr>
</tbody>
</table>
48 changes: 36 additions & 12 deletions pkg/clusterversion/cluster_version.pb.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion pkg/clusterversion/cluster_version.proto
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,11 @@ import "roachpb/metadata.proto";
import "gogoproto/gogo.proto";

// ClusterVersion represents a cluster's "active version". It is returned by the
// Version cluster setting. Its IsActive() method can be used to determine if a
// cluster version setting. Its IsActive() method can be used to determine if a
// particular migration is to be considered enabled or disabled.
message ClusterVersion {
option (gogoproto.goproto_stringer) = false;
option (gogoproto.equal) = true;

reserved 1;
// The version of functionality in use in the cluster. This value must
Expand Down
11 changes: 11 additions & 0 deletions pkg/clusterversion/cockroach_versions.go
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ const (
VersionHashShardedIndexes
VersionCreateRolePrivilege
VersionStatementDiagnosticsSystemTables
VersionNoLegacyTruncatedAndAppliedState

// Add new versions here (step one of two).
)
Expand Down Expand Up @@ -427,6 +428,16 @@ var versionsSingleton = keyedVersions([]keyedVersion{
Key: VersionStatementDiagnosticsSystemTables,
Version: roachpb.Version{Major: 19, Minor: 2, Unstable: 14},
},
{
// VersionNoLegacyTruncatedAndAppliedState does not enable any new
// functionality, but we know that once it is active, the truncated
// state of all ranges in the cluster is unreplicated, and we are using
// the RangeAppliedState for all ranges as well. This means that in the
// 20.2 cycle we are free to remove any holdover code handling their
// predecessors.
Key: VersionNoLegacyTruncatedAndAppliedState,
Version: roachpb.Version{Major: 19, Minor: 2, Unstable: 15},
},
// Add new versions here (step two of two).

})
Expand Down
5 changes: 3 additions & 2 deletions pkg/clusterversion/versionkey_string.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 0 additions & 5 deletions pkg/keys/constants.go
Original file line number Diff line number Diff line change
Expand Up @@ -140,11 +140,6 @@ var (
LocalRangeMax = LocalRangePrefix.PrefixEnd()
// LocalQueueLastProcessedSuffix is the suffix for replica queue state keys.
LocalQueueLastProcessedSuffix = roachpb.RKey("qlpt")
// LocalRangeDescriptorJointSuffix is the suffix for keys storing
// range descriptors. The value is a struct of type RangeDescriptor.
//
// TODO(tbg): decide what to actually store here. This is still unused.
LocalRangeDescriptorJointSuffix = roachpb.RKey("rdjt")
// LocalRangeDescriptorSuffix is the suffix for keys storing
// range descriptors. The value is a struct of type RangeDescriptor.
LocalRangeDescriptorSuffix = roachpb.RKey("rdsc")
Expand Down
7 changes: 3 additions & 4 deletions pkg/keys/doc.go
Original file line number Diff line number Diff line change
Expand Up @@ -190,10 +190,9 @@ var _ = [...]interface{}{
// as a whole. They are replicated and addressable. Typical examples are
// the range descriptor and transaction records. They all share
// `LocalRangePrefix`.
QueueLastProcessedKey, // "qlpt"
RangeDescriptorJointKey, // "rdjt"
RangeDescriptorKey, // "rdsc"
TransactionKey, // "txn-"
QueueLastProcessedKey, // "qlpt"
RangeDescriptorKey, // "rdsc"
TransactionKey, // "txn-"

// 4. Store local keys: These contain metadata about an individual store.
// They are unreplicated and unaddressable. The typical example is the
Expand Down
10 changes: 0 additions & 10 deletions pkg/keys/keys.go
Original file line number Diff line number Diff line change
Expand Up @@ -369,16 +369,6 @@ func RangeDescriptorKey(key roachpb.RKey) roachpb.Key {
return MakeRangeKey(key, LocalRangeDescriptorSuffix, nil)
}

// RangeDescriptorJointKey returns a range-local key for the "joint descriptor"
// for the range with specified key. This key is not versioned and it is set if
// and only if the range is in a joint configuration that it yet has to transition
// out of.
func RangeDescriptorJointKey(key roachpb.RKey) roachpb.Key {
return MakeRangeKey(key, LocalRangeDescriptorJointSuffix, nil)
}

var _ = RangeDescriptorJointKey // silence unused check

// TransactionKey returns a transaction key based on the provided
// transaction key and ID. The base key is encoded in order to
// guarantee that all transaction records for a range sort together.
Expand Down
78 changes: 78 additions & 0 deletions pkg/kv/kvserver/batcheval/cmd_migrate.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
// Copyright 2014 The Cockroach Authors.
//
// Use of this software is governed by the Business Source License
// included in the file licenses/BSL.txt.
//
// As of the Change Date specified in that file, in accordance with
// the Business Source License, use of this software will be governed
// by the Apache License, Version 2.0, included in the file
// licenses/APL.txt.

package batcheval

import (
"context"

"github.com/cockroachdb/cockroach/pkg/clusterversion"
"github.com/cockroachdb/cockroach/pkg/keys"
"github.com/cockroachdb/cockroach/pkg/kv/kvserver/batcheval/result"
"github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanset"
"github.com/cockroachdb/cockroach/pkg/kv/kvserver/storagepb"
"github.com/cockroachdb/cockroach/pkg/roachpb"
"github.com/cockroachdb/cockroach/pkg/storage"
"github.com/cockroachdb/cockroach/pkg/util/hlc"
)

func init() {
RegisterReadWriteCommand(roachpb.Migrate, declareKeysMigrate, Migrate)
}

func declareKeysMigrate(
_ *roachpb.RangeDescriptor, header roachpb.Header, req roachpb.Request, latchSpans, _ *spanset.SpanSet,
) {
latchSpans.AddNonMVCC(spanset.SpanReadWrite, roachpb.Span{Key: keys.RaftTruncatedStateLegacyKey(header.RangeID)})
}

// Migrate ensures that the range proactively carries out any outstanding
// below-Raft migrations. When this command returns, all of the below-Raft
// features known to be available are enabled.
func Migrate(
ctx context.Context, batch storage.ReadWriter, cArgs CommandArgs, resp roachpb.Response,
) (result.Result, error) {
newV := cArgs.Args.(*roachpb.MigrateRequest).NewVersion
if newV.Less(clusterversion.VersionByKey(clusterversion.VersionNoLegacyTruncatedAndAppliedState)) {
// This is only hit during testing or when upgrading an alpha to an unstable below
// the above version. In production, newV will be a major release and more precisely,
// it will equal the BinaryServerVersion of any binary in the cluster (they are all
// equal when the cluster version is bumped, mod pathological races due to user error).
return result.Result{}, nil
}

var legacyTruncatedState roachpb.RaftTruncatedState
legacyKeyFound, err := storage.MVCCGetProto(
ctx, batch, keys.RaftTruncatedStateLegacyKey(cArgs.EvalCtx.GetRangeID()),
hlc.Timestamp{}, &legacyTruncatedState, storage.MVCCGetOptions{},
)
if err != nil {
return result.Result{}, err
}

var pd result.Result
if legacyKeyFound {
// Time to migrate by deleting the legacy key. The downstream-of-Raft
// code will atomically rewrite the truncated state (supplied via the
// side effect) into the new unreplicated key.
if err := storage.MVCCDelete(
ctx, batch, cArgs.Stats, keys.RaftTruncatedStateLegacyKey(cArgs.EvalCtx.GetRangeID()),
hlc.Timestamp{}, nil, /* txn */
); err != nil {
return result.Result{}, err
}
pd.Replicated.State = &storagepb.ReplicaState{
// We need to pass in a truncated state to enable the migration.
// Passing the same one is the easiest thing to do.
TruncatedState: &legacyTruncatedState,
}
}
return pd, nil
}
7 changes: 7 additions & 0 deletions pkg/kv/kvserver/below_raft_protos_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,13 @@ var belowRaftGoldenProtos = map[reflect.Type]fixture{
emptySum: 7551962144604783939,
populatedSum: 12720006657210437557,
},
reflect.TypeOf(&enginepb.MVCCStats{}): {
populatedConstructor: func(r *rand.Rand) protoutil.Message {
return enginepb.NewPopulatedMVCCStats(r, false)
},
emptySum: 18064891702890239528,
populatedSum: 4476106263917291130,
},
reflect.TypeOf(&enginepb.RangeAppliedState{}): {
populatedConstructor: func(r *rand.Rand) protoutil.Message {
return enginepb.NewPopulatedRangeAppliedState(r, false)
Expand Down
86 changes: 86 additions & 0 deletions pkg/kv/kvserver/client_replica_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,13 @@ import (
"time"

"github.com/cockroachdb/cockroach/pkg/base"
"github.com/cockroachdb/cockroach/pkg/clusterversion"
"github.com/cockroachdb/cockroach/pkg/config/zonepb"
"github.com/cockroachdb/cockroach/pkg/internal/client"
"github.com/cockroachdb/cockroach/pkg/keys"
"github.com/cockroachdb/cockroach/pkg/kv/kvserver"
"github.com/cockroachdb/cockroach/pkg/kv/kvserver/batcheval"
"github.com/cockroachdb/cockroach/pkg/kv/kvserver/stateloader"
"github.com/cockroachdb/cockroach/pkg/kv/kvserver/storagebase"
"github.com/cockroachdb/cockroach/pkg/kv/kvserver/storagepb"
"github.com/cockroachdb/cockroach/pkg/roachpb"
Expand Down Expand Up @@ -2969,3 +2971,87 @@ func makeReplicationTargets(ids ...int) (targets []roachpb.ReplicationTarget) {
}
return targets
}

func TestMigrate(t *testing.T) {
defer leaktest.AfterTest(t)()
defer leaktest.AfterTest(t)
ctx := context.Background()

for _, testCase := range []struct {
name string
typ stateloader.TruncatedStateType
}{
{"ts=new,as=new", stateloader.TruncatedStateUnreplicated},
{"ts=legacy,as=new", stateloader.TruncatedStateLegacyReplicated},
{"ts=legacy,as=legacy", stateloader.TruncatedStateLegacyReplicatedAndNoAppliedKey},
} {
t.Run(testCase.name, func(t *testing.T) {
args := base.TestClusterArgs{}
args.ServerArgs.Knobs.Store = &kvserver.StoreTestingKnobs{TruncatedStateTypeOverride: &testCase.typ}
args.ServerArgs.Knobs.Server = &server.TestingKnobs{
DisableAutomaticVersionUpgrade: 1,
// VersionNoLegacyTruncatedAndAppliedState is special: it
// enables assertions against legacy truncated and applied state
// when it is active. Additionally, SET CLUSTER SETTING will
// call Migrate only when activating this version (so we're
// making sure it does not do it in this test).
BootstrapVersionOverride: clusterversion.VersionByKey(clusterversion.VersionNoLegacyTruncatedAndAppliedState - 1),
}
tc := testcluster.StartTestCluster(t, 3, args)
defer tc.Stopper().Stop(ctx)

visitAllRepls := func(f func(*kvserver.Replica) error) error {
for i := 0; i < tc.NumServers(); i++ {
return tc.Server(i).GetStores().(*kvserver.Stores).VisitStores(func(s *kvserver.Store) error {
var err error
s.VisitReplicas(func(repl *kvserver.Replica) bool {
err = f(repl)
return err == nil
})
return err
})
}
return nil
}

getLegacy := func() []string {
t.Helper()
var out []string
require.NoError(t, visitAllRepls(func(repl *kvserver.Replica) error {
sl := stateloader.Make(repl.RangeID)
_, legacy, err := sl.LoadRaftTruncatedState(ctx, repl.Engine())
if err != nil {
return err
}
if legacy {
// Legacy truncated state.
out = append(out, fmt.Sprintf("ts(r%d)", repl.RangeID))
}
as, err := sl.LoadRangeAppliedState(ctx, repl.Engine())
if err != nil {
return err
}
if as == nil {
// Not using AppliedState yet.
out = append(out, fmt.Sprintf("as(r%d)", repl.RangeID))
}
return nil
}))
return out
}

if out := getLegacy(); (len(out) == 0) != (testCase.typ == stateloader.TruncatedStateUnreplicated) {
t.Fatalf("expected no legacy keys iff bootstrapped with unreplicated truncated state, got legacy keys: %v", out)
} else {
// NB: we'll never spot a legacy applied state here. This is
// because that migration is so aggressive that it has already
// happened as part of the initial up-replication.
t.Logf("legacy keys before migration: %v", out)
}

_, err := tc.Conns[0].ExecContext(ctx, `SET CLUSTER SETTING version = $1`, clusterversion.TestingBinaryVersion.String())
require.NoError(t, err)
require.Zero(t, getLegacy())
})
}
}
Loading

0 comments on commit 11da41b

Please sign in to comment.