Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: introduce pkg/migrations #56107

Closed

Commits on Oct 27, 2020

  1. server: clarify a logging message

    We'll want to eventually distinguish between sqlmigrations (only run at
    start up) and general purpose (and possibly long-running) migrations.
    We'll introduce the latter in future commits, within a new
    pkg/migration.
    
    Release note: None
    irfansharif committed Oct 27, 2020
    Configuration menu
    Copy the full SHA
    711245a View commit details
    Browse the repository at this point in the history
  2. sql: add scaffolding for version upgrade hook

    This callback will be called after validating a `SET CLUSTER SETTING
    version` but before executing it. It will be used in future commits to
    execute long-running migrations.
    
    Release note: None
    irfansharif committed Oct 27, 2020
    Configuration menu
    Copy the full SHA
    aad0bad View commit details
    Browse the repository at this point in the history

Commits on Oct 29, 2020

  1. kvserver,roachpb: introduce Migrate request type

    This request will be fleshed out and used in future commits that
    introduce the long running migration orchestrator process.
    
    Release note: None
    irfansharif committed Oct 29, 2020
    Configuration menu
    Copy the full SHA
    e27583a View commit details
    Browse the repository at this point in the history
  2. server,kvserver: introduce the EveryNode rpc

    ...and populate it with the `AckClusterVersion` op.
    
    EveryNode is the RPC that will powers the long running migrations
    infrastructure. It'll let callers define and execute arbitrary commands
    across every node in the cluster. To motivate what this would look like,
    we introduce alongside it one such command, the `AckClusterVersion`
    operation. It isn't currently hooked up to anything, but this will
    eventually be the mechanism through which we'll propagate cluster
    version bumps across the crdb cluster, replacing our gossip based
    distribution mechanism in-place today. This will let the orchestrator
    bump version gates in a controlled fashion across the cluster.
    
    To achieve this, the stubbed out implementation of AckClusterVersion
    makes use of the same `StoreClusterVersionKey` otherwise used in
    callbacks attached to the gossip handler for cluster version bumps.
    
    Release note: None
    irfansharif committed Oct 29, 2020
    Configuration menu
    Copy the full SHA
    c690d2d View commit details
    Browse the repository at this point in the history
  3. kvserver: introduce GetLivenessesFromKV

    Now that we always create a liveness record on start up (cockroachdb#53805), we can
    simply fetch all liveness records from KV when wanting an up-to-date
    view of all nodes in the cluster. We add a helper to do as much,
    which we'll rely on in future commits. It's a bit unfortunate that we're
    further adding on to the NodeLiveness API without changing the
    underlying look-aside caching structure, but the methods fetching
    records from KV is the world we're hoping to start moving towards over
    time.
    
    Release note: None
    irfansharif committed Oct 29, 2020
    Configuration menu
    Copy the full SHA
    c3801a5 View commit details
    Browse the repository at this point in the history
  4. migration: introduce pkg/migrations

    Package migration captures the facilities needed to define and execute
    migrations for a crdb cluster. These migrations can be arbitrarily long
    running, are free to send out arbitrary requests cluster wide, change
    internal DB state, and much more. They're typically reserved for crdb
    internal operations and state. Each migration is idempotent in nature,
    is associated with a specific cluster version, and executed when the
    cluster version is made activate on every node in the cluster.
    
    Examples of migrations that apply would be migrations to move all raft
    state from one storage engine to another, or purging all usage of the
    replicated truncated state in KV. A "sister" package of interest is
    pkg/sqlmigrations.
    
    ---
    
    This commit only introduces the basic scaffolding and wiring from
    existing functionality. We'll flesh in the missing bits in future
    commits.
    
    Release note: None
    irfansharif committed Oct 29, 2020
    Configuration menu
    Copy the full SHA
    32397d6 View commit details
    Browse the repository at this point in the history
  5. migration: plumb in node dialer

    The migration manager will make use of the EveryNode RPC once it's
    properly wired up (in future commits). It'll need a node dialer to do
    so.
    
    Release note: None
    irfansharif committed Oct 29, 2020
    Configuration menu
    Copy the full SHA
    9ccb3f9 View commit details
    Browse the repository at this point in the history
  6. server,migration: generate EveryNode req/resp helpers

    We expect to add multiple req/resp types as individual operations for
    the EveryNode RPC. Each of these operations will correspond to a
    "primitive" of sorts for the (long running) migrations infrastructure.
    It's a bit cumbersome to wield this nested union type, so we
    autogenerate helper code to do it for us. We take precedence from
    api.proto and all the very many batch request/responses.
    
    Release note: None
    irfansharif committed Oct 29, 2020
    Configuration menu
    Copy the full SHA
    c260c85 View commit details
    Browse the repository at this point in the history
  7. server: expose GetLivenessesFromKV in node liveness interfaces

    To flesh out the migrations infrastructure (in future commits), we'll
    need a handle on all the nodes present in the system. Now that we always
    create a liveness record on start up (cockroachdb#53805), we can simply fetch all
    liveness records from KV. We add a helper to do so.
    
    It's a bit unfortunate that we're further adding on to the NodeLiveness
    API without changing the caching structure, but the methods fetching
    records from KV is the world we're hoping to move towards going forward.
    
    This does mean that we're introducing a direct dependency on
    NodeLiveness in the sql layer, and there's improvements to be made here
    around interfaces delineating between "system tenant only" sql code and
    everything else. Only system tenants have the privilege to set cluster
    settings (or at least the version setting specifically), which is what
    this API will look to power.
    
    Release note: None
    irfansharif committed Oct 29, 2020
    Configuration menu
    Copy the full SHA
    829697b View commit details
    Browse the repository at this point in the history
  8. server,migration: plumb in node liveness

    Plumb in the view into node liveness that was fleshed out earlier. We
    use it to power the RequiredNodes primitive, that provides a handle on
    all nodes in the system.
    
    Copying from elsewhere:
    
        // RequiredNodes returns the node IDs for all nodes that are
        // currently part of the cluster (i.e. they haven't been
        // decommissioned away). Migrations have the pre-requisite that all
        // required nodes are up and running so that we're able to execute
        // all relevant node-level operations on them. If any of the nodes
        // are found to be unavailable, an error is returned.
    
    Release note: None
    irfansharif committed Oct 29, 2020
    Configuration menu
    Copy the full SHA
    97fdbf2 View commit details
    Browse the repository at this point in the history
  9. [dnm] clusterversion,heartbeat: hack to bump versions willy nilly

    This commit is not going to be merged. It was added to test things
    quickly (in lieu of actual tests) by letting me bump versions willy
    nilly (ignoring max allowable version).
    
    Release note: None
    irfansharif committed Oct 29, 2020
    Configuration menu
    Copy the full SHA
    acc6553 View commit details
    Browse the repository at this point in the history
  10. server: introduce ValidateTargetClusterVersionRequest

    We'll use this primitive in a future commit to introduce additional
    safeguards (not present today) around cluster version upgrades.
    Specifically, we'll use this EveryNode operation to validate that every
    node in the cluster is running a binary that's able to support the
    specified cluster version.
    irfansharif committed Oct 29, 2020
    Configuration menu
    Copy the full SHA
    7bfa878 View commit details
    Browse the repository at this point in the history
  11. settings,clusterversion: disconnect cluster version from gossip

    ...in favor of direct RPCs to all nodes in the cluster. This commit in
    particular deserves a bit of scrutiny. It uses the building blocks we've
    added thus far to replace the use of gossip to disseminate the cluster
    version. It does so by sending out individual RPCs to each node in the
    cluster, informing them of a version bump, all the while retaining the
    same guarantees provided by our (now previously) gossip-backed mechanism.
    
    This diff has the following "pieces":
    - It disconnecting version setting updates through gossip (by
      disconnecting the setting type within the updater process)
    - It using the EveryNode RPC to send out RPCs to each node in the
      cluster, containing the payload that each node would otherwise receive
      through gossip.
    - It expands the clusterversion.Handle interface to allow setting the
      active version directly through it.
    - It persisting any cluster versions received from other nodes first,
      within keys.StoreClusterVersionKey, before bumping the version gate.
      This was previously achieved by attaching callbacks on the version
      handle (look towards all the diffs around SetBeforeChange). This is an
      invariant we also maintained earlier.
    - It using the active version provided by the join RPC to set the
      version setting directly (after having persisted it first). This too
      was previously achieved through gossip + the callback.
    
    While here, we add a few comments and chip away at the callback hooks
    that are no longer needed.
    
    Release note: None
    irfansharif committed Oct 29, 2020
    Configuration menu
    Copy the full SHA
    87c6d56 View commit details
    Browse the repository at this point in the history
  12. migration: plumb in an internal executor, kv.DB

    Just filling in a few missing dependencies we expect migration code to
    rely on going forward.
    
    Release note: None
    irfansharif committed Oct 29, 2020
    Configuration menu
    Copy the full SHA
    a43a6bf View commit details
    Browse the repository at this point in the history
  13. migration: implement IterateRangeDecriptors

    It's not currently wired up to anything (there are no real migrations
    yet), but it's one of the primitives we expect future migrations to rely
    on (in future commits).
    
    Release note: None
    irfansharif committed Oct 29, 2020
    Configuration menu
    Copy the full SHA
    73bc6f9 View commit details
    Browse the repository at this point in the history
  14. sql: introduce system.migrations

    We'll use it in a future commit to store migration state, for
    introspection.
    
    > SHOW CREATE system.migrations;
             table_name        |                     create_statement
    ---------------------------+------------------------------------------------------------
      system.public.migrations | CREATE TABLE public.migrations (
                               |     id INT8 NOT NULL DEFAULT unique_rowid(),
                               |     metadata STRING NOT NULL,
                               |     started TIMESTAMP NOT NULL DEFAULT now():::TIMESTAMP,
                               |     progress BYTES NULL,
                               |     CONSTRAINT "primary" PRIMARY KEY (id ASC),
                               |     FAMILY "primary" (id, metadata, started),
                               |     FAMILY progress (progress)
                               | )
    
    Release note: None
    irfansharif committed Oct 29, 2020
    Configuration menu
    Copy the full SHA
    678508a View commit details
    Browse the repository at this point in the history
  15. [dnm] clusterversion,heartbeat: remove hack to bump versions willy nilly

    This reverts an earlier commit removing version upgrade safeguards. This
    commit will also not be merged.
    
    Release note: None
    irfansharif committed Oct 29, 2020
    Configuration menu
    Copy the full SHA
    4244094 View commit details
    Browse the repository at this point in the history