Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-20.2: server: always create a liveness record before starting up #54212

Merged
merged 3 commits into from
Sep 15, 2020

Commits on Sep 11, 2020

  1. server: always create a liveness record before starting up

    Previously it used to be the case that it was possible for a node to be
    up and running, and for there to be no corresponding liveness record for
    it. This was a very transient situation as liveness records are created
    for a given node as soon as it out its first heartbeat. Still, given
    that this could take a few seconds, it lent to a lot of complexity in
    our handling of node liveness where we had to always anticipate the
    possibility of there being no corresponding liveness record for a given
    node (and thus creating it if necessary).
    
    Having a liveness record for each node always present is a crucial
    building block for long running migrations (cockroachdb#48843). There the intention
    is to have the orchestrator process look towards the list of liveness
    records for an authoritative view of cluster membership. Previously when
    it was possible for an active member of the cluster to not have a
    corresponding liveness record (no matter how unlikely or short-lived in
    practice), we could not generate such a view.
    
    ---
    
    This is an alternative implementation for cockroachdb#53805. Here we choose to
    manually write the liveness record for the bootstrapping node when
    writing initial cluster data. For all other nodes, we do it on the
    server-side of the join RPC. We're also careful to do it in the legacy
    codepath when joining a cluster through gossip.
    
    Release note: None
    irfansharif committed Sep 11, 2020
    Configuration menu
    Copy the full SHA
    9b2c2be View commit details
    Browse the repository at this point in the history
  2. localtestcluster: re-order setting of gossip descriptor

    The heartbeat loop depends on gossip to retrieve the node ID. When
    stressing a few tests that make use of LocalTestCluster, I was seeing
    empty liveness records for empty node IDs being heartbeated. By
    re-ordering things as such we bring it closer to the Server
    initialization ordering.
    
    Release note: None
    irfansharif committed Sep 11, 2020
    Configuration menu
    Copy the full SHA
    c3ce94b View commit details
    Browse the repository at this point in the history
  3. kvserver: address migration concern with node liveness

    In cockroachdb#53842 we introduced a change to always persist a liveness record on
    start up. As part of that change, we refactored how the liveness
    heartbeat codepath dealt with missing liveness records: it knew to fetch
    it from KV given we were now maintaining the invariant that it would
    always be present. Except that wasn't necessarily true, as demonstrated
    by the following scenario:
    
    ```
    // - v20.1 node gets added to v20.1 cluster, and is quickly removed
    //   before being able to persist its liveness record.
    // - The cluster is upgraded to v20.2.
    // - The node from earlier is rolled into v20.2, and re-added to the
    //   cluster.
    // - It's never able to successfully heartbeat (it didn't join
    //   through the join rpc, bootstrap, or gossip). Welp.
    ```
    Though admittedly unlikely, we should handle it all the same instead of
    simply erroring out. We'll just fall back to creating the liveness
    record in-place as we did in v20.1 code. We can remove this fallback in
    21.1 code.
    
    Release note: None
    irfansharif committed Sep 11, 2020
    Configuration menu
    Copy the full SHA
    683f713 View commit details
    Browse the repository at this point in the history