Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: integrate the TLS auto-negotiation inside the crdb code #60632

Closed
8 of 25 tasks
knz opened this issue Feb 16, 2021 · 4 comments
Closed
8 of 25 tasks

server: integrate the TLS auto-negotiation inside the crdb code #60632

knz opened this issue Feb 16, 2021 · 4 comments
Labels
A-authentication Pertains to authn subsystems A-security A-server-start-drain Pertains to server startup and shutdown sequences C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-server-and-security DB Server & Security

Comments

@knz
Copy link
Contributor

knz commented Feb 16, 2021

Fixes #32448.
Epic: CRDB-6663
Jira issue: CRDB-3167

Meta-issue to track the implementation for #51991

A couple of new components:

Follow-up work:

Clean-up work:

CLI commands:

  • cockroach connect - new, only responsible for TLS handshake and writing the certs to disk
    This will leverage the first 3 components identified above: cert gen primitives + HTTP client/server for handshake.
  • cockroach start - when provided an init token, must check if the host cert is known already and if not start the TLS handshake before the remainder of the start code server: integrate the TLS auto-negotiation in the start commands #63850
  • cockroach start-single-node - new flag --self-secure-init that auto-generates an init token and proceeds as per the start logic
  • cockroach demo - will be modified to leverage the self secure init code added to start-single-node
  • Add --init-token-file flag to protect the init handshake shared secret cli: --init-token exposes the init token to the ps command #61231

Bugs:

Technical question where the answer is needed as prereq to a number of points above:

  • how are the CN and OU fields populated
  • how is the SAN field populated
    • current assumption for prototype/MVP: the addresses provided on --join go into the SAN? or maybe --listen-addr? (Unsure, this is under-specified)
    • need a practical test with a multi-server experiment, to understand the design constraint
    • there may be some flags / extra logic needed to pick up reasonable + valid addresses to populate SAN

Possible action item: perform that experiment

@knz knz added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-security C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) and removed C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. labels Feb 16, 2021
itsbilal added a commit to itsbilal/cockroach that referenced this issue Feb 16, 2021
Adds a new command stub, `connect`, with relevant args of
`--certs-dir`, `--init-token`, and list of peers. Implementation
code for this command is yet to come, and the command is not hooked
up to the outer `cockroach` command yet.

Very first part of cockroachdb#60632.

Release note: None.
itsbilal added a commit to itsbilal/cockroach that referenced this issue Feb 17, 2021
Adds a new command stub, `connect`, with relevant args of
`--certs-dir`, `--init-token`, and list of peers. Implementation
code for this command is yet to come, and the command is not hooked
up to the outer `cockroach` command yet.

Very first part of cockroachdb#60632.

Release note: None.
itsbilal added a commit to itsbilal/cockroach that referenced this issue Feb 17, 2021
Adds a new command stub, `connect`, with relevant args of
`--certs-dir`, `--init-token`, `--listen-addr`, `--join` (same as start).
Implementation code for this command is yet to come, and the command is not
hooked up to the outer `cockroach` command yet.

Very first part of cockroachdb#60632.

Release note: None.
craig bot pushed a commit that referenced this issue Feb 18, 2021
60627: bazel: fix up name of `forbiddenmethod` library target r=rickystewart a=rickystewart

This was moved wholesale from a directory called `descriptormarshal`,
so it retained its old library name. This isn't the normal style we're
using and I'm concerned about how Gazelle will handle this going
forward, so change it to the expected style. Also delete an accidentally
duplicated test.

Release note: None

60636: cli: Add connect command stub r=aaron-crl a=itsbilal

Adds a new command stub, `connect`, with relevant args of
`--certs-dir`, `--init-token`, and list of peers. Implementation
code for this command is yet to come, and the command is not hooked
up to the outer `cockroach` command yet.

Very first part of #60632.

Release note: None.

Co-authored-by: Ricky Stewart <ricky@cockroachlabs.com>
Co-authored-by: Bilal Akhtar <bilal@cockroachlabs.com>
aaron-crl pushed a commit to aaron-crl/cockroach that referenced this issue Feb 18, 2021
itsbilal added a commit to itsbilal/cockroach that referenced this issue Feb 19, 2021
…odes

This PR implements the init protocol phase of the cert-free setup
described in cockroachdb#51991. A lot of the code is pulled out of Aaron's
reference implementation of this protocol:
https://github.com/aaron-crl/toy-secure-init-handshake/tree/n-way-join

One part of cockroachdb#60632.

Release note: None.
itsbilal pushed a commit to itsbilal/cockroach that referenced this issue Feb 19, 2021
itsbilal added a commit to itsbilal/cockroach that referenced this issue Feb 19, 2021
…odes

This PR implements the init protocol phase of the cert-free setup
described in cockroachdb#51991. A lot of the code is pulled out of Aaron's
reference implementation of this protocol:
https://github.com/aaron-crl/toy-secure-init-handshake/tree/n-way-join

One part of cockroachdb#60632.

Release note: None.
aaron-crl pushed a commit to aaron-crl/cockroach that referenced this issue Feb 21, 2021
…tion

server: added utility function for bundling init certs
server: added init function that uses a recieved bundle to provision a node
security: added helper functions to support automatic certificate generation
security: added ClientCAKeyPath helper to align with ClientCACertPath

This is part of cockroachdb#60632 and provides functions for cockroachdb#60636.

Release note: None
aaron-crl pushed a commit to aaron-crl/cockroach that referenced this issue Feb 21, 2021
…tion

server: added utility function for bundling init certs
server: added init function that uses a recieved bundle to provision a node
security: added helper functions to support automatic certificate generation
security: added ClientCAKeyPath helper to align with ClientCACertPath

This is part of cockroachdb#60632 and provides functions for cockroachdb#60636.

Release note: None
aaron-crl pushed a commit to aaron-crl/cockroach that referenced this issue Feb 22, 2021
…tion

server: added utility function for bundling init certs
server: added init function that uses a recieved bundle to provision a node
security: added helper functions to support automatic certificate generation
security: added ClientCAKeyPath helper to align with ClientCACertPath

This is part of cockroachdb#60632 and provides functions for cockroachdb#60636.

Release note: None
craig bot pushed a commit that referenced this issue Feb 22, 2021
60705: server: added initial cert utilities for automatic cert generation r=knz a=aaron-crl

security: added helper functions to support automatic certificate generation

This doesn't do anything yet but will be updated to trigger off:
#60636 when it lands.

This is part of the work supporting #60632

Release note: None

Co-authored-by: Aaron Blum <aaron@cockroachlabs.com>
itsbilal added a commit to itsbilal/cockroach that referenced this issue Mar 22, 2021
Adds a CREATE JOINTOKEN statement for use in TLS auto-joins.
This statement, when run on a self-hosted single-tenant
Cockroach node, creates and returns a new join token. This
join token can then be copy-pasted to new nodes and used
to give them the set of certificates for secure auto TLS
initialization.

See RFC cockroachdb#51991. Part of cockroachdb#60632.

Release justification: New code path, gated behind an
experimental feature flag.
Release note: None.
itsbilal added a commit to itsbilal/cockroach that referenced this issue Mar 26, 2021
Adds a create_join_token builtin function for use in TLS auto-joins.
This function, when run on a self-hosted single-tenant
Cockroach node, creates and returns a new join token. This
join token can then be copy-pasted to new nodes and used
to give them the set of certificates for secure auto TLS
initialization.

See RFC cockroachdb#51991. Part of cockroachdb#60632.

Release note: None.
itsbilal added a commit to itsbilal/cockroach that referenced this issue Mar 29, 2021
Adds a create_join_token builtin function for use in TLS auto-joins.
This function, when run on a self-hosted single-tenant
Cockroach node, creates and returns a new join token. This
join token can then be copy-pasted to new nodes and used
to give them the set of certificates for secure auto TLS
initialization.

See RFC cockroachdb#51991. Part of cockroachdb#60632.

Release note: None.
craig bot pushed a commit that referenced this issue Mar 29, 2021
62053: sql: Add system table join_tokens, and create_join_tokens() builtin function r=knz a=itsbilal

This change adds a new system table, `join_tokens`, for the
exclusive use of storing join tokens. This is necessary as
we need guaranteed at-most-once semantics with these, which
transactions give us pretty easily. A related migration is also added
to create said table

This change also adds a new builtin function, `crdb_internal.create_join_token()`
that creates and persists a join token in that table.

Currently, there's no mechanism to remove expired join tokens.

See RFC #51991. Part of #60632.

Release note (general change): Add `crdb_internal.create_join_token()`
sql builtin function to create join tokens for use when joining
new nodes to a secure cluster. This functionality is hidden behind
a feature flag.

Co-authored-by: Bilal Akhtar <bilal@cockroachlabs.com>
craig bot pushed a commit that referenced this issue Apr 19, 2021
63589: server, security: Fix one-way connectivity with connect cmd r=knz a=itsbilal

Informs #60632.

Previously, non-trust-leader nodes couldn't connect back
to the trust leader due to the presence of the wrong
`ca-client.crt` on their disk; the main CA cert/key was
being written in four places.

This change fixes that bug,
and also creates a new `client.node.crt` certificate to prevent
other subsequent errors from being thrown.

Fixes #61624.

Release note: None.

63672: kvserver: fix write below closedts bug  r=andreimatei a=andreimatei

This patch fixes a bug in our closed timestamp management. This bug was
making it possible for a command to close a timestamp even though other
requests writing at lower timestamps are currently evaluating. The
problem was that we were assuming that, if a replica is proposing a new
lease, there can't be any requests in flight and every future write
evaluated on the range will wait for the new lease and the evaluate
above the lease start time. Based on that reasoning, the proposal buffer
was recording the lease start time as its assignedClosedTimestamp. This
was matching what it does for every write, where assignedClosedTimestamp
corresponds to the the closed timestamp carried by the command.

It turns out that the replica's reasoning was wrong. It is, in fact,
possible for writes to be evaluating on the range when the lease
acquisition is proposed. And these evaluations might be done at
timestamps below the would-be lease's start time. This happens when the
replica has already received a lease through a lease transfer. The
transfer must have applied after the previous lease expired and the
replica decided to start acquiring a new one.

This fixes one of the assertion failures seen in #62655.

Release note (bug fix): A bug leading to crashes with the message
"writing below closed ts" has been fixed.

63756: backupccl: reset restored jobs during cluster restore r=dt a=pbardea

Previously, jobs were restored without modification during cluster
restore. Due to a recently discovered bug where backup may miss
non-transactional writes written to offline spans by these jobs, their
progress may no longer be accurate on the restored cluster.

IMPORT and RESTORE jobs perform non-transactional writes that may be
missed. When a cluster RESTORE brings back these OFFLINE tables, it will
also bring back its associated job. To ensure the underlying data in
these tables is correct, the jobs are now set in a reverting state so
that they can clean up after themselves.

In-progress schema change jobs that are affected, will fail upon
validation.

Release note (bug fix): Fix a bug where restored jobs may have assumed
to have made progress that was not captured in the backup. The restored
jobs are now either canceled cluster restore.

63837: build: update the go version requirement for `make` r=otan a=knz

Fixes #63837.

The builder image already requires go 1.15.10. This patch
modifies the check for a non-builder `make` command to require
at least the same version.

Release note: None

Co-authored-by: Bilal Akhtar <bilal@cockroachlabs.com>
Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
Co-authored-by: Paul Bardea <pbardea@gmail.com>
Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
craig bot pushed a commit that referenced this issue May 6, 2021
63168: ALTER SCHEMA, CREATE SCHEMA, DROP SCHEMA diagrams r=ericharmeling a=ericharmeling

Release justification: non-production code changes

Release note: None

63492: server, security: Token-based add/join TLS functionality  r=knz a=itsbilal

Informs  #60632.

This change adds a new CLI command, `connect join`, that lets a new
node retrieve CA certificates off of an existing secure cluster
and bootstrap its own TLS certificates for joining that cluster.
This is achieved through the consumption of a join token
stored in the join_tokens table. Join tokens can be created
using the `create_join_tokens()` sql builtin function, added
in a previous change.

The previous `connect` command has now been renamed to
`connect init`.

A new set of GRPC endpoints have been created to handle the
server side of this change; to support retrieval of CAs as well
as entire initialization cert bundles. The cert bundles with
CAs and private keys aren't sent over the wire until a
node join token has been verified and consumed. The receiving
node (the one running `connect join`) then stores them in its
SSLCertsDir and bootstraps its own certificates off of it.

This feature is hidden behind a feature flag.

A couple other changes in this commit to clean up related code
in the `security` package:
 - Capitalize CA consistently in protobuf structs and method names
 - Write all CAs/certs from the right places in InitializeFromConfig
   instead of writing the InterNode one everywhere
 - Pass around raw byte certificates in auto_tls_init.go instead of
   doing excess PEM encodings and decodings.
 - Create a `client.node.crt` signed by `ca-client.crt`, otherwise
   nodes wouldn't be able to connect to each other. Fixes #61624.

Release note (cli change): Rename `connect` to `connect init`,
and add `connect join` command to retrieve certificates from
an existing secure cluster and setup a new node to connect with
it.

Co-authored-by: Aaron Blum <aaron@cockroachlabs.com>
Co-authored-by: Bilal Akhtar <bilal@cockroachlabs.com>

Co-authored-by: Eric Harmeling <eric.harmeling@cockroachlabs.com>
Co-authored-by: Bilal Akhtar <bilal@cockroachlabs.com>
@dt
Copy link
Member

dt commented May 17, 2021

As discussed elsewhere, I think it might be worth considering switching to an off-the-shelf encoder like proto to handle join token encoding/decoding before we get locked into compatibility with a shipped scheme: #65344

@jlinder jlinder added the T-server-and-security DB Server & Security label Jun 16, 2021
@knz knz added A-server-start-drain Pertains to server startup and shutdown sequences A-authentication Pertains to authn subsystems labels Jul 29, 2021
@knz
Copy link
Contributor Author

knz commented Jul 12, 2023

Tracking the list of items here as a jira epic: CRDB-6663.

@knz knz closed this as completed Jul 12, 2023
craig bot pushed a commit that referenced this issue Jan 11, 2024
113893: cli: remove cockroach connect r=nvanbenschoten a=andrewbaptist

`connect` was implemented as an experiment to allow bootstrapping nodes from other nodes CA's (#60632). The details are described here: https://github.com/aaron-crl/toy-secure-init-handshake/tree/n-way-join

This implementation was never completed, and the visibility of this code can cause confusion. This PR removes all the code with the idea that we can bring it back later if necessary.

Epic: none

Release note (cli change): Removal of the `cockroach connect` functionality.

117542: sql: fix erroneous benchmark regressions r=mgartner a=mgartner

Several benchmarks incorrectly included cluster shutdown as part of the
benchmark timing. This caused major regressions in benchmarks
when #117116 was merged because it made cluster shutdown slower. The
benchmarks now stop the timer before initiating cluster shutdown to more
accurately measure the code in question.

Fixes #117494

Release note: None


117558: sql: add telemetry block regexps and fix flake r=mgartner a=mgartner

#### sqltestutils: allow blocking regexps in telemetry feature list

The `feature-allowlist` directive for telemetry tests has been renamed
`feature-list` and it now supports blocking regexps. Any line in the
`feature-list` that starts with "!" is a block regexp that prevents any
features matching the regexp from being included in the output of the
`feature-usage` and `feature-counters` directives. This is helpful
because Go's regexps do not support negative look-aheads, so adding
features that should not be matched is difficult and tedious.

Release note: None

#### sql: ignore "sql.schema.create_stats" in schema telemetry test

This commit ignores the `sql.schema.create_stats` in the `schema`
telemetry tests because it causes sporadic failures.

Fixes #117309

Release note: None


117569: authors: add homa31 to authors r=homa31 a=homa31

Release note: None
Epic: None

117572: *: Prep work for supporting CONFIGURE ZONE in declarative schema changer r=Xiang-Gu a=Xiang-Gu

This PR consists several preparation work that will be needed for properly supporting CONFIGURE ZONE stmts in declarative schema changer. They're separated out for easier review and bc I expect those to be a lot less controversial than the main work. See each commit for details.

Informs #117574
Epic: CRDB-31473

Co-authored-by: Andrew Baptist <baptist@cockroachlabs.com>
Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
Co-authored-by: Howard Ma <howard.ma@cockroachlabs.com>
Co-authored-by: Xiang Gu <xiang@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-authentication Pertains to authn subsystems A-security A-server-start-drain Pertains to server startup and shutdown sequences C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-server-and-security DB Server & Security
Projects
None yet
Development

No branches or pull requests

4 participants