Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DNM] RFCS: draft of transactional schema changes #8

Merged
merged 1,920 commits into from
Nov 7, 2022

Conversation

ajwerner
Copy link
Owner

@ajwerner ajwerner commented Sep 15, 2020

Many details left to be sussed out.

TODOs:

  • Add commentary about the mixed version state.
    • We'll need to certainly preserve roughly all of the existing behavior for this to still work
  • Provide proposals for less invasive default behavior that minimizes risks of restarts, especially without pessimistic mode.

Release note: None

@ajwerner
Copy link
Owner Author

This is a space for note taking on TODOs on the RFC prior to publishing it for more public consumption.

@ajwerner ajwerner force-pushed the ajwerner/transaction-schema-changes branch from 63e62e4 to cc8e784 Compare September 17, 2020 05:49
Copy link

@thoszhang thoszhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick first pass.

Comment on lines 24 to 25
1. When a user transaction commits, its DMLs all commit but its DDLs are
merely queued. It's possible that the async parts of its DDLs will fail and

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a minor point, but there are some DDL statements (e.g., renaming a table column) that are committed in the transaction. One relatively small aspect of our problems is that it's probably not obvious to users which DDLs exhibit the non-transactional behavior and which ones don't, and which DDLs do or don't get rolled back if there are multiple within a transaction and one of the async ones fails.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to the first issue.


# Motivation

The lack of transactional schema changes boils down to two fundamental problems:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the effects of constraints being added on other writes is also worth mentioning here (either as a third distinct problem or a sub-problem of the first).

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to the first

Comment on lines 91 to 99
the physical layout has not changed but the logical layout has. When version 3
(`DELETE_ONLY`) comes into effect, the physical layout may differ from
version 1. It's critical that by the time the physical layout begins to change
we know that no transactions are in version 1.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to explicitly define "physical layout" and "logical layout" up front.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Roughly speaking, physical layout refers to table data being stored in KV; logical layout refers to metadata indicating how to read/write that data to KV. The idea here is that we can't update the physical layout until the logical layout catches up.)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a sentence

Comment on lines 306 to 362
and populates its target by synthesizing SSTs using the `bulkAdder`. All rows
written by the index backfill have a timestamp that precedes any write due to
transactional traffic, thus preserving subsequent writes as they are.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is quite true? IIUC we just pick Clock.Now() as the read and write timestamp and basically overwrite our existing writes. There's also a step where we do a KV scan at that timestamp over the index span to resolve intents and populate the timestamp cache to avoid writing under the SSTs. That's about as much as I know.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I just learned about this this week (and it turned out to be sort of buggy cockroachdb#54891).

Comment on lines 416 to 436
* This transaction will create a job that then builds the relevant indexes
but does not make them public.
* If the job fails, the statement returns an error
* Rollback should happen under the job too. We’ll wait for terminal.
* If it succeeds the user transaction may proceed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm getting here is that the job (1) only builds the index, unlike today's schema change jobs; and (2) is essentially a black box from the perspective of the schema change state machine, in the sense that we'll just wait for the job to reach a terminal state before proceeding, sort of like today's schema change jobs in the connExecutor. Is that right?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, in fact it's not clear whether job is the right term.

Comment on lines 422 to 439
* Ensure that the user transaction is able to drop the leases to the old
table descriptor so that we don’t prevent version bumps.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean? What exactly happens if we already did some writes in the transaction with an older leased version of the table?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's fine if we did some writes with an older leased version of the table because we're going to backfill those writes into new indexes and the physical layout of the table can't change no matter how many versions we bump it through. I added a comment here.

* This is aided by the fact that we’re not re-writing rows.
* The pessimistic mode interactions here are important
* The backfiller should be able to read ignoring the parent’s locks.
* What does it mean to re-issue previous writes to tables with new indexes?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably want to clarify eventually that this refers to writes in the same transaction, as distinct from intents from other writing transactions.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

schema changes.
* Should jobs be used in any way for dealing with backfills and validations
underneath a user transaction?
* What are the pros? Lots of cons.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the pros are basically just that we get all the jobs observability "for free." But I guess we could hook backfills into the query progress somehow. It's unclear whether being able to pause these jobs is a good idea, but I think probably not.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.


### Child Transactions

The primary new KV concept required by this proposal is the child transaction.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is KV already planning to do this? Are we planning to?


When the above transaction commits, `foo` is in version 2 and has pending
mutations to drop the columns c1 and c2. One thing to note is that transactions
which happen after the above transaction may still observe those columns due to
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be something that would be resolved with the proposed future work, right?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to "One thing to note is that transactions which happen after the above transaction may still observe those columns"? It's true that with the proposed work, we'd no longer commit the transaction at this point; the transaction would only be committed only when the entire column schema change is done. However, it's still true in general that because we lease/cache table descriptors, all descriptor changes still take place asynchronously across the cluster. We provide a guarantee that we don't return results to the client until the change has taken effect on all nodes. We can elaborate more on this offline. I'm not sure if I've answered your question.

The upshot of this design is that we can achieve compatibility with postgres
and meet user expectations for what is possible within a SQL transaction. This
proposal is not, however, without its risks. In order to regain
transactionality, in some cases, we'll need to actually perform the work of
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not in all cases? Won't we always need to perform the schema change work within the user's transaction?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Andrew may be referring to something else, but here's one way to think of it: Currently all schema change work is performed in the user transaction and in a set of transactions after the user transaction commits. (A few simple schema changes like RENAME COLUMN are performed entirely in the user transaction.) The present proposal is to perform all the work in the user transaction, and (if needed) in a set of child transactions that commit before the user transaction commits.

Once the side schema changes and has evidence that all leases are up-to-date,
the statement execution can continue. At this point, a new child transaction can
write and commit a descriptor of the table descriptor which has the column in
`WRITE_ONLY` with a pending mutation to perform a backfill that it
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does WRITE_ONLY status indicate to other transactions?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For indexes, it means that transactions will keep the index updated on writes, but not use the index for reads. For columns it's a bit more complicated, but essentially it's the same: we write to the column as though it were a normal column (with computed values, defaults, etc.) but it's not available for reads by the user.


## Non-atomic DDLs ([#42061])

Many schema changes require `O(rows)` work. Generally we've wanted to avoid
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is 0(rows) work?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means the work required is proportional to the number of rows in the table (e.g., for an index backfill, the more rows, the longer the backfill runs)

This RFC utilizes the language of
[20151014_online_schema_change.md](./20151014_online_schema_change.md).

* Child Transaction - a new concept in this RFC. A Child transaction is a
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is different that nested transactions with savepoints?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this is a separate internal concept.

* Why is this backfill different from all other backfills?
* It needs to behave differently in the face of intents due to the user
transaction.
* What should it do when it encounters an intent from the user transaction?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intents are the previous statements within the same transaction?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, these are write intents from writes we did earlier

descriptor with the constraint as `PUBLIC` using a synthetic descriptor
(but don't write descriptors until the end).
* For concurrent transactions which see the constraint in `ADDING`
* Evaluate the constraint. If failed:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a concurrent txn inserts new rows AND the constraint still passes, the index backfill for the ALTER TABLE would ignore these additional rows?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you asking about a case where we're adding an index and later adding a constraint in the same transaction? At this point the index backfill has already finished, and the index would be in the write-only state, so we'd just write to that index normally if the constraint isn't violated. Does that answer your question?

* Straw man: write to the table descriptor marking the constraint as
having failed.
* One problem I see is that the side-key gave us some assumption that there were not concurrent modifications from user transactions to the table descriptor
* Straw man 2: write to some other key that the user transaction making the constraint public will need to read prior to committing.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't all concurrent transactions need to read before committing to check for any conflicts with newly committed constraints?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd be getting this information off the leased/cached copy of the descriptor (as usual), so it wouldn't be a KV read. I think we'd refresh the leased version if we detected a possible violation.

How do we expect users to use this ASYNC behavior?

* This isn’t for the tools, this is for today’s users of schema changes who
want their schema changes to happen if their connection disconnects.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the same behavior as bulk io operations?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, basically - we'd probably want to use jobs for this


* What should the semantics be when adding a column with a default value and using the `ASYNC` keywords?
* Seems like ASYNC operations should not take effect at all.
* How do we deal with DMLs and DDLs interleaved within a transaction to a table.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this different than the interleaving that is happening your examples above?

Copy link

@thoszhang thoszhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking a first pass at responding to some of @vy-ton's comments.


## Non-atomic DDLs ([#42061])

Many schema changes require `O(rows)` work. Generally we've wanted to avoid

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means the work required is proportional to the number of rows in the table (e.g., for an index backfill, the more rows, the longer the backfill runs)


When the above transaction commits, `foo` is in version 2 and has pending
mutations to drop the columns c1 and c2. One thing to note is that transactions
which happen after the above transaction may still observe those columns due to

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to "One thing to note is that transactions which happen after the above transaction may still observe those columns"? It's true that with the proposed work, we'd no longer commit the transaction at this point; the transaction would only be committed only when the entire column schema change is done. However, it's still true in general that because we lease/cache table descriptors, all descriptor changes still take place asynchronously across the cluster. We provide a guarantee that we don't return results to the client until the change has taken effect on all nodes. We can elaborate more on this offline. I'm not sure if I've answered your question.

Comment on lines 91 to 99
the physical layout has not changed but the logical layout has. When version 3
(`DELETE_ONLY`) comes into effect, the physical layout may differ from
version 1. It's critical that by the time the physical layout begins to change
we know that no transactions are in version 1.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Roughly speaking, physical layout refers to table data being stored in KV; logical layout refers to metadata indicating how to read/write that data to KV. The idea here is that we can't update the physical layout until the logical layout catches up.)

The upshot of this design is that we can achieve compatibility with postgres
and meet user expectations for what is possible within a SQL transaction. This
proposal is not, however, without its risks. In order to regain
transactionality, in some cases, we'll need to actually perform the work of

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Andrew may be referring to something else, but here's one way to think of it: Currently all schema change work is performed in the user transaction and in a set of transactions after the user transaction commits. (A few simple schema changes like RENAME COLUMN are performed entirely in the user transaction.) The present proposal is to perform all the work in the user transaction, and (if needed) in a set of child transactions that commit before the user transaction commits.

Once the side schema changes and has evidence that all leases are up-to-date,
the statement execution can continue. At this point, a new child transaction can
write and commit a descriptor of the table descriptor which has the column in
`WRITE_ONLY` with a pending mutation to perform a backfill that it

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For indexes, it means that transactions will keep the index updated on writes, but not use the index for reads. For columns it's a bit more complicated, but essentially it's the same: we write to the column as though it were a normal column (with computed values, defaults, etc.) but it's not available for reads by the user.

This RFC utilizes the language of
[20151014_online_schema_change.md](./20151014_online_schema_change.md).

* Child Transaction - a new concept in this RFC. A Child transaction is a

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this is a separate internal concept.

instead we use a separate key to coordinate changes
* Switch to pessimistic mode so that refreshes succeed (optional, see later).
* Start a child transaction to add the column
* What if we run into a failed, previous schema change?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is if we started a schema change earlier and it never got properly cleaned up for some reason.

* Created in child txn which writes column as delete only.
* First step is make column write-only and wait for that before beginning
backfill. This also adds the indexes which will be backfilled by the job.
* Backfill all new relevant indexes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backfill operation is mostly non-transactional (it involves just doing a bunch of bulk writes to KV) but it's coordinated from child transactions, not the user (parent) transaction.

* Why is this backfill different from all other backfills?
* It needs to behave differently in the face of intents due to the user
transaction.
* What should it do when it encounters an intent from the user transaction?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, these are write intents from writes we did earlier

@ajwerner ajwerner force-pushed the ajwerner/transaction-schema-changes branch 2 times, most recently from 086f427 to 610cf34 Compare October 1, 2020 02:02
@ajwerner ajwerner force-pushed the ajwerner/transaction-schema-changes branch 5 times, most recently from 956f69c to 9a2bbfb Compare October 22, 2020 14:41
@ajwerner ajwerner force-pushed the ajwerner/transaction-schema-changes branch 2 times, most recently from c789acb to 490089f Compare October 30, 2020 18:29
craig[bot] and others added 15 commits November 19, 2020 21:39
56914: sqlsmith: disallow `crdb_internal.unsafe_*` metadata mutation builtins r=lucy-zhang a=lucy-zhang

Closes cockroachdb#56890.

Release note: None

Co-authored-by: Lucy Zhang <lucy@cockroachlabs.com>
This patch removes the KeepRedactable option from the Logs and LogFile
RPCs. The option existed just for backwards compatibility with pre-20.2
clients. Even for such old clients, I don't think there's real
consequences to removing this option. This patch retains the behvior of
the option being set.
The option, when coupled with the Redact=false option, made the RPCs
keep the log entries exactly as they are in the log file - i.e.
redactable, with sensistive data markers. With this patch, the logs
returned by the RPCs are either redacted, or redactable. "Flattening"
them (i.e. not redacting, but stripping the markers) is no longer an
option.

The Logs RPC is used by the AdminUI, which doesn't use either Redact or
KeepRedactable - so, with this patch, the UI will show logs with markers
in them. I think that's OK.
The LogFile RPC is used by the cli getting a debug zip. Since 20.2, the
client is setting KeepRedactable, and so nothing changes. Older clients
will get log entries with markers in them.

This change is motivated by the fact that I'd like to replace the
LogFile RPC with a more efficient version that returns a raw file, not a
proto-encoded list of log entries. In fact we already have that RPC in
the form of GetFiles, which we may be able to use. This KeepRedactable
option was in my way.

Release note: None
56458: kv: immediately propagate WriteTooOld error from DeleteRange requests r=nvanbenschoten a=nvanbenschoten

Fixes cockroachdb#56111.

This commit fixes a bug that allowed non-serializable state to leak from
the return value of a DeleteRange request. DeleteRange requests were not
returning WriteTooOld errors on conflicting tombstones at higher
timestamps, so they were allowing conflicting DELETEs to influence their
returned keys. This, in turn, was permitting cases where a DELETE's rows
updated return value would disagree with previously observed state in a
transaction.

Worse than this, I think there might have even been a bug with deferring
the WriteTooOld status from a DeleteRange request, because unlike most
MVCC operations, MVCCDeleteRange does not complete before returning a
WriteTooOldError. I think this means that it could have partially
succeeded and then returned a WriteTooOldError. If the write too old
status was deferred, this would result in the DeleteRange request
failing to delete some keys. This is why "blind-write" requests must
complete entirely even if they intend to return a WriteTooOldError (see
`mvccPutInternal`). We don't particularly like this behavior, but it's
been around for years.

The fix for this issue is twofold. First, this commit fixes
MVCCDeleteRange. It uses the relatively new `FailOnMoreRecent` flag in
its initial scan to ensure that a WriteTooOld error is returned even in
the presence of tombstones at equal or higher timestamps. This has the
side-effect of letting us get rid of some strange code added in ee80e3d
which scanned at MaxTimestamp. Second, this commit fixes DeleteRange
request by giving it the `isRead` flag. This ensures that the request is
never considered a blind-write and therefore never defers a "write too
old" status.

I intend to backport this to release-20.1 and release-20.2.

Release notes (bug fix): DELETE statements no longer have a chance of
returning an incorrect number of deleted rows in transactions that will
eventually need to restart due to contention.

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
The cgroup information is constructed by looking at `/proc/self/cgroup`,
matching with the mount point in `/proc/self/mountinfo`
and then inpsecting the `memory.stat` file. The matching between the
cgroup and the mount point was working only in case that the mount and
the cgroup are exact match relative to cgroup NS. This is however
inadequate as while true for the docker case, it isn't true in the general
case. In this example determining the cgroup memory limit isn't possible:
```
cgcreate -t $USER:$USER -a $USER:$USER -g memory:crdb_test
echo 3999997952 > /sys/fs/cgroup/memory/crdb_test/memory.limit_in_bytes
cgexec -g memory:crdb_test ./cockroach start-single-nod
```
To address this, this patch instead constructs the expected
relative path of the cgroup's memory info by starting with the
mount path and then adding the cgroup.

Release note: None
56835: opt: constrain multi-col inverted index prefix cols by contiguous range r=mgartner a=mgartner

This commit allows for the prefix columns of multi-column inverted
indexes to be constrained to a contiguous range. For example, consider
the schema and query:

    CREATE TABLE t (
        k INT PRIMARY KEY,
        i INT,
        g GEOMETRY,
        INVERTED INDEX (i, g)
    )

    SELECT k FROM t WHERE i IN (1, 2, 3) AND ST_CoveredBy(..., g)

Prior to this commit, a scan over the inverted index would not be
generated for this query.

Release note: None

Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
Release note (sql change): Introduced stubs for ALTER DATABASE ...
PRIMARY REGION and CREATE TABLE ... PRIMARY REGION
56878: server: remove KeepRedactable logs rpc option r=andreimatei a=andreimatei

This patch removes the KeepRedactable option from the Logs and LogFile
RPCs. The option existed just for backwards compatibility with pre-20.2
clients. Even for such old clients, I don't think there's real
consequences to removing this option. This patch retains the behvior of
the option being set.
The option, when coupled with the Redact=false option, made the RPCs
keep the log entries exactly as they are in the log file - i.e.
redactable, with sensistive data markers. With this patch, the logs
returned by the RPCs are either redacted, or redactable. "Flattening"
them (i.e. not redacting, but stripping the markers) is no longer an
option.

The Logs RPC is used by the AdminUI, which doesn't use either Redact or
KeepRedactable - so, with this patch, the UI will show logs with markers
in them. I think that's OK.
The LogFile RPC is used by the cli getting a debug zip. Since 20.2, the
client is setting KeepRedactable, and so nothing changes. Older clients
will get log entries with markers in them.

This change is motivated by the fact that I'd like to replace the
LogFile RPC with a more efficient version that returns a raw file, not a
proto-encoded list of log entries. In fact we already have that RPC in
the form of GetFiles, which we may be able to use. This KeepRedactable
option was in my way.

Release note: None

Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
56829: cli: add \x command to toggle records display format r=knz a=erikgrinaker

`psql` has a `\x` command to toggle extended output for results. CRDB
already supports this output format via `\set display_format=records`.
This commit adds `\x [on|off]` to toggle between `records` and `table`
display formats.

The `auto` option from `psql` to automatically enable extended output
depending on the result width is not supported, only `on` and `off`.

Resolves cockroachdb#56706.

Release note (cli change): A `\x [on|off]` command has been added to
toggle the `records` display format, following `psql` behavior.

Also adds support for `yes` and `no` as boolean options for slash commands as a separate commit, following `psql`.


56840: util, server: fix memory detection with non-root cgroups r=darinpp a=darinpp

The cgroup information is constructed by looking at `/proc/self/cgroup`,
matching with the mount point in `/proc/self/mountinfo`
and then inpsecting the `memory.stat` file. The matching between the
cgroup and the mount point was working only in case that the mount and
the cgroup are exact match relative to cgroup NS. This is however
inadequate as while true for the docker case, it isn't true in the general
case. In this example determining the cgroup memory limit isn't possible:
```
cgcreate -t $USER:$USER -a $USER:$USER -g memory:crdb_test
echo 3999997952 > /sys/fs/cgroup/memory/crdb_test/memory.limit_in_bytes
cgexec -g memory:crdb_test ./cockroach start-single-nod
```
To address this, this patch instead constructs the expected
relative path of the cgroup's memory info by starting with the
mount path and then adding the cgroup.

Release note: None

Co-authored-by: Erik Grinaker <erik@grinaker.org>
Co-authored-by: Darin Peshev <darinp@gmail.com>
This commit adds a field `NonReleaseBlocker` to `testSpec` which, when
set to true, causes the `release-blocker` tag to not be applied to the
GitHub issues for test failures. Its intended purpose is to allow
roachtests to be run nightly while they're not yet stable and we still
expect non-release-blocking failures frequently.

Release note: None
We're still adding new features to the `schemachange` workload
frequently and finding minor bugs. We want this test to run nightly, but
it's not stable enough to warrant failures being marked as release
blockers.

Release note: None
56883: sql: allow PRIMARY REGION syntax for CREATE / ALTER DATABASE r=ajstorm a=otan

Release note (sql change): Introduced stubs for ALTER DATABASE ...
PRIMARY REGION and CREATE TABLE ... PRIMARY REGION

Co-authored-by: Oliver Tan <otan@cockroachlabs.com>
55474: sql: clarify help text for `show range` r=solongordon a=jseldess

The SHOW RANGE FOR ROW statement now accepts the indexed values
of the row as a tuple. The help text representation of this was
previous (row, value, ...). The help text representation is
now (value1, value2, ...), which I believe is clearer.

Counterpart to this docs PR: cockroachdb/docs#8634

Release note: None

Co-authored-by: Jesse Seldess <j_seldess@hotmail.com>
When iterating over the names of tables to scatter, we were sharing the
loop variable between the goroutines, which is racy and caused a test
failure due to reading a garbage string value for the table name. In
general, this function may not scatter each table exactly once. This PR
fixes the bug by moving the read out of the goroutine.

Release note (bug fix): Fixed a race condition in the `tpcc` workload
with the `--scatter` flag where tables could be scattered multiple times
or not at all.
Relates to cockroachdb#56851.

In investigations like cockroachdb#56851, we've seen the mutex in the Raft
scheduler collapse due to too much concurrency. To address this, we
needed to drop the scheduler's goroutine pool size to bound the amount
of contention on the mutex to ensure that the scheduler was able to
schedule any goroutines.

This commit caps this concurrency to 96, instead of letting it grow
unbounded as a function of the number of cores on the system.

Release note (performance improvement): The Raft processing goroutine
pool's size is now capped at 96. This was observed to prevent instability
on large machines (32+ vCPU) in clusters with many ranges (50k+ per node).
Relates to cockroachdb#56851.

In cockroachdb#56851, we saw that all of the Raft transport's receiving goroutines
were stuck in the Raft scheduler, attempting to enqueue Ranges in
response to coalesced heartbeats. We saw this in stacktraces like:
```
goroutine 321096 [semacquire]:
sync.runtime_SemacquireMutex(0xc00007099c, 0xc005822a00, 0x1)
	/usr/local/go/src/runtime/sema.go:71 +0x47
sync.(*Mutex).lockSlow(0xc000070998)
	/usr/local/go/src/sync/mutex.go:138 +0xfc
sync.(*Mutex).Lock(...)
	/usr/local/go/src/sync/mutex.go:81
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*raftScheduler).enqueue1(0xc000070980, 0x4, 0x19d8cb, 0x1)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/scheduler.go:261 +0xb0
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*raftScheduler).EnqueueRaftRequest(0xc000070980, 0x19d8cb)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/scheduler.go:299 +0x3e
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).HandleRaftUncoalescedRequest(0xc001136700, 0x4becc00, 0xc019f31b60, 0xc01288e5c0, 0x4ba44c0, 0xc014ff2b40, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/store_raft.go:175 +0x201
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).uncoalesceBeats(0xc001136700, 0x4becc00, 0xc019f31b60, 0xc035790a80, 0x37, 0x43, 0x100000001, 0x29b00000000, 0x0, 0x400000004, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/store_raft.go:110 +0x33b
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).HandleRaftRequest(0xc001136700, 0x4becc00, 0xc019f31b60, 0xc02be585f0, 0x4ba44c0, 0xc014ff2b40, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/store_raft.go:130 +0x1be
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*RaftTransport).handleRaftRequest(0xc000188780, 0x4becc00, 0xc019f31b60, 0xc02be585f0, 0x4ba44c0, 0xc014ff2b40, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/raft_transport.go:299 +0xab
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*RaftTransport).RaftMessageBatch.func1.1.1(0x4c3fac0, 0xc00d3ccdf0, 0xc000188780, 0x4becc00, 0xc019f31b60, 0x95fe98, 0x40c5720)
	/go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/raft_transport.go:370 +0x199
```

In that issue, we also saw that too much concurrency on the Raft
scheduler's Mutex had caused the mutex to collapse (get stuck in the
slow path, in the OS kernel) and hundreds of goroutines to pile up on
it.

We suspect that part of the problem here was that each of the coalesced
heartbeats was locking the Raft scheduler once per Range. So a coalesced
heartbeat that contained 10k ranges would lock the scheduler 10k times
on the receiver.

The commit attempts to alleviate this issue by batch enqueuing Ranges in
the Raft scheduler in response to coalesced heartbeats. This has a
slight fixed overhead (i.e. the need for a slice) but in response,
reduces the load that coalesced heartbeats place on the Raft scheduler's
mutex by a factor of 128 (`enqueueChunkSize`). This should reduce the
impact that a large number of Ranges have on contention in the Raft
scheduler.

Release note (performance improvement): Interactions between Raft heartbeats
and the Raft goroutine pool scheduler are now more efficient and avoid excessive
mutex contention. This was observed to prevent instability on large machines
(32+ vCPU) in clusters with many ranges (50k+ per node).
craig[bot] and others added 9 commits November 30, 2020 23:21
57241: build: install tar in our docker image r=kenliu a=chrisseto

In the switch to redhat's ubi image, many binaries were excluded in the
container. The loss of tar breaks kubectl cp, a critical tool for
debugging cockroachdb clusters that are running in kubernetes.
This commit includes tar into our docker images such that kubectl cp
will work out of the box.

Release note (general change): Included tar in docker images. This allows users to use kubectl cp on 20.2.x containers.

Co-authored-by: Chris Seto <chriskseto@gmail.com>
This commit removes the templaing of "with-nulls" and "without-nulls"
cases of the merge joiner because the benchmarks have shown that
skipping nulls check in the latter case doesn't offer much performance
improvement, but having that templating increases the amount of
generated code by a factor of 2 or more.

Additionally, this commit adjusts the template slightly to omit some
empty lines.

Release note: None
57252: logictest: add tests for backfilling multi-column inverted indexes r=RaduBerinde a=mgartner

Release note: None

Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
This commit removes multiple arguments in favor of using the spec when
initializing the cFetcher as well as refactors several loops in order to
avoid making copies of descriptors during table reader planning.

Additionally, it moves a map from scanNode next to the map's only user
(stats job) and removes the map from the scanNode itself as well as
removes some of the unused arguments.

Release note: None
57267: sql,colexec: minor cleanup of scan-related things r=yuzefovich a=yuzefovich

This commit removes multiple arguments in favor of using the spec when
initializing the cFetcher as well as refactors several loops in order to
avoid making copies of descriptors during table reader planning.

Additionally, it moves a map from scanNode next to the map's only user
(stats job) and removes the map from the scanNode itself as well as
removes some of the unused arguments.

Release note: None

Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
This is meant to be temporary until we discover the cause of cockroachdb#55290.

Release note: None
57263: sql: raise error if opaque statement receives nil plan node r=RaduBerinde a=angelapwen

Addresses cockroachdb#56861.

Previously there was no check for a situation in which a planNode
was nil. For opaque statements, this  resulted in occasional panics
when a nil planNode was returned. This commit adds a check for
opaque statements to make sure the planNode is not nil before
moving on to execution.

Release note: None

Co-authored-by: angelapwen <angelaw@cockroachlabs.com>
57233: ccl/sqlproxyccl: lazily initialize closedCh r=jbowens a=jbowens

Lazily initialize `sqlproxyccl.Conn`'s `closedCh`. This avoids
allocating a channel when it's not needed and allows package consumers
to construct their own `sqlproxyccl.Conn`s without needing access to the
unexported `closedCh` member field.

Co-authored-by: Jackson Owens <jackson@cockroachlabs.com>
57207: sql: remove pointless Clock update closure allocation r=jordanlewis a=jordanlewis

Previously, we made a pointless closure on every query to update the
hlc.Clock on receipt of new observed timestamps. This is pointless
because we can be passing pre-allocated objects that can update
themselves (most often the *hlc.Clock itself).

```
name                                          old time/op    new time/op    delta
FlowSetup/vectorize=true/distribute=false-24     132µs ± 2%     132µs ± 1%    ~     (p=0.661 n=10+9)

name                                          old alloc/op   new alloc/op   delta
FlowSetup/vectorize=true/distribute=false-24    28.3kB ± 1%    28.2kB ± 1%    ~     (p=0.605 n=9+9)

name                                          old allocs/op  new allocs/op  delta
FlowSetup/vectorize=true/distribute=false-24       232 ± 0%       230 ± 0%  -0.86%  (p=0.000 n=8+8)
```

Release note: None

57271: lease: add logging for questionable descriptors r=jordanlewis a=jordanlewis

This is meant to be temporary until we discover the cause of cockroachdb#55290.

Release note: None

Co-authored-by: Jordan Lewis <jordanthelewis@gmail.com>
@ajwerner ajwerner force-pushed the ajwerner/transaction-schema-changes branch from 490089f to cede00d Compare December 1, 2020 04:31
jordanlewis and others added 12 commits December 1, 2020 00:55
This commit changes the VectorizedFlow to store its type resolver as a
struct directly, rather than a pointer. This prevents a pointless
allocation per flow.

Release note: None
Previously, when initializing a processor, we'd need to allocate a fresh
SemaCtx for every processor. Instead of doing this, we add a SemaCtx
value inside of ProcessorBase to share the ProcessorBase-allocated
memory.

Release note: None
Previously, resolving a table would always need to heap allocate the
TableName that is used to look up the names. This was unnecessary.

Release note: None
56587: importccl: Added row limit option for importing bundle formats r=adityamaru a=mokaixu

A row limit option allows users to specify a fixed number of rows
that they want to import. This functionality is only available for
CSV/DELIMITED/AVRO formats.

With this code change, users can limit the number of rows they want
to import for PGDUMP and MYSQLDUMP bundle data formats. That is, they can
specify: `WITH row_limit="{$num}"` to only import $num rows.

Release note (sql change): Added WITH row_limit="{$num}" option for importing
bundle formats to allow users to do a quick test run on an import of $num rows.
Ex. IMPORT ... WITH row_limit="3";

Resolves: cockroachdb#26493

Co-authored-by: Monica Xu <monicax@cockroachlabs.com>
This commit adds a simple ContentionEvent protobuf. This protobuf is not yet
emitted by KV although the idea is that it will be. In the meantime, the SQL
Execution engine can use this message to generate mock contention events in
order to build higher-level observability infrastructure around contention.

Release note: None
This commit also implements mock contention event generation when either a
testing knob or cluster setting is set. The former is useful for programmable
tests, while the latter will be useful for observing changes to the DB console
(e.g. global contention view).

Release note: None (new behavior is only enabled in tests)
56906: sql: produce mock ContentionEvents and display contention time in EXPLAIN ANALYZE r=RaduBerinde,tbg a=asubiotto

Please take a look at individual commits for details

Release note: None

Closes cockroachdb#56612 

@tbg could you take a look at the first commit which defines a `ContentionEvent` protobuf? It's close to what's described in cockroachdb#55583 (minus some fields that can be added later).

Co-authored-by: Alfonso Subiotto Marques <alfonso@cockroachlabs.com>
50869: ui: Correct responses for restricted access to non-admin users r=koorosh a=koorosh

Resolves cockroachdb#48152, cockroachdb#50360

Depends on cockroachdb/admin-ui-components#37
Depends on cockroachdb/yarn-vendored#44

Previously, request handlers for node locations and eventslogs didn't
check if the user has an admin role or not and proceeds with an attempt to query
data from tables with granted permissions for admin users only. This
error handled as server error then.

Current change validates if a user has admin role before requesting
data from tables and responds with Permission denied error when user
doesn't have an admin role.

UI changes:
- Add InlineAlert component to display redesigned messages;

**Screens**
Single info message (Restricted permissions)
<img width="1174" alt="Screenshot 2020-07-02 at 18 34 03" src="https://user-images.githubusercontent.com/3106437/86460281-c3d7c880-bd30-11ea-9ef4-006e436c26e8.png">

Single error message
<img width="1154" alt="Screenshot 2020-07-02 at 18 41 54" src="https://user-images.githubusercontent.com/3106437/86460294-cd613080-bd30-11ea-9e81-9009f4d47b5e.png">

Multiple error messages
<img width="1259" alt="Screenshot 2020-07-02 at 18 24 34" src="https://user-images.githubusercontent.com/3106437/86460323-d9e58900-bd30-11ea-81bf-6ff47fdadba1.png">



Co-authored-by: Andrii Vorobiov <and.vorobiov@gmail.com>
57254: cloud: Bump orchestration version to 20.2.2 r=nathanstilwell a=nathanstilwell

Release note: None

Co-authored-by: Nathan Stilwell <nathanstilwell@cockroachlabs.com>
57197: sql: enum introspection improvements r=otan a=arulajmani

See individual commit messages for details. 

Co-authored-by: arulajmani <arulajmani@gmail.com>
@ajwerner ajwerner force-pushed the ajwerner/transaction-schema-changes branch from cede00d to c18c23e Compare December 1, 2020 15:36
56968: colexec: retune hash table constants in all of its uses r=yuzefovich a=yuzefovich

**colexec: adjust benchmarks to include cases with small number of tuples**

This commit adjusts several benchmarks in order to run them against
inputs with small number of tuples. The reasoning for such change is
that we intend to remove the vectorize_row_count_threshold, so we now
should be paying attention to both small and large input sets.

Additionally, this commit removes the "with-nulls" cases from the
aggregation benchmarks because the speed of "no-nulls" and "with-nulls"
are roughly the same, so having both doesn't provide us more useful
info, yet slows down the benchmark by a factor of 2.

Release note: None

**colexec: retune hash table constants in all of its uses**

This commit changes the load factor and the initial number of buckets of
the hash table according to the recent micro-benchmarking and running of
TPCH queries. The numbers needed an update since we made the hash table
as well as the operators that use it more dynamic. The benchmarks show
an improvement on the inputs with small sizes, sometimes a regression on
inputs roughly of `coldata.BatchSize()` in size, and mostly improvements
on larger inputs. This trade-off seems beneficial to me.

Release note: None

57179: cli: Warn users if --locality doesn't include "region" r=otan a=ajstorm

With the new multi-region abstractions coming in 21.1, users will need
to add regions to databases (and optionally, tables).  The regions
defined at the cluster level are derived from the locality flags, but we
search explicitly for "region".  As a result, if the user wishes to
leverage the new abstractions, they must provide a "region" tier in the
locality flag when starting their nodes.

To push users in that direction, we warn on `cockroach start` if the
supplied locality flag doesn't include a region tier.

Release note (cli change): Warn users if locality flag doesn't contain a
"region" tier.

57201: sql: remove a bunch of allocations from name and type resolution paths r=jordanlewis a=jordanlewis

See individual commits. This PR shaves 3% of the total number of allocations in the FlowSetup benchmark via a variety of methods.

```
name                                          old time/op    new time/op    delta
FlowSetup/vectorize=true/distribute=false-24     144µs ± 1%     142µs ± 2%  -1.01%  (p=0.023 n=10+10)

name                                          old alloc/op   new alloc/op   delta
FlowSetup/vectorize=true/distribute=false-24    28.3kB ± 1%    27.8kB ± 1%  -1.76%  (p=0.000 n=9+9)

name                                          old allocs/op  new allocs/op  delta
FlowSetup/vectorize=true/distribute=false-24       232 ± 0%       224 ± 0%  -3.45%  (p=0.000 n=8+8)
```

Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Adam Storm <storm@cockroachlabs.com>
Co-authored-by: Jordan Lewis <jordanthelewis@gmail.com>
@ajwerner ajwerner force-pushed the ajwerner/transaction-schema-changes branch from c18c23e to d1e2d19 Compare December 1, 2020 16:48
craig[bot] and others added 2 commits December 1, 2020 16:57
57266: colexec: remove templating nulls vs no-nulls case in merge joiner r=yuzefovich a=yuzefovich

This commit removes the templaing of "with-nulls" and "without-nulls"
cases of the merge joiner because the benchmarks have shown that
skipping nulls check in the latter case doesn't offer much performance
improvement, but having that templating increases the amount of
generated code by a factor of 2 or more.

Additionally, this commit adjusts the template slightly to omit some
empty lines.

Release note: None

Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
This RFC puts forth a proposal for making schema changes in transactions behave
more like postgresql and the expections of clients and ORMs.

Release note: None
@ajwerner ajwerner force-pushed the ajwerner/transaction-schema-changes branch from d1e2d19 to 93dbe44 Compare December 1, 2020 17:31
ajwerner pushed a commit that referenced this pull request Apr 29, 2022
…db#80762 cockroachdb#80773

79911: opt: refactor and test lookup join key column and expr generation r=mgartner a=mgartner

#### opt: simplify fetching outer column in CustomFuncs.findComputedColJoinEquality

Previously, `CustomFuncs.findComputedColJoinEquality` used
`CustomFuncs.OuterCols` to retrieve the outer columns of computed column
expressions. `CustomFuncs.OuterCols` returns the cached outer columns in
the expression if it is a `memo.ScalarPropsExpr`, and falls back to
calculating the outer columns with `memo.BuildSharedProps` otherwise.
Computed column expressions are never `memo.ScalarPropsExpr`s, so we use
just use `memo.BuildSharedProps` directly.

Release note: None

#### opt: make RemapCols a method on Factory instead of CustomFuncs

Release note: None

#### opt: use partial-index-reduced filters when building lookup expressions

This commit makes a minor change to `generateLookupJoinsImpl`.
Previously, equality filters were extracted from the original `ON`
filters. Now they are extracted from filters that have been reduced by
partial index implication. This has no effect on behavior because
equality filters that reference columns in two tables cannot exist in
partial index predicates, so they will never be eliminated during
partial index implication.

Release note: None

#### opt: moves some lookup join generation logic to lookup join package

This commit adds a new `lookupjoin` package. Logic for determining the
key columns and lookup expressions for lookup joins has been moved to
`lookupJoin.ConstraintBuilder`. The code was moved with as few changes
as possible, and the behavior does not change in any way. This move will
make it easier to test this code in isolation in the future, and allow
for further refactoring.

Release note: None

#### opt: generalize lookupjoin.ConstraintBuilder API

This commit makes the lookupjoin.ConstraintBuilder API more general to
make unit testing easier in a future commit.

Release note: None

#### opt: add data-driven tests for lookupjoin.ConstraintBuilder

Release note: None

#### opt: add lookupjoin.Constraint struct

The `lookupjoin.Constraint` struct has been added to encapsulate
multiple data structures that represent a strategy for constraining a
lookup join.

Release note: None

80511: pkg/cloud/azure: Support specifying Azure environments in storage URLs r=adityamaru a=nlowe-sx

The Azure Storage cloud provider learned a new parameter, AZURE_ENVIRONMENT,
which specifies which azure environment the storage account in question
belongs to. This allows cockroach to backup and restore data to Azure
Storage Accounts outside the main Azure Public Cloud. For backwards
compatibility, this defaults to "AzurePublicCloud" if AZURE_ENVIRONMENT
is not specified.
 
Fixes cockroachdb#47163
 
## Verification Evidence
 
I spun up a single node cluster:
 
```
nlowe@nlowe-z4l:~/projects/github/cockroachdb/cockroach [feat/47163-azure-storage-support-multiple-environments L|✚ 2] [🗓  2022-04-22 08:25:49]
$ bazel run //pkg/cmd/cockroach:cockroach -- start-single-node --insecure
WARNING: Option 'host_javabase' is deprecated
WARNING: Option 'javabase' is deprecated
WARNING: Option 'host_java_toolchain' is deprecated
WARNING: Option 'java_toolchain' is deprecated
INFO: Invocation ID: 11504a98-f767-413a-8994-8f92793c2ecf
INFO: Analyzed target //pkg/cmd/cockroach:cockroach (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //pkg/cmd/cockroach:cockroach up-to-date:
  _bazel/bin/pkg/cmd/cockroach/cockroach_/cockroach
INFO: Elapsed time: 0.358s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Build completed successfully, 1 total action
*
* WARNING: ALL SECURITY CONTROLS HAVE BEEN DISABLED!
*
* This mode is intended for non-production testing only.
*
* In this mode:
* - Your cluster is open to any client that can access any of your IP addresses.
* - Intruders with access to your machine or network can observe client-server traffic.
* - Intruders can log in without password and read or write any data in the cluster.
* - Intruders can consume all your server's resources and cause unavailability.
*
*
* INFO: To start a secure server without mandating TLS for clients,
* consider --accept-sql-without-tls instead. For other options, see:
*
* - https://go.crdb.dev/issue-v/53404/dev
* - https://www.cockroachlabs.com/docs/dev/secure-a-cluster.html
*
*
* WARNING: neither --listen-addr nor --advertise-addr was specified.
* The server will advertise "nlowe-z4l" to other nodes, is this routable?
*
* Consider using:
* - for local-only servers:  --listen-addr=localhost
* - for multi-node clusters: --advertise-addr=<host/IP addr>
*
*
CockroachDB node starting at 2022-04-22 15:25:55.461315977 +0000 UTC (took 2.1s)
build:               CCL unknown @  (go1.17.6)
webui:               http://nlowe-z4l:8080/
sql:                 postgresql://root@nlowe-z4l:26257/defaultdb?sslmode=disable
sql (JDBC):          jdbc:postgresql://nlowe-z4l:26257/defaultdb?sslmode=disable&user=root
RPC client flags:    /home/nlowe/.cache/bazel/_bazel_nlowe/cf6ed4d0d14c8e474a5c30d572846d8a/execroot/cockroach/bazel-out/k8-fastbuild/bin/pkg/cmd/cockroach/cockroach_/cockroach <client cmd> --host=nlowe-z4l:26257 --insecure
logs:                /home/nlowe/.cache/bazel/_bazel_nlowe/cf6ed4d0d14c8e474a5c30d572846d8a/execroot/cockroach/bazel-out/k8-fastbuild/bin/pkg/cmd/cockroach/cockroach_/cockroach.runfiles/cockroach/cockroach-data/logs
temp dir:            /home/nlowe/.cache/bazel/_bazel_nlowe/cf6ed4d0d14c8e474a5c30d572846d8a/execroot/cockroach/bazel-out/k8-fastbuild/bin/pkg/cmd/cockroach/cockroach_/cockroach.runfiles/cockroach/cockroach-data/cockroach-temp4100501952
external I/O path:   /home/nlowe/.cache/bazel/_bazel_nlowe/cf6ed4d0d14c8e474a5c30d572846d8a/execroot/cockroach/bazel-out/k8-fastbuild/bin/pkg/cmd/cockroach/cockroach_/cockroach.runfiles/cockroach/cockroach-data/extern
store[0]:            path=/home/nlowe/.cache/bazel/_bazel_nlowe/cf6ed4d0d14c8e474a5c30d572846d8a/execroot/cockroach/bazel-out/k8-fastbuild/bin/pkg/cmd/cockroach/cockroach_/cockroach.runfiles/cockroach/cockroach-data
storage engine:      pebble
clusterID:           bb3942d7-f241-4d26-aa4a-1bd0d6556e4d
status:              initialized new cluster
nodeID:              1
```
 
I was then able to view the contents of a backup hosted in an azure
government storage account:
 
```
root@:26257/defaultdb> SELECT DISTINCT object_name FROM [SHOW BACKUP 'azure://container/path/to/backup?AZURE_ACCOUNT_NAME=account&AZURE_ACCOUNT_KEY=***&AZURE_ENVIRONMENT=AzureUSGovernmentCloud'] WHERE object_type = 'database';
               object_name
------------------------------------------
  example_database
  ...
(17 rows)
 
Time: 5.859632889s
```
 
Omitting the `AZURE_ENVIRONMENT` parameter, we can see cockroach
defaults to the public cloud where my storage account does not exist:
 
```
root@:26257/defaultdb> SELECT DISTINCT object_name FROM [SHOW BACKUP 'azure://container/path/to/backup?AZURE_ACCOUNT_NAME=account&AZURE_ACCOUNT_KEY=***'] WHERE object_type = 'database';
ERROR: reading previous backup layers: unable to list files for specified blob: Get "https://account.blob.core.windows.net/container?comp=list&delimiter=path%2Fto%2Fbackup&restype=container&timeout=61": dial tcp: lookup account.blob.core.windows.net on 8.8.8.8:53: no such host
```
 
## Tests
 
Two new tests are added to verify that the storage account URL is correctly
built from the provided Azure Environment name, and that the Environment
defaults to the Public Cloud if unspecified for backwards compatibility. I
verified the existing tests pass against a government storage account after
specifying `AZURE_ENVIRONMENT` as `AzureUSGovernmentCloud` in the backup URL
query parameters:
 
```
nlowe@nlowe-mbp:~/projects/github/cockroachdb/cockroachdb [feat/47163-azure-storage-support-multiple-environments| …3] [🗓  2022-04-22 17:38:26]
$ export AZURE_ACCOUNT_NAME=account
nlowe@nlowe-mbp:~/projects/github/cockroachdb/cockroachdb [feat/47163-azure-storage-support-multiple-environments| …3] [🗓  2022-04-22 17:38:42]
$ export AZURE_ACCOUNT_KEY=***
nlowe@nlowe-mbp:~/projects/github/cockroachdb/cockroachdb [feat/47163-azure-storage-support-multiple-environments| …3] [🗓  2022-04-22 17:39:25]
$ export AZURE_CONTAINER=container
nlowe@nlowe-mbp:~/projects/github/cockroachdb/cockroachdb [feat/47163-azure-storage-support-multiple-environments| …3] [🗓  2022-04-22 17:39:48]
$ export AZURE_ENVIRONMENT=AzureUSGovernmentCloud
nlowe@nlowe-mbp:~/projects/github/cockroachdb/cockroachdb [feat/47163-azure-storage-support-multiple-environments| …3] [🗓  2022-04-22 17:40:15]
$ bazel test --test_output=streamed --test_arg=-test.v --action_env=AZURE_ACCOUNT_NAME --action_env=AZURE_ACCOUNT_KEY --action_env=AZURE_CONTAINER --action_env=AZURE_ENVIRONMENT //pkg/cloud/azure:azure_test
INFO: Invocation ID: aa88a942-f3c7-4df6-bade-8f5f0e18041f
WARNING: Streamed test output requested. All tests will be run locally, without sharding, one at a time
INFO: Build option --action_env has changed, discarding analysis cache.
INFO: Analyzed target //pkg/cloud/azure:azure_test (468 packages loaded, 16382 targets configured).
INFO: Found 1 test target...
initialized metamorphic constant "span-reuse-rate" with value 28
=== RUN   TestAzure
=== RUN   TestAzure/simple_round_trip
=== RUN   TestAzure/exceeds-4mb-chunk
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats/#00
    cloud_test_helpers.go:226: read 3345 of file at 4778744
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats/#1
    cloud_test_helpers.go:226: read 7228 of file at 226589
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats/#2
    cloud_test_helpers.go:226: read 634 of file at 256284
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats/#3
    cloud_test_helpers.go:226: read 7546 of file at 3546208
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats/#4
    cloud_test_helpers.go:226: read 24123 of file at 4821795
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats/#5
    cloud_test_helpers.go:226: read 16899 of file at 403428
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats/#6
    cloud_test_helpers.go:226: read 29467 of file at 4886370
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats/#7
    cloud_test_helpers.go:226: read 11700 of file at 1876920
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats/#8
    cloud_test_helpers.go:226: read 2928 of file at 489781
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats/cockroachdb#9
    cloud_test_helpers.go:226: read 19933 of file at 1483342
=== RUN   TestAzure/read-single-file-by-uri
=== RUN   TestAzure/write-single-file-by-uri
=== RUN   TestAzure/file-does-not-exist
=== RUN   TestAzure/List
=== RUN   TestAzure/List/root
=== RUN   TestAzure/List/file-slash-numbers-slash
=== RUN   TestAzure/List/root-slash
=== RUN   TestAzure/List/file
=== RUN   TestAzure/List/file-slash
=== RUN   TestAzure/List/slash-f
=== RUN   TestAzure/List/nothing
=== RUN   TestAzure/List/delim-slash-file-slash
=== RUN   TestAzure/List/delim-data
--- PASS: TestAzure (34.81s)
    --- PASS: TestAzure/simple_round_trip (9.66s)
    --- PASS: TestAzure/exceeds-4mb-chunk (16.45s)
        --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats (6.41s)
            --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#00 (0.15s)
            --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#1 (0.64s)
            --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#2 (0.65s)
            --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#3 (0.60s)
            --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#4 (0.75s)
            --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#5 (0.80s)
            --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#6 (0.75s)
            --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#7 (0.65s)
            --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#8 (0.65s)
            --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/cockroachdb#9 (0.77s)
    --- PASS: TestAzure/read-single-file-by-uri (0.60s)
    --- PASS: TestAzure/write-single-file-by-uri (0.60s)
    --- PASS: TestAzure/file-does-not-exist (1.05s)
    --- PASS: TestAzure/List (2.40s)
        --- PASS: TestAzure/List/root (0.30s)
        --- PASS: TestAzure/List/file-slash-numbers-slash (0.30s)
        --- PASS: TestAzure/List/root-slash (0.30s)
        --- PASS: TestAzure/List/file (0.30s)
        --- PASS: TestAzure/List/file-slash (0.30s)
        --- PASS: TestAzure/List/slash-f (0.30s)
        --- PASS: TestAzure/List/nothing (0.15s)
        --- PASS: TestAzure/List/delim-slash-file-slash (0.15s)
        --- PASS: TestAzure/List/delim-data (0.30s)
=== RUN   TestAntagonisticAzureRead
--- PASS: TestAntagonisticAzureRead (103.90s)
=== RUN   TestParseAzureURL
=== RUN   TestParseAzureURL/Defaults_to_Public_Cloud_when_AZURE_ENVIRONEMNT_unset
=== RUN   TestParseAzureURL/Can_Override_AZURE_ENVIRONMENT
--- PASS: TestParseAzureURL (0.00s)
    --- PASS: TestParseAzureURL/Defaults_to_Public_Cloud_when_AZURE_ENVIRONEMNT_unset (0.00s)
    --- PASS: TestParseAzureURL/Can_Override_AZURE_ENVIRONMENT (0.00s)
=== RUN   TestMakeAzureStorageURLFromEnvironment
=== RUN   TestMakeAzureStorageURLFromEnvironment/AzurePublicCloud
=== RUN   TestMakeAzureStorageURLFromEnvironment/AzureUSGovernmentCloud
--- PASS: TestMakeAzureStorageURLFromEnvironment (0.00s)
    --- PASS: TestMakeAzureStorageURLFromEnvironment/AzurePublicCloud (0.00s)
    --- PASS: TestMakeAzureStorageURLFromEnvironment/AzureUSGovernmentCloud (0.00s)
PASS
Target //pkg/cloud/azure:azure_test up-to-date:
  _bazel/bin/pkg/cloud/azure/azure_test_/azure_test
INFO: Elapsed time: 159.865s, Critical Path: 152.35s
INFO: 66 processes: 2 internal, 64 darwin-sandbox.
INFO: Build completed successfully, 66 total actions
//pkg/cloud/azure:azure_test                                             PASSED in 139.9s
 
INFO: Build completed successfully, 66 total actions
```

80705: kvclient: fix gRPC stream leak in rangefeed client r=tbg,srosenberg a=erikgrinaker

When the DistSender rangefeed client received a `RangeFeedError` message
and propagated a retryable error up the stack, it would fail to close
the existing gRPC stream, causing stream/goroutine leaks.

Release note (bug fix): Fixed a goroutine leak when internal rangefeed
clients received certain kinds of retriable errors.

80762: joberror: add ConnectionReset/ConnectionRefused to retryable err allow list r=miretskiy a=adityamaru

Bulk jobs will no longer treat `sysutil.IsErrConnectionReset`
and `sysutil.IsErrConnectionRefused` as permanent errors. IMPORT,
RESTORE and BACKUP will treat this error as transient and retry.

Release note: None

80773: backupccl: break dependency to testcluster r=irfansharif a=irfansharif

Noticed we were building testing library packages when building CRDB
binaries.

    $ bazel query "somepath(//pkg/cmd/cockroach-short, //pkg/testutils/testcluster)"
    //pkg/cmd/cockroach-short:cockroach-short
    //pkg/cmd/cockroach-short:cockroach-short_lib
    //pkg/ccl:ccl
    //pkg/ccl/backupccl:backupccl
    //pkg/testutils/testcluster:testcluster

Release note: None

Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
Co-authored-by: Nathan Lowe <nathan.lowe@spacex.com>
Co-authored-by: Erik Grinaker <grinaker@cockroachlabs.com>
Co-authored-by: Aditya Maru <adityamaru@gmail.com>
Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com>
@ajwerner ajwerner merged commit d6c9563 into master Nov 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.