Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
79911: opt: refactor and test lookup join key column and expr generation r=mgartner a=mgartner #### opt: simplify fetching outer column in CustomFuncs.findComputedColJoinEquality Previously, `CustomFuncs.findComputedColJoinEquality` used `CustomFuncs.OuterCols` to retrieve the outer columns of computed column expressions. `CustomFuncs.OuterCols` returns the cached outer columns in the expression if it is a `memo.ScalarPropsExpr`, and falls back to calculating the outer columns with `memo.BuildSharedProps` otherwise. Computed column expressions are never `memo.ScalarPropsExpr`s, so we use just use `memo.BuildSharedProps` directly. Release note: None #### opt: make RemapCols a method on Factory instead of CustomFuncs Release note: None #### opt: use partial-index-reduced filters when building lookup expressions This commit makes a minor change to `generateLookupJoinsImpl`. Previously, equality filters were extracted from the original `ON` filters. Now they are extracted from filters that have been reduced by partial index implication. This has no effect on behavior because equality filters that reference columns in two tables cannot exist in partial index predicates, so they will never be eliminated during partial index implication. Release note: None #### opt: moves some lookup join generation logic to lookup join package This commit adds a new `lookupjoin` package. Logic for determining the key columns and lookup expressions for lookup joins has been moved to `lookupJoin.ConstraintBuilder`. The code was moved with as few changes as possible, and the behavior does not change in any way. This move will make it easier to test this code in isolation in the future, and allow for further refactoring. Release note: None #### opt: generalize lookupjoin.ConstraintBuilder API This commit makes the lookupjoin.ConstraintBuilder API more general to make unit testing easier in a future commit. Release note: None #### opt: add data-driven tests for lookupjoin.ConstraintBuilder Release note: None #### opt: add lookupjoin.Constraint struct The `lookupjoin.Constraint` struct has been added to encapsulate multiple data structures that represent a strategy for constraining a lookup join. Release note: None 80511: pkg/cloud/azure: Support specifying Azure environments in storage URLs r=adityamaru a=nlowe-sx The Azure Storage cloud provider learned a new parameter, AZURE_ENVIRONMENT, which specifies which azure environment the storage account in question belongs to. This allows cockroach to backup and restore data to Azure Storage Accounts outside the main Azure Public Cloud. For backwards compatibility, this defaults to "AzurePublicCloud" if AZURE_ENVIRONMENT is not specified. Fixes #47163 ## Verification Evidence I spun up a single node cluster: ``` nlowe@nlowe-z4l:~/projects/github/cockroachdb/cockroach [feat/47163-azure-storage-support-multiple-environments L|✚ 2] [🗓 2022-04-22 08:25:49] $ bazel run //pkg/cmd/cockroach:cockroach -- start-single-node --insecure WARNING: Option 'host_javabase' is deprecated WARNING: Option 'javabase' is deprecated WARNING: Option 'host_java_toolchain' is deprecated WARNING: Option 'java_toolchain' is deprecated INFO: Invocation ID: 11504a98-f767-413a-8994-8f92793c2ecf INFO: Analyzed target //pkg/cmd/cockroach:cockroach (0 packages loaded, 0 targets configured). INFO: Found 1 target... Target //pkg/cmd/cockroach:cockroach up-to-date: _bazel/bin/pkg/cmd/cockroach/cockroach_/cockroach INFO: Elapsed time: 0.358s, Critical Path: 0.00s INFO: 1 process: 1 internal. INFO: Build completed successfully, 1 total action INFO: Build completed successfully, 1 total action * * WARNING: ALL SECURITY CONTROLS HAVE BEEN DISABLED! * * This mode is intended for non-production testing only. * * In this mode: * - Your cluster is open to any client that can access any of your IP addresses. * - Intruders with access to your machine or network can observe client-server traffic. * - Intruders can log in without password and read or write any data in the cluster. * - Intruders can consume all your server's resources and cause unavailability. * * * INFO: To start a secure server without mandating TLS for clients, * consider --accept-sql-without-tls instead. For other options, see: * * - https://go.crdb.dev/issue-v/53404/dev * - https://www.cockroachlabs.com/docs/dev/secure-a-cluster.html * * * WARNING: neither --listen-addr nor --advertise-addr was specified. * The server will advertise "nlowe-z4l" to other nodes, is this routable? * * Consider using: * - for local-only servers: --listen-addr=localhost * - for multi-node clusters: --advertise-addr=<host/IP addr> * * CockroachDB node starting at 2022-04-22 15:25:55.461315977 +0000 UTC (took 2.1s) build: CCL unknown @ (go1.17.6) webui: http://nlowe-z4l:8080/ sql: postgresql://root@nlowe-z4l:26257/defaultdb?sslmode=disable sql (JDBC): jdbc:postgresql://nlowe-z4l:26257/defaultdb?sslmode=disable&user=root RPC client flags: /home/nlowe/.cache/bazel/_bazel_nlowe/cf6ed4d0d14c8e474a5c30d572846d8a/execroot/cockroach/bazel-out/k8-fastbuild/bin/pkg/cmd/cockroach/cockroach_/cockroach <client cmd> --host=nlowe-z4l:26257 --insecure logs: /home/nlowe/.cache/bazel/_bazel_nlowe/cf6ed4d0d14c8e474a5c30d572846d8a/execroot/cockroach/bazel-out/k8-fastbuild/bin/pkg/cmd/cockroach/cockroach_/cockroach.runfiles/cockroach/cockroach-data/logs temp dir: /home/nlowe/.cache/bazel/_bazel_nlowe/cf6ed4d0d14c8e474a5c30d572846d8a/execroot/cockroach/bazel-out/k8-fastbuild/bin/pkg/cmd/cockroach/cockroach_/cockroach.runfiles/cockroach/cockroach-data/cockroach-temp4100501952 external I/O path: /home/nlowe/.cache/bazel/_bazel_nlowe/cf6ed4d0d14c8e474a5c30d572846d8a/execroot/cockroach/bazel-out/k8-fastbuild/bin/pkg/cmd/cockroach/cockroach_/cockroach.runfiles/cockroach/cockroach-data/extern store[0]: path=/home/nlowe/.cache/bazel/_bazel_nlowe/cf6ed4d0d14c8e474a5c30d572846d8a/execroot/cockroach/bazel-out/k8-fastbuild/bin/pkg/cmd/cockroach/cockroach_/cockroach.runfiles/cockroach/cockroach-data storage engine: pebble clusterID: bb3942d7-f241-4d26-aa4a-1bd0d6556e4d status: initialized new cluster nodeID: 1 ``` I was then able to view the contents of a backup hosted in an azure government storage account: ``` root@:26257/defaultdb> SELECT DISTINCT object_name FROM [SHOW BACKUP 'azure://container/path/to/backup?AZURE_ACCOUNT_NAME=account&AZURE_ACCOUNT_KEY=***&AZURE_ENVIRONMENT=AzureUSGovernmentCloud'] WHERE object_type = 'database'; object_name ------------------------------------------ example_database ... (17 rows) Time: 5.859632889s ``` Omitting the `AZURE_ENVIRONMENT` parameter, we can see cockroach defaults to the public cloud where my storage account does not exist: ``` root@:26257/defaultdb> SELECT DISTINCT object_name FROM [SHOW BACKUP 'azure://container/path/to/backup?AZURE_ACCOUNT_NAME=account&AZURE_ACCOUNT_KEY=***'] WHERE object_type = 'database'; ERROR: reading previous backup layers: unable to list files for specified blob: Get "https://account.blob.core.windows.net/container?comp=list&delimiter=path%2Fto%2Fbackup&restype=container&timeout=61": dial tcp: lookup account.blob.core.windows.net on 8.8.8.8:53: no such host ``` ## Tests Two new tests are added to verify that the storage account URL is correctly built from the provided Azure Environment name, and that the Environment defaults to the Public Cloud if unspecified for backwards compatibility. I verified the existing tests pass against a government storage account after specifying `AZURE_ENVIRONMENT` as `AzureUSGovernmentCloud` in the backup URL query parameters: ``` nlowe@nlowe-mbp:~/projects/github/cockroachdb/cockroachdb [feat/47163-azure-storage-support-multiple-environments| …3] [🗓 2022-04-22 17:38:26] $ export AZURE_ACCOUNT_NAME=account nlowe@nlowe-mbp:~/projects/github/cockroachdb/cockroachdb [feat/47163-azure-storage-support-multiple-environments| …3] [🗓 2022-04-22 17:38:42] $ export AZURE_ACCOUNT_KEY=*** nlowe@nlowe-mbp:~/projects/github/cockroachdb/cockroachdb [feat/47163-azure-storage-support-multiple-environments| …3] [🗓 2022-04-22 17:39:25] $ export AZURE_CONTAINER=container nlowe@nlowe-mbp:~/projects/github/cockroachdb/cockroachdb [feat/47163-azure-storage-support-multiple-environments| …3] [🗓 2022-04-22 17:39:48] $ export AZURE_ENVIRONMENT=AzureUSGovernmentCloud nlowe@nlowe-mbp:~/projects/github/cockroachdb/cockroachdb [feat/47163-azure-storage-support-multiple-environments| …3] [🗓 2022-04-22 17:40:15] $ bazel test --test_output=streamed --test_arg=-test.v --action_env=AZURE_ACCOUNT_NAME --action_env=AZURE_ACCOUNT_KEY --action_env=AZURE_CONTAINER --action_env=AZURE_ENVIRONMENT //pkg/cloud/azure:azure_test INFO: Invocation ID: aa88a942-f3c7-4df6-bade-8f5f0e18041f WARNING: Streamed test output requested. All tests will be run locally, without sharding, one at a time INFO: Build option --action_env has changed, discarding analysis cache. INFO: Analyzed target //pkg/cloud/azure:azure_test (468 packages loaded, 16382 targets configured). INFO: Found 1 test target... initialized metamorphic constant "span-reuse-rate" with value 28 === RUN TestAzure === RUN TestAzure/simple_round_trip === RUN TestAzure/exceeds-4mb-chunk === RUN TestAzure/exceeds-4mb-chunk/rand-readats === RUN TestAzure/exceeds-4mb-chunk/rand-readats/#00 cloud_test_helpers.go:226: read 3345 of file at 4778744 === RUN TestAzure/exceeds-4mb-chunk/rand-readats/#1 cloud_test_helpers.go:226: read 7228 of file at 226589 === RUN TestAzure/exceeds-4mb-chunk/rand-readats/#2 cloud_test_helpers.go:226: read 634 of file at 256284 === RUN TestAzure/exceeds-4mb-chunk/rand-readats/#3 cloud_test_helpers.go:226: read 7546 of file at 3546208 === RUN TestAzure/exceeds-4mb-chunk/rand-readats/#4 cloud_test_helpers.go:226: read 24123 of file at 4821795 === RUN TestAzure/exceeds-4mb-chunk/rand-readats/#5 cloud_test_helpers.go:226: read 16899 of file at 403428 === RUN TestAzure/exceeds-4mb-chunk/rand-readats/#6 cloud_test_helpers.go:226: read 29467 of file at 4886370 === RUN TestAzure/exceeds-4mb-chunk/rand-readats/#7 cloud_test_helpers.go:226: read 11700 of file at 1876920 === RUN TestAzure/exceeds-4mb-chunk/rand-readats/#8 cloud_test_helpers.go:226: read 2928 of file at 489781 === RUN TestAzure/exceeds-4mb-chunk/rand-readats/#9 cloud_test_helpers.go:226: read 19933 of file at 1483342 === RUN TestAzure/read-single-file-by-uri === RUN TestAzure/write-single-file-by-uri === RUN TestAzure/file-does-not-exist === RUN TestAzure/List === RUN TestAzure/List/root === RUN TestAzure/List/file-slash-numbers-slash === RUN TestAzure/List/root-slash === RUN TestAzure/List/file === RUN TestAzure/List/file-slash === RUN TestAzure/List/slash-f === RUN TestAzure/List/nothing === RUN TestAzure/List/delim-slash-file-slash === RUN TestAzure/List/delim-data --- PASS: TestAzure (34.81s) --- PASS: TestAzure/simple_round_trip (9.66s) --- PASS: TestAzure/exceeds-4mb-chunk (16.45s) --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats (6.41s) --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#00 (0.15s) --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#1 (0.64s) --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#2 (0.65s) --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#3 (0.60s) --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#4 (0.75s) --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#5 (0.80s) --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#6 (0.75s) --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#7 (0.65s) --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#8 (0.65s) --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#9 (0.77s) --- PASS: TestAzure/read-single-file-by-uri (0.60s) --- PASS: TestAzure/write-single-file-by-uri (0.60s) --- PASS: TestAzure/file-does-not-exist (1.05s) --- PASS: TestAzure/List (2.40s) --- PASS: TestAzure/List/root (0.30s) --- PASS: TestAzure/List/file-slash-numbers-slash (0.30s) --- PASS: TestAzure/List/root-slash (0.30s) --- PASS: TestAzure/List/file (0.30s) --- PASS: TestAzure/List/file-slash (0.30s) --- PASS: TestAzure/List/slash-f (0.30s) --- PASS: TestAzure/List/nothing (0.15s) --- PASS: TestAzure/List/delim-slash-file-slash (0.15s) --- PASS: TestAzure/List/delim-data (0.30s) === RUN TestAntagonisticAzureRead --- PASS: TestAntagonisticAzureRead (103.90s) === RUN TestParseAzureURL === RUN TestParseAzureURL/Defaults_to_Public_Cloud_when_AZURE_ENVIRONEMNT_unset === RUN TestParseAzureURL/Can_Override_AZURE_ENVIRONMENT --- PASS: TestParseAzureURL (0.00s) --- PASS: TestParseAzureURL/Defaults_to_Public_Cloud_when_AZURE_ENVIRONEMNT_unset (0.00s) --- PASS: TestParseAzureURL/Can_Override_AZURE_ENVIRONMENT (0.00s) === RUN TestMakeAzureStorageURLFromEnvironment === RUN TestMakeAzureStorageURLFromEnvironment/AzurePublicCloud === RUN TestMakeAzureStorageURLFromEnvironment/AzureUSGovernmentCloud --- PASS: TestMakeAzureStorageURLFromEnvironment (0.00s) --- PASS: TestMakeAzureStorageURLFromEnvironment/AzurePublicCloud (0.00s) --- PASS: TestMakeAzureStorageURLFromEnvironment/AzureUSGovernmentCloud (0.00s) PASS Target //pkg/cloud/azure:azure_test up-to-date: _bazel/bin/pkg/cloud/azure/azure_test_/azure_test INFO: Elapsed time: 159.865s, Critical Path: 152.35s INFO: 66 processes: 2 internal, 64 darwin-sandbox. INFO: Build completed successfully, 66 total actions //pkg/cloud/azure:azure_test PASSED in 139.9s INFO: Build completed successfully, 66 total actions ``` 80705: kvclient: fix gRPC stream leak in rangefeed client r=tbg,srosenberg a=erikgrinaker When the DistSender rangefeed client received a `RangeFeedError` message and propagated a retryable error up the stack, it would fail to close the existing gRPC stream, causing stream/goroutine leaks. Release note (bug fix): Fixed a goroutine leak when internal rangefeed clients received certain kinds of retriable errors. 80762: joberror: add ConnectionReset/ConnectionRefused to retryable err allow list r=miretskiy a=adityamaru Bulk jobs will no longer treat `sysutil.IsErrConnectionReset` and `sysutil.IsErrConnectionRefused` as permanent errors. IMPORT, RESTORE and BACKUP will treat this error as transient and retry. Release note: None 80773: backupccl: break dependency to testcluster r=irfansharif a=irfansharif Noticed we were building testing library packages when building CRDB binaries. $ bazel query "somepath(//pkg/cmd/cockroach-short, //pkg/testutils/testcluster)" //pkg/cmd/cockroach-short:cockroach-short //pkg/cmd/cockroach-short:cockroach-short_lib //pkg/ccl:ccl //pkg/ccl/backupccl:backupccl //pkg/testutils/testcluster:testcluster Release note: None Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com> Co-authored-by: Nathan Lowe <nathan.lowe@spacex.com> Co-authored-by: Erik Grinaker <grinaker@cockroachlabs.com> Co-authored-by: Aditya Maru <adityamaru@gmail.com> Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com>
- Loading branch information