diff --git a/_includes/sidebar-data-v21.1.json b/_includes/sidebar-data-v21.1.json index 10c896d37e0..e1d55a85f10 100644 --- a/_includes/sidebar-data-v21.1.json +++ b/_includes/sidebar-data-v21.1.json @@ -627,10 +627,10 @@ ] }, { - "title": "Multi-region Clusters", + "title": "Multi-region Capabilities", "items": [ { - "title": "Multi-region Overview", + "title": "Overview", "urls": [ "/${VERSION}/multiregion-overview.html" ] @@ -677,21 +677,15 @@ ] }, { - "title": "Geo-Partitioned Replicas", - "urls": [ - "/${VERSION}/topology-geo-partitioned-replicas.html" - ] - }, - { - "title": "Geo-Partitioned Leaseholders", + "title": "Global Tables", "urls": [ - "/${VERSION}/topology-geo-partitioned-leaseholders.html" + "/${VERSION}/topology-global-tables.html" ] }, { - "title": "Duplicate Indexes", + "title": "Regional Tables", "urls": [ - "/${VERSION}/topology-duplicate-indexes.html" + "/${VERSION}/topology-regional-tables.html" ] }, { diff --git a/_includes/v21.1/sql/global-table-description.md b/_includes/v21.1/sql/global-table-description.md new file mode 100644 index 00000000000..acd3b0be0c1 --- /dev/null +++ b/_includes/v21.1/sql/global-table-description.md @@ -0,0 +1,7 @@ + _Global_ tables are optimized for low-latency reads from every region in the database. The tradeoff is that writes will incur higher latencies from any given region, since writes have to be replicated across every region to make the global low-latency reads possible. + +Use global tables when your application has a "read-mostly" table of reference data that is rarely updated, and needs to be available to all regions. + +For an example of a table that can benefit from the _global_ table locality setting in a multi-region deployment, see the `promo_codes` table from the [MovR application](movr.html). + +For instructions showing how to set a table's locality to `GLOBAL`, see [`ALTER TABLE ... SET LOCALITY`](set-locality.html#global) diff --git a/_includes/v21.1/sql/regional-by-row-table-description.md b/_includes/v21.1/sql/regional-by-row-table-description.md new file mode 100644 index 00000000000..fa2f4d98f6e --- /dev/null +++ b/_includes/v21.1/sql/regional-by-row-table-description.md @@ -0,0 +1,7 @@ +In _regional by row_ tables, individual rows are optimized for access from different regions. This setting divides a table and all of [its indexes](multiregion-overview.html#indexes-on-regional-by-row-tables) into [partitions](partitioning.html), with each partition optimized for access from a different region. Like [regional tables](multiregion-overview.html#regional-tables), _regional by row_ tables are optimized for access from a single region. However, that region is specified at the row level instead of applying to the whole table. + +Use regional by row tables when your application requires low-latency reads and writes at a row level where individual rows are primarily accessed from a single region. For example, a users table in a global application may need to keep some users' data in specific regions for better performance. + +For an example of a table that can benefit from the _regional by row_ setting in a multi-region deployment, see the `users` table from the [MovR application](movr.html). + +For instructions showing how to set a table's locality to `REGIONAL BY ROW`, see [`ALTER TABLE ... SET LOCALITY`](set-locality.html#regional-by-row) diff --git a/_includes/v21.1/sql/regional-table-description.md b/_includes/v21.1/sql/regional-table-description.md new file mode 100644 index 00000000000..de037be58ae --- /dev/null +++ b/_includes/v21.1/sql/regional-table-description.md @@ -0,0 +1,9 @@ +Regional tables work well when your application requires low-latency reads and writes for an entire table from a single region. + +For _regional_ tables, access to the table will be fast in the table's "home region" and slower in other regions. In other words, CockroachDB optimizes access to data in regional tables from a single region. By default, a regional table's home region is the [database's primary region](multiregion-overview.html#database-regions), but that can be changed to use any region added to the database. + +For instructions showing how to set a table's locality to `REGIONAL BY TABLE`, see [`ALTER TABLE ... SET LOCALITY`](set-locality.html#regional-by-table) + +{{site.data.alerts.callout_info}} +By default, all tables in a multi-region database are _regional_ tables that use the database's primary region. Unless you know your application needs different performance characteristics than regional tables provide, there is no need to change this setting. +{{site.data.alerts.end}} diff --git a/_includes/v21.1/sql/use-multiregion-instead-of-partitioning.md b/_includes/v21.1/sql/use-multiregion-instead-of-partitioning.md new file mode 100644 index 00000000000..611588178eb --- /dev/null +++ b/_includes/v21.1/sql/use-multiregion-instead-of-partitioning.md @@ -0,0 +1,3 @@ +{{site.data.alerts.callout_success}} +New in v21.1: Most users should not need to use partitioning directly. Instead, they should use CockroachDB's built-in [multi-region capabilities](multiregion-overview.html), which automatically handle geo-partitioning and other low-level details. +{{site.data.alerts.end}} diff --git a/_includes/v21.1/topology-patterns/multi-region-cluster-setup.md b/_includes/v21.1/topology-patterns/multi-region-cluster-setup.md index 020db4a7ab0..76fd4edb0ab 100644 --- a/_includes/v21.1/topology-patterns/multi-region-cluster-setup.md +++ b/_includes/v21.1/topology-patterns/multi-region-cluster-setup.md @@ -1,20 +1,18 @@ -Each [multi-region topology pattern](topology-patterns.html#multi-region-patterns) assumes the following setup: +Each [multi-region pattern](topology-patterns.html#multi-region-patterns) assumes the following setup: Multi-region hardware setup #### Hardware - 3 regions - - Per region, 3+ AZs with 3+ VMs evenly distributed across them - - Region-specific app instances and load balancers - Each load balancer redirects to CockroachDB nodes in its region. - When CockroachDB nodes are unavailable in a region, the load balancer redirects to nodes in other regions. #### Cluster -Each node is started with the [`--locality`](cockroach-start.html#locality) flag specifying its region and AZ combination. For example, the following command starts a node in the west1 AZ of the us-west region: +Each node is started with the [`--locality`](cockroach-start.html#locality) flag specifying its region and AZ combination. For example, the following command starts a node in the `west1` AZ of the `us-west` region: {% include copy-clipboard.html %} ~~~ shell @@ -22,7 +20,7 @@ $ cockroach start \ --locality=region=us-west,zone=west1 \ --certs-dir=certs \ --advertise-addr= \ ---join=:26257,:26257,:26257 \ +--join=:26257,:26257,:26257 \ --cache=.25 \ --max-sql-memory=.25 \ --background diff --git a/_includes/v21.1/topology-patterns/multiregion-db-setup.md b/_includes/v21.1/topology-patterns/multiregion-db-setup.md new file mode 100644 index 00000000000..c3307c79ee7 --- /dev/null +++ b/_includes/v21.1/topology-patterns/multiregion-db-setup.md @@ -0,0 +1,36 @@ +First, create a database and use it: + +{% include copy-clipboard.html %} +~~~ sql +CREATE DATABASE test; +~~~ + +{% include copy-clipboard.html %} +~~~ sql +USE test; +~~~ + +[This cluster is already deployed across three regions](#cluster-setup). Therefore, to make this database a "multi-region database", you need to issue the following SQL statement that [sets the primary region](add-region.html#set-the-primary-region): + +{% include copy-clipboard.html %} +~~~ sql +ALTER DATABASE test PRIMARY REGION "us-east"; +~~~ + +{{site.data.alerts.callout_info}} +Every multi-region database must have a primary region. For more information, see [Database regions](multiregion-overview.html#database-regions). +{{site.data.alerts.end}} + +Next, issue the following [`ADD REGION`](add-region.html) statements to add the remaining regions to the database. + +{% include copy-clipboard.html %} +~~~ sql +ALTER DATABASE test ADD REGION "us-west"; +~~~ + +{% include copy-clipboard.html %} +~~~ sql +ALTER DATABASE test ADD REGION "us-central"; +~~~ + +Congratulations, `test` is now a multi-region database! diff --git a/_includes/v21.1/topology-patterns/multiregion-fundamentals.md b/_includes/v21.1/topology-patterns/multiregion-fundamentals.md new file mode 100644 index 00000000000..9c65c09d735 --- /dev/null +++ b/_includes/v21.1/topology-patterns/multiregion-fundamentals.md @@ -0,0 +1,20 @@ +Multi-region patterns require thinking about the following questions: + +- What are my [survival goals](multiregion-overview.html#survival-goals)? Do I need to survive a [zone failure](multiregion-overview.html#surviving-zone-failures)? A [region failure](multiregion-overview.html#surviving-region-failures)? +- Given the constraints provided by my survival goals, what are the [table localities](multiregion-overview.html#table-locality) that will provide the performance characteristics I need for each table's data? + - Do I need low-latency reads and writes from a single region? Do I need them at the [row level](multiregion-overview.html#regional-by-row-tables)? Or will the [table level](multiregion-overview.html#regional-tables) suffice? + - Do I have a "read-mostly" [table of reference data that is rarely updated](multiregion-overview.html#global-tables), but that needs to be available to all regions? + +For more information about our multi-region capabilities, review the following pages: + +- [Multi-region overview](multiregion-overview.html) +- [Choosing a multi-region configuration](choosing-a-multi-region-configuration.html) +- [When to use `ZONE` vs. `REGION` Survival Goals](when-to-use-zone-vs-region-survival-goals.html) +- [When to use `REGIONAL` vs. `GLOBAL` Tables](when-to-use-regional-vs-global-tables.html) + +In addition, reviewing the following information will be helpful: + +- The concept of [locality](cockroach-start.html#locality), which makes CockroachDB aware of the location of nodes and able to intelligently place and balance data based on how you define [survival goals](multiregion-overview.html#survival-goals) and [table localities](multiregion-overview.html#table-locality). +- The recommendations in our [Production Checklist](recommended-production-settings.html). +- This page doesn't account for hardware specifications, so be sure to follow our [hardware recommendations](recommended-production-settings.html#hardware) and perform a POC to size hardware for your use case. +- Finally, adopt these [SQL Best Practices](performance-best-practices-overview.html) to get good performance. diff --git a/_includes/v21.1/topology-patterns/see-also.md b/_includes/v21.1/topology-patterns/see-also.md index 03844ca34fd..92317b9f8e3 100644 --- a/_includes/v21.1/topology-patterns/see-also.md +++ b/_includes/v21.1/topology-patterns/see-also.md @@ -1,11 +1,14 @@ +- [Multi-Region Capabilities Overview](multiregion-overview.html) +- [Choosing a multi-region configuration](choosing-a-multi-region-configuration.html) +- [When to use `ZONE` vs. `REGION` survival goals](when-to-use-zone-vs-region-survival-goals.html) +- [When to use `GLOBAL` vs. `REGIONAL` tables](when-to-use-regional-vs-global-tables.html) +- [`ALTER DATABASE ... SURVIVE {ZONE,REGION} FAILURE`](survive-failure.html) +- [`ALTER TABLE ... SET LOCALITY ...`](set-locality.html) - [Topology Patterns Overview](topology-patterns.html) - - - Single-region - - [Development](topology-development.html) - - [Basic Production](topology-basic-production.html) - - - Multi-region - - [Geo-Partitioned Replicas](topology-geo-partitioned-replicas.html) - - [Geo-Partitioned Leaseholders](topology-geo-partitioned-leaseholders.html) - - [Duplicate Indexes](topology-duplicate-indexes.html) - - [Follow-the-Workload](topology-follow-the-workload.html) + - Single-region + - [Development](topology-development.html) + - [Basic Production](topology-basic-production.html) + - Multi-region + - [`REGIONAL` Table Locality Pattern](topology-regional-tables.html) + - [`GLOBAL` Table Locality Pattern](topology-global-tables.html) + - [Follow-the-Workload](topology-follow-the-workload.html) diff --git a/v21.1/alter-primary-key.md b/v21.1/alter-primary-key.md index 460ed98ae3c..72e4df3325f 100644 --- a/v21.1/alter-primary-key.md +++ b/v21.1/alter-primary-key.md @@ -100,141 +100,6 @@ You can add a column and change the primary key with a couple of `ALTER TABLE` s Note that the old primary key index becomes a secondary index, in this case, `users_name_key`. If you do not want the old primary key to become a secondary index when changing a primary key, you can use [`DROP CONSTRAINT`](drop-constraint.html)/[`ADD CONSTRAINT`](add-constraint.html) instead. -### Make a single-column primary key composite for geo-partitioning - -Suppose that you are storing the data for users of your application in a table called `users`, defined by the following `CREATE TABLE` statement: - -{% include copy-clipboard.html %} -~~~ sql -> CREATE TABLE users ( - id UUID PRIMARY KEY DEFAULT gen_random_uuid(), - email STRING, - name STRING, - INDEX users_name_idx (name) -); -~~~ - -Now suppose that you want to expand your business from a single region into multiple regions. After you [deploy your application in multiple regions](topology-patterns.html), you consider [geo-partitioning your data](topology-geo-partitioned-replicas.html) to minimize latency and optimize performance. In order to geo-partition the `user` database, you need to add a column specifying the location of the data (e.g., `region`): - -{% include copy-clipboard.html %} -~~~ sql -> ALTER TABLE users ADD COLUMN region STRING NOT NULL; -~~~ - -When you geo-partition a database, you [partition the database on a primary key column](partitioning.html#partition-using-primary-key). The primary key of this table is still on `id`. Change the primary key to be composite, on `region` and `id`: - -{% include copy-clipboard.html %} -~~~ sql -> ALTER TABLE users ALTER PRIMARY KEY USING COLUMNS (region, id); -~~~ -{{site.data.alerts.callout_info}} -The order of the primary key columns is important when geo-partitioning. For performance, always place the partition column first. -{{site.data.alerts.end}} - -{% include copy-clipboard.html %} -~~~ sql -> SHOW CREATE TABLE users; -~~~ - -~~~ - table_name | create_statement --------------+------------------------------------------------------------- - users | CREATE TABLE users ( - | id UUID NOT NULL DEFAULT gen_random_uuid(), - | email STRING NULL, - | name STRING NULL, - | region STRING NOT NULL, - | CONSTRAINT "primary" PRIMARY KEY (region ASC, id ASC), - | UNIQUE INDEX users_id_key (id ASC), - | INDEX users_name_idx (name ASC), - | FAMILY "primary" (id, email, name, region) - | ) -(1 row) -~~~ - -Note that the old primary key index on `id` is now the secondary index `users_id_key`. - -With the new primary key on `region` and `id`, the table is ready to be [geo-partitioned](topology-geo-partitioned-replicas.html): - -{% include copy-clipboard.html %} -~~~ sql -> ALTER TABLE users PARTITION BY LIST (region) ( - PARTITION us_west VALUES IN ('us_west'), - PARTITION us_east VALUES IN ('us_east') - ); -~~~ - -{% include copy-clipboard.html %} -~~~ sql -> ALTER PARTITION us_west OF INDEX users@primary - CONFIGURE ZONE USING constraints = '[+region=us-west1]'; - ALTER PARTITION us_east OF INDEX users@primary - CONFIGURE ZONE USING constraints = '[+region=us-east1]'; -~~~ - -{% include copy-clipboard.html %} -~~~ sql -> SHOW PARTITIONS FROM TABLE users; -~~~ - -~~~ - database_name | table_name | partition_name | parent_partition | column_names | index_name | partition_value | zone_config | full_zone_config -----------------+------------+----------------+------------------+--------------+---------------+-----------------+------------------------------------+-------------------------------------- - movr | users | us_west | NULL | region | users@primary | ('us_west') | constraints = '[+region=us-west1]' | range_min_bytes = 134217728, - | | | | | | | | range_max_bytes = 536870912, - | | | | | | | | gc.ttlseconds = 90000, - | | | | | | | | num_replicas = 3, - | | | | | | | | constraints = '[+region=us-west1]', - | | | | | | | | lease_preferences = '[]' - movr | users | us_east | NULL | region | users@primary | ('us_east') | constraints = '[+region=us-east1]' | range_min_bytes = 134217728, - | | | | | | | | range_max_bytes = 536870912, - | | | | | | | | gc.ttlseconds = 90000, - | | | | | | | | num_replicas = 3, - | | | | | | | | constraints = '[+region=us-east1]', - | | | | | | | | lease_preferences = '[]' -(2 rows) -~~~ - -The table is now geo-partitioned on the `region` column. - -You now need to geo-partition any secondary indexes in the table. In order to geo-partition an index, the index must be prefixed by a column that can be used as a partitioning identifier (in this case, `region`). Currently, neither of the secondary indexes (i.e., `users_id_key` and `users_name_idx`) are prefixed by the `region` column, so they can't be meaningfully geo-partitioned. Any secondary indexes that you want to keep must be dropped, recreated, and then partitioned. - -Start by dropping both indexes: - -{% include copy-clipboard.html %} -~~~ sql -> DROP INDEX users_id_key CASCADE; - DROP INDEX users_name_idx CASCADE; -~~~ - -You don't need to recreate the index on `id` with `region`. Both columns are already indexed by the new primary key. - -Add `region` to the index on `name`: - -{% include copy-clipboard.html %} -~~~ sql -> CREATE INDEX ON users(region, name); -~~~ - -Then geo-partition the index: - -{% include copy-clipboard.html %} -~~~ sql -> ALTER INDEX users_region_name_idx PARTITION BY LIST (region) ( - PARTITION us_west VALUES IN ('us_west'), - PARTITION us_east VALUES IN ('us_east') - ); -~~~ - -{% include copy-clipboard.html %} -~~~ sql -> ALTER PARTITION us_west OF INDEX users@users_region_name_idx - CONFIGURE ZONE USING constraints = '[+region=us-west1]'; - ALTER PARTITION us_east OF INDEX users@users_region_name_idx - CONFIGURE ZONE USING constraints = '[+region=us-east1]'; -~~~ - - ## See also - [Constraints](constraints.html) diff --git a/v21.1/cockroach-demo.md b/v21.1/cockroach-demo.md index 8a1fd44d8ac..96658176ff9 100644 --- a/v21.1/cockroach-demo.md +++ b/v21.1/cockroach-demo.md @@ -52,7 +52,7 @@ Start a multi-node demo cluster: $ cockroach demo --nodes= ~~~ -Start a multi-region demo cluster with automatic geo-partitioning +Start a multi-region demo cluster with automatic [geo-partitioning](#start-a-multi-region-demo-cluster-with-automatic-geo-partitioning) ~~~ shell $ cockroach demo --geo-partitioned-replicas @@ -115,7 +115,7 @@ Flag | Description `--empty` | Start the demo cluster without a pre-loaded dataset. `--execute`
`-e` | Execute SQL statements directly from the command line, without opening a shell. This flag can be set multiple times, and each instance can contain one or more statements separated by semi-colons.

If an error occurs in any statement, the command exits with a non-zero status code and further statements are not executed. The results of each statement are printed to the standard output (see `--format` for formatting options). `--format` | How to display table rows printed to the standard output. Possible values: `tsv`, `csv`, `table`, `raw`, `records`, `sql`, `html`.

**Default:** `table` for sessions that [output on a terminal](cockroach-sql.html#session-and-output-types); `tsv` otherwise

This flag corresponds to the `display_format` [client-side option](#client-side-options) for use in interactive sessions. -`--geo-partitioned-replicas` | Start a 9-node demo cluster with the [Geo-Partitioned Replicas](topology-geo-partitioned-replicas.html) topology pattern applied to the [`movr`](movr.html) database. +`--geo-partitioned-replicas` | Start a 9-node demo cluster with [geo-partitioning](#start-a-multi-region-demo-cluster-with-automatic-geo-partitioning) applied to the [`movr`](movr.html) database. `--global` | This experimental flag is used to simulate a [multi-region cluster](simulate-a-multi-region-cluster-on-localhost.html) which sets the [`--locality` flag on node startup](cockroach-start.html#locality) to three different regions. It also simulates the network latency that would occur between them given the specified localities. In order for this to operate as expected, with 3 nodes in each of 3 regions, you must also pass the `--nodes 9` argument. `--insecure` | Set this to `false` to start the demo cluster in secure mode using TLS certificates to encrypt network communication. `--insecure=false` gives you an easy way test out CockroachDB [authorization features](authorization.html) and also creates a password (`admin`) for the `root` user for logging into the DB Console.

**Env Variable:** `COCKROACH_INSECURE`
**Default:** `false` `--max-sql-memory` | For each demo node, the maximum in-memory storage capacity for temporary SQL data, including prepared queries and intermediate data rows during query execution. This can be a percentage (notated as a decimal or with `%`) or any bytes-based unit, for example:

`--max-sql-memory=.25`
`--max-sql-memory=25%`
`--max-sql-memory=10000000000 ----> 1000000000 bytes`
`--max-sql-memory=1GB ----> 1000000000 bytes`
`--max-sql-memory=1GiB ----> 1073741824 bytes`

**Default:** `128MiB` @@ -446,7 +446,9 @@ You can also use this URL to connect an application to the demo cluster. $ cockroach demo --geo-partitioned-replicas ~~~ -This command starts a 9-node demo cluster with the `movr` database preloaded, and [partitions](partitioning.html) and [zone constraints](configure-replication-zones.html) applied to the primary and secondary indexes. For more information, see the [Geo-Partitioned Replicas](topology-geo-partitioned-replicas.html) topology pattern. +This command starts a 9-node demo cluster with the `movr` database preloaded, and [partitions](partitioning.html) and [zone constraints](configure-replication-zones.html) applied to the primary and secondary indexes. + +{% include {{page.version.version}}/sql/use-multiregion-instead-of-partitioning.md %} ### Shut down and restart nodes in a multi-node demo cluster diff --git a/v21.1/computed-columns.md b/v21.1/computed-columns.md index 8ac36a164a9..e65ddb1777b 100644 --- a/v21.1/computed-columns.md +++ b/v21.1/computed-columns.md @@ -6,17 +6,18 @@ toc: true A computed column exposes data generated from other columns by a [scalar expression](scalar-expressions.html) included in the column definition. A stored computed column (set with the `STORED` SQL keyword) is calculated when a row is inserted or updated, and stores the resulting value of the scalar expression in the primary index similar to a regular column. A virtual computed column (set with the `VIRTUAL` SQL keyword) is not stored, and the value of the scalar expression is computed during queries as needed. - ## Why use computed columns? -Computed columns are especially useful when used with [partitioning](partitioning.html), [`JSONB`](jsonb.html) columns, or [secondary indexes](indexes.html). - -- **Partitioning** requires that partitions are defined using columns that are a prefix of the [primary key](primary-key.html). In the case of geo-partitioning, some applications will want to collapse the number of possible values in this column, to make certain classes of queries more performant. For example, if a users table has a country and state column, then you can make a stored computed column locality with a reduced domain for use in partitioning. For more information, see the [partitioning example](#create-a-table-with-geo-partitions-and-a-computed-column) below. +Computed columns are especially useful when used with [`JSONB`](jsonb.html) columns, [secondary indexes](indexes.html), or [partitioning](partitioning.html). - **JSONB** columns are used for storing semi-structured `JSONB` data. When the table's primary information is stored in `JSONB`, it's useful to index a particular field of the `JSONB` document. In particular, computed columns allow for the following use case: a two-column table with a `PRIMARY KEY` column and a `payload` column, whose primary key is computed as some field from the `payload` column. This alleviates the need to manually separate your primary keys from your JSON blobs. For more information, see the [`JSONB` example](#create-a-table-with-a-jsonb-column-and-a-stored-computed-column) below. - **Secondary indexes** can be created on computed columns, which is especially useful when a table is frequently sorted. See the [secondary indexes example](#create-a-table-with-a-secondary-index-on-a-computed-column) below. +- **Partitioning** requires that partitions are defined using columns that are a prefix of the [primary key](primary-key.html). In the case of geo-partitioning, some applications will want to collapse the number of possible values in this column, to make certain classes of queries more performant. For example, if a users table has a country and state column, then you can make a stored computed column locality with a reduced domain for use in partitioning. + +{% include {{page.version.version}}/sql/use-multiregion-instead-of-partitioning.md %} + ## Considerations Computed columns: @@ -72,10 +73,6 @@ Parameter | Description {% include {{ page.version.version }}/computed-columns/virtual.md %} -### Create a table with geo-partitions and a computed column - -{% include {{ page.version.version }}/computed-columns/partitioning.md %} The `locality` values can then be used for geo-partitioning. - ### Create a table with a secondary index on a computed column {% include {{ page.version.version }}/computed-columns/secondary-index.md %} @@ -96,4 +93,3 @@ For more information, see [`ADD COLUMN`](add-column.html). - [Information Schema](information-schema.html) - [`CREATE TABLE`](create-table.html) - [`JSONB`](jsonb.html) -- [Define Table Partitions (Enterprise)](partitioning.html) diff --git a/v21.1/cost-based-optimizer.md b/v21.1/cost-based-optimizer.md index 3850ab0ff3a..803533c9f90 100644 --- a/v21.1/cost-based-optimizer.md +++ b/v21.1/cost-based-optimizer.md @@ -198,7 +198,7 @@ To make the optimizer prefer lookup joins to merge joins when performing foreign Given multiple identical [indexes](indexes.html) that have different locality constraints using [replication zones](configure-replication-zones.html), the optimizer will prefer the index that is closest to the gateway node that is planning the query. In a properly configured geo-distributed cluster, this can lead to performance improvements due to improved data locality and reduced network traffic. {{site.data.alerts.callout_info}} -This feature is only available to users with an [enterprise license](enterprise-licensing.html). For insight into how to use this feature to get low latency, consistent reads in multi-region deployments, see the [Duplicate Indexes](topology-follower-reads.html) topology pattern. +This feature is only available to users with an [enterprise license](enterprise-licensing.html). For insight into how to use this feature to get low latency, consistent reads in multi-region deployments, see the [Global Table Locality Pattern](topology-global-tables.html) topology pattern. {{site.data.alerts.end}} This feature enables scenarios such as: diff --git a/v21.1/create-table-as.md b/v21.1/create-table-as.md index 5082fdfa634..a69554a3ebf 100644 --- a/v21.1/create-table-as.md +++ b/v21.1/create-table-as.md @@ -92,7 +92,7 @@ table td:first-child { The default rules for [column families](column-families.html) apply. -The [primary key](primary-key.html) of tables created with `CREATE TABLE ... AS` is not automatically derived from the query results. You must specify new primary keys at table creation. For examples, see [Specify a primary key](create-table-as.html#specify-a-primary-key) and [Specify a primary key for partitioning](create-table-as.html#specify-a-primary-key-for-partitioning). +The [primary key](primary-key.html) of tables created with `CREATE TABLE ... AS` is not automatically derived from the query results. You must specify new primary keys at table creation. For examples, see [Specify a primary key](create-table-as.html#specify-a-primary-key). ## Examples @@ -296,57 +296,6 @@ You can define the [column families](column-families.html) of a new table create (1 row) ~~~ -### Specify a primary key for partitioning - -If you are [partitioning](partitioning.html) a table based on a [primary key](primary-key.html), the primary key must be properly defined. To change the primary key after table creation, you can use an [`ALTER TABLE ... ALTER PRIMARY KEY`](alter-primary-key.html) statement. - -Suppose that you want to [geo-partition](demo-low-latency-multi-region-deployment.html) the `drivers` table that you created with the following statement: - -{% include copy-clipboard.html %} -~~~ sql -> CREATE TABLE drivers (id, city, name) AS VALUES (gen_random_uuid(), 'new york', 'Harry Potter'), (gen_random_uuid(), 'seattle', 'Evelyn Martin'); -~~~ - -{% include copy-clipboard.html %} -~~~ sql -> SHOW CREATE TABLE drivers; -~~~ -~~~ - table_name | create_statement -+------------+----------------------------------------------+ - drivers | CREATE TABLE drivers ( - | id UUID NULL, - | city STRING NULL, - | name STRING NULL, - | FAMILY "primary" (id, city, name, rowid) - | ) -(1 row) -~~~ - -In order for this table to be properly geo-partitioned with the other tables in the `movr` dataset, the table must have a composite primary key defined that includes the unique row identifier (`id`, in this case) and the row locality identifier (`city`). Use the following statement to change the primary key to a composite primary key: - -{% include copy-clipboard.html %} -~~~ sql -> CREATE TABLE drivers_pk (id, city, name, PRIMARY KEY (id, city)) AS SELECT id, city, name FROM drivers; -~~~ - -{% include copy-clipboard.html %} -~~~ sql -> SHOW CREATE TABLE drivers_pk; -~~~ -~~~ - table_name | create_statement -+------------+----------------------------------------------------------+ - drivers_pk | CREATE TABLE drivers_pk ( - | id UUID NOT NULL, - | city STRING NOT NULL, - | name STRING NULL, - | CONSTRAINT "primary" PRIMARY KEY (id ASC, city ASC), - | FAMILY "primary" (id, city, name) - | ) -(1 row) -~~~ - ## See also - [Selection Queries](selection-queries.html) diff --git a/v21.1/disaster-recovery.md b/v21.1/disaster-recovery.md index e2803f7927b..d913806ff3a 100644 --- a/v21.1/disaster-recovery.md +++ b/v21.1/disaster-recovery.md @@ -160,6 +160,10 @@ When using Kubernetes, recovery actions happen automatically in many cases and n ### Multi-region survivability planning +{{site.data.alerts.callout_success}} +New in v21.1: By default, every [multi-region database](multiregion-overview.html) has a [survival goal](multiregion-overview.html#survival-goals) associated with it. The survival goal setting provides an abstraction that handles the low-level details of replica placement to ensure your desired fault tolerance. The information below is still useful for legacy deployments. +{{site.data.alerts.end}} + The table below shows the replication factor (RF) needed to achieve the listed fault tolerance (e.g., survive 1 failed node) for a multi-region, cloud-deployed cluster with 3 availability zones (AZ) per region and one node in each AZ: {{site.data.alerts.callout_danger}} diff --git a/v21.1/get-started-with-enterprise-trial.md b/v21.1/get-started-with-enterprise-trial.md index 97bb89ac980..620869f00d1 100644 --- a/v21.1/get-started-with-enterprise-trial.md +++ b/v21.1/get-started-with-enterprise-trial.md @@ -5,7 +5,7 @@ toc: true license: true --- -Congratulations on starting your CockroachDB Enterprise Trial! With it, you'll not only get access to CockroachDB's core capabilities like [high availability](frequently-asked-questions.html#how-does-cockroachdb-survive-failures) and [`SERIALIZABLE` isolation](frequently-asked-questions.html#how-is-cockroachdb-strongly-consistent), but also our Enterprise-only features like distributed [`BACKUP`](backup.html) & [`RESTORE`](restore.html), [geo-partitioning](partitioning.html), and [cluster visualization](enable-node-map.html). +Congratulations on starting your CockroachDB Enterprise Trial! With it, you'll not only get access to CockroachDB's core capabilities like [high availability](frequently-asked-questions.html#how-does-cockroachdb-survive-failures) and [`SERIALIZABLE` isolation](frequently-asked-questions.html#how-is-cockroachdb-strongly-consistent), but also our Enterprise-only features like distributed [`BACKUP`](backup.html) & [`RESTORE`](restore.html), [multi-region capabilities](multiregion-overview.html), and [cluster visualization](enable-node-map.html). ## Install CockroachDB diff --git a/v21.1/licensing-faqs.md b/v21.1/licensing-faqs.md index 9bad9eace9d..fb26c964b04 100644 --- a/v21.1/licensing-faqs.md +++ b/v21.1/licensing-faqs.md @@ -57,7 +57,7 @@ Feature | BSL | CCL (free) | CCL (paid) **[Core changefeed](stream-data-out-of-cockroachdb-using-changefeeds.html#create-a-core-changefeed)** | | ✓ | **[Enterprise changefeed](stream-data-out-of-cockroachdb-using-changefeeds.html#configure-a-changefeed-enterprise)** | | | ✓ **[Table-level zone configuration](configure-replication-zones.html#replication-zone-levels)** | ✓ | | -**[Geo-partitioning](topology-geo-partitioned-replicas.html)** | | | ✓ +**[Multi-Region Capabilities](multiregion-overview.html)** | | | ✓ **[Follower reads](follower-reads.html)** | | | ✓ **[Node map](enable-node-map.html)** | | | ✓ **[Locality-aware index selection](cost-based-optimizer.html#preferring-the-nearest-index)** | | | ✓ diff --git a/v21.1/multi-region-use-case.md b/v21.1/multi-region-use-case.md index c13affd2585..e98563dab6e 100644 --- a/v21.1/multi-region-use-case.md +++ b/v21.1/multi-region-use-case.md @@ -34,7 +34,7 @@ Limiting latency improves the user experience, and it can also help avoid proble To reduce database latency in a distributed CockroachDB deployment, data can be [geo-partitioned](topology-geo-partitioned-replicas.html). Geo-partitioning enables you to control where specific rows of data are stored. Limiting database operations to specific partitions can reduce the distance requests need to travel between the client and the database. {{site.data.alerts.callout_info}} -Geo-partitioned replicas can dramatically improve latency in multi-region deployments, but at the [cost of resiliency](topology-geo-partitioned-replicas.html#resiliency). Geo-partitioned replicas are resilient to availability zone failures, but not regional failures. +Geo-partitioned replicas can dramatically improve latency in multi-region deployments, but at the [cost of resiliency](topology-geo-partitioned-replicas.html). Geo-partitioned replicas are resilient to availability zone failures, but not regional failures. {{site.data.alerts.end}} If you are building an application, it's likely that the end user will not be making requests to the database directly. Instead, the user makes requests to the application, and the application makes requests to the database on behalf of the user. To limit the latency between the application and the database, you need to design and deploy your application such that: diff --git a/v21.1/multiregion-overview.md b/v21.1/multiregion-overview.md index cdbe0ebc0ac..6be79df5fc1 100644 --- a/v21.1/multiregion-overview.md +++ b/v21.1/multiregion-overview.md @@ -1,6 +1,6 @@ --- -title: Multi-region Overview -summary: Learn how to use CockroachDB's improved multi-region user experience. +title: Multi-region Capabilities Overview +summary: Learn how to use CockroachDB's improved multi-region capabilities. toc: true --- @@ -134,18 +134,12 @@ Table locality settings are used for optimizing latency under different read/wri ### Regional tables -Regional tables work well when your application requires low-latency reads and writes for an entire table from a single region. - -For _regional_ tables, access to the table will be fast in the table's "home region" and slower in other regions. In other words, CockroachDB optimizes access to data in regional tables from a single region. By default, a regional table's home region is the [database's primary region](#database-regions), but that can be changed to use any region added to the database. - -For instructions showing how to set a table's locality to `REGIONAL BY TABLE`, see [`ALTER TABLE ... SET LOCALITY`](set-locality.html#regional-by-table) - -{{site.data.alerts.callout_info}} -By default, all tables in a multi-region database are _regional_ tables that use the database's primary region. Unless you know your application needs different performance characteristics than regional tables provide, there is no need to change this setting. -{{site.data.alerts.end}} +{% include {{page.version.version}}/sql/regional-table-description.md %} ### Regional by row tables +{% include {{page.version.version}}/sql/regional-by-row-table-description.md %} + In _regional by row_ tables, individual rows are optimized for access from different regions. This setting divides a table and all of [its indexes](#indexes-on-regional-by-row-tables) into [partitions](partitioning.html), with each partition optimized for access from a different region. Like [regional tables](#regional-tables), _regional by row_ tables are optimized for access from a single region. However, that region is specified at the row level instead of applying to the whole table. Use regional by row tables when your application requires low-latency reads and writes at a row level where individual rows are primarily accessed from a single region. For example, a users table in a global application may need to keep some users' data in specific regions due to regulations (such as GDPR), for better performance, or both. @@ -156,13 +150,7 @@ For instructions showing how to set a table's locality to `REGIONAL BY ROW`, see ### Global tables - _Global_ tables are optimized for low-latency reads from every region in the database. The tradeoff is that writes will incur higher latencies from any given region, since writes have to be replicated across every region to make the global low-latency reads possible. - -Use global tables when your application has a "read-mostly" table of reference data that is rarely updated, and needs to be available to all regions. - -For an example of a table that can benefit from the _global_ table locality setting in a multi-region deployment, see the `promo_codes` table from the [MovR application](movr.html). - -For instructions showing how to set a table's locality to `GLOBAL`, see [`ALTER TABLE ... SET LOCALITY`](set-locality.html#global) +{% include {{page.version.version}}/sql/global-table-description.md %} ## Additional Features diff --git a/v21.1/partition-by.md b/v21.1/partition-by.md index cddbff7c87d..bb8b93afcc4 100644 --- a/v21.1/partition-by.md +++ b/v21.1/partition-by.md @@ -6,6 +6,8 @@ toc: true `PARTITION BY` is a subcommand of [`ALTER TABLE`](alter-table.html) and [`ALTER INDEX`](alter-index.html) that is used to partition, re-partition, or un-partition a table or secondary index. After defining partitions, [`CONFIGURE ZONE`](configure-zone.html) is used to control the replication and placement of partitions. +{% include {{page.version.version}}/sql/use-multiregion-instead-of-partitioning.md %} + {{site.data.alerts.callout_info}} [Partitioning](partitioning.html) is an [enterprise-only](enterprise-licensing.html) feature. If you are looking for the `PARTITION BY` used in SQL window functions, see [Window Functions](window-functions.html). {{site.data.alerts.end}} diff --git a/v21.1/partitioning.md b/v21.1/partitioning.md index b8a9d4d9937..d067ff759b0 100644 --- a/v21.1/partitioning.md +++ b/v21.1/partitioning.md @@ -12,6 +12,8 @@ Table partitioning is an [enterprise-only](enterprise-licensing.html) feature. F ## Why use table partitioning +{% include {{page.version.version}}/sql/use-multiregion-instead-of-partitioning.md %} + Table partitioning helps you reduce latency and cost: - **Geo-partitioning** allows you to keep user data close to the user, which reduces the distance that the data needs to travel, thereby **reducing latency**. To geo-partition a table, define location-based partitions while creating a table, create location-specific zone configurations, and apply the zone configurations to the corresponding partitions. diff --git a/v21.1/query-replication-reports.md b/v21.1/query-replication-reports.md index f43e2ca9e9b..a1ae1b2376a 100644 --- a/v21.1/query-replication-reports.md +++ b/v21.1/query-replication-reports.md @@ -417,7 +417,9 @@ The `system.replication_critical_localities` report contains which of your local That said, a locality being critical is not necessarily a bad thing as long as you are aware of it. What matters is that [you configure the topology of your cluster to get the resiliency you expect](topology-patterns.html). -By default, the [movr demo cluster](cockroach-demo.html#start-a-multi-region-demo-cluster-with-automatic-geo-partitioning) has some ranges in critical localities. This is expected because it follows the [geo-partitioned replicas topology pattern](topology-geo-partitioned-replicas.html), which ties data for latency-sensitive queries to specific geographies at the cost of data unavailability during a region-wide failure. +By default, the [movr demo cluster](cockroach-demo.html#start-a-multi-region-demo-cluster-with-automatic-geo-partitioning) has some ranges in critical localities. This is expected because it ties data for latency-sensitive queries to specific geographies at the cost of data unavailability during a region-wide failure. + +{% include {{page.version.version}}/sql/use-multiregion-instead-of-partitioning.md %} {% include copy-clipboard.html %} ~~~ sql diff --git a/v21.1/secure-a-cluster.md b/v21.1/secure-a-cluster.md index 99bbf28abdc..0e467d80c86 100644 --- a/v21.1/secure-a-cluster.md +++ b/v21.1/secure-a-cluster.md @@ -446,7 +446,7 @@ Adding capacity is as simple as starting more nodes with `cockroach start`. - [Install the client driver](install-client-drivers.html) for your preferred language - Learn more about [CockroachDB SQL](learn-cockroachdb-sql.html) and the [built-in SQL client](cockroach-sql.html) -- Further explore CockroachDB capabilities like [fault tolerance and automated repair](demo-fault-tolerance-and-recovery.html), [geo-partitioning](demo-low-latency-multi-region-deployment.html), [serializable transactions](demo-serializable.html), and [JSON support](demo-json-support.html) +- Further explore CockroachDB capabilities like [fault tolerance and automated repair](demo-fault-tolerance-and-recovery.html), [multi-region performance](demo-low-latency-multi-region-deployment.html), [serializable transactions](demo-serializable.html), and [JSON support](demo-json-support.html) You might also be interested in the following pages: diff --git a/v21.1/show-locality.md b/v21.1/show-locality.md index cdf635c7b87..a3643c3f273 100644 --- a/v21.1/show-locality.md +++ b/v21.1/show-locality.md @@ -78,7 +78,7 @@ For a more extensive example, see [Create a table with node locality information ## See also -- [Geo-Partitioning](demo-low-latency-multi-region-deployment.html) +- [Multi-Region Performance](demo-low-latency-multi-region-deployment.html) - [Locality](cockroach-start.html#locality) - [Orchestrated Deployment](orchestration.html) - [Manual Deployment](manual-deployment.html) diff --git a/v21.1/show-partitions.md b/v21.1/show-partitions.md index f519a2bef88..180cbb4e01f 100644 --- a/v21.1/show-partitions.md +++ b/v21.1/show-partitions.md @@ -6,6 +6,8 @@ toc: true Use the `SHOW PARTITIONS` [statement](sql-statements.html) to view details about existing [partitions](partitioning.html). +{% include {{page.version.version}}/sql/use-multiregion-instead-of-partitioning.md %} + {% include enterprise-feature.md %} {% include {{page.version.version}}/sql/crdb-internal-partitions.md %} @@ -234,4 +236,4 @@ If a partitioned table has no zones configured, the `SHOW CREATE TABLE` output i - [Define Table Partitions](partitioning.html) - [SQL Statements](sql-statements.html) -- [Geo-Partitioning](demo-low-latency-multi-region-deployment.html) +- [Multi-Region Performance](demo-low-latency-multi-region-deployment.html) diff --git a/v21.1/show-zone-configurations.md b/v21.1/show-zone-configurations.md index 9d2d8f4c4f7..4c68e6bae6f 100644 --- a/v21.1/show-zone-configurations.md +++ b/v21.1/show-zone-configurations.md @@ -81,6 +81,8 @@ CONFIGURE ZONE 1 ### View the replication zone for a partition +{% include {{page.version.version}}/sql/use-multiregion-instead-of-partitioning.md %} + {% include {{ page.version.version }}/zone-configs/create-a-replication-zone-for-a-table-partition.md %} ## See also diff --git a/v21.1/start-a-local-cluster-in-docker-linux.md b/v21.1/start-a-local-cluster-in-docker-linux.md index 0b30fd6dbda..289b8ffb17f 100644 --- a/v21.1/start-a-local-cluster-in-docker-linux.md +++ b/v21.1/start-a-local-cluster-in-docker-linux.md @@ -36,4 +36,4 @@ Also, feel free to watch this process in action before going through the steps y - Learn more about [CockroachDB SQL](learn-cockroachdb-sql.html) and the [built-in SQL client](cockroach-sql.html) - [Install the client driver](install-client-drivers.html) for your preferred language - [Build an app with CockroachDB](build-an-app-with-cockroachdb.html) -- Further explore CockroachDB capabilities like [fault tolerance and automated repair](demo-fault-tolerance-and-recovery.html), [geo-partitioning](demo-low-latency-multi-region-deployment.html), [serializable transactions](demo-serializable.html), and [JSON support](demo-json-support.html) +- Further explore CockroachDB capabilities like [fault tolerance and automated repair](demo-fault-tolerance-and-recovery.html), [multi-region performance](demo-low-latency-multi-region-deployment.html), [serializable transactions](demo-serializable.html), and [JSON support](demo-json-support.html) diff --git a/v21.1/start-a-local-cluster-in-docker-mac.md b/v21.1/start-a-local-cluster-in-docker-mac.md index b8afd3fce7e..333a5ed2ab6 100644 --- a/v21.1/start-a-local-cluster-in-docker-mac.md +++ b/v21.1/start-a-local-cluster-in-docker-mac.md @@ -35,4 +35,4 @@ Also, feel free to watch this process in action before going through the steps y - Learn more about [CockroachDB SQL](learn-cockroachdb-sql.html) and the [built-in SQL client](cockroach-sql.html) - [Install the client driver](install-client-drivers.html) for your preferred language - [Build an app with CockroachDB](build-an-app-with-cockroachdb.html) -- Further explore CockroachDB capabilities like [fault tolerance and automated repair](demo-fault-tolerance-and-recovery.html), [geo-partitioning](demo-low-latency-multi-region-deployment.html), [serializable transactions](demo-serializable.html), and [JSON support](demo-json-support.html) +- Further explore CockroachDB capabilities like [fault tolerance and automated repair](demo-fault-tolerance-and-recovery.html), [multi-region performance](demo-low-latency-multi-region-deployment.html), [serializable transactions](demo-serializable.html), and [JSON support](demo-json-support.html) diff --git a/v21.1/start-a-local-cluster-in-docker-windows.md b/v21.1/start-a-local-cluster-in-docker-windows.md index 4ad2b8b1395..a260aa19910 100644 --- a/v21.1/start-a-local-cluster-in-docker-windows.md +++ b/v21.1/start-a-local-cluster-in-docker-windows.md @@ -236,4 +236,4 @@ PS C:\Users\username> Remove-Item cockroach-data -recurse - Learn more about [CockroachDB SQL](learn-cockroachdb-sql.html) and the [built-in SQL client](cockroach-sql.html) - [Install the client driver](install-client-drivers.html) for your preferred language - [Build an app with CockroachDB](build-an-app-with-cockroachdb.html) -- Further explore CockroachDB capabilities like [fault tolerance and automated repair](demo-fault-tolerance-and-recovery.html), [geo-partitioning](demo-low-latency-multi-region-deployment.html), [serializable transactions](demo-serializable.html), and [JSON support](demo-json-support.html) +- Further explore CockroachDB capabilities like [fault tolerance and automated repair](demo-fault-tolerance-and-recovery.html), [multi-region performance](demo-low-latency-multi-region-deployment.html), [serializable transactions](demo-serializable.html), and [JSON support](demo-json-support.html) diff --git a/v21.1/start-a-local-cluster.md b/v21.1/start-a-local-cluster.md index 3772e45ad2e..6f0070c5640 100644 --- a/v21.1/start-a-local-cluster.md +++ b/v21.1/start-a-local-cluster.md @@ -376,4 +376,4 @@ Adding capacity is as simple as starting more nodes with `cockroach start`. - [Install the client driver](install-client-drivers.html) for your preferred language - Learn more about [CockroachDB SQL](learn-cockroachdb-sql.html) and the [built-in SQL client](cockroach-sql.html) - [Build an app with CockroachDB](build-an-app-with-cockroachdb.html) -- Further explore CockroachDB capabilities like [fault tolerance and automated repair](demo-fault-tolerance-and-recovery.html), [geo-partitioning](demo-low-latency-multi-region-deployment.html), [serializable transactions](demo-serializable.html), and [JSON support](demo-json-support.html) +- Further explore CockroachDB capabilities like [fault tolerance and automated repair](demo-fault-tolerance-and-recovery.html), [multi-region performance](demo-low-latency-multi-region-deployment.html), [serializable transactions](demo-serializable.html), and [JSON support](demo-json-support.html) diff --git a/v21.1/topology-duplicate-indexes.md b/v21.1/topology-duplicate-indexes.md deleted file mode 100644 index 2b7df847900..00000000000 --- a/v21.1/topology-duplicate-indexes.md +++ /dev/null @@ -1,191 +0,0 @@ ---- -title: Duplicate Indexes Topology -summary: Guidance on using the duplicate indexes topology in a multi-region deployment. -toc: true ---- - -In a multi-region deployment, the duplicate indexes pattern is a good choice for tables with the following requirements: - -- Read latency must be low, but write latency can be much higher. -- Reads must be up-to-date for business reasons or because the table is referenced by [foreign keys](foreign-key.html). -- Rows in the table, and all latency-sensitive queries, **cannot** be tied to specific geographies. -- Table data must remain available during a region failure. - -In general, this pattern is suited well for immutable/reference tables that are rarely or never updated. - - - -{{site.data.alerts.callout_success}} -**See It In Action** - Read about how a [financial software company](https://www.cockroachlabs.com/case-studies/top-u-s-financial-software-company-turns-to-cockroachdb-to-improve-its-application-login-experience/) is using the Duplicate Indexes topology for low latency reads in their identity access management layer. -{{site.data.alerts.end}} - -## Prerequisites - -### Fundamentals - -{% include {{ page.version.version }}/topology-patterns/fundamentals.md %} - -### Cluster setup - -{% include {{ page.version.version }}/topology-patterns/multi-region-cluster-setup.md %} - -## Configuration - -{{site.data.alerts.callout_info}} -Pinning secondary indexes requires an [Enterprise license](https://www.cockroachlabs.com/get-cockroachdb). -{{site.data.alerts.end}} - -### Summary - -Using this pattern, you tell CockroachDB to put the leaseholder for the table itself (also called the primary index) in one region, create 2 secondary indexes on the table, and tell CockroachDB to put the leaseholder for each secondary index in one of the other regions. This means that reads will access the local leaseholder (either for the table itself or for one of the secondary indexes). Writes, however, will still leave the region to get consensus for the table and its secondary indexes. - -Duplicate Indexes topology - -### Steps - -Assuming you have a [cluster deployed across three regions](#cluster-setup) and a table like the following: - -{% include copy-clipboard.html %} -~~~ sql -> CREATE TABLE postal_codes ( - id INT PRIMARY KEY, - code STRING -); -~~~ - -1. If you do not already have one, [request a trial Enterprise license](https://www.cockroachlabs.com/get-cockroachdb). - -2. [Create a replication zone](configure-zone.html) for the table and set a leaseholder preference telling CockroachDB to put the leaseholder for the table in one of the regions, for example `us-west`: - - {% include copy-clipboard.html %} - ~~~ sql - > ALTER TABLE postal_codes - CONFIGURE ZONE USING - num_replicas = 3, - constraints = '{"+region=us-west":1}', - lease_preferences = '[[+region=us-west]]'; - ~~~ - -3. [Create secondary indexes](create-index.html) on the table for each of your other regions, including all of the columns you wish to read either in the key or in the key and a [`STORING`](create-index.html#store-columns) clause: - - {% include copy-clipboard.html %} - ~~~ sql - > CREATE INDEX idx_central ON postal_codes (id) - STORING (code); - ~~~ - - {% include copy-clipboard.html %} - ~~~ sql - > CREATE INDEX idx_east ON postal_codes (id) - STORING (code); - ~~~ - -4. [Create a replication zone](configure-zone.html) for each secondary index, in each case setting a leaseholder preference telling CockroachDB to put the leaseholder for the index in a distinct region: - - {% include copy-clipboard.html %} - ~~~ sql - > ALTER INDEX postal_codes@idx_central - CONFIGURE ZONE USING - num_replicas = 3, - constraints = '{"+region=us-central":1}', - lease_preferences = '[[+region=us-central]]'; - ~~~ - - {% include copy-clipboard.html %} - ~~~ sql - > ALTER INDEX postal_codes@idx_east - CONFIGURE ZONE USING - num_replicas = 3, - constraints = '{"+region=us-east":1}', - lease_preferences = '[[+region=us-east]]'; - ~~~ - -5. To confirm that replication zones are in effect, you can use the [`SHOW CREATE TABLE`](show-create.html): - - {% include copy-clipboard.html %} - ~~~ sql - > SHOW CREATE TABLE postal_codes; - ~~~ - - ~~~ - table_name | create_statement - +--------------+----------------------------------------------------------------------------+ - postal_codes | CREATE TABLE postal_codes ( - | id INT8 NOT NULL, - | code STRING NULL, - | CONSTRAINT "primary" PRIMARY KEY (id ASC), - | INDEX idx_central (id ASC) STORING (code), - | INDEX idx_east (id ASC) STORING (code), - | FAMILY "primary" (id, code) - | ); - | ALTER TABLE defaultdb.public.postal_codes CONFIGURE ZONE USING - | num_replicas = 3, - | constraints = '{+region=us-west: 1}', - | lease_preferences = '[[+region=us-west]]'; - | ALTER INDEX defaultdb.public.postal_codes@idx_central CONFIGURE ZONE USING - | num_replicas = 3, - | constraints = '{+region=us-central: 1}', - | lease_preferences = '[[+region=us-central]]'; - | ALTER INDEX defaultdb.public.postal_codes@idx_east CONFIGURE ZONE USING - | num_replicas = 3, - | constraints = '{+region=us-east: 1}', - | lease_preferences = '[[+region=us-east]]' - (1 row) - ~~~ - -## Characteristics - -### Latency - -#### Reads - -Reads access the local leaseholder and, therefore, never leave the region. This makes read latency very low. - -For example, in the animation below: - -1. The read request in `us-central` reaches the regional load balancer. -2. The load balancer routes the request to a gateway node. -3. The gateway node routes the request to the relevant leaseholder. In `us-west`, the leaseholder is for the table itself. In the other regions, the leaseholder is for the relevant index, which the [cost-based optimizer](cost-based-optimizer.html) uses due to the leaseholder preferences. -4. The leaseholder retrieves the results and returns to the gateway node. -5. The gateway node returns the results to the client. - -Pinned secondary indexes topology - -#### Writes - -The replicas for the table and its secondary indexes are spread across all 3 regions, so writes involve multiple network hops across regions to achieve consensus. This increases write latency significantly. It's also important to understand that the replication of extra indexes can reduce throughput and increase storage cost. - -For example, in the animation below: - -1. The write request in `us-central` reaches the regional load balancer. -2. The load balancer routes the request to a gateway node. -3. The gateway node routes the request to the leaseholder replicas for the table and its secondary indexes. -4. While each leaseholder appends the write to its Raft log, it notifies its follower replicas. -5. In each case, as soon as one follower has appended the write to its Raft log (and thus a majority of replicas agree based on identical Raft logs), it notifies the leaseholder and the write is committed on the agreeing replicas. -6. The leaseholders then return acknowledgement of the commit to the gateway node. -7. The gateway node returns the acknowledgement to the client. - -Duplicate Indexes topology - -### Resiliency - -Because this pattern balances the replicas for the table and its secondary indexes across regions, one entire region can fail without interrupting access to the table: - -Pinned Secondary Indexes topology - - - -## Alternatives - -- If reads from a table can be historical (4.8 seconds or more in the past), consider the [Follower Reads](topology-follower-reads.html) pattern. -- If rows in the table, and all latency-sensitive queries, can be tied to specific geographies, consider the [Geo-Partitioned Leaseholders](topology-geo-partitioned-leaseholders.html) pattern. Both patterns avoid extra secondary indexes, which increase data replication and, therefore, higher throughput and less storage. - -## Tutorial - -For a step-by-step demonstration of how this pattern gets you low-latency reads in a broadly distributed cluster, see the [Low Latency Multi-Region Deployment](demo-low-latency-multi-region-deployment.html) tutorial. - -## See also - -{% include {{ page.version.version }}/topology-patterns/see-also.md %} diff --git a/v21.1/topology-follow-the-workload.md b/v21.1/topology-follow-the-workload.md index 193b95ce296..3d70499e865 100644 --- a/v21.1/topology-follow-the-workload.md +++ b/v21.1/topology-follow-the-workload.md @@ -12,7 +12,7 @@ In a multi-region deployment, follow-the-workload is the default pattern for tab - Table data must remain available during a region failure. {{site.data.alerts.callout_success}} -If read performance is your main focus for a table, but you want low-latency reads everywhere instead of just in the most active region, consider the [Duplicate Indexes](topology-duplicate-indexes.html) or [Follower Reads](topology-follower-reads.html) pattern. +If read performance is your main focus for a table, but you want low-latency reads everywhere instead of just in the most active region, consider the [Global Table Locality Pattern](topology-global-tables.html) or [Follower Reads](topology-follower-reads.html) pattern. {{site.data.alerts.end}} ## Prerequisites diff --git a/v21.1/topology-follower-reads.md b/v21.1/topology-follower-reads.md index de9445311fe..a2d35d85900 100644 --- a/v21.1/topology-follower-reads.md +++ b/v21.1/topology-follower-reads.md @@ -12,7 +12,7 @@ In a multi-region deployment, the follower reads pattern is a good choice for ta - Table data must remain available during a region failure. {{site.data.alerts.callout_success}} -This pattern is compatible with all of the other multi-region patterns except [Geo-Partitioned Replicas](topology-geo-partitioned-replicas.html). However, if reads from a table must be exactly up-to-date, use the [Duplicate Indexes](topology-duplicate-indexes.html) or [Geo-Partitioned Leaseholders](topology-geo-partitioned-leaseholders.html) pattern instead. Up-to-date reads are required by tables referenced by [foreign keys](foreign-key.html), for example. +If reads from a table must be exactly up-to-date, use the [Global Table Locality Pattern](topology-global-tables.html) or [Regional Table Locality Pattern](topology-regional-tables.html) instead. Up-to-date reads are required by tables referenced by [foreign keys](foreign-key.html), for example. {{site.data.alerts.end}} ## Prerequisites diff --git a/v21.1/topology-geo-partitioned-leaseholders.md b/v21.1/topology-geo-partitioned-leaseholders.md deleted file mode 100644 index ce813ca9032..00000000000 --- a/v21.1/topology-geo-partitioned-leaseholders.md +++ /dev/null @@ -1,251 +0,0 @@ ---- -title: Geo-Partitioned Leaseholders Topology -summary: Common cluster topology patterns with setup examples and performance considerations. -toc: true ---- - -In a multi-region deployment, the geo-partitioned [leaseholders](architecture/replication-layer.html#leases) topology is a good choice for tables with the following requirements: - -- Read latency must be low, but write latency can be higher. -- Reads must be up-to-date for business reasons or because the table is referenced by [foreign keys](foreign-key.html). -- Rows in the table, and all latency-sensitive queries, can be tied to specific geographies, e.g., city, state, region. -- Table data must remain available during a region failure. - -{{site.data.alerts.callout_success}} -**See It In Action** - Read about how a [large telecom provider](https://www.cockroachlabs.com/case-studies/telecom-provider-replaces-amazon-aurora-with-cockroachdb-to-attain-analways-on-customer-experience/) with millions of customers across the United States is using the Geo-Partitioned Leaseholders topology in production for strong resiliency and performance. -{{site.data.alerts.end}} - -## Prerequisites - -### Fundamentals - -{% include {{ page.version.version }}/topology-patterns/fundamentals.md %} - -### Cluster setup - -{% include {{ page.version.version }}/topology-patterns/multi-region-cluster-setup.md %} - -## Configuration - -{{site.data.alerts.callout_info}} -Geo-partitioning requires an [Enterprise license](https://www.cockroachlabs.com/get-cockroachdb). -{{site.data.alerts.end}} - -### Summary - -Using this pattern, you design your table schema to allow for [partitioning](partitioning.html#table-creation), with a column identifying geography as the first column in the table's compound primary key (e.g., city/id). You tell CockroachDB to partition the table and all of its secondary indexes by that geography column, each partition becoming its own range of 3 replicas. You then tell CockroachDB to put the leaseholder for each partition in the relevant region (e.g., LA partitions in `us-west`, NY partitions in `us-east`). The other replicas of a partition remain balanced across the other regions. This means that reads in each region will access local leaseholders and, therefore, will have low, intra-region latencies. Writes, however, will leave the region to get consensus and, therefore, will have higher, cross-region latencies. - -Geo-partitioned leaseholders topology - -### Steps - -Assuming you have a [cluster deployed across three regions](#cluster-setup) and a table and secondary index like the following: - -{% include copy-clipboard.html %} -~~~ sql -> CREATE TABLE users ( - id UUID NOT NULL DEFAULT gen_random_uuid(), - city STRING NOT NULL, - first_name STRING NOT NULL, - last_name STRING NOT NULL, - address STRING NOT NULL, - PRIMARY KEY (city ASC, id ASC) -); -~~~ - -{% include copy-clipboard.html %} -~~~ sql -> CREATE INDEX users_last_name_index ON users (city, last_name); -~~~ - -1. If you do not already have one, [request a trial Enterprise license](https://www.cockroachlabs.com/get-cockroachdb). - -2. Partition the table by `city`. For example, assuming there are three possible `city` values, `los angeles`, `chicago`, and `new york`: - - {% include copy-clipboard.html %} - ~~~ sql - > ALTER TABLE users PARTITION BY LIST (city) ( - PARTITION la VALUES IN ('los angeles'), - PARTITION chicago VALUES IN ('chicago'), - PARTITION ny VALUES IN ('new york') - ); - ~~~ - - This creates distinct ranges for each partition of the table. - -3. Partition the secondary index by `city` as well: - - {% include copy-clipboard.html %} - ~~~ sql - > ALTER INDEX users_last_name_index PARTITION BY LIST (city) ( - PARTITION la VALUES IN ('los angeles'), - PARTITION chicago VALUES IN ('chicago'), - PARTITION ny VALUES IN ('new york') - ); - ~~~ - - This creates distinct ranges for each partition of the secondary index. - -4. For each partition of the table and its secondary index, [create a replication zone](configure-zone.html) that tells CockroachDB to put the partition's leaseholder in the relevant region: - - {% include copy-clipboard.html %} - ~~~ sql - > ALTER PARTITION la OF INDEX users@* - CONFIGURE ZONE USING - num_replicas = 3, - constraints = '{"+region=us-west":1}', - lease_preferences = '[[+region=us-west]]'; - ALTER PARTITION chicago OF INDEX users@* - CONFIGURE ZONE USING - num_replicas = 3, - constraints = '{"+region=us-central":1}', - lease_preferences = '[[+region=us-central]]'; - ALTER PARTITION ny OF INDEX users@* - CONFIGURE ZONE USING - num_replicas = 3, - constraints = '{"+region=us-east":1}', - lease_preferences = '[[+region=us-east]]'; - ~~~ - -5. To confirm that partitions are in effect, you can use the [`SHOW CREATE TABLE`](show-create.html) or [`SHOW PARTITIONS`](show-partitions.html) statement: - - {% include copy-clipboard.html %} - ~~~ sql - > SHOW CREATE TABLE users; - ~~~ - - ~~~ - table_name | create_statement - +------------+----------------------------------------------------------------------------------------------------+ - users | CREATE TABLE users ( - | id UUID NOT NULL DEFAULT gen_random_uuid(), - | city STRING NOT NULL, - | first_name STRING NOT NULL, - | last_name STRING NOT NULL, - | address STRING NOT NULL, - | CONSTRAINT "primary" PRIMARY KEY (city ASC, id ASC), - | INDEX users_last_name_index (city ASC, last_name ASC) PARTITION BY LIST (city) ( - | PARTITION la VALUES IN (('los angeles')), - | PARTITION chicago VALUES IN (('chicago')), - | PARTITION ny VALUES IN (('new york')) - | ), - | FAMILY "primary" (id, city, first_name, last_name, address) - | ) PARTITION BY LIST (city) ( - | PARTITION la VALUES IN (('los angeles')), - | PARTITION chicago VALUES IN (('chicago')), - | PARTITION ny VALUES IN (('new york')) - | ); - | ALTER PARTITION chicago OF INDEX defaultdb.public.users@primary CONFIGURE ZONE USING - | num_replicas = 3, - | constraints = '{+region=us-central: 1}', - | lease_preferences = '[[+region=us-central]]'; - | ALTER PARTITION la OF INDEX defaultdb.public.users@primary CONFIGURE ZONE USING - | num_replicas = 3, - | constraints = '{+region=us-west: 1}', - | lease_preferences = '[[+region=us-west]]'; - | ALTER PARTITION ny OF INDEX defaultdb.public.users@primary CONFIGURE ZONE USING - | num_replicas = 3, - | constraints = '{+region=us-east: 1}', - | lease_preferences = '[[+region=us-east]]'; - | ALTER PARTITION chicago OF INDEX defaultdb.public.users@users_last_name_index CONFIGURE ZONE USING - | num_replicas = 3, - | constraints = '{+region=us-central1: 1}', - | lease_preferences = '[[+region=us-central1]]'; - | ALTER PARTITION la OF INDEX defaultdb.public.users@users_last_name_index CONFIGURE ZONE USING - | num_replicas = 3, - | constraints = '{+region=us-west: 1}', - | lease_preferences = '[[+region=us-west]]'; - | ALTER PARTITION ny OF INDEX defaultdb.public.users@users_last_name_index CONFIGURE ZONE USING - | num_replicas = 3, - | constraints = '{+region=us-east: 1}', - | lease_preferences = '[[+region=us-east]]' - (1 row) - ~~~ - - {% include copy-clipboard.html %} - ~~~ sql - > SHOW PARTITIONS FROM TABLE users; - ~~~ - - ~~~ - database_name | table_name | partition_name | parent_partition | column_names | index_name | partition_value | zone_config - +---------------+------------+----------------+------------------+--------------+-----------------------------+-----------------+-----------------------------------------------+ - defaultdb | users | la | NULL | city | users@primary | ('los angeles') | num_replicas = 3, - | | | | | | | constraints = '{+region=us-west1: 1}', - | | | | | | | lease_preferences = '[[+region=us-west1]]' - defaultdb | users | chicago | NULL | city | users@primary | ('chicago') | num_replicas = 3, - | | | | | | | constraints = '{+region=us-central1: 1}', - | | | | | | | lease_preferences = '[[+region=us-central1]]' - defaultdb | users | ny | NULL | city | users@primary | ('new york') | num_replicas = 3, - | | | | | | | constraints = '{+region=us-east1: 1}', - | | | | | | | lease_preferences = '[[+region=us-east1]]' - defaultdb | users | la | NULL | city | users@users_last_name_index | ('los angeles') | num_replicas = 3, - | | | | | | | constraints = '{+region=us-west1: 1}', - | | | | | | | lease_preferences = '[[+region=us-west1]]' - defaultdb | users | chicago | NULL | city | users@users_last_name_index | ('chicago') | num_replicas = 3, - | | | | | | | constraints = '{+region=us-central1: 1}', - | | | | | | | lease_preferences = '[[+region=us-central1]]' - defaultdb | users | ny | NULL | city | users@users_last_name_index | ('new york') | num_replicas = 3, - | | | | | | | constraints = '{+region=us-east1: 1}', - | | | | | | | lease_preferences = '[[+region=us-east1]]' - (6 rows) - ~~~ - -{{site.data.alerts.callout_success}} -As you scale and add more cities, you can repeat steps 2 and 3 with the new complete list of cities to re-partition the table and its secondary indexes, and then repeat step 4 to create replication zones for the new partitions. -{{site.data.alerts.end}} - -{% include {{page.version.version}}/sql/crdb-internal-partitions.md %} - -## Characteristics - -### Latency - -#### Reads - -Because each partition's leaseholder is constrained to the relevant region (e.g., the `la` partitions' leaseholders are located in the `us-west` region), reads that specify the local region key access the relevant leaseholder locally. This makes read latency very low, with the exception of reads that do not specify a region key or that refer to a partition in another region. - -For example, in the animation below: - -1. The read request in `us-west` reaches the regional load balancer. -2. The load balancer routes the request to a gateway node. -3. The gateway node routes the request to the leaseholder for the relevant partition. -4. The leaseholder retrieves the results and returns to the gateway node. -5. The gateway node returns the results to the client. - -Geo-partitioned leaseholders topology - -#### Writes - -Just like for reads, because each partition's leaseholder is constrained to the relevant region (e.g., the `la` partitions' leaseholders are located in the `us-west` region), writes that specify the local region key access the relevant leaseholder replicas locally. However, a partition's other replicas are spread across the other regions, so writes involve multiple network hops across regions to achieve consensus. This increases write latency significantly. - -For example, in the animation below: - -1. The write request in `us-west` reaches the regional load balancer. -2. The load balancer routes the request to a gateway node. -3. The gateway node routes the request to the leaseholder replicas for the relevant table and secondary index partitions. -4. While each leaseholder appends the write to its Raft log, it notifies its follower replicas, which are in the other regions. -5. In each case, as soon as one follower has appended the write to its Raft log (and thus a majority of replicas agree based on identical Raft logs), it notifies the leaseholder and the write is committed on the agreeing replicas. -6. The leaseholders then return acknowledgement of the commit to the gateway node. -7. The gateway node returns the acknowledgement to the client. - -Geo-partitioned leaseholders topology - -### Resiliency - -Because this pattern balances the replicas for each partition across regions, one entire region can fail without interrupting access to any partitions. In this case, if any range loses its leaseholder in the region-wide outage, CockroachDB makes one of the range's other replicas the leaseholder: - -Geo-partitioning topology - - - -## Alternatives - -- If reads from a table can be historical (4.8 seconds or more in the past), consider the [Follower Reads](topology-follower-reads.html) pattern. -- If rows in the table, and all latency-sensitive queries, **cannot** be tied to specific geographies, consider the [Duplicate Indexes](topology-duplicate-indexes.html) pattern. - -## See also - -{% include {{ page.version.version }}/topology-patterns/see-also.md %} diff --git a/v21.1/topology-geo-partitioned-replicas.md b/v21.1/topology-geo-partitioned-replicas.md deleted file mode 100644 index 2749650cc0b..00000000000 --- a/v21.1/topology-geo-partitioned-replicas.md +++ /dev/null @@ -1,224 +0,0 @@ ---- -title: Geo-Partitioned Replicas Topology -summary: Guidance on using the geo-partitioned replicas topology in a multi-region deployment. -toc: true ---- - -In a multi-region deployment, the geo-partitioned replicas topology is a good choice for tables with the following requirements: - -- Read and write latency must be low. -- Rows in the table, and all latency-sensitive queries, can be tied to specific geographies, e.g., city, state, region. -- Regional data must remain available during an AZ failure, but it's OK for regional data to become unavailable during a region-wide failure. - -{{site.data.alerts.callout_success}} -**See It In Action** - Read about how an [electronic lock manufacturer](https://www.cockroachlabs.com/case-studies/european-electronic-lock-manufacturer-modernizes-iam-system-with-managed-cockroachdb/) and [multi-national bank](https://www.cockroachlabs.com/case-studies/top-five-multinational-bank-modernizes-its-european-core-banking-services-migrating-from-oracle-to-cockroachdb/) are using the Geo-Partitioned Replicas topology in production for improved performance and regulatory compliance. -{{site.data.alerts.end}} - -## Prerequisites - -### Fundamentals - -{% include {{ page.version.version }}/topology-patterns/fundamentals.md %} - -### Cluster setup - -{% include {{ page.version.version }}/topology-patterns/multi-region-cluster-setup.md %} - -## Configuration - -{{site.data.alerts.callout_info}} -Geo-partitioning requires an [Enterprise license](https://www.cockroachlabs.com/get-cockroachdb). -{{site.data.alerts.end}} - -### Summary - -Using this pattern, you design your table schema to allow for [partitioning](partitioning.html#table-creation), with a column identifying geography as the first column in the table's compound primary key (e.g., city/id). You tell CockroachDB to partition the table and all of its secondary indexes by that geography column, each partition becoming its own range of 3 replicas. You then tell CockroachDB to pin each partition (all of its replicas) to the relevant region (e.g., LA partitions in `us-west`, NY partitions in `us-east`). This means that reads and writes in each region will always have access to the relevant replicas and, therefore, will have low, intra-region latencies. - -Geo-partitioning topology - -### Steps - -Assuming you have a [cluster deployed across three regions](#cluster-setup) and a table and secondary index like the following: - -{% include copy-clipboard.html %} -~~~ sql -> CREATE TABLE users ( - id UUID NOT NULL DEFAULT gen_random_uuid(), - city STRING NOT NULL, - first_name STRING NOT NULL, - last_name STRING NOT NULL, - address STRING NOT NULL, - PRIMARY KEY (city ASC, id ASC) -); -~~~ - -{% include copy-clipboard.html %} -~~~ sql -> CREATE INDEX users_last_name_index ON users (city, last_name); -~~~ - -{{site.data.alerts.callout_info}} -A geo-partitioned table does not require a secondary index. However, if the table does have one or more secondary indexes, each index must be partitioned as well. This means that the indexes must start with the column identifying geography, like the table itself, which impacts the queries they'll be useful for. If you can't partition all secondary indexes on a table you want to geo-partition, consider the [Geo-Partitioned Leaseholders](topology-geo-partitioned-leaseholders.html) pattern instead. -{{site.data.alerts.end}} - -1. If you do not already have one, [request a trial Enterprise license](https://www.cockroachlabs.com/get-cockroachdb). - -2. Partition the table by `city`. For example, assuming there are three possible `city` values, `los angeles`, `chicago`, and `new york`: - - {% include copy-clipboard.html %} - ~~~ sql - > ALTER TABLE users PARTITION BY LIST (city) ( - PARTITION la VALUES IN ('los angeles'), - PARTITION chicago VALUES IN ('chicago'), - PARTITION ny VALUES IN ('new york') - ); - ~~~ - - This creates distinct ranges for each partition of the table. - -3. Partition the secondary index by `city` as well: - - {% include copy-clipboard.html %} - ~~~ sql - > ALTER INDEX users_last_name_index PARTITION BY LIST (city) ( - PARTITION la VALUES IN ('los angeles'), - PARTITION chicago VALUES IN ('chicago'), - PARTITION ny VALUES IN ('new york') - ); - ~~~ - - This creates distinct ranges for each partition of the secondary index. - -4. For each partition of the table and its secondary index, [create a replication zone](configure-zone.html) that constrains the partition's replicas to nodes in the relevant region: - - {{site.data.alerts.callout_success}} - The `@*` syntax lets you create zone configurations for all identically named partitions of a table, saving you multiple steps. - {{site.data.alerts.end}} - - {% include copy-clipboard.html %} - ~~~ sql - > ALTER PARTITION la OF INDEX users@* - CONFIGURE ZONE USING constraints = '[+region=us-west]'; - ALTER PARTITION chicago OF INDEX users@* - CONFIGURE ZONE USING constraints = '[+region=us-central]'; - ALTER PARTITION ny OF INDEX users@* - CONFIGURE ZONE USING constraints = '[+region=us-east]'; - ~~~ - -5. To confirm that partitions are in effect, you can use the [`SHOW CREATE TABLE`](show-create.html) or [`SHOW PARTITIONS`](show-partitions.html) statement: - - {% include copy-clipboard.html %} - ~~~ sql - > SHOW CREATE TABLE users; - ~~~ - - ~~~ - table_name | create_statement - +------------+----------------------------------------------------------------------------------------------------+ - users | CREATE TABLE users ( - | id UUID NOT NULL DEFAULT gen_random_uuid(), - | city STRING NOT NULL, - | first_name STRING NOT NULL, - | last_name STRING NOT NULL, - | address STRING NOT NULL, - | CONSTRAINT "primary" PRIMARY KEY (city ASC, id ASC), - | INDEX users_last_name_index (city ASC, last_name ASC) PARTITION BY LIST (city) ( - | PARTITION la VALUES IN (('los angeles')), - | PARTITION chicago VALUES IN (('chicago')), - | PARTITION ny VALUES IN (('new york')) - | ), - | FAMILY "primary" (id, city, first_name, last_name, address) - | ) PARTITION BY LIST (city) ( - | PARTITION la VALUES IN (('los angeles')), - | PARTITION chicago VALUES IN (('chicago')), - | PARTITION ny VALUES IN (('new york')) - | ); - | ALTER PARTITION chicago OF INDEX defaultdb.public.users@primary CONFIGURE ZONE USING - | constraints = '[+region=us-central]'; - | ALTER PARTITION la OF INDEX defaultdb.public.users@primary CONFIGURE ZONE USING - | constraints = '[+region=us-west]'; - | ALTER PARTITION ny OF INDEX defaultdb.public.users@primary CONFIGURE ZONE USING - | constraints = '[+region=us-east]'; - | ALTER PARTITION chicago OF INDEX defaultdb.public.users@users_last_name_index CONFIGURE ZONE USING - | constraints = '[+region=us-central]'; - | ALTER PARTITION la OF INDEX defaultdb.public.users@users_last_name_index CONFIGURE ZONE USING - | constraints = '[+region=us-west]'; - | ALTER PARTITION ny OF INDEX defaultdb.public.users@users_last_name_index CONFIGURE ZONE USING - | constraints = '[+region=us-east]' - (1 row) - ~~~ - - {% include copy-clipboard.html %} - ~~~ sql - > SHOW PARTITIONS FROM TABLE users; - ~~~ - - ~~~ - database_name | table_name | partition_name | parent_partition | column_names | index_name | partition_value | zone_config - +---------------+------------+----------------+------------------+--------------+-----------------------------+-----------------+---------------------------------------+ - defaultdb | users | la | NULL | city | users@primary | ('los angeles') | constraints = '[+region=us-west]' - defaultdb | users | chicago | NULL | city | users@primary | ('chicago') | constraints = '[+region=us-central]' - defaultdb | users | ny | NULL | city | users@primary | ('new york') | constraints = '[+region=us-east]' - defaultdb | users | la | NULL | city | users@users_last_name_index | ('los angeles') | constraints = '[+region=us-west]' - defaultdb | users | chicago | NULL | city | users@users_last_name_index | ('chicago') | constraints = '[+region=us-central]' - defaultdb | users | ny | NULL | city | users@users_last_name_index | ('new york') | constraints = '[+region=us-east]' - (6 rows) - ~~~ - -{{site.data.alerts.callout_success}} -As you scale and add more cities, you can repeat steps 2 and 3 with the new complete list of cities to re-partition the table and its secondary indexes, and then repeat step 4 to create replication zones for the new partitions. -{{site.data.alerts.end}} - -{% include {{page.version.version}}/sql/crdb-internal-partitions.md %} - -## Characteristics - -### Latency - -#### Reads - -Because each partition is constrained to the relevant region (e.g., the `la` partitions are located in the `us-west` region), reads that specify the local region key access the relevant leaseholder locally. This makes read latency very low, with the exception of reads that do not specify a region key or that refer to a partition in another region; such reads will be transactionally consistent but won't have local latencies. - -For example, in the animation below: - -1. The read request in `us-central` reaches the regional load balancer. -2. The load balancer routes the request to a gateway node. -3. The gateway node routes the request to the leaseholder for the relevant partition. -4. The leaseholder retrieves the results and returns to the gateway node. -5. The gateway node returns the results to the client. - -Geo-partitioning topology - -#### Writes - -Just like for reads, because each partition is constrained to the relevant region (e.g., the `la` partitions are located in the `us-west` region), writes that specify the local region key access the relevant replicas without leaving the region. This makes write latency very low, with the exception of writes that do not specify a region key or that refer to a partition in another region; such writes will be transactionally consistent but won't have local latencies. - -For example, in the animation below: - -1. The write request in `us-central` reaches the regional load balancer. -2. The load balancer routes the request to a gateway node. -3. The gateway node routes the request to the leaseholder replicas for the relevant table and secondary index partitions. -4. While each leaseholder appends the write to its Raft log, it notifies its follower replicas, which are in the same region. -5. In each case, as soon as one follower has appended the write to its Raft log (and thus a majority of replicas agree based on identical Raft logs), it notifies the leaseholder and the write is committed on the agreeing replicas. -6. The leaseholders then return acknowledgement of the commit to the gateway node. -7. The gateway node returns the acknowledgement to the client. - -Geo-partitioning topology - -### Resiliency - -Because each partition is constrained to the relevant region and balanced across the 3 AZs in the region, one AZ can fail per region without interrupting access to the partitions in that region: - -Geo-partitioning topology - -However, if an entire region fails, the partitions in that region become unavailable for reads and writes, even if your load balancer can redirect requests to a different region: - -Geo-partitioning topology - -## Tutorial - -For a step-by-step demonstration of how this pattern gets you low-latency reads and writes in a broadly distributed cluster, see the [Low Latency Multi-Region Deployment](demo-low-latency-multi-region-deployment.html) tutorial. - -## See also - -{% include {{ page.version.version }}/topology-patterns/see-also.md %} diff --git a/v21.1/topology-global-tables.md b/v21.1/topology-global-tables.md new file mode 100644 index 00000000000..a271ba2b00b --- /dev/null +++ b/v21.1/topology-global-tables.md @@ -0,0 +1,116 @@ +--- +title: The Global Table Locality Pattern +summary: Guidance on using the GLOBAL table locality pattern in a multi-region deployment. +toc: true +redirect_from: topology-duplicate-indexes.html +--- + +In a [multi-region deployment](multiregion-overview.html), the [`GLOBAL` table locality](multiregion-overview.html#global-tables) is a good choice for tables with the following requirements: + +- Read latency must be low, but write latency can be much higher. +- Reads must be up-to-date for business reasons or because the table is referenced by [foreign keys](foreign-key.html). +- Rows in the table, and all latency-sensitive queries, **cannot** be tied to specific geographies. + +In general, this pattern is suited well for reference tables that are rarely updated. + +{{site.data.alerts.callout_info}} +Tables with the `GLOBAL` locality can survive zone or region failures, depending on the database-level [survival goal](multiregion-overview.html#survival-goals) setting. +{{site.data.alerts.end}} + +## Prerequisites + +### Fundamentals + +{% include {{ page.version.version }}/topology-patterns/multiregion-fundamentals.md %} + +### Cluster setup + +{% include {{ page.version.version }}/topology-patterns/multi-region-cluster-setup.md %} + +## Configuration + +{{site.data.alerts.callout_info}} +`GLOBAL` tables (and the other [multi-region capabilities](multiregion-overview.html)) require an [Enterprise license](https://www.cockroachlabs.com/get-cockroachdb). +{{site.data.alerts.end}} + +### Summary + +Using this pattern, you tell CockroachDB to set the [table locality](multiregion-overview.html#table-locality) to `GLOBAL`, and CockroachDB handles the details. + +{% include {{page.version.version}}/sql/global-table-description.md %} + +### Steps + +{% include {{page.version.version}}/topology-patterns/multiregion-db-setup.md %} + +Next, create a [`GLOBAL` table](multiregion-overview.html#global-tables) by issuing the following statement: + +{% include copy-clipboard.html %} +~~~ sql +CREATE TABLE postal_codes ( + id INT PRIMARY KEY, + code STRING +) LOCALITY GLOBAL; +~~~ + +Alternatively, you can set an existing table's locality to `GLOBAL` using [`ALTER TABLE ... SET LOCALITY`](set-locality.html): + +{% include copy-clipboard.html %} +~~~ sql +> ALTER TABLE postal_codes SET LOCALITY GLOBAL; +~~~ + +To confirm that your `postal_codes` table data is replicated across the cluster in accordance with the `GLOBAL` table locality, check the **Data Distribution** [debug page](ui-debug-pages.html) in the [DB Console](ui-overview.html). It will look something like the output below (which is edited for length). Translating from zone configurations into human language, this output says: + +- Make the database resilient to zone failures, with replicas in each region (this is the default [`ZONE` survival goal](multiregion-overview.html#survival-goals)). +- Put the leaseholders in `us-east`, since it's the [primary database region](multiregion-overview.html#database-regions). +- Finally, make the `postal_codes` table a [global table](multiregion-overview.html#global-tables). + +~~~ +ALTER DATABASE test CONFIGURE ZONE USING + num_replicas = 5, + num_voters = 3, + constraints = '{+region=us-central: 1, +region=us-east: 1, +region=us-west: 1}', + voter_constraints = '[+region=us-east]', + lease_preferences = '[[+region=us-east]]' +... +ALTER TABLE test.public.postal_codes CONFIGURE ZONE USING + global_reads = true +~~~ + +{{site.data.alerts.callout_success}} +A better way to check that your [table locality settings](multiregion-overview.html#table-locality) are having the expected effect is by monitoring how the performance metrics of a workload change as the settings are applied to a running cluster. For a tutorial showing how to use table localities to improve performance metrics across a multi-region cluster, see [Low Latency Reads and Writes in a Multi-Region Cluster](demo-low-latency-multi-region-deployment.html). +{{site.data.alerts.end}} + +## Characteristics + +### Latency + +Global tables support low-latency, global reads of read-mostly data using an extension to CockroachDB's standard transaction protocol called [non-blocking transactions](architecture/transaction-layer.html#non-blocking-transactions). + +#### Reads + +Thanks to the [non-blocking transaction](architecture/transaction-layer.html#non-blocking-transactions) protocol extension, reads against `GLOBAL` tables access a consistent local replica and therefore never leave the region. This keeps read latency low. + +#### Writes + +Writes incur higher latencies than reads, since they require a "commit-wait" step to ensure consistency. For more information about how this works, see [non-blocking transactions](architecture/transaction-layer.html#non-blocking-transactions). + +### Resiliency + +Because the `test` database does not specify a [survival goal](multiregion-overview.html#survival-goals), it uses the default [`ZONE` survival goal](multiregion-overview.html#surviving-zone-failures). With the default settings, an entire AZ can fail without interrupting access to the database. + +For more information about how to choose a database survival goal, see [When to use `ZONE` vs. `REGION` survival goals](when-to-use-zone-vs-region-survival-goals.html). + +## Alternatives + +- If your application can tolerate historical reads in some cases, consider the [Follower Reads](topology-follower-reads.html) pattern. +- If rows in the table, and all latency-sensitive queries, can be tied to specific geographies, consider the [`REGIONAL` Table Locality Pattern](topology-regional-tables.html) pattern. + +## Tutorial + +For a step-by-step demonstration showing how CockroachDB's multi-region capabilities (including `GLOBAL` and `REGIONAL` tables) give you low-latency reads in a distributed cluster, see the tutorial on [Low Latency Reads and Writes in a Multi-Region Cluster](demo-low-latency-multi-region-deployment.html). + +## See also + +{% include {{ page.version.version }}/topology-patterns/see-also.md %} diff --git a/v21.1/topology-patterns.md b/v21.1/topology-patterns.md index 7e9b4489fd7..2242591e908 100644 --- a/v21.1/topology-patterns.md +++ b/v21.1/topology-patterns.md @@ -6,7 +6,7 @@ redirect_from: cluster-topology-patterns.html key: cluster-topology-patterns.html --- -This section provides recommended topology patterns for running CockroachDB in a cloud environment, each with required configurations and latency and resiliency characteristics. +This section provides recommended topology for running CockroachDB in a cloud environment, each with required configurations and latency and resiliency characteristics. {{site.data.alerts.callout_info}} You can observe latency patterns for your cluster on the [Network Latency page](ui-network-latency-page.html) of the DB Console. @@ -23,19 +23,23 @@ Pattern | Latency | Resiliency | Configuration ## Multi-region patterns -When your clients are in multiple geographic regions, it is important to deploy your cluster across regions properly and then carefully choose the right topology for each of your tables. Not doing so can result in unexpected latency and resiliency. +When your clients are in multiple geographic regions, it is important to deploy your cluster across regions properly and then carefully choose: + +1. The right [survival goal](multiregion-overview.html#survival-goals) for each database. +1. The right [table locality](multiregion-overview.html#table-locality) for each of your tables. + +Not doing so can result in unexpected latency and resiliency. For more information, see the [Multi-Region Capabilities Overview](multiregion-overview.html). {{site.data.alerts.callout_info}} -Multi-region patterns are almost always table-specific. For example, you might use the [Geo-Partitioning Replicas](topology-geo-partitioned-replicas.html) pattern for frequently updated tables that are geographically specific and the [Duplicate Indexes](topology-duplicate-indexes.html) pattern for reference tables that are not tied to geography and that are read frequently but updated infrequently. +The multi-region patterns described below are almost always table-specific. For example, you might use [Regional Tables](topology-regional-tables.html) for frequently updated tables that are geographically specific, and [Global Tables](topology-global-tables.html) pattern for reference tables that are not tied to geography and that are read frequently but updated infrequently. {{site.data.alerts.end}} -Pattern | Latency | Resiliency | Configuration ---------|---------|------------|-------------- -[Geo-Partitioned Replicas](topology-geo-partitioned-replicas.html) |
  • Fast regional reads and writes
|
  • 1 AZ failure per partition
|
  • Geo-partitioned table
  • Partition replicas pinned to regions
-[Geo-Partitioned Leaseholders](topology-geo-partitioned-leaseholders.html) |
  • Fast regional reads
  • Slower cross-region writes
|
  • 1 region failure
|
  • Geo-partitioned table
  • Partition replicas spread across regions
  • Partition leaseholders pinned to regions
-[Duplicate Indexes](topology-duplicate-indexes.html) |
  • Fast regional reads (current)
  • Much slower cross-region writes
|
  • 1 region failure
|
  • Multiple identical indexes
  • Index replicas spread across regions
  • Index leaseholders pinned to regions
-[Follower Reads](topology-follower-reads.html) |
  • Fast regional reads (historical)
  • Slower cross-region writes
|
  • 1 region failure
|
  • App configured to use follower reads
-[Follow-the-Workload](topology-follow-the-workload.html) |
  • Fast regional reads (active region)
  • Slower cross-region reads (elsewhere)
  • Slower cross-region writes
  • |
    • 1 region failure
    |
    • None
    +| Pattern | Latency | Resiliency | +|----------------------------------------------------------+------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------| +| [Regional Tables](topology-regional-tables.html) | Low latency for single-region writes and multi-region stale reads. | Depends on your [survival goals](multiregion-overview.html#survival-goals). | +| [Global Tables](topology-global-tables.html) | Low-latency multi-region reads; writes are higher latency than reads. | Depends on your [survival goals](multiregion-overview.html#survival-goals). | +| [Follower Reads](topology-follower-reads.html) | Fast regional (historical) reads, slower cross-region writes. | Depends on your [survival goals](multiregion-overview.html#survival-goals). | +| [Follow-the-Workload](topology-follow-the-workload.html) | Fast regional reads in the active region; slower cross-region reads elsewhere. Slower cross-region writes. | Depends on your [survival goals](multiregion-overview.html#survival-goals). | ## Anti-patterns @@ -43,4 +47,3 @@ The following anti-patterns are ineffective or risky: - Single-region deployments using 2 AZs, or multi-region deployments using 2 regions. In these cases, the cluster would be unable to survive the loss of a single AZ or a single region, respectively. - Broadly distributed multi-region deployments (e.g., `us-west`, `asia`, and `europe`) using only the default [Follow-the-Workload](topology-follow-the-workload.html) pattern. In this case, latency will likely be unacceptably high. -- [Geo-partitioned tables](topology-geo-partitioned-replicas.html) with non-partitioned secondary indexes. In this case, writes will incur cross-region latency to achieve consensus on the non-partitioned indexes. diff --git a/v21.1/topology-regional-tables.md b/v21.1/topology-regional-tables.md new file mode 100644 index 00000000000..2e3f8edf252 --- /dev/null +++ b/v21.1/topology-regional-tables.md @@ -0,0 +1,141 @@ +--- +title: The Regional Table Locality Pattern +summary: Guidance on using the Regional Table Locality Pattern in a multi-region deployment. +toc: true +redirect_from: +- topology-geo-partitioned-replicas.html +- topology-geo-partitioned-leaseholders.html +--- + +In a [multi-region deployment](multiregion-overview.html), the [Regional Table Locality Pattern](multiregion-overview.html#table-locality) is a good choice for tables with the following requirements: + +- Read and write latency must be low. +- Rows in the table, and all latency-sensitive queries, can be tied to specific geographies, e.g., city, state, region. + +{{site.data.alerts.callout_info}} +Tables with the Regional Table Locality Pattern can survive zone or region failures, depending on the database-level [survival goal](multiregion-overview.html#survival-goals) setting. +{{site.data.alerts.end}} + +## Prerequisites + +### Fundamentals + +{% include {{ page.version.version }}/topology-patterns/multiregion-fundamentals.md %} + +### Cluster setup + +{% include {{ page.version.version }}/topology-patterns/multi-region-cluster-setup.md %} + +## Configuration + +{{site.data.alerts.callout_info}} +Regional tables (and the other [multi-region capabilities](multiregion-overview.html)) require an [Enterprise license](https://www.cockroachlabs.com/get-cockroachdb). +{{site.data.alerts.end}} + +### Summary + +Using this pattern, you tell CockroachDB to set the [table locality](multiregion-overview.html#table-locality) to either [`REGIONAL BY TABLE`](#regional-tables) or [`REGIONAL BY ROW`](#regional-by-row-tables), and CockroachDB handles the details. + +#### Regional tables + +{% include {{page.version.version}}/sql/regional-table-description.md %} + +#### Regional by row tables + +{% include {{page.version.version}}/sql/regional-by-row-table-description.md %} + +### Steps + +{% include {{page.version.version}}/topology-patterns/multiregion-db-setup.md %} + +Next, create a `users` table: + +{% include copy-clipboard.html %} +~~~ sql +CREATE TABLE users ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + city STRING NOT NULL, + first_name STRING NOT NULL, + last_name STRING NOT NULL, + address STRING NOT NULL +); +~~~ + +Since all tables in a multi-region cluster default to the [`REGIONAL BY TABLE`](#regional-tables) locality setting, let's set the table's locality to [`REGIONAL BY ROW`](#regional-by-row-tables) using the following [`ALTER TABLE`](alter-table.html) statements: [`ADD COLUMN`](add-column.html), [`ALTER COLUMN`](alter-column.html), and [`SET LOCALITY`](set-locality.html). + +{% include copy-clipboard.html %} +~~~ sql +ALTER TABLE users ADD COLUMN region crdb_internal_region AS ( + CASE WHEN city = 'milwaukee' THEN 'us-central' + WHEN city = 'chicago' THEN 'us-central' + WHEN city = 'dallas' THEN 'us-central' + WHEN city = 'new york' THEN 'us-east' + WHEN city = 'boston' THEN 'us-east' + WHEN city = 'washington dc' THEN 'us-east' + WHEN city = 'san francisco' THEN 'us-west' + WHEN city = 'seattle' THEN 'us-west' + WHEN city = 'los angeles' THEN 'us-west' + END +) STORED; +ALTER TABLE users ALTER COLUMN region SET NOT NULL; +ALTER TABLE users SET LOCALITY REGIONAL BY ROW AS "region"; +~~~ + +To confirm that your `users` table data is replicated across the cluster in accordance with the `REGIONAL BY ROW` table locality, check the **Data Distribution** [debug page](ui-debug-pages.html) in the [DB Console](ui-overview.html). It will look something like the output below (which is edited for length). Translating from zone configurations into human language, this output says: + +- Make the database resilient to zone failures, with replicas in each region (this is the default [`ZONE` survival goal](multiregion-overview.html#survival-goals)). +- Put the leaseholders in `us-east`, since it's the [primary database region](multiregion-overview.html#database-regions). +- Make the `users` table a [regional by row table](#regional-by-row-tables) by [partitioning](partitioning.html) the [primary key index](primary-key.html) by region. When rows are added or updated, which region the row is associated is specified as part of the update. For details, see the instructions for [updating a row's home region](set-locality.html#crdb_region). Thanks to CockroachDB's [multi-region capabilities](multiregion-overview.html), you do not need to do any partitioning "by hand", the database does it for you based on your desired [table locality setting](multiregion-overview.html#table-locality). + +~~~ +ALTER DATABASE test CONFIGURE ZONE USING + num_replicas = 5, + num_voters = 3, + constraints = '{+region=us-central: 1, +region=us-east: 1, +region=us-west: 1}', + voter_constraints = '[+region=us-east]', + lease_preferences = '[[+region=us-east]]' +... +ALTER PARTITION "europe-west1" OF INDEX test.public.users@primary CONFIGURE ZONE USING + num_voters = 3, + voter_constraints = '[+region=europe-west1]', + lease_preferences = '[[+region=europe-west1]]' +ALTER PARTITION "us-east1" OF INDEX test.public.users@primary CONFIGURE ZONE USING + num_voters = 3, + voter_constraints = '[+region=us-east1]', + lease_preferences = '[[+region=us-east1]]' +ALTER PARTITION "us-west1" OF INDEX test.public.users@primary CONFIGURE ZONE USING + num_voters = 3, + voter_constraints = '[+region=us-west1]', + lease_preferences = '[[+region=us-west1]]' +~~~ + +{{site.data.alerts.callout_success}} +A better way to check that your [table locality settings](multiregion-overview.html#table-locality) are having the expected effect is by monitoring how the performance metrics of a workload change as the settings are applied to a running cluster. For a tutorial showing how to use table localities to improve performance metrics across a multi-region cluster, see [Low Latency Reads and Writes in a Multi-Region Cluster](demo-low-latency-multi-region-deployment.html). +{{site.data.alerts.end}} + +## Characteristics + +### Latency + +For [`REGIONAL BY TABLE` tables](#regional-tables), you get low latency for single-region writes and multi-region stale reads. + +For [`REGIONAL BY ROW` tables](#regional-by-row-tables), you get low-latency consistent multi-region reads & writes for rows which are homed in specific regions. + +### Resiliency + +Because the `test` database does not specify a [survival goal](multiregion-overview.html#survival-goals), it uses the default [`ZONE` survival goal](multiregion-overview.html#surviving-zone-failures). With the default settings, an entire AZ can fail without interrupting access to the database. + +For more information about how to choose a database survival goal, see [When to use `ZONE` vs. `REGION` survival goals](when-to-use-zone-vs-region-survival-goals.html). + +## Alternatives + +- If rows in the table **cannot** be tied to specific geographies, and reads must be up-to-date for business reasons or because the table is referenced by [foreign keys](foreign-key.html), consider the [`GLOBAL` Table Locality Pattern](topology-global-tables.html). +- If your application can tolerate historical reads in some cases, consider the [Follower Reads](topology-follower-reads.html) pattern. + +## Tutorial + +For a step-by-step demonstration showing how CockroachDB's multi-region capabilities (including [`REGIONAL BY ROW` tables](#regional-by-row-tables)) give you low-latency reads in a distributed cluster, see the tutorial on [Low Latency Reads and Writes in a Multi-Region Cluster](demo-low-latency-multi-region-deployment.html). + +## See also + +{% include {{ page.version.version }}/topology-patterns/see-also.md %}