Add global and regional tables to patterns docs

Fixes #9268 Fixes #9266 Fixes #9265 Fixes #9264 Fixes #10425 Fixes #10319 Addresses #10051 Addresses #10401 Addresses #10461 Addresses #10463 Summary of changes: - Update multi-region "topology patterns" as follows: - Deemphasize "topology" for these patterns, since it really isn't - it's per-table - Add "global tables" as a pattern - Add "regional tables" as a pattern - Remove "duplicate indexes" and "geo-partitioned foos" pages and have them redirect to the above, as appropriate - Fix up ~all links in the docs to point to the new things - Update mentions of partitioning throughout the docs to note that most users should use new v21.1+ multi-region capabilities instead of explicit partitioning - Also removed mention of partitioning from several places as part of deemphasing explicit partitioning generally, since most use cases are covered by MR abstractions - Update `cockroach demo` to deemphasize partitioning and point to new MR things - Update disaster recovery pages to mention most users of new deployments should just use multi-region survival goals - Update various licensing docs to point to the new multi-region things - Update replication reports to mention the new MR things, in lieu of further updates - Renamed a bunch of links that used the phrase "geo-partitioning" to refer to the multi-region latency tutorial, which now goes by the name "Multi-Region Performance" - Note: we explicitly did *not* touch anything related to the multi-region Flask app, since that work is happening via #10394
cockroachdb · Apr 29, 2021 · e702ee9 · e702ee9
1 parent e90979f
commit e702ee9
Show file tree

Hide file tree

Showing 38 changed files with 420 additions and 935 deletions.
diff --git a/_includes/sidebar-data-v21.1.json b/_includes/sidebar-data-v21.1.json
@@ -627,10 +627,10 @@
             ]
           },
           {
-            "title": "Multi-region Clusters",
+            "title": "Multi-region Capabilities",
             "items": [
               {
-                "title": "Multi-region Overview",
+                "title": "Overview",
                 "urls": [
                   "/${VERSION}/multiregion-overview.html"
                 ]
@@ -677,21 +677,15 @@
                 ]
               },
               {
-                "title": "Geo-Partitioned Replicas",
-                "urls": [
-                  "/${VERSION}/topology-geo-partitioned-replicas.html"
-                ]
-              },
-              {
-                "title": "Geo-Partitioned Leaseholders",
+                "title": "Global Tables",
                 "urls": [
-                  "/${VERSION}/topology-geo-partitioned-leaseholders.html"
+                  "/${VERSION}/topology-global-tables.html"
                 ]
               },
               {
-                "title": "Duplicate Indexes",
+                "title": "Regional Tables",
                 "urls": [
-                  "/${VERSION}/topology-duplicate-indexes.html"
+                  "/${VERSION}/topology-regional-tables.html"
                 ]
               },
               {

diff --git a/_includes/v21.1/sql/global-table-description.md b/_includes/v21.1/sql/global-table-description.md
@@ -0,0 +1,7 @@
+ _Global_ tables are optimized for low-latency reads from every region in the database. The tradeoff is that writes will incur higher latencies from any given region, since writes have to be replicated across every region to make the global low-latency reads possible.
+
+Use global tables when your application has a "read-mostly" table of reference data that is rarely updated, and needs to be available to all regions.
+
+For an example of a table that can benefit from the _global_ table locality setting in a multi-region deployment, see the `promo_codes` table from the [MovR application](movr.html).
+
+For instructions showing how to set a table's locality to `GLOBAL`, see [`ALTER TABLE ... SET LOCALITY`](set-locality.html#global)
diff --git a/_includes/v21.1/sql/regional-by-row-table-description.md b/_includes/v21.1/sql/regional-by-row-table-description.md
@@ -0,0 +1,7 @@
+In _regional by row_ tables, individual rows are optimized for access from different regions. This setting divides a table and all of [its indexes](multiregion-overview.html#indexes-on-regional-by-row-tables) into [partitions](partitioning.html), with each partition optimized for access from a different region. Like [regional tables](multiregion-overview.html#regional-tables), _regional by row_ tables are optimized for access from a single region. However, that region is specified at the row level instead of applying to the whole table.
+
+Use regional by row tables when your application requires low-latency reads and writes at a row level where individual rows are primarily accessed from a single region. For example, a users table in a global application may need to keep some users' data in specific regions for better performance.
+
+For an example of a table that can benefit from the _regional by row_ setting in a multi-region deployment, see the `users` table from the [MovR application](movr.html).
+
+For instructions showing how to set a table's locality to `REGIONAL BY ROW`, see [`ALTER TABLE ... SET LOCALITY`](set-locality.html#regional-by-row)
diff --git a/_includes/v21.1/sql/regional-table-description.md b/_includes/v21.1/sql/regional-table-description.md
@@ -0,0 +1,9 @@
+Regional tables work well when your application requires low-latency reads and writes for an entire table from a single region.
+
+For _regional_ tables, access to the table will be fast in the table's "home region" and slower in other regions. In other words, CockroachDB optimizes access to data in regional tables from a single region. By default, a regional table's home region is the [database's primary region](multiregion-overview.html#database-regions), but that can be changed to use any region added to the database.
+
+For instructions showing how to set a table's locality to `REGIONAL BY TABLE`, see [`ALTER TABLE ... SET LOCALITY`](set-locality.html#regional-by-table)
+
+{{site.data.alerts.callout_info}}
+By default, all tables in a multi-region database are _regional_ tables that use the database's primary region. Unless you know your application needs different performance characteristics than regional tables provide, there is no need to change this setting.
+{{site.data.alerts.end}}
diff --git a/_includes/v21.1/sql/use-multiregion-instead-of-partitioning.md b/_includes/v21.1/sql/use-multiregion-instead-of-partitioning.md
@@ -0,0 +1,3 @@
+{{site.data.alerts.callout_success}}
+<span class="version-tag">New in v21.1:</span> Most users should not need to use partitioning directly.  Instead, they should use CockroachDB's built-in [multi-region capabilities](multiregion-overview.html), which automatically handle geo-partitioning and other low-level details.
+{{site.data.alerts.end}}
diff --git a/_includes/v21.1/topology-patterns/multi-region-cluster-setup.md b/_includes/v21.1/topology-patterns/multi-region-cluster-setup.md
@@ -1,28 +1,26 @@
-Each [multi-region topology pattern](topology-patterns.html#multi-region-patterns) assumes the following setup:
+Each [multi-region pattern](topology-patterns.html#multi-region-patterns) assumes the following setup:
 
 <img src="{{ 'images/v21.1/topology-patterns/topology_multi-region_hardware.png' | relative_url }}" alt="Multi-region hardware setup" style="max-width:100%" />
 
 #### Hardware
 
 - 3 regions
-
 - Per region, 3+ AZs with 3+ VMs evenly distributed across them
-
 - Region-specific app instances and load balancers
     - Each load balancer redirects to CockroachDB nodes in its region.
     - When CockroachDB nodes are unavailable in a region, the load balancer redirects to nodes in other regions.
 
 #### Cluster
 
-Each node is started with the [`--locality`](cockroach-start.html#locality) flag specifying its region and AZ combination. For example, the following command starts a node in the west1 AZ of the us-west region:
+Each node is started with the [`--locality`](cockroach-start.html#locality) flag specifying its region and AZ combination. For example, the following command starts a node in the `west1` AZ of the `us-west` region:
 
 {% include copy-clipboard.html %}
 ~~~ shell
 $ cockroach start \
 --locality=region=us-west,zone=west1 \
 --certs-dir=certs \
 --advertise-addr=<node1 internal address> \
---join=<node1 internal address>:26257,<node2 internal address>:26257,<node3 internal address>:26257 \        
+--join=<node1 internal address>:26257,<node2 internal address>:26257,<node3 internal address>:26257 \
 --cache=.25 \
 --max-sql-memory=.25 \
 --background

diff --git a/_includes/v21.1/topology-patterns/multiregion-db-setup.md b/_includes/v21.1/topology-patterns/multiregion-db-setup.md
@@ -0,0 +1,36 @@
+First, create a database and use it:
+
+{% include copy-clipboard.html %}
+~~~ sql
+CREATE DATABASE test;
+~~~
+
+{% include copy-clipboard.html %}
+~~~ sql
+USE test;
+~~~
+
+[This cluster is already deployed across three regions](#cluster-setup).  Therefore, to make this database a "multi-region database", you need to issue the following SQL statement that [sets the primary region](add-region.html#set-the-primary-region):
+
+{% include copy-clipboard.html %}
+~~~ sql
+ALTER DATABASE test PRIMARY REGION "us-east";
+~~~
+
+{{site.data.alerts.callout_info}}
+Every multi-region database must have a primary region.  For more information, see [Database regions](multiregion-overview.html#database-regions).
+{{site.data.alerts.end}}
+
+Next, issue the following [`ADD REGION`](add-region.html) statements to add the remaining regions to the database.
+
+{% include copy-clipboard.html %}
+~~~ sql
+ALTER DATABASE test ADD REGION "us-west";
+~~~
+
+{% include copy-clipboard.html %}
+~~~ sql
+ALTER DATABASE test ADD REGION "us-central";
+~~~
+
+Congratulations, `test` is now a multi-region database!
diff --git a/_includes/v21.1/topology-patterns/multiregion-fundamentals.md b/_includes/v21.1/topology-patterns/multiregion-fundamentals.md
@@ -0,0 +1,20 @@
+Multi-region patterns require thinking about the following questions:
+
+- What are my [survival goals](multiregion-overview.html#survival-goals)?  Do I need to survive a [zone failure](multiregion-overview.html#surviving-zone-failures)?  A [region failure](multiregion-overview.html#surviving-region-failures)?
+- Given the constraints provided by my survival goals, what are the [table localities](multiregion-overview.html#table-locality) that will provide the performance characteristics I need for each table's data?
+  - Do I need low-latency reads and writes from a single region? Do I need them at the [row level](multiregion-overview.html#regional-by-row-tables)?  Or will the [table level](multiregion-overview.html#regional-tables) suffice?
+  - Do I have a "read-mostly" [table of reference data that is rarely updated](multiregion-overview.html#global-tables), but that needs to be available to all regions?
+
+For more information about our multi-region capabilities, review the following pages:
+
+- [Multi-region overview](multiregion-overview.html)
+- [Choosing a multi-region configuration](choosing-a-multi-region-configuration.html)
+- [When to use `ZONE` vs. `REGION` Survival Goals](when-to-use-zone-vs-region-survival-goals.html)
+- [When to use `REGIONAL` vs. `GLOBAL` Tables](when-to-use-regional-vs-global-tables.html)
+
+In addition, reviewing the following information will be helpful:
+
+- The concept of [locality](cockroach-start.html#locality), which makes CockroachDB aware of the location of nodes and able to intelligently place and balance data based on how you define [survival goals](multiregion-overview.html#survival-goals) and [table localities](multiregion-overview.html#table-locality).
+- The recommendations in our [Production Checklist](recommended-production-settings.html).
+- This page doesn't account for hardware specifications, so be sure to follow our [hardware recommendations](recommended-production-settings.html#hardware) and perform a POC to size hardware for your use case.
+- Finally, adopt these [SQL Best Practices](performance-best-practices-overview.html) to get good performance.
diff --git a/_includes/v21.1/topology-patterns/see-also.md b/_includes/v21.1/topology-patterns/see-also.md
@@ -1,11 +1,14 @@
+- [Multi-Region Capabilities Overview](multiregion-overview.html)
+- [Choosing a multi-region configuration](choosing-a-multi-region-configuration.html)
+- [When to use `ZONE` vs. `REGION` survival goals](when-to-use-zone-vs-region-survival-goals.html)
+- [When to use `GLOBAL` vs. `REGIONAL` tables](when-to-use-regional-vs-global-tables.html)
+- [`ALTER DATABASE ... SURVIVE {ZONE,REGION} FAILURE`](survive-failure.html)
+- [`ALTER TABLE ... SET LOCALITY ...`](set-locality.html)
 - [Topology Patterns Overview](topology-patterns.html)
-
-    - Single-region
-        - [Development](topology-development.html)
-        - [Basic Production](topology-basic-production.html)
-
-    - Multi-region
-        - [Geo-Partitioned Replicas](topology-geo-partitioned-replicas.html)
-        - [Geo-Partitioned Leaseholders](topology-geo-partitioned-leaseholders.html)
-        - [Duplicate Indexes](topology-duplicate-indexes.html)
-        - [Follow-the-Workload](topology-follow-the-workload.html)
+  - Single-region
+      - [Development](topology-development.html)
+      - [Basic Production](topology-basic-production.html)
+  - Multi-region
+      - [`REGIONAL` Table Locality Pattern](topology-regional-tables.html)
+      - [`GLOBAL` Table Locality Pattern](topology-global-tables.html)
+      - [Follow-the-Workload](topology-follow-the-workload.html)
diff --git a/v21.1/alter-primary-key.md b/v21.1/alter-primary-key.md
@@ -100,141 +100,6 @@ You can add a column and change the primary key with a couple of `ALTER TABLE` s
 
 Note that the old primary key index becomes a secondary index, in this case, `users_name_key`. If you do not want the old primary key to become a secondary index when changing a primary key, you can use [`DROP CONSTRAINT`](drop-constraint.html)/[`ADD CONSTRAINT`](add-constraint.html) instead.
 
-### Make a single-column primary key composite for geo-partitioning
-
-Suppose that you are storing the data for users of your application in a table called `users`, defined by the following `CREATE TABLE` statement:
-
-{% include copy-clipboard.html %}
-~~~ sql
-> CREATE TABLE users (
-  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
-  email STRING,
-  name STRING,
-  INDEX users_name_idx (name)
-);
-~~~
-
-Now suppose that you want to expand your business from a single region into multiple regions. After you [deploy your application in multiple regions](topology-patterns.html), you consider [geo-partitioning your data](topology-geo-partitioned-replicas.html) to minimize latency and optimize performance. In order to geo-partition the `user` database, you need to add a column specifying the location of the data (e.g., `region`):
-
-{% include copy-clipboard.html %}
-~~~ sql
-> ALTER TABLE users ADD COLUMN region STRING NOT NULL;
-~~~
-
-When you geo-partition a database, you [partition the database on a primary key column](partitioning.html#partition-using-primary-key). The primary key of this table is still on `id`. Change the primary key to be composite, on `region` and `id`:
-
-{% include copy-clipboard.html %}
-~~~ sql
-> ALTER TABLE users ALTER PRIMARY KEY USING COLUMNS (region, id);
-~~~
-{{site.data.alerts.callout_info}}
-The order of the primary key columns is important when geo-partitioning. For performance, always place the partition column first.
-{{site.data.alerts.end}}
-
-{% include copy-clipboard.html %}
-~~~ sql
-> SHOW CREATE TABLE users;
-~~~
-
-~~~
-  table_name |                      create_statement
--------------+-------------------------------------------------------------
-  users      | CREATE TABLE users (
-             |     id UUID NOT NULL DEFAULT gen_random_uuid(),
-             |     email STRING NULL,
-             |     name STRING NULL,
-             |     region STRING NOT NULL,
-             |     CONSTRAINT "primary" PRIMARY KEY (region ASC, id ASC),
-             |     UNIQUE INDEX users_id_key (id ASC),
-             |     INDEX users_name_idx (name ASC),
-             |     FAMILY "primary" (id, email, name, region)
-             | )
-(1 row)
-~~~
-
-Note that the old primary key index on `id` is now the secondary index `users_id_key`.
-
-With the new primary key on `region` and `id`, the table is ready to be [geo-partitioned](topology-geo-partitioned-replicas.html):
-
-{% include copy-clipboard.html %}
-~~~ sql
-> ALTER TABLE users PARTITION BY LIST (region) (
-    PARTITION us_west VALUES IN ('us_west'),
-    PARTITION us_east VALUES IN ('us_east')
-  );
-~~~
-
-{% include copy-clipboard.html %}
-~~~ sql
-> ALTER PARTITION us_west OF INDEX users@primary
-    CONFIGURE ZONE USING constraints = '[+region=us-west1]';
-  ALTER PARTITION us_east OF INDEX users@primary
-    CONFIGURE ZONE USING constraints = '[+region=us-east1]';
-~~~
-
-{% include copy-clipboard.html %}
-~~~ sql
-> SHOW PARTITIONS FROM TABLE users;
-~~~
-
-~~~
-  database_name | table_name | partition_name | parent_partition | column_names |  index_name   | partition_value |            zone_config             |          full_zone_config
-----------------+------------+----------------+------------------+--------------+---------------+-----------------+------------------------------------+--------------------------------------
-  movr          | users      | us_west        | NULL             | region       | users@primary | ('us_west')     | constraints = '[+region=us-west1]' | range_min_bytes = 134217728,
-                |            |                |                  |              |               |                 |                                    | range_max_bytes = 536870912,
-                |            |                |                  |              |               |                 |                                    | gc.ttlseconds = 90000,
-                |            |                |                  |              |               |                 |                                    | num_replicas = 3,
-                |            |                |                  |              |               |                 |                                    | constraints = '[+region=us-west1]',
-                |            |                |                  |              |               |                 |                                    | lease_preferences = '[]'
-  movr          | users      | us_east        | NULL             | region       | users@primary | ('us_east')     | constraints = '[+region=us-east1]' | range_min_bytes = 134217728,
-                |            |                |                  |              |               |                 |                                    | range_max_bytes = 536870912,
-                |            |                |                  |              |               |                 |                                    | gc.ttlseconds = 90000,
-                |            |                |                  |              |               |                 |                                    | num_replicas = 3,
-                |            |                |                  |              |               |                 |                                    | constraints = '[+region=us-east1]',
-                |            |                |                  |              |               |                 |                                    | lease_preferences = '[]'
-(2 rows)
-~~~
-
-The table is now geo-partitioned on the `region` column.
-
-You now need to geo-partition any secondary indexes in the table. In order to geo-partition an index, the index must be prefixed by a column that can be used as a partitioning identifier (in this case, `region`). Currently, neither of the secondary indexes (i.e., `users_id_key` and `users_name_idx`) are prefixed by the `region` column, so they can't be meaningfully geo-partitioned. Any secondary indexes that you want to keep must be dropped, recreated, and then partitioned.
-
-Start by dropping both indexes:
-
-{% include copy-clipboard.html %}
-~~~ sql
-> DROP INDEX users_id_key CASCADE;
-  DROP INDEX users_name_idx CASCADE;
-~~~
-
-You don't need to recreate the index on `id` with `region`. Both columns are already indexed by the new primary key.
-
-Add `region` to the index on `name`:
-
-{% include copy-clipboard.html %}
-~~~ sql
-> CREATE INDEX ON users(region, name);
-~~~
-
-Then geo-partition the index:
-
-{% include copy-clipboard.html %}
-~~~ sql
-> ALTER INDEX users_region_name_idx PARTITION BY LIST (region) (
-    PARTITION us_west VALUES IN ('us_west'),
-    PARTITION us_east VALUES IN ('us_east')
-  );
-~~~
-
-{% include copy-clipboard.html %}
-~~~ sql
-> ALTER PARTITION us_west OF INDEX users@users_region_name_idx
-    CONFIGURE ZONE USING constraints = '[+region=us-west1]';
-  ALTER PARTITION us_east OF INDEX users@users_region_name_idx
-    CONFIGURE ZONE USING constraints = '[+region=us-east1]';
-~~~
-
-
 ## See also
 
 - [Constraints](constraints.html)