diff --git a/_includes/sidebar-data-v19.2.json b/_includes/sidebar-data-v19.2.json index 27b3e5696ca..b901859c9c2 100644 --- a/_includes/sidebar-data-v19.2.json +++ b/_includes/sidebar-data-v19.2.json @@ -496,6 +496,12 @@ "urls": [ "/${VERSION}/rotate-certificates.html" ] + }, + { + "title": "Run Replication Reports", + "urls": [ + "/${VERSION}/run-replication-reports.html" + ] } ] }, diff --git a/_includes/v19.2/misc/basic-terms.md b/_includes/v19.2/misc/basic-terms.md index d0b512762aa..7efca149553 100644 --- a/_includes/v19.2/misc/basic-terms.md +++ b/_includes/v19.2/misc/basic-terms.md @@ -2,7 +2,7 @@ Term | Definition -----|------------ **Cluster** | Your CockroachDB deployment, which acts as a single logical application. **Node** | An individual machine running CockroachDB. Many nodes join together to create your cluster. -**Range** | CockroachDB stores all user data (tables, indexes, etc.) and almost all system data in a giant sorted map of key-value pairs. This keyspace is divided into "ranges", contiguous chunks of the keyspace, so that every key can always be found in a single range.

From a SQL perspective, a table and its secondary indexes initially map to a single range, where each key-value pair in the range represents a single row in the table (also called the primary index because the table is sorted by the primary key) or a single row in a secondary index. As soon as that range reaches 64 MiB in size, it splits into two ranges. This process continues for these new ranges as the table and its indexes continue growing. +**Range** | CockroachDB stores all user data (tables, indexes, etc.) and almost all system data in a giant sorted map of key-value pairs. This keyspace is divided into "ranges", contiguous chunks of the keyspace, so that every key can always be found in a single range.

From a SQL perspective, a table and its secondary indexes initially map to a single range, where each key-value pair in the range represents a single row in the table (also called the primary index because the table is sorted by the primary key) or a single row in a secondary index. As soon as that range reaches 64 MiB in size, it splits into two ranges. This process continues for these new ranges as the table and its indexes continue growing. **Replica** | CockroachDB replicates each range (3 times by default) and stores each replica on a different node. **Leaseholder** | For each range, one of the replicas holds the "range lease". This replica, referred to as the "leaseholder", is the one that receives and coordinates all read and write requests for the range.

Unlike writes, read requests access the leaseholder and send the results to the client without needing to coordinate with any of the other range replicas. This reduces the network round trips involved and is possible because the leaseholder is guaranteed to be up-to-date due to the fact that all write requests also go to the leaseholder. **Raft Leader** | For each range, one of the replicas is the "leader" for write requests. Via the [Raft consensus protocol](https://www.cockroachlabs.com/docs/{{ page.version.version }}/architecture/replication-layer.html#raft), this replica ensures that a majority of replicas (the leader and enough followers) agree, based on their Raft logs, before committing the write. The Raft leader is almost always the same replica as the leaseholder. diff --git a/_includes/v19.2/sql/settings/settings.md b/_includes/v19.2/sql/settings/settings.md index 2d0f0e7b8cd..7876f063ac0 100644 --- a/_includes/v19.2/sql/settings/settings.md +++ b/_includes/v19.2/sql/settings/settings.md @@ -55,7 +55,7 @@ kv.range_split.load_qps_thresholdinteger2500the QPS over which, the range becomes a candidate for load based splitting kv.rangefeed.concurrent_catchup_iteratorsinteger64number of rangefeeds catchup iterators a store will allow concurrently before queueing kv.rangefeed.enabledbooleanfalseif set, rangefeed registration is enabled -kv.replication_reports.intervalduration1m0sthe frequency for generating the replication_constraint_stats, replication_stats_report and replication_critical_localities reports (set to 0 to disable) +kv.replication_reports.intervalduration1m0sthe frequency for generating the replication_constraint_stats, replication_stats_report and replication_critical_localities reports (set to 0 to disable) kv.snapshot_rebalance.max_ratebyte size8.0 MiBthe rate limit (bytes/sec) to use for rebalance and upreplication snapshots kv.snapshot_recovery.max_ratebyte size8.0 MiBthe rate limit (bytes/sec) to use for recovery snapshots kv.snapshot_sst.sync_sizebyte size2.0 MiBthreshold after which snapshot SST writes must fsync diff --git a/_includes/v19.2/zone-configs/variables.md b/_includes/v19.2/zone-configs/variables.md index 268e9d88f8b..921532e579d 100644 --- a/_includes/v19.2/zone-configs/variables.md +++ b/_includes/v19.2/zone-configs/variables.md @@ -3,7 +3,7 @@ Variable | Description `range_min_bytes` | The minimum size, in bytes, for a range of data in the zone. When a range is less than this size, CockroachDB will merge it with an adjacent range.

**Default:** `16777216` (16MiB) `range_max_bytes` | The maximum size, in bytes, for a range of data in the zone. When a range reaches this size, CockroachDB will spit it into two ranges.

**Default:** `67108864` (64MiB) `gc.ttlseconds` | The number of seconds overwritten values will be retained before garbage collection. Smaller values can save disk space if values are frequently overwritten; larger values increase the range allowed for `AS OF SYSTEM TIME` queries, also know as [Time Travel Queries](select-clause.html#select-historical-data-time-travel).

It is not recommended to set this below `600` (10 minutes); doing so will cause problems for long-running queries. Also, since all versions of a row are stored in a single range that never splits, it is not recommended to set this so high that all the changes to a row in that time period could add up to more than 64MiB; such oversized ranges could contribute to the server running out of memory or other problems.

**Default:** `90000` (25 hours) -`num_replicas` | The number of replicas in the zone.

**Default:** `3`

For the `system` database and `.meta`, `.liveness`, and `.system` ranges, the default value is `5`. +`num_replicas` | The number of replicas in the zone.

**Default:** `3`

For the `system` database and `.meta`, `.liveness`, and `.system` ranges, the default value is `5`. `constraints` | An array of required (`+`) and/or prohibited (`-`) constraints influencing the location of replicas. See [Types of Constraints](configure-replication-zones.html#types-of-constraints) and [Scope of Constraints](configure-replication-zones.html#scope-of-constraints) for more details.

To prevent hard-to-detect typos, constraints placed on [store attributes and node localities](configure-replication-zones.html#descriptive-attributes-assigned-to-nodes) must match the values passed to at least one node in the cluster. If not, an error is signalled.

**Default:** No constraints, with CockroachDB locating each replica on a unique node and attempting to spread replicas evenly across localities. `lease_preferences` | An ordered list of required and/or prohibited constraints influencing the location of [leaseholders](architecture/overview.html#glossary). Whether each constraint is required or prohibited is expressed with a leading `+` or `-`, respectively. Note that lease preference constraints do not have to be shared with the `constraints` field. For example, it's valid for your configuration to define a `lease_preferences` field that does not reference any values from the `constraints` field. It's also valid to define a `lease_preferences` field with no `constraints` field at all.

If the first preference cannot be satisfied, CockroachDB will attempt to satisfy the second preference, and so on. If none of the preferences can be met, the lease will be placed using the default lease placement algorithm, which is to base lease placement decisions on how many leases each node already has, trying to make all the nodes have around the same amount.

Each value in the list can include multiple constraints. For example, the list `[[+zone=us-east-1b, +ssd], [+zone=us-east-1a], [+zone=us-east-1c, +ssd]]` means "prefer nodes with an SSD in `us-east-1b`, then any nodes in `us-east-1a`, then nodes in `us-east-1c` with an SSD."

For a usage example, see [Constrain leaseholders to specific datacenters](configure-replication-zones.html#constrain-leaseholders-to-specific-datacenters).

**Default**: No lease location preferences are applied if this field is not specified. diff --git a/v19.2/admin-ui-replication-dashboard.md b/v19.2/admin-ui-replication-dashboard.md index 1bb047a59d7..46e78093151 100644 --- a/v19.2/admin-ui-replication-dashboard.md +++ b/v19.2/admin-ui-replication-dashboard.md @@ -12,8 +12,8 @@ The **Replication** dashboard in the CockroachDB Admin UI enables you to monitor - **Range**: CockroachDB stores all user data and almost all system data in a giant sorted map of key-value pairs. This keyspace is divided into "ranges", contiguous chunks of the keyspace, so that every key can always be found in a single range. - **Range Replica:** CockroachDB replicates each range (3 times by default) and stores each replica on a different node. - **Range Lease:** For each range, one of the replicas holds the "range lease". This replica, referred to as the "leaseholder", is the one that receives and coordinates all read and write requests for the range. -- **Under-replicated Ranges:** When a cluster is first initialized, the few default starting ranges will only have a single replica, but as soon as other nodes are available, they will replicate to them until they've reached their desired replication factor, the default being 3. If a range does not have enough replicas, the range is said to be "under-replicated". -- **Unavailable Ranges:** If a majority of a range's replicas are on nodes that are unavailable, then the entire range is unavailable and will be unable to process queries. +- **Under-replicated Ranges:** When a cluster is first initialized, the few default starting ranges will only have a single replica, but as soon as other nodes are available, they will replicate to them until they've reached their desired replication factor, the default being 3. If a range does not have enough replicas, the range is said to be "under-replicated". +- **Unavailable Ranges:** If a majority of a range's replicas are on nodes that are unavailable, then the entire range is unavailable and will be unable to process queries. For more details, see [Scalable SQL Made Easy: How CockroachDB Automates Operations](https://www.cockroachlabs.com/blog/automated-rebalance-and-repair/) diff --git a/v19.2/architecture/overview.md b/v19.2/architecture/overview.md index 702b8b87206..d71535ceeb8 100644 --- a/v19.2/architecture/overview.md +++ b/v19.2/architecture/overview.md @@ -45,7 +45,7 @@ CockroachDB heavily relies on the following concepts, so being familiar with the Term | Definition -----|----------- **Consistency** | CockroachDB uses "consistency" in both the sense of [ACID semantics](https://en.wikipedia.org/wiki/Consistency_(database_systems)) and the [CAP theorem](https://en.wikipedia.org/wiki/CAP_theorem), albeit less formally than either definition. What we try to express with this term is that your data should be anomaly-free. -**Consensus** | When a range receives a write, a quorum of nodes containing replicas of the range acknowledge the write. This means your data is safely stored and a majority of nodes agree on the database's current state, even if some of the nodes are offline.

When a write *doesn't* achieve consensus, forward progress halts to maintain consistency within the cluster. +**Consensus** | When a range receives a write, a quorum of nodes containing replicas of the range acknowledge the write. This means your data is safely stored and a majority of nodes agree on the database's current state, even if some of the nodes are offline.

When a write *doesn't* achieve consensus, forward progress halts to maintain consistency within the cluster. **Replication** | Replication involves creating and distributing copies of data, as well as ensuring copies remain consistent. However, there are multiple types of replication: namely, synchronous and asynchronous.

Synchronous replication requires all writes to propagate to a quorum of copies of the data before being considered committed. To ensure consistency with your data, this is the kind of replication CockroachDB uses.

Asynchronous replication only requires a single node to receive the write to be considered committed; it's propagated to each copy of the data after the fact. This is more or less equivalent to "eventual consistency", which was popularized by NoSQL databases. This method of replication is likely to cause anomalies and loss of data. **Transactions** | A set of operations performed on your database that satisfy the requirements of [ACID semantics](https://en.wikipedia.org/wiki/Database_transaction). This is a crucial component for a consistent system to ensure developers can trust the data in their database. **Multi-Active Availability** | Our consensus-based notion of high availability that lets each node in the cluster handle reads and writes for a subset of the stored data (on a per-range basis). This is in contrast to active-passive replication, in which the active node receives 100% of request traffic, as well as active-active replication, in which all nodes accept requests but typically cannot guarantee that reads are both up-to-date and fast. diff --git a/v19.2/configure-replication-zones.md b/v19.2/configure-replication-zones.md index 6711e15f3c7..f62d5312dbe 100644 --- a/v19.2/configure-replication-zones.md +++ b/v19.2/configure-replication-zones.md @@ -98,7 +98,7 @@ When starting a node with the [`cockroach start`](start-a-node.html) command, yo Attribute Type | Description ---------------|------------ -**Node Locality** | Using the [`--locality`](start-a-node.html#locality) flag, you can assign arbitrary key-value pairs that describe the locality of the node. Locality might include country, region, datacenter, rack, etc. The key-value pairs should be ordered from most inclusive to least inclusive (e.g., country before datacenter before rack), and the keys and the order of key-value pairs must be the same on all nodes. It's typically better to include more pairs than fewer. For example:

`--locality=region=east,datacenter=us-east-1`
`--locality=region=east,datacenter=us-east-2`
`--locality=region=west,datacenter=us-west-1`

CockroachDB attempts to spread replicas evenly across the cluster based on locality, with the order determining the priority. However, locality can be used to influence the location of data replicas in various ways using replication zones.

When there is high latency between nodes, CockroachDB also uses locality to move range leases closer to the current workload, reducing network round trips and improving read performance. See [Follow-the-workload](demo-follow-the-workload.html) for more details. +**Node Locality** | Using the [`--locality`](start-a-node.html#locality) flag, you can assign arbitrary key-value pairs that describe the location of the node. Locality might include region, country, datacenter, rack, etc. The key-value pairs should be ordered into _locality tiers_ that range from most inclusive to least inclusive (e.g., region before datacenter as in `region=eu,dc=paris`), and the keys and the order of key-value pairs must be the same on all nodes. It's typically better to include more pairs than fewer. For example:

`--locality=region=east,datacenter=us-east-1`
`--locality=region=east,datacenter=us-east-2`
`--locality=region=west,datacenter=us-west-1`

CockroachDB attempts to spread replicas evenly across the cluster based on locality, with the order of locality tiers determining the priority. Locality can also be used to influence the location of data replicas in various ways using replication zones.

When there is high latency between nodes, CockroachDB uses locality to move range leases closer to the current workload, reducing network round trips and improving read performance. See [Follow-the-workload](demo-follow-the-workload.html) for more details. **Node Capability** | Using the `--attrs` flag, you can specify node capability, which might include specialized hardware or number of cores, for example:

`--attrs=ram:64gb` **Store Type/Capability** | Using the `attrs` field of the `--store` flag, you can specify disk type or capability, for example:

`--store=path=/mnt/ssd01,attrs=ssd`
`--store=path=/mnt/hda1,attrs=hdd:7200rpm` diff --git a/v19.2/run-replication-reports.md b/v19.2/run-replication-reports.md new file mode 100644 index 00000000000..e174b1b31ff --- /dev/null +++ b/v19.2/run-replication-reports.md @@ -0,0 +1,526 @@ +--- +title: Run Replication Reports +summary: Verify that your cluster's data replication, data placement, and zone configurations are working as expected. +keywords: availability zone, zone config, zone configs, zone configuration, constraint, constraints +toc: true +--- + +New in v19.2: Several new and updated tables (listed below) are available to help you query the status of your cluster's data replication, data placement, and zone constraint conformance. For example, you can: + +- See what data is under-replicated or unavailable. +- Show which of your localities (if any) are critical. A locality is "critical" for a range if all of the nodes in that locality becoming unreachable would cause the range to become unavailable. In other words, the locality contains a majority of the range's replicas. +- See if any of your cluster's data placement constraints are being violated. + +{{site.data.alerts.callout_info}} +The information on this page assumes you are familiar with [replication zones](configure-replication-zones.html) and [partitioning](partitioning.html). +{{site.data.alerts.end}} + +{{site.data.alerts.callout_danger}} +**This is an experimental feature.** The interface and output is subject to change. + +In particular, the direct access to `system` tables shown here will not be a supported way to inspect CockroachDB in future versions. We're committed to add stable ways to inspect these replication reports in the future, likely via `SHOW` statements and/or [views](views.html) and [built-in functions](functions-and-operators.html) in the `crdb_internal` schema. +{{site.data.alerts.end}} + +## Conformance reporting tables + +The following new and updated tables are available for verifying constraint conformance. + +- [`system.replication_stats`](#system-replication_stats) shows information about whether data is under-replicated, over-replicated, or unavailable. +- [`system.replication_constraint_stats`](#system-replication_constraint_stats) shows a list of violations to any data placement requirements you've configured. +- [`system.replication_critical_localities`](#system-replication_critical_localities) shows which localities in your cluster (if any) are critical. A locality is "critical" for a range if all of the nodes in that locality becoming unreachable would cause the range to become unavailable. In other words, the locality contains a majority of the range's replicas. +- [`system.reports_meta`](#system-reports_meta) lists the IDs and times when the replication reports were produced that generated the data in the `system.replication_*` tables. +- [`crdb_internal.zones`](#crdb_internal-zones) can be used with the tables above to figure out the databases and table names where the non-conforming or at-risk data is located. + +To configure how often the conformance reports are run, adjust the `kv.replication_reports.interval` [cluster setting](cluster-settings.html#kv-replication-reports-interval), which accepts an [`INTERVAL`](interval.html). For example, to run it every five minutes: + +{% include copy-clipboard.html %} +~~~ sql +SET CLUSTER setting kv.replication_reports.interval = '5m'; +~~~ + +Only members of the `admin` role can access these tables. By default, the `root` user belongs to the `admin` role. For more information about users and roles, see [Manage Users](authorization.html#create-and-manage-users) and [Manage Roles](roles.html). + + + +{{site.data.alerts.callout_info}} +The replication reports are only generated for zones that meet the following criteria: + +- The zone overrides some replication attributes compared to their parent zone. Ranges in zones for which a report is not generated are counted in the report of the first ancestor zone for which a report is generated. +- The zone's parent is the default zone. + +The attributes that must be overridden to trigger each report to run are: + +| Report | Field(s) | +|-----------------------------------+--------------------------------| +| `replication_constraint_stats` | `constraints` | +| `replication_critical_localities` | `constraints`, `num_replicas` | +| `replication_stats` | `constraints`, `num_replicas` | +{{site.data.alerts.end}} + +### system.replication_stats + +The `system.replication_stats` table shows information about whether data is under-replicated, over-replicated, or unavailable. + +For an example using this table, see [Find out which databases and tables have under-replicated ranges](#find-out-which-databases-and-tables-have-under-replicated-ranges). + +{% include copy-clipboard.html %} +~~~ sql +SHOW COLUMNS FROM system.replication_stats; +~~~ + +| Column name | Data type | Description | +|-------------------------+--------------------+---------------------------------------------------------------------------------------------------------------------------------------| +| zone_id | [`INT8`](int.html) | The ID of the [replication zone](configure-zone.html). | +| subzone_id | [`INT8`](int.html) | The ID of the subzone (i.e., [partition](partition-by.html)). | +| report_id | [`INT8`](int.html) | The ID of the [report](#system-reports_meta) that generated this data. | +| total_ranges | [`INT8`](int.html) | Total [ranges](architecture/overview.html#architecture-range) in the zone this report entry is referring to. | +| unavailable_ranges | [`INT8`](int.html) | Unavailable ranges in the zone this report entry is referring to. | +| under_replicated_ranges | [`INT8`](int.html) | [Under-replicated ranges](admin-ui-replication-dashboard.html#under-replicated-ranges) in the zone this report entry is referring to. | +| over_replicated_ranges | [`INT8`](int.html) | Over-replicated ranges in the zone this report entry is referring to. | + +### system.replication_critical_localities + +The `system.replication_critical_localities` table shows which of your localities (if any) are critical. A locality is "critical" for a range if all of the nodes in that locality becoming unreachable would cause the range to become unavailable. In other words, the locality contains a majority of the range's replicas. + +That said, a locality being critical is not necessarily a bad thing as long as you are aware of it. What matters is that [you configure the topology of your cluster to get the resiliency you expect](topology-patterns.html). + +As described in [Configure Replication Zones](configure-replication-zones.html#descriptive-attributes-assigned-to-nodes), localities are key-value pairs defined at [node startup time](start-a-node.html#locality), and are ordered into _locality tiers_ that range from most inclusive to least inclusive (e.g. region before datacenter as in `region=eu,dc=paris`). + +For an example using this table, see [Find out which databases and tables have ranges in critical localities](#find-out-which-databases-and-tables-have-ranges-in-critical-localities). + +{% include copy-clipboard.html %} +~~~ sql +SHOW COLUMNS FROM system.replication_critical_localities; +~~~ + +| Column name | Data type | Description | +|----------------+-------------------------+-------------------------------------------------------------------------------------------------------------------------------------| +| zone_id | [`INT8`](int.html) | The ID of the [replication zone](configure-zone.html). | +| subzone_id | [`INT8`](int.html) | The ID of the subzone (i.e., [partition](partition-by.html)). | +| locality | [`STRING`](string.html) | The name of the critical [locality](configure-replication-zones.html#zone-config-node-locality). | +| report_id | [`INT8`](int.html) | The ID of the [report](#system-reports_meta) that generated this data. | +| at_risk_ranges | [`INT8`](int.html) | The [ranges](architecture/overview.html#architecture-range) that are at risk of becoming unavailable as of the time of this report. | + +{{site.data.alerts.callout_info}} +If you have not [defined any localities](configure-replication-zones.html#zone-config-node-locality), this report will not return any results. It only reports on localities that have been explicitly defined. +{{site.data.alerts.end}} + +### system.replication_constraint_stats + +The `system.replication_constraint_stats` table lists violations to any data placement requirements you've configured. + +For an example using this table, see [Find out which of your tables have a constraint violation](#find-out-which-of-your-tables-have-a-constraint-violation). + +{% include copy-clipboard.html %} +~~~ sql +SHOW COLUMNS FROM system.replication_constraint_stats; +~~~ + +| Column name | Data type | Description | +|------------------+---------------------------------+---------------------------------------------------------------------------------------------------------| +| zone_id | [`INT8`](int.html) | The ID of the [replication zone](configure-zone.html). | +| subzone_id | [`INT8`](int.html) | The ID of the subzone (i.e., [partition](partition-by.html)). | +| type | [`STRING`](string.html) | The type of zone configuration that was violated, e.g., `constraint`. | +| config | [`STRING`](string.html) | The YAML key-value pair used to configure the zone, e.g., `+region=europe-west1`. | +| report_id | [`INT8`](int.html) | The ID of the [report](#system-reports_meta) that generated this data. | +| violation_start | [`TIMESTAMPTZ`](timestamp.html) | The time when the violation was detected. Will return `NULL` if the number of `violating_ranges` is 0. | +| violating_ranges | [`INT8`](int.html) | The [ranges](architecture/overview.html#architecture-range) that are in violation of the configuration. | + +### system.reports_meta + +The `system.reports_meta` table contains metadata about when the replication reports were last run. Each report contains a number of report entries, one per zone. + +Replication reports are run at the interval specified by the `kv.replication_reports.interval` [cluster setting](cluster-settings.html#kv-replication-reports-interval). + +{% include copy-clipboard.html %} +~~~ sql +SHOW COLUMNS FROM system.reports_meta; +~~~ + +| Column name | Data type | Description | +|-------------+---------------------------------+---------------------------------------------------------| +| id | [`INT8`](int.html) | The ID of the report that this report entry is part of. | +| generated | [`TIMESTAMPTZ`](timestamp.html) | When the report was generated. | + +### crdb_internal.zones + +The `crdb_internal.zones` table is useful for: + +- Viewing your cluster's zone configurations in various formats: YAML, SQL, etc. +- Matching up data returned from the various replication reports with the names of the databases and tables, indexes, and partitions where that data lives. + +{% include copy-clipboard.html %} +~~~ sql +SHOW COLUMNS FROM crdb_internal.zones; +~~~ + +| column_name | data_type | description | +|---------------------+-------------------------+---------------------------------------------------------------------------------------------------------------------------------| +| zone_id | [`INT8`](int.html) | The ID of the [replication zone](configure-zone.html). | +| subzone_id | [`INT8`](int.html) | The ID of the subzone (i.e., [partition](partition-by.html)). | +| target | [`STRING`](string.html) | The "object" that the constraint is being applied to, e.g., `PARTITION us_west OF INDEX movr.public.users@primary`. | +| range_name | [`STRING`](string.html) | The zone's name. | +| database_name | [`STRING`](string.html) | The [database](show-databases.html) where the `target`'s data is located. | +| table_name | [`STRING`](string.html) | The [table](show-tables.html) where the `target`'s data is located. | +| index_name | [`STRING`](string.html) | The [index](show-index.html) where the `target`'s data is located. | +| partition_name | [`STRING`](string.html) | The [partition](show-partitions.html) where the `target`'s data is located. | +| full_config_yaml | [`STRING`](string.html) | The YAML you used to [configure this replication zone](configure-replication-zones.html). | +| full_config_sql | [`STRING`](string.html) | The SQL you used to [configure this replication zone](configure-replication-zones.html). | +| raw_config_yaml | [`STRING`](string.html) | The YAML for this [replication zone](configure-replication-zones.html), showing only values the user changed from the defaults. | +| raw_config_sql | [`STRING`](string.html) | The SQL for this [replication zone](configure-replication-zones.html), showing only values the user changed from the defaults. | +| raw_config_protobuf | [`BYTES`](bytes.html) | A protobuf representation of the configuration for this [replication zone](configure-replication-zones.html). | + +## Examples + +The examples shown below are each using a [geo-partitioned demo cluster](cockroach-demo.html#run-cockroach-demo-with-geo-partitioned-replicas) started with the following command: + +{% include copy-clipboard.html %} +~~~ shell +cockroach demo movr --geo-partitioned-replicas +~~~ + +### Find out which of your tables have a constraint violation + +By default, this geo-distributed demo cluster will not have any constraint violations. + +To introduce a violation that we can then query for, we'll modify the zone configuration of the `users` table. + +First, let's see what existing zone configurations are attached to the `users` table, so we know what to modify. + +{% include copy-clipboard.html %} +~~~ sql +SHOW CREATE TABLE users; +~~~ + +~~~ + table_name | create_statement ++------------+-------------------------------------------------------------------------------------+ + users | CREATE TABLE users ( + | id UUID NOT NULL, + | city VARCHAR NOT NULL, + | name VARCHAR NULL, + | address VARCHAR NULL, + | credit_card VARCHAR NULL, + | CONSTRAINT "primary" PRIMARY KEY (city ASC, id ASC), + | FAMILY "primary" (id, city, name, address, credit_card) + | ) PARTITION BY LIST (city) ( + | PARTITION us_west VALUES IN (('seattle'), ('san francisco'), ('los angeles')), + | PARTITION us_east VALUES IN (('new york'), ('boston'), ('washington dc')), + | PARTITION europe_west VALUES IN (('amsterdam'), ('paris'), ('rome')) + | ); + | ALTER PARTITION europe_west OF INDEX movr.public.users@primary CONFIGURE ZONE USING + | constraints = '[+region=europe-west1]'; + | ALTER PARTITION us_east OF INDEX movr.public.users@primary CONFIGURE ZONE USING + | constraints = '[+region=us-east1]'; + | ALTER PARTITION us_west OF INDEX movr.public.users@primary CONFIGURE ZONE USING + | constraints = '[+region=us-west1]' +(1 row) +~~~ + +To create a constraint violation, let's tell the ranges in the `europe_west` partition that they are explicitly supposed to *not* be in the `region=europe-west1` locality by issuing the following statement: + +{% include copy-clipboard.html %} +~~~ sql +ALTER PARTITION europe_west of INDEX movr.public.users@primary CONFIGURE ZONE USING constraints = '[-region=europe-west1]'; +~~~ + +Once the statement above executes, the ranges currently stored in that locality will now be in a state where they are explicitly not supposed to be in that locality, and are thus in violation of a constraint. + +In other words, we are telling the ranges "where you are now is exactly where you are *not* supposed to be". This will cause the cluster to rebalance the ranges, which will take some time. During the time it takes for the rebalancing to occur, the ranges will be in violation. + +By default, the system constraint conformance report runs once every minute. You can change that interval by modifying the `kv.replication_reports.interval` [cluster setting](cluster-settings.html#kv-replication-reports-interval). + +After the internal constraint conformance report has run again, the following query should report a violation: + +{% include copy-clipboard.html %} +~~~ sql +SELECT * FROM system.replication_constraint_stats WHERE violating_ranges > 0; +~~~ + +~~~ + zone_id | subzone_id | type | config | report_id | violation_start | violating_ranges ++---------+------------+------------+----------------------+-----------+---------------------------------+------------------+ + 53 | 2 | constraint | +region=us-east1 | 1 | 2019-10-21 20:28:40.79508+00:00 | 2 + 53 | 3 | constraint | -region=europe-west1 | 1 | 2019-10-21 20:28:40.79508+00:00 | 2 + 54 | 2 | constraint | +region=us-west1 | 1 | 2019-10-21 20:28:40.79508+00:00 | 1 + 54 | 4 | constraint | +region=us-east1 | 1 | 2019-10-21 20:28:40.79508+00:00 | 2 + 55 | 6 | constraint | +region=us-east1 | 1 | 2019-10-21 20:28:40.79508+00:00 | 4 + 55 | 9 | constraint | +region=europe-west1 | 1 | 2019-10-21 20:28:40.79508+00:00 | 6 + 56 | 2 | constraint | +region=us-east1 | 1 | 2019-10-21 20:28:40.79508+00:00 | 2 + 56 | 3 | constraint | +region=europe-west1 | 1 | 2019-10-21 20:28:40.79508+00:00 | 1 + 58 | 2 | constraint | +region=us-east1 | 1 | 2019-10-21 20:28:40.79508+00:00 | 2 +(9 rows) +~~~ + +To be more useful, we'd like to find out the database and table names where these constraint-violating ranges live. To get that information we'll need to join the output of `system.replication_constraint_stats` table with the `crdb_internal.zones` table using a query like the following: + +{% include copy-clipboard.html %} +~~~ sql +WITH + partition_violations + AS ( + SELECT + * + FROM + system.replication_constraint_stats + WHERE + violating_ranges > 0 + ), + report + AS ( + SELECT + crdb_internal.zones.zone_id, + crdb_internal.zones.subzone_id, + target, + database_name, + table_name, + index_name, + partition_violations.type, + partition_violations.config, + partition_violations.violation_start, + partition_violations.violating_ranges + FROM + crdb_internal.zones, partition_violations + WHERE + crdb_internal.zones.zone_id + = partition_violations.zone_id + ) +SELECT * FROM report; +~~~ + +~~~ + zone_id | subzone_id | target | database_name | table_name | index_name | type | config | violation_start | violating_ranges ++---------+------------+------------------------------------------------------------------------------------------------+---------------+------------+-----------------------------------------------+------------+----------------------+---------------------------------+------------------+ + 53 | 1 | PARTITION us_west OF INDEX movr.public.users@primary | movr | users | primary | constraint | -region=europe-west1 | 2019-10-21 20:28:40.79508+00:00 | 1 + 53 | 2 | PARTITION us_east OF INDEX movr.public.users@primary | movr | users | primary | constraint | -region=europe-west1 | 2019-10-21 20:28:40.79508+00:00 | 1 + 53 | 3 | PARTITION europe_west OF INDEX movr.public.users@primary | movr | users | primary | constraint | -region=europe-west1 | 2019-10-21 20:28:40.79508+00:00 | 1 + 54 | 1 | PARTITION us_west OF INDEX movr.public.vehicles@primary | movr | vehicles | primary | constraint | +region=us-east1 | 2019-10-21 20:28:40.79508+00:00 | 1 + 54 | 2 | PARTITION us_west OF INDEX movr.public.vehicles@vehicles_auto_index_fk_city_ref_users | movr | vehicles | vehicles_auto_index_fk_city_ref_users | constraint | +region=us-east1 | 2019-10-21 20:28:40.79508+00:00 | 1 + 54 | 3 | PARTITION us_east OF INDEX movr.public.vehicles@primary | movr | vehicles | primary | constraint | +region=us-east1 | 2019-10-21 20:28:40.79508+00:00 | 1 + 54 | 4 | PARTITION us_east OF INDEX movr.public.vehicles@vehicles_auto_index_fk_city_ref_users | movr | vehicles | vehicles_auto_index_fk_city_ref_users | constraint | +region=us-east1 | 2019-10-21 20:28:40.79508+00:00 | 1 + 54 | 5 | PARTITION europe_west OF INDEX movr.public.vehicles@primary | movr | vehicles | primary | constraint | +region=us-east1 | 2019-10-21 20:28:40.79508+00:00 | 1 + 54 | 6 | PARTITION europe_west OF INDEX movr.public.vehicles@vehicles_auto_index_fk_city_ref_users | movr | vehicles | vehicles_auto_index_fk_city_ref_users | constraint | +region=us-east1 | 2019-10-21 20:28:40.79508+00:00 | 1 + 55 | 1 | PARTITION us_west OF INDEX movr.public.rides@primary | movr | rides | primary | constraint | +region=us-east1 | 2019-10-21 20:28:40.79508+00:00 | 1 + 55 | 2 | PARTITION us_west OF INDEX movr.public.rides@rides_auto_index_fk_city_ref_users | movr | rides | rides_auto_index_fk_city_ref_users | constraint | +region=us-east1 | 2019-10-21 20:28:40.79508+00:00 | 1 + 55 | 3 | PARTITION us_west OF INDEX movr.public.rides@rides_auto_index_fk_vehicle_city_ref_vehicles | movr | rides | rides_auto_index_fk_vehicle_city_ref_vehicles | constraint | +region=us-east1 | 2019-10-21 20:28:40.79508+00:00 | 1 + 55 | 4 | PARTITION us_east OF INDEX movr.public.rides@primary | movr | rides | primary | constraint | +region=us-east1 | 2019-10-21 20:28:40.79508+00:00 | 1 + 55 | 5 | PARTITION us_east OF INDEX movr.public.rides@rides_auto_index_fk_city_ref_users | movr | rides | rides_auto_index_fk_city_ref_users | constraint | +region=us-east1 | 2019-10-21 20:28:40.79508+00:00 | 1 + 55 | 6 | PARTITION us_east OF INDEX movr.public.rides@rides_auto_index_fk_vehicle_city_ref_vehicles | movr | rides | rides_auto_index_fk_vehicle_city_ref_vehicles | constraint | +region=us-east1 | 2019-10-21 20:28:40.79508+00:00 | 1 + 55 | 7 | PARTITION europe_west OF INDEX movr.public.rides@primary | movr | rides | primary | constraint | +region=us-east1 | 2019-10-21 20:28:40.79508+00:00 | 1 + 55 | 8 | PARTITION europe_west OF INDEX movr.public.rides@rides_auto_index_fk_city_ref_users | movr | rides | rides_auto_index_fk_city_ref_users | constraint | +region=us-east1 | 2019-10-21 20:28:40.79508+00:00 | 1 + 55 | 9 | PARTITION europe_west OF INDEX movr.public.rides@rides_auto_index_fk_vehicle_city_ref_vehicles | movr | rides | rides_auto_index_fk_vehicle_city_ref_vehicles | constraint | +region=us-east1 | 2019-10-21 20:28:40.79508+00:00 | 1 +(18 rows) +~~~ + +If you were to repeat this query at 60-second intervals, you would see that the number of results returned decreases and eventually falls to zero as the cluster rebalances the ranges to their new homes. Eventually you will see this output, which will tell you that the rebalancing has finished. + +~~~ + zone_id | subzone_id | target | database_name | table_name | index_name | type | config | violation_start | violating_ranges ++---------+------------+--------+---------------+------------+------------+------+--------+-----------------+------------------+ +(0 rows) +~~~ + +### Find out which databases and tables have under-replicated ranges + +By default, this geo-distributed demo cluster will not have any [under-replicated ranges](admin-ui-replication-dashboard.html#under-replicated-ranges). + +To force it into a state where some ranges are under-replicated, issue the following statement, which tells it to store 9 copies of each range underlying the `rides` table (by default [it stores 3](configure-replication-zones.html#num_replicas)). + +{% include copy-clipboard.html %} +~~~ sql +ALTER TABLE rides CONFIGURE ZONE USING num_replicas=9; +~~~ + +Once the statement above executes, the cluster will rebalance so that it's storing 9 copies of each range underlying the `rides` table. During the time it takes for the rebalancing to occur, these ranges will be considered "under-replicated", since there are not yet as many copies (9) of each range as you have just specified. + +By default, the internal constraint conformance report runs once every minute. You can change that interval by modifying the `kv.replication_reports.interval` [cluster setting](cluster-settings.html#kv-replication-reports-interval). + +After the system constraint conformance report has run again, the following query should report under-replicated ranges: + +{% include copy-clipboard.html %} +~~~ sql +SELECT * FROM system.replication_stats WHERE under_replicated_ranges > 0; +~~~ + +~~~ + zone_id | subzone_id | report_id | total_ranges | unavailable_ranges | under_replicated_ranges | over_replicated_ranges ++---------+------------+-----------+--------------+--------------------+-------------------------+------------------------+ + 55 | 0 | 3 | 28 | 0 | 6 | 0 + 55 | 3 | 3 | 9 | 0 | 9 | 0 + 55 | 6 | 3 | 9 | 0 | 9 | 0 + 55 | 9 | 3 | 9 | 0 | 9 | 0 +(4 rows) +~~~ + +To be more useful, we'd like to find out the database and table names where these under-replicated ranges live. To get that information we'll need to join the output of `system.replication_stats` table with the `crdb_internal.zones` table using a query like the following: + +{% include copy-clipboard.html %} +~~~ sql +WITH + under_replicated_zones + AS ( + SELECT + zone_id, under_replicated_ranges + FROM + system.replication_stats + WHERE + under_replicated_ranges > 0 + ), + report + AS ( + SELECT + crdb_internal.zones.zone_id, + target, + range_name, + database_name, + table_name, + index_name, + under_replicated_zones.under_replicated_ranges + FROM + crdb_internal.zones, under_replicated_zones + WHERE + crdb_internal.zones.zone_id + = under_replicated_zones.zone_id + ) +SELECT * FROM report; +~~~ + +~~~ + zone_id | target | range_name | database_name | table_name | index_name | under_replicated_ranges ++---------+------------------------------------------------------------------------------------------------+------------+---------------+------------+-----------------------------------------------+-------------------------+ + 55 | TABLE movr.public.rides | NULL | movr | rides | NULL | 9 + 55 | TABLE movr.public.rides | NULL | movr | rides | NULL | 9 + 55 | TABLE movr.public.rides | NULL | movr | rides | NULL | 9 + 55 | PARTITION us_west OF INDEX movr.public.rides@primary | NULL | movr | rides | primary | 9 + 55 | PARTITION us_west OF INDEX movr.public.rides@primary | NULL | movr | rides | primary | 9 + 55 | PARTITION us_west OF INDEX movr.public.rides@primary | NULL | movr | rides | primary | 9 + 55 | PARTITION us_west OF INDEX movr.public.rides@rides_auto_index_fk_city_ref_users | NULL | movr | rides | rides_auto_index_fk_city_ref_users | 9 + 55 | PARTITION us_west OF INDEX movr.public.rides@rides_auto_index_fk_city_ref_users | NULL | movr | rides | rides_auto_index_fk_city_ref_users | 9 + 55 | PARTITION us_west OF INDEX movr.public.rides@rides_auto_index_fk_city_ref_users | NULL | movr | rides | rides_auto_index_fk_city_ref_users | 9 + 55 | PARTITION us_west OF INDEX movr.public.rides@rides_auto_index_fk_vehicle_city_ref_vehicles | NULL | movr | rides | rides_auto_index_fk_vehicle_city_ref_vehicles | 9 + 55 | PARTITION us_west OF INDEX movr.public.rides@rides_auto_index_fk_vehicle_city_ref_vehicles | NULL | movr | rides | rides_auto_index_fk_vehicle_city_ref_vehicles | 9 + 55 | PARTITION us_west OF INDEX movr.public.rides@rides_auto_index_fk_vehicle_city_ref_vehicles | NULL | movr | rides | rides_auto_index_fk_vehicle_city_ref_vehicles | 9 + 55 | PARTITION us_east OF INDEX movr.public.rides@primary | NULL | movr | rides | primary | 9 + 55 | PARTITION us_east OF INDEX movr.public.rides@primary | NULL | movr | rides | primary | 9 + 55 | PARTITION us_east OF INDEX movr.public.rides@primary | NULL | movr | rides | primary | 9 + 55 | PARTITION us_east OF INDEX movr.public.rides@rides_auto_index_fk_city_ref_users | NULL | movr | rides | rides_auto_index_fk_city_ref_users | 9 + 55 | PARTITION us_east OF INDEX movr.public.rides@rides_auto_index_fk_city_ref_users | NULL | movr | rides | rides_auto_index_fk_city_ref_users | 9 + 55 | PARTITION us_east OF INDEX movr.public.rides@rides_auto_index_fk_city_ref_users | NULL | movr | rides | rides_auto_index_fk_city_ref_users | 9 + 55 | PARTITION us_east OF INDEX movr.public.rides@rides_auto_index_fk_vehicle_city_ref_vehicles | NULL | movr | rides | rides_auto_index_fk_vehicle_city_ref_vehicles | 9 + 55 | PARTITION us_east OF INDEX movr.public.rides@rides_auto_index_fk_vehicle_city_ref_vehicles | NULL | movr | rides | rides_auto_index_fk_vehicle_city_ref_vehicles | 9 + 55 | PARTITION us_east OF INDEX movr.public.rides@rides_auto_index_fk_vehicle_city_ref_vehicles | NULL | movr | rides | rides_auto_index_fk_vehicle_city_ref_vehicles | 9 + 55 | PARTITION europe_west OF INDEX movr.public.rides@primary | NULL | movr | rides | primary | 9 + 55 | PARTITION europe_west OF INDEX movr.public.rides@primary | NULL | movr | rides | primary | 9 + 55 | PARTITION europe_west OF INDEX movr.public.rides@primary | NULL | movr | rides | primary | 9 + 55 | PARTITION europe_west OF INDEX movr.public.rides@rides_auto_index_fk_city_ref_users | NULL | movr | rides | rides_auto_index_fk_city_ref_users | 9 + 55 | PARTITION europe_west OF INDEX movr.public.rides@rides_auto_index_fk_city_ref_users | NULL | movr | rides | rides_auto_index_fk_city_ref_users | 9 + 55 | PARTITION europe_west OF INDEX movr.public.rides@rides_auto_index_fk_city_ref_users | NULL | movr | rides | rides_auto_index_fk_city_ref_users | 9 + 55 | PARTITION europe_west OF INDEX movr.public.rides@rides_auto_index_fk_vehicle_city_ref_vehicles | NULL | movr | rides | rides_auto_index_fk_vehicle_city_ref_vehicles | 9 + 55 | PARTITION europe_west OF INDEX movr.public.rides@rides_auto_index_fk_vehicle_city_ref_vehicles | NULL | movr | rides | rides_auto_index_fk_vehicle_city_ref_vehicles | 9 + 55 | PARTITION europe_west OF INDEX movr.public.rides@rides_auto_index_fk_vehicle_city_ref_vehicles | NULL | movr | rides | rides_auto_index_fk_vehicle_city_ref_vehicles | 9 +(30 rows) +~~~ + +### Find out which databases and tables have ranges in critical localities + +The `system.replication_critical_localities` table shows which of your localities (if any) are critical. A locality is "critical" for a range if all of the nodes in that locality becoming unreachable would cause the range to become unavailable. In other words, the locality contains a majority of the range's replicas. + +That said, a locality being critical is not necessarily a bad thing as long as you are aware of it. What matters is that [you configure the topology of your cluster to get the resiliency you expect](topology-patterns.html). + +By default, the [movr demo cluster](cockroach-demo.html#run-cockroach-demo-with-geo-partitioned-replicas) has some ranges in critical localities. This is expected because it follows the [geo-partitioned replicas topology pattern](topology-geo-partitioned-replicas.html), which ties data for latency-sensitive queries to specific geographies at the cost of data unavailability during a region-wide failure. + +{% include copy-clipboard.html %} +~~~ sql +SELECT * FROM system.replication_critical_localities WHERE at_risk_ranges > 0; +~~~ + +~~~ + zone_id | subzone_id | locality | report_id | at_risk_ranges ++---------+------------+---------------------+-----------+----------------+ + 53 | 1 | region=us-west1 | 2 | 3 + 53 | 2 | region=us-east1 | 2 | 3 + 53 | 3 | region=europe-west1 | 2 | 3 + 54 | 2 | region=us-west1 | 2 | 6 + 54 | 4 | region=us-east1 | 2 | 6 + 54 | 6 | region=europe-west1 | 2 | 6 + 55 | 3 | region=us-west1 | 2 | 9 + 55 | 6 | region=us-east1 | 2 | 9 + 55 | 9 | region=europe-west1 | 2 | 9 + 56 | 1 | region=us-west1 | 2 | 3 + 56 | 2 | region=us-east1 | 2 | 3 + 56 | 3 | region=europe-west1 | 2 | 3 + 58 | 1 | region=us-west1 | 2 | 3 + 58 | 2 | region=us-east1 | 2 | 3 + 58 | 3 | region=europe-west1 | 2 | 3 +(15 rows) +~~~ + +To be more useful, we'd like to find out the database and table names where these ranges live that are at risk of unavailability in the event of a locality becoming unreachable. To get that information we'll need to join the output of `system.replication_critical_localities` table with the `crdb_internal.zones` table using a query like the following: + +{% include copy-clipboard.html %} +~~~ sql +WITH + at_risk_zones AS ( + SELECT + zone_id, locality, at_risk_ranges + FROM + system.replication_critical_localities + WHERE + at_risk_ranges > 0 + ), + report AS ( + SELECT + crdb_internal.zones.zone_id, + target, + database_name, + table_name, + index_name, + at_risk_zones.at_risk_ranges + FROM + crdb_internal.zones, at_risk_zones + WHERE + crdb_internal.zones.zone_id + = at_risk_zones.zone_id + ) +SELECT DISTINCT * FROM report; +~~~ + +~~~ + zone_id | target | database_name | table_name | index_name | at_risk_ranges ++---------+------------------------------------------------------------------------------------------------+---------------+----------------------------+-----------------------------------------------+----------------+ + 53 | PARTITION us_west OF INDEX movr.public.users@primary | movr | users | primary | 3 + 53 | PARTITION us_east OF INDEX movr.public.users@primary | movr | users | primary | 3 + 53 | PARTITION europe_west OF INDEX movr.public.users@primary | movr | users | primary | 3 + 54 | PARTITION us_west OF INDEX movr.public.vehicles@primary | movr | vehicles | primary | 6 + 54 | PARTITION us_west OF INDEX movr.public.vehicles@vehicles_auto_index_fk_city_ref_users | movr | vehicles | vehicles_auto_index_fk_city_ref_users | 6 + 54 | PARTITION us_east OF INDEX movr.public.vehicles@primary | movr | vehicles | primary | 6 + 54 | PARTITION us_east OF INDEX movr.public.vehicles@vehicles_auto_index_fk_city_ref_users | movr | vehicles | vehicles_auto_index_fk_city_ref_users | 6 + 54 | PARTITION europe_west OF INDEX movr.public.vehicles@primary | movr | vehicles | primary | 6 + 54 | PARTITION europe_west OF INDEX movr.public.vehicles@vehicles_auto_index_fk_city_ref_users | movr | vehicles | vehicles_auto_index_fk_city_ref_users | 6 + 55 | PARTITION us_west OF INDEX movr.public.rides@primary | movr | rides | primary | 9 + 55 | PARTITION us_west OF INDEX movr.public.rides@rides_auto_index_fk_city_ref_users | movr | rides | rides_auto_index_fk_city_ref_users | 9 + 55 | PARTITION us_west OF INDEX movr.public.rides@rides_auto_index_fk_vehicle_city_ref_vehicles | movr | rides | rides_auto_index_fk_vehicle_city_ref_vehicles | 9 + 55 | PARTITION us_east OF INDEX movr.public.rides@primary | movr | rides | primary | 9 + 55 | PARTITION us_east OF INDEX movr.public.rides@rides_auto_index_fk_city_ref_users | movr | rides | rides_auto_index_fk_city_ref_users | 9 + 55 | PARTITION us_east OF INDEX movr.public.rides@rides_auto_index_fk_vehicle_city_ref_vehicles | movr | rides | rides_auto_index_fk_vehicle_city_ref_vehicles | 9 + 55 | PARTITION europe_west OF INDEX movr.public.rides@primary | movr | rides | primary | 9 + 55 | PARTITION europe_west OF INDEX movr.public.rides@rides_auto_index_fk_city_ref_users | movr | rides | rides_auto_index_fk_city_ref_users | 9 + 55 | PARTITION europe_west OF INDEX movr.public.rides@rides_auto_index_fk_vehicle_city_ref_vehicles | movr | rides | rides_auto_index_fk_vehicle_city_ref_vehicles | 9 + 56 | PARTITION us_west OF INDEX movr.public.vehicle_location_histories@primary | movr | vehicle_location_histories | primary | 3 + 56 | PARTITION us_east OF INDEX movr.public.vehicle_location_histories@primary | movr | vehicle_location_histories | primary | 3 + 56 | PARTITION europe_west OF INDEX movr.public.vehicle_location_histories@primary | movr | vehicle_location_histories | primary | 3 + 58 | PARTITION us_west OF INDEX movr.public.user_promo_codes@primary | movr | user_promo_codes | primary | 3 + 58 | PARTITION us_east OF INDEX movr.public.user_promo_codes@primary | movr | user_promo_codes | primary | 3 + 58 | PARTITION europe_west OF INDEX movr.public.user_promo_codes@primary | movr | user_promo_codes | primary | 3 +(24 rows) +~~~ + +To give another example, let's say your cluster were similar to the one shown above, but configured with [tiered localities](start-a-node.html#locality) such that you had split `us-east1` into `{region=us-east1,dc=dc1, region=us-east1,dc=dc2, region=us-east1,dc=dc3}`. In that case, you wouldn't expect any DC to be critical, because the cluster would "diversify" each range's location as much as possible across data centers. In such a situation, if you were to see a DC identified as a critical locality, you'd be surprised and you'd take some action. For example, perhaps the diversification process is failing because some localities are filled to capacity. If there is no disk space free in a locality, your cluster can't move replicas there. + +## See also + +- [Configure Replication Zones](configure-replication-zones.html) +- [Partitioning](partitioning.html) +- [`PARTITION BY`](partition-by.html) +- [`CONFIGURE ZONE`](configure-zone.html) +- [Start a node](start-a-node.html) diff --git a/v19.2/start-a-node.md b/v19.2/start-a-node.md index 8669aae7395..faab737fe94 100644 --- a/v19.2/start-a-node.md +++ b/v19.2/start-a-node.md @@ -88,9 +88,9 @@ Flag | Description ### Locality -The `--locality` flag accepts arbitrary key-value pairs that describe the location of the node. Locality might include country, region, datacenter, rack, etc. The key-value pairs should be ordered from most to least inclusive, and the keys and order of key-value pairs must be the same on all nodes. It's typically better to include more pairs than fewer. +The `--locality` flag accepts arbitrary key-value pairs that describe the location of the node. Locality might include region, country, datacenter, rack, etc. The key-value pairs should be ordered into _locality tiers_ from most inclusive to least inclusive (e.g., region before datacenter as in `region=eu,dc=paris`), and the keys and order of key-value pairs must be the same on all nodes. It's typically better to include more pairs than fewer. -- CockroachDB spreads the replicas of each piece of data across as diverse a set of localities as possible, with the order determining the priority. However, locality can also be used to influence the location of data replicas in various ways using [replication zones](configure-replication-zones.html#replication-constraints). +- CockroachDB spreads the replicas of each piece of data across as diverse a set of localities as possible, with the order determining the priority. Locality can also be used to influence the location of data replicas in various ways using [replication zones](configure-replication-zones.html#replication-constraints). - When there is high latency between nodes (e.g., cross-datacenter deployments), CockroachDB uses locality to move range leases closer to the current workload, reducing network round trips and improving read performance, also known as ["follow-the-workload"](demo-follow-the-workload.html). In a deployment across more than 3 datacenters, however, to ensure that all data benefits from "follow-the-workload", you must increase your replication factor to match the total number of datacenters.