Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document the 'SELECT ... FOR UPDATE' statement #6671

Merged
merged 1 commit into from
Mar 5, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions _includes/sidebar-data-v20.1.json
Original file line number Diff line number Diff line change
Expand Up @@ -1200,6 +1200,12 @@
"/${VERSION}/select-clause.html"
]
},
{
"title": "<code>SELECT FOR UPDATE</code>",
"urls": [
"/${VERSION}/select-for-update.html"
]
},
{
"title": "<code>SET</code> &lt;session variable&gt;",
"urls": [
Expand Down
5 changes: 5 additions & 0 deletions _includes/v20.1/misc/mitigate-contention-note.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{{site.data.alerts.callout_info}}
It's possible to mitigate read-write contention and reduce transaction retries using the following techniques:
1. By performing reads using [`AS OF SYSTEM TIME`](performance-best-practices-overview.html#use-as-of-system-time-to-decrease-conflicts-with-long-running-queries).
2. By using [`SELECT FOR UPDATE`](select-for-update.html) to order transactions by controlling concurrent access to one or more rows of a table. This reduces retries in scenarios where a transaction performs a read and then updates the same row it just read.
{{site.data.alerts.end}}
12 changes: 12 additions & 0 deletions _includes/v20.1/misc/session-vars.html
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,18 @@
<td>Yes</td>
</tr>

<tr>
<td>
<code>enable_implicit_select_for_update</code>
</td>
<td><span class="version-tag">New in v20.1</span>: Indicates whether <a href="update.html"><code>UPDATE</code></a> statements acquire locks using the <code>FOR UPDATE</code> locking mode during their initial row scan, which improves performance for contended workloads. For more information about how <code>FOR UPDATE</code> locking works, see the documentation for <a href="select-for-update.html"><code>SELECT FOR UPDATE</code></a>.</td>
<td>
<code>on</code>
</td>
<td>Yes</td>
<td>Yes</td>
</tr>

<tr>
<td>
<code>enable_zig_zag_join</code>
Expand Down
12 changes: 12 additions & 0 deletions _includes/v20.1/sql/select-for-update-overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
<span class="version-tag">New in v20.1</span>: The `SELECT ... FOR UPDATE` statement is used to order transactions by controlling concurrent access to one or more rows of a table.

It works by locking the rows returned by a [selection query][selection], such that other transactions trying to access those rows are forced to wait for the transaction that locked the rows to finish. These other transactions are effectively put into a queue based on when they tried to read the value of the locked rows.

Because this queueing happens during the read operation, the thrashing that would otherwise occur if multiple concurrently executing transactions attempt to `SELECT` the same data and then `UPDATE` the results of that selection is prevented. By preventing this thrashing, CockroachDB also prevents the [transaction retries][retries] that would otherwise occur.

As a result, using `SELECT FOR UPDATE` leads to increased throughput and decreased tail latency for contended operations.

<!-- Reference Links -->

[retries]: transactions.html#transaction-retries
[selection]: selection-queries.html
1 change: 1 addition & 0 deletions _includes/v20.1/sql/settings/settings.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
<tr><td><code>server.user_login.timeout</code></td><td>duration</td><td><code>10s</code></td><td>timeout after which client authentication times out if some system range is unavailable (0 = no timeout)</td></tr>
<tr><td><code>server.web_session_timeout</code></td><td>duration</td><td><code>168h0m0s</code></td><td>the duration that a newly created web session will be valid</td></tr>
<tr><td><code>sql.defaults.default_int_size</code></td><td>integer</td><td><code>8</code></td><td>the size, in bytes, of an INT type</td></tr>
<tr><td><code>sql.defaults.implicit_select_for_update.enabled</code></td><td>boolean</td><td><code>true</code></td><td>default value for enable_implicit_select_for_update session setting; enables FOR UPDATE locking during the row-fetch phase of mutation statements</td></tr>
<tr><td><code>sql.defaults.results_buffer.size</code></td><td>byte size</td><td><code>16 KiB</code></td><td>default size of the buffer that accumulates results for a statement or a batch of statements before they are sent to the client. This can be overridden on an individual connection with the 'results_buffer_size' parameter. Note that auto-retries generally only happen while no results have been delivered to the client, so reducing this size can increase the number of retriable errors a client receives. On the other hand, increasing the buffer size can increase the delay until the client receives the first result row. Updating the setting only affects new connections. Setting to 0 disables any buffering.</td></tr>
<tr><td><code>sql.defaults.serial_normalization</code></td><td>enumeration</td><td><code>rowid</code></td><td>default handling of SERIAL in table definitions [rowid = 0, virtual_sequence = 1, sql_sequence = 2]</td></tr>
<tr><td><code>sql.distsql.max_running_flows</code></td><td>integer</td><td><code>500</code></td><td>maximum number of concurrent flows that can be run on a node</td></tr>
Expand Down
2 changes: 2 additions & 0 deletions v20.1/performance-best-practices-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -383,6 +383,8 @@ To avoid contention, multiple strategies can be applied:
[`INSERT`](insert.html)/[`UPDATE`](update.html)/[`DELETE`](delete.html)/[`UPSERT`](upsert.html)
clauses together in a single SQL statement.

- Use the [`SELECT FOR UPDATE`](select-for-update.html) statement in scenarios where a transaction performs a read and then updates the row(s) it just read. It orders transactions by controlling concurrent access to one or more rows of a table. It works by locking the rows returned by a [selection query](selection-queries.html), such that other transactions trying to access those rows are forced to wait for the transaction that locked the rows to finish. These other transactions are effectively put into a queue that is ordered based on when they try to read the value of the locked row(s).

- When replacing values in a row, use [`UPSERT`](upsert.html) and
specify values for all columns in the inserted rows. This will
usually have the best performance under contention, compared to
Expand Down
7 changes: 7 additions & 0 deletions v20.1/select-clause.md
Original file line number Diff line number Diff line change
Expand Up @@ -525,10 +525,17 @@ Results from two or more queries can be combined together as follows:
- Using [set operations](selection-queries.html#set-operations) to combine rows
using inclusion/exclusion rules.

### Row-level locking for concurrency control with `SELECT FOR UPDATE`

{% include {{page.version.version}}/sql/select-for-update-overview.md %}

For an example showing how to use it, see [`SELECT FOR UPDATE`](select-for-update.html).

## See also

- [Scalar Expressions](scalar-expressions.html)
- [Selection Clauses](selection-queries.html#selection-clauses)
- [`SELECT FOR UPDATE`](select-for-update.html)
- [Set Operations](selection-queries.html#set-operations)
- [Table Expressions](table-expressions.html)
- [Ordering Query Results](query-order.html)
Expand Down
129 changes: 129 additions & 0 deletions v20.1/select-for-update.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
---
title: SELECT FOR UPDATE
summary: The SELECT FOR UPDATE statement is used to order transactions under contention.
keywords: concurrency control, locking, transactions, update locking, update, contention
toc: true
---

{% include {{page.version.version}}/sql/select-for-update-overview.md %}

## Required privileges

The user must have the `SELECT` and `UPDATE` [privileges](authorization.html#assign-privileges) on the tables used as operands.

## Parameters

The same as for other [selection queries](selection-queries.html).

## Examples

### Enforce transaction order when updating the same rows

In this example, we'll use `SELECT ... FOR UPDATE` to lock a row inside a transaction, forcing other transactions that want to update the same row to wait for the first transaction to complete. The other transactions that want to update the same row are effectively put into a queue based on when they first try to read the value of the row.

This example assumes you are running a [local unsecured cluster](start-a-local-cluster.html).

First, let's connect to the running cluster (we'll call this Terminal 1):

{% include copy-clipboard.html %}
~~~ shell
cockroach sql --insecure
~~~

Next, let's create a table and insert some rows:

{% include copy-clipboard.html %}
~~~ sql
CREATE TABLE kv (k INT PRIMARY KEY, v INT);
INSERT INTO kv (k, v) VALUES (1, 5), (2, 10), (3, 15);
~~~

Next, we'll start a [transaction](transactions.html) and and lock the row we want to operate on:

{% include copy-clipboard.html %}
~~~ sql
BEGIN;
SELECT * FROM kv WHERE k = 1 FOR UPDATE;
~~~

Hit enter twice in the [SQL client](cockroach-sql.html) to send the input so far to be evaluated. This will result in the following output:

~~~
k | v
+---+----+
1 | 5
(1 row)
~~~

Now let's open another terminal and connect to the database from a second client (we'll call this Terminal 2):

{% include copy-clipboard.html %}
~~~ shell
cockroach sql --insecure
~~~

From Terminal 2, start a transaction and try to lock the same row for updates that is already being accessed by the transaction we opened in Terminal 1:

{% include copy-clipboard.html %}
~~~ sql
BEGIN;
SELECT * FROM kv WHERE k = 1 FOR UPDATE;
~~~

Hit enter twice to send the input so far to be evaluated. Because Terminal 1 has already locked this row, the `SELECT ... FOR UPDATE` statement from Terminal 2 will appear to "wait".

Back in Terminal 1, let's update the row and commit the transaction:

{% include copy-clipboard.html %}
~~~ sql
UPDATE kv SET v = v + 5 WHERE k = 1;
COMMIT;
~~~

~~~
COMMIT
~~~

Now that the transaction in Terminal 1 has committed, the transaction in Terminal 2 will be "unblocked", generating the following output, which shows the value left by the transaction in Terminal 1:

~~~
k | v
+---+----+
1 | 10
(1 row)
~~~

The transaction in Terminal 2 can now receive input, so let's update the row in question again:

{% include copy-clipboard.html %}
~~~ sql
UPDATE kv SET v = v + 5 WHERE k = 1;
COMMIT;
~~~

~~~
UPDATE 1
~~~

Finally, we commit the transaction in Terminal 2:

{% include copy-clipboard.html %}
~~~ sql
COMMIT;
~~~

~~~
COMMIT
~~~

## See also

- [`SELECT`](select-clause.html)
- [Selection Queries](selection-queries.html)
- [Understanding and avoiding transaction contention][transaction_contention]

<!-- Reference links -->

[transaction_contention]: performance-best-practices-overview.html#understanding-and-avoiding-transaction-contention
[retries]: transactions.html#client-side-intervention
[select]: select-clause.html
7 changes: 7 additions & 0 deletions v20.1/selection-queries.md
Original file line number Diff line number Diff line change
Expand Up @@ -504,6 +504,12 @@ EXPLAIN SELECT * FROM employees WHERE emp_no > 300025 ORDER BY emp_no LIMIT 25;
Using a sequential (i.e., non-[UUID](uuid.html)) primary key creates hot spots in the database for write-heavy workloads, since concurrent [`INSERT`](insert.html)s to the table will attempt to write to the same (or nearby) underlying [ranges](architecture/overview.html#architecture-range). This can be mitigated by designing your schema with [multi-column primary keys which include a monotonically increasing column](performance-best-practices-overview.html#use-multi-column-primary-keys).
{{site.data.alerts.end}}

## Row-level locking for concurrency control with `SELECT FOR UPDATE`

{% include {{page.version.version}}/sql/select-for-update-overview.md %}

For an example showing how to use it, see [`SELECT FOR UPDATE`](select-for-update.html).

## Composability

[Selection clauses](#selection-clauses) are defined in the context of selection queries. [Table expressions](table-expressions.html) are defined in the context of the `FROM` sub-clause of [`SELECT`](select-clause.html). Nevertheless, they can be integrated with one another to form more complex queries or statements.
Expand Down Expand Up @@ -612,6 +618,7 @@ For example:
## See also

- [Simple `SELECT` Clause](select-clause.html)
- [`SELECT FOR UPDATE`](select-for-update.html)
- [Table Expressions](table-expressions.html)
- [Ordering Query Results](query-order.html)
- [Limiting Query Results](limit-offset.html)
1 change: 1 addition & 0 deletions v20.1/sql-feature-support.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,7 @@ table tr td:nth-child(2) {
`UPSERT` | ✓ | PostgreSQL, MSSQL Extension | [`UPSERT` documentation](upsert.html)
`EXPLAIN` | ✓ | Common Extension | [`EXPLAIN` documentation](explain.html)
`SELECT INTO` | Alternative | Common Extension | You can replicate similar functionality using [`CREATE TABLE`](create-table.html) and then `INSERT INTO ... SELECT ...`.
`SELECT ... FOR UPDATE` | ✓ | Common Extension | [`SELECT FOR UPDATE` documentation](select-for-update.html)

### Clauses

Expand Down
1 change: 1 addition & 0 deletions v20.1/sql-statements.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ Statement | Usage
[`IMPORT INTO`](import-into.html) | Bulk-insert CSV data into an existing table.
[`INSERT`](insert.html) | Insert rows into a table.
[`SELECT`](select-clause.html) | Select specific rows and columns from a table and optionally compute derived values.
[`SELECT ... FOR UPDATE`](select-for-update.html) | Order transactions by controlling concurrent access to one or more rows of a table.
[`TABLE`](selection-queries.html#table-clause) | Select all rows and columns from a table.
[`TRUNCATE`](truncate.html) | Delete all rows from specified tables.
[`UPDATE`](update.html) | Update rows in a table.
Expand Down
6 changes: 5 additions & 1 deletion v20.1/transactions.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,9 @@ Type | Description

## Transaction retries

Transactions may require retries if they experience deadlock or [read/write contention](performance-best-practices-overview.html#understanding-and-avoiding-transaction-contention) with other concurrent transactions which cannot be resolved without allowing potential [serializable anomalies](https://en.wikipedia.org/wiki/Serializability). (However, it's possible to mitigate read-write conflicts by performing reads using [`AS OF SYSTEM TIME`](performance-best-practices-overview.html#use-as-of-system-time-to-decrease-conflicts-with-long-running-queries).)
Transactions may require retries if they experience deadlock or [read/write contention](performance-best-practices-overview.html#understanding-and-avoiding-transaction-contention) with other concurrent transactions which cannot be resolved without allowing potential [serializable anomalies](https://en.wikipedia.org/wiki/Serializability).

{% include {{page.version.version}}/misc/mitigate-contention-note.md %}

There are two cases in which transaction retries occur:

Expand Down Expand Up @@ -156,6 +158,8 @@ To handle these types of errors you have the following options:
2. **Most users, such as application authors**: Abort the transaction using the [`ROLLBACK`](rollback-transaction.html) statement, and then reissue all of the statements in the transaction. For an example, see the [Client-side intervention example](#client-side-intervention-example).
3. **Advanced users, such as library authors**: Use the [`SAVEPOINT`](savepoint.html) statement to create retryable transactions. Retryable transactions can improve performance because their priority is increased each time they are retried, making them more likely to succeed the longer they're in your system. For instructions showing how to do this, see [Advanced Client-Side Transaction Retries](advanced-client-side-transaction-retries.html).

{% include {{page.version.version}}/misc/mitigate-contention-note.md %}

#### Client-side intervention example

{% include {{page.version.version}}/misc/client-side-intervention-example.md %}
Expand Down