Document the 'SELECT ... FOR UPDATE' statement #6671

rmloveland · 2020-02-24T22:15:38Z

Fixes #6000.

Summary of changes:

Add docs for the SELECT ... FOR UPDATE statement, including an
example showing its intended use.
Update other pages re: selection queries to mention the use of SELECT FOR UPDATE to reduce contention/retries and improve
throughput/latency.
Update Transactions page and Perf Best Practices page to mention
SELECT FOR UPDATE as a possible solution to contention/retry issues.
Add SELECT FOR UPDATE to SQL Statements and SQL Feature Support
pages.
Add docs for new enable_implicit_select_for_update session and
cluster setting.

cockroach-teamcity · 2020-02-24T22:15:44Z

This change is

rmloveland · 2020-02-24T22:16:42Z

Related issue to document KV layer changes: #6571

rmloveland · 2020-02-24T22:21:20Z

Nathan this PR is meant to document the user-facing changes related to SELECT FOR UPDATE. We will document the KV-layer changes separately via #6571.

Please take a look at your convenience and let me know if this is missing any pertinent info that SQL users should have re: SFU, or if anything is incorrect. Also, if you have suggestions re: how to better explain this feature, I'm happy to update.

nvanbenschoten

Thanks for publishing this Rich!

Do we no longer get previews for these changes?

Reviewed 12 of 12 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rmloveland)

_includes/v20.1/misc/session-vars.html, line 102 at r1 (raw file):

    <code>enable_implicit_select_for_update</code>
   </td>
   <td><span class="version-tag">New in v20.1</span>: Indicates whether <a href="update.html"><code>UPDATE</code></a> statements acquire locks using the <code>FOR UPDATE</code> locking mode during their initial row scan, which improves performance for contended workloads.  For more information about how <code>FOR UPDATE</code> locking works, see the documentation for <a href="select-for-update.html"><code>SELECT FOR UPDATE</code></a>.</td>

This may also include DELETE and UPSERT statements by the 20.1 release. I'm not sure whether you want to mention that now or not.

_includes/v20.1/sql/select-for-update-overview.md, line 5 at r1 (raw file):

Benefits of using `SELECT FOR UPDATE` include:

- Reduced [transaction contention][transaction_contention] when your workload requires concurrent access to the same rows.

Transaction contention is not reduced by SELECT FOR UPDATE. In fact, it's increased.

_includes/v20.1/sql/select-for-update-overview.md, line 7 at r1 (raw file):

- Reduced [transaction contention][transaction_contention] when your workload requires concurrent access to the same rows.
- A substantial reduction the number of [transaction retries][retries] in scenarios where a transaction performs a read and then updates the same row it just read.
- Increased throughput.

We should qualify this further. Neither of these statements is true for uncontended workloads. I think we should remove this list and expand upon the second point more, as it's the one that gets at the heart of why users would want to use SELECT FOR UPDATE.

SELECT FOR UPDATE allows users to acquire pessimistic locks during reads so that they can't be modified between the time that they are read and the time that they are updated. This prevents retries, which occur when operations interleave like this. This also prevents thrashing during these read-modify-write attempts because transactions are forced to queue up during the read operation instead of during the write operation.

All of this is what leads to increased throughput and decreased tail latency for contended operations.

rmloveland

Do we no longer get previews

AIUI the security team changed perms on our S3 buckets recently, which affected docs builds. According to some discussion in the #education channel, this should be fixed now.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)

_includes/v20.1/misc/session-vars.html, line 102 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

This may also include DELETE and UPSERT statements by the 20.1 release. I'm not sure whether you want to mention that now or not.

For now I have filed #6699 to track adding DELETE and UPSERT here later as needed.

_includes/v20.1/sql/select-for-update-overview.md, line 5 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Transaction contention is not reduced by SELECT FOR UPDATE. In fact, it's increased.

Interesting. Is that because the read-based queueing you describe below causes more transactions to "stack up" behind the one that holds the lock?

_includes/v20.1/sql/select-for-update-overview.md, line 7 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

We should qualify this further. Neither of these statements is true for uncontended workloads. I think we should remove this list and expand upon the second point more, as it's the one that gets at the heart of why users would want to use SELECT FOR UPDATE.

SELECT FOR UPDATE allows users to acquire pessimistic locks during reads so that they can't be modified between the time that they are read and the time that they are updated. This prevents retries, which occur when operations interleave like this. This also prevents thrashing during these read-modify-write attempts because transactions are forced to queue up during the read operation instead of during the write operation.

All of this is what leads to increased throughput and decreased tail latency for contended operations.

Ah, OK. Thanks for this information.

I have updated this section to remove the bulleted list and rewrite things a bit. I tried to simplify the language slightly while still also hitting the points you mentioned. PTAL.

rmloveland · 2020-02-27T17:28:22Z

Also, thanks for the review Nathan! (Since I forgot to say that :-} )

cockroach-teamcity · 2020-02-27T17:31:33Z

Online preview: http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/abf80ac55e9aee2338d22b5e751d7331a4024de1/

Edited pages:

nvanbenschoten

Reviewed 1 of 1 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rmloveland)

_includes/v20.1/sql/select-for-update-overview.md, line 5 at r1 (raw file):

Previously, rmloveland (Rich Loveland) wrote…

Interesting. Is that because the read-based queueing you describe below causes more transactions to "stack up" behind the one that holds the lock?

Exactly, this is called "pessimistic" locking, as opposed to "optimistic" locking. With SELECT FOR UPDATE, the reads grab locks and queue behind one another instead of executing concurrently. By itself, this would actually hurt performance. However, when performing an UPDATE of a row that was just read, we need to make sure that the row hasn't changed since the read. If we don't grab locks during the read and we're performing contended UPDATEs then there's a good chance that the row has changed and we'll need to restart the entire transaction (ie. we were optimistically hoping the row wouldn't change but doing nothing to prevent that, and it cost us). So SELECT FOR UPDATE is more aggressive (read: pessimistic) with lock, which has a small negative performance impact on its own but can prevent these full txn restarts, which hurt performance significantly more.

_includes/v20.1/sql/select-for-update-overview.md, line 7 at r1 (raw file):

read-modify-write workflow

Consider mentioning what this means in the context of SQL (e.g. a SELECT and then an UPDATE on that select).

Also, this should probably mention that the thrashing only happens with contended read-modify-write operations.

In addition,

The thrashing you refer to in the previous sentence is exactly what causes these retries, so this should be reworded a little bit.

v20.1/select-for-update.md, line 13 at r2 (raw file):

<div>
  {% include {{ page.version.version }}/sql/diagrams/simple_select_clause.html %}

Is this the right diagram to include here? It doesn't actually include any mention of "FOR UPDATE".

rmloveland

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)

_includes/v20.1/sql/select-for-update-overview.md, line 5 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Exactly, this is called "pessimistic" locking, as opposed to "optimistic" locking. With SELECT FOR UPDATE, the reads grab locks and queue behind one another instead of executing concurrently. By itself, this would actually hurt performance. However, when performing an UPDATE of a row that was just read, we need to make sure that the row hasn't changed since the read. If we don't grab locks during the read and we're performing contended UPDATEs then there's a good chance that the row has changed and we'll need to restart the entire transaction (ie. we were optimistically hoping the row wouldn't change but doing nothing to prevent that, and it cost us). So SELECT FOR UPDATE is more aggressive (read: pessimistic) with lock, which has a small negative performance impact on its own but can prevent these full txn restarts, which hurt performance significantly more.

Thanks for explaining this.

_includes/v20.1/sql/select-for-update-overview.md, line 7 at r1 (raw file):
Re: translating "read-modify-write" to SQL, and specifying that the thrashing only happens with contended read-modify-write operations, I updated that part to say the following:

Because this queueing happens during the read operation, the thrashing that would otherwise occur (when multiple concurrently executing transactions attempt to SELECT the same data and then UPDATE the results of that selection) is prevented.

Re: the thrashing causing the retries, I updated that sentence to say the following:

By preventing this thrashing, CockroachDB also prevents the [transaction retries][retries] that would otherwise occur.

PTAL.

v20.1/select-for-update.md, line 13 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Is this the right diagram to include here? It doesn't actually include any mention of "FOR UPDATE".

It's not the right diagram. For now, I have deleted this section and diagram from this page (for now) and filed #6720 to get help figuring out how to generate the right diagram.

I notice that the CRDB grammar contains a for_locking_strength section which has what we want (FOR UPDATE), but I'm not sure how to get from there to a railroad diagram. The pkg/cmd/docgen tool is something of a black box (to me); I'll try to figure this out / get help via that other issue.

cockroach-teamcity · 2020-03-02T20:00:28Z

Online preview: http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/cb165b38b0852f6dbd4b0f845f602c047b574569/

Edited pages:

nvanbenschoten

Reviewed 3 of 4 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)

_includes/v20.1/sql/select-for-update-overview.md, line 7 at r1 (raw file):

Previously, rmloveland (Rich Loveland) wrote…

Re: translating "read-modify-write" to SQL, and specifying that the thrashing only happens with contended read-modify-write operations, I updated that part to say the following:

Because this queueing happens during the read operation, the thrashing that would otherwise occur (when multiple concurrently executing transactions attempt to SELECT the same data and then UPDATE the results of that selection) is prevented.

Re: the thrashing causing the retries, I updated that sentence to say the following:

By preventing this thrashing, CockroachDB also prevents the [transaction retries][retries] that would otherwise occur.

PTAL.

LGTM now, though you might s/when/if/ and pull that out of the parens. If the UPDATEs aren't contended then it's perfectly fine to not use SELECT FOR UPDATE.

rmloveland

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)

_includes/v20.1/sql/select-for-update-overview.md, line 7 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

LGTM now, though you might s/when/if/ and pull that out of the parens. If the UPDATEs aren't contended then it's perfectly fine to not use SELECT FOR UPDATE.

Done and done (s/when/if/ and paren removal). Thanks!

cockroach-teamcity · 2020-03-03T19:57:43Z

Online preview: http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/8d91d3650fa0486b8ffde5bdd6672dbf6b585d31/

Edited pages:

nvanbenschoten

Reviewed 1 of 4 files at r3, 1 of 1 files at r4.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @lnhsingh)

lnhsingh

One language suggestion + clean up, but otherwise LGTM

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @rmloveland)

_includes/v20.1/sql/select-for-update-overview.md, line 3 at r4 (raw file):

<span class="version-tag">New in v20.1</span>: The `SELECT ... FOR UPDATE` statement is used to order transactions by controlling concurrent access to one or more rows of a table.

It works by causing the rows returned by a [selection query][selection] to be locked, such that other transactions trying to access those rows are forced to wait for the transaction that locked the rows to finish. These other transactions are effectively put into a queue based on when they tried to read the value of the locked rows.

It works by causing the rows returned by a selection query to be locked > It works by locking the rows returned by a selection query,

v20.1/performance-best-practices-overview.md, line 386 at r4 (raw file):

It works by causing the rows returned by a [selection query][selection] to be locked, such that other transactions trying to access those rows are forced to wait for the transaction that locked the rows to finish. These other transactions are effectively put into a queue based on when they tried to read the value of the locked rows.

I think this description should be the same as your select-for-update include above. Something like:

It works by locking the rows returned by a [selection query][selection], such that other transactions trying to access those rows are forced to wait for the transaction that locked the rows to finish. These other transactions are effectively put into a queue based on when they tried to read the value of the locked rows.

v20.1/select-for-update.md, line 14 at r4 (raw file):

The user must have the `SELECT` and `UPDATE` [privileges](authorization.html#assign-privileges) on the tables used as operands.

## Parameters

Should there also be a Synopsis section?

v20.1/selection-queries.md, line 25 at r4 (raw file):

<div>
  {% include {{ page.version.version }}/sql/diagrams/select.html %}

This is rendering with what looks like merge conflict messages - <<<<<<< HEAD and ====== and >>>>>>> Recursive CTE docs. I think there's also two diagrams showing when there should be one.

Fixes #6000. Summary of changes: - Add docs for the `SELECT ... FOR UPDATE` statement, including an example showing its intended use. - Update other pages re: selection queries to mention the use of `SELECT FOR UPDATE` to reduce retries and improve throughput/latency. - Update Transactions page and Perf Best Practices page to mention `SELECT FOR UPDATE` as a possible solution to thrashing/retry issues. - Add `SELECT FOR UPDATE` to SQL Statements and SQL Feature Support pages. - Add docs for new `enable_implicit_select_for_update` session and cluster setting.

rmloveland

Thanks Lauren!

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @lnhsingh)

_includes/v20.1/sql/select-for-update-overview.md, line 3 at r4 (raw file):

Previously, lnhsingh (Lauren Hirata Singh) wrote…

It works by causing the rows returned by a selection query to be locked > It works by locking the rows returned by a selection query,

Updated! Much better, thanks.

v20.1/performance-best-practices-overview.md, line 386 at r4 (raw file):

Previously, lnhsingh (Lauren Hirata Singh) wrote…

It works by causing the rows returned by a [selection query][selection] to be locked, such that other transactions trying to access those rows are forced to wait for the transaction that locked the rows to finish. These other transactions are effectively put into a queue based on when they tried to read the value of the locked rows.
I think this description should be the same as your select-for-update include above. Something like:

It works by locking the rows returned by a [selection query][selection], such that other transactions trying to access those rows are forced to wait for the transaction that locked the rows to finish. These other transactions are effectively put into a queue based on when they tried to read the value of the locked rows.

Fixed.

v20.1/select-for-update.md, line 14 at r4 (raw file):

Previously, lnhsingh (Lauren Hirata Singh) wrote…

Should there also be a Synopsis section?

Yes! Unfortunately, I have not been able to figure out how to generate a diagram correctly for this variant of SELECT. Filed #6720 - details about the pain are listed there.

v20.1/selection-queries.md, line 25 at r4 (raw file):

Previously, lnhsingh (Lauren Hirata Singh) wrote…

This is rendering with what looks like merge conflict messages - <<<<<<< HEAD and ====== and >>>>>>> Recursive CTE docs. I think there's also two diagrams showing when there should be one.

Weird! I am not seeing the merge conflict markers in my local repo. I just rebased everything onto latest master with no conflicts. Will check to see if the conflict markers are there after I push this update.

cockroach-teamcity · 2020-03-05T16:42:39Z

Online preview: http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/978cf68494785809963e2ee5d59946f1160bb861/

Edited pages:

rmloveland · 2020-03-05T17:00:15Z

This is rendering with what looks like merge conflict messages - <<<<<<< HEAD and ====== and >>>>>>> Recursive CTE docs. I think there's also two diagrams showing when there should be one.

This IS there, my bad. I went and looked at the include. It was introduced in a different PR :-/

I will make the fix in a separate PR since I want to drive some discussion of one of my pet issues #2836

rmloveland requested a review from nvanbenschoten February 24, 2020 22:20

nvanbenschoten reviewed Feb 26, 2020

View reviewed changes

rmloveland mentioned this pull request Feb 27, 2020

Update docs to mention that DELETE and UPSERT also use the locking mechanism created for SELECT FOR UPDATE #6699

Closed

rmloveland commented Feb 27, 2020

View reviewed changes

rmloveland force-pushed the 20200213-select-for-update branch from 166e278 to abf80ac Compare February 27, 2020 17:27

nvanbenschoten reviewed Feb 28, 2020

View reviewed changes

rmloveland force-pushed the 20200213-select-for-update branch from abf80ac to cb165b3 Compare March 2, 2020 19:56

rmloveland commented Mar 2, 2020

View reviewed changes

nvanbenschoten reviewed Mar 2, 2020

View reviewed changes

rmloveland force-pushed the 20200213-select-for-update branch from cb165b3 to 8d91d36 Compare March 3, 2020 19:54

rmloveland commented Mar 3, 2020

View reviewed changes

rmloveland requested a review from lnhsingh March 3, 2020 19:55

nvanbenschoten approved these changes Mar 3, 2020

View reviewed changes

lnhsingh approved these changes Mar 4, 2020

View reviewed changes

rmloveland commented Mar 5, 2020

View reviewed changes

rmloveland force-pushed the 20200213-select-for-update branch from 8d91d36 to 978cf68 Compare March 5, 2020 16:39

rmloveland merged commit b3a7005 into master Mar 5, 2020

rmloveland deleted the 20200213-select-for-update branch March 5, 2020 17:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document the 'SELECT ... FOR UPDATE' statement #6671

Document the 'SELECT ... FOR UPDATE' statement #6671

rmloveland commented Feb 24, 2020

cockroach-teamcity commented Feb 24, 2020

rmloveland commented Feb 24, 2020

rmloveland commented Feb 24, 2020

nvanbenschoten left a comment

rmloveland left a comment

rmloveland commented Feb 27, 2020

cockroach-teamcity commented Feb 27, 2020

nvanbenschoten left a comment

rmloveland left a comment

cockroach-teamcity commented Mar 2, 2020

nvanbenschoten left a comment

rmloveland left a comment

cockroach-teamcity commented Mar 3, 2020

nvanbenschoten left a comment

lnhsingh left a comment

rmloveland left a comment

cockroach-teamcity commented Mar 5, 2020

rmloveland commented Mar 5, 2020

Document the 'SELECT ... FOR UPDATE' statement #6671

Document the 'SELECT ... FOR UPDATE' statement #6671

Conversation

rmloveland commented Feb 24, 2020

cockroach-teamcity commented Feb 24, 2020

rmloveland commented Feb 24, 2020

rmloveland commented Feb 24, 2020

nvanbenschoten left a comment

Choose a reason for hiding this comment

rmloveland left a comment

Choose a reason for hiding this comment

rmloveland commented Feb 27, 2020

cockroach-teamcity commented Feb 27, 2020

nvanbenschoten left a comment

Choose a reason for hiding this comment

rmloveland left a comment

Choose a reason for hiding this comment

cockroach-teamcity commented Mar 2, 2020

nvanbenschoten left a comment

Choose a reason for hiding this comment

rmloveland left a comment

Choose a reason for hiding this comment

cockroach-teamcity commented Mar 3, 2020

nvanbenschoten left a comment

Choose a reason for hiding this comment

lnhsingh left a comment

Choose a reason for hiding this comment

rmloveland left a comment

Choose a reason for hiding this comment

cockroach-teamcity commented Mar 5, 2020

rmloveland commented Mar 5, 2020