Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: simplify connection state machine - stop tracking retry intent #45484

Merged

Conversation

andreimatei
Copy link
Contributor

Before this patch, the SQL connection state machine had an optimization:
if a transaction that hadn't used "SAVEPOINT cockroach_restart"
encountered a retriable error that we can't auto-retry, then we'd
release the txn's locks eagerly and enter the Aborted state. As opposed
to transactions that had used the "SAVEPOINT cockroach_restart", which
go to RestartWait.
This optimization is a significant complication for the state machine,
so this patch is removing it. All transactions now go to RestartWait,
and wait for a ROLLBACK to release the locks.

On the flip side, doing "RELEASE SAVEPOINT cockroach_restart" and
"ROLLBACK SAVEPOINT cockroach_restart" now works even for transactions
that haven't explicitly declared that savepoint, which is nice. Although
I don't promise I'll keep it working.

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@andreimatei
Copy link
Contributor Author

andreimatei commented Feb 27, 2020

cc @knz

before:
txnstatetransitions_diagram

after:
txnstatetransitions_diagram (1)

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then we'd release the txn's locks eagerly and enter the Aborted state.

I did some testing with PG and it's actually pretty smart. It seems to eagerly release all locks acquired in the current savepoint scope when an error is hit, but none of the locks that were acquired in other scopes. For instance, the following would allow concurrent transactions that want to write to "b" to proceed but not those that want to write to "a":

BEGIN;
SAVEPOINT x;
INSERT INTO kv VALUES ('a', 1);
SAVEPOINT y;
INSERT INTO kv VALUES ('b', 2);
INSYNTAXERROR;

Is this something you'd like to support? If so then we'd actually want something like what we had before, right?

Other than this question though, the code changes here LGTM.

Reviewed 9 of 9 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis)

@andreimatei andreimatei force-pushed the savepoint.remove-retry-intent branch from d2cdc00 to 8d0c357 Compare March 2, 2020 19:26
Before this patch, the SQL connection state machine had an optimization:
if a transaction that hadn't used "SAVEPOINT cockroach_restart"
encountered a retriable error that we can't auto-retry, then we'd
release the txn's locks eagerly and enter the Aborted state. As opposed
to transactions that had used the "SAVEPOINT cockroach_restart", which
go to RestartWait.
This optimization is a significant complication for the state machine,
so this patch is removing it. All transactions now go to RestartWait,
and wait for a ROLLBACK to release the locks.

On the flip side, doing "RELEASE SAVEPOINT cockroach_restart" and
"ROLLBACK SAVEPOINT cockroach_restart" now works even for transactions
that haven't explicitly declared that savepoint, which is nice. Although
I don't promise I'll keep it working.

Release note: None
@andreimatei andreimatei force-pushed the savepoint.remove-retry-intent branch from 8d0c357 to 198e4e7 Compare March 2, 2020 21:18
Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something you'd like to support?

If anything, the case that I'd be most interested in improving is the case in which no savepoint has been used. But even there I don't care too much; it's optimizing for errors after all. The code is definitely simpler without caring about this. But, relatedly, allowing for partial retries after a serializability failure - I think there's money in that.

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @nvanbenschoten)

@craig
Copy link
Contributor

craig bot commented Mar 2, 2020

Build failed (retrying...)

@craig
Copy link
Contributor

craig bot commented Mar 3, 2020

Build succeeded

@craig craig bot merged commit cf8cd92 into cockroachdb:master Mar 3, 2020
@andreimatei andreimatei deleted the savepoint.remove-retry-intent branch March 10, 2020 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants