-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor sql/executor to support retries of batches of statements. #4576
Conversation
I will add a test for restarts in the current PR. Turns out deterministically triggering restarts of txns for which all statements are received in one batch is not so easy. |
Looks like a number of changes are combo'd up into one commit here. Reviewed 16 of 18 files at r1. client/txn.go, line 404 [r1] (raw file): client/txn.go, line 406 [r1] (raw file): client/txn.go, line 408 [r1] (raw file): client/txn.go, line 422 [r1] (raw file): client/txn.go, line 428 [r1] (raw file): client/txn.go, line 439 [r1] (raw file): client/txn.go, line 449 [r1] (raw file): client/txn.go, line 465 [r1] (raw file): sql/drop_test.go, line 189 [r1] (raw file): sql/plan.go, line 61 [r1] (raw file): sql/session.proto, line 31 [r1] (raw file): sql/testdata/datetime, line 187 [r1] (raw file): sql/testdata/txn, line 116 [r1] (raw file): sql/txn.go, line 41 [r1] (raw file): Comments from the review on Reviewable.io |
You can try using the new parallel test. Having multiple clients run transactions that modify the same data in parallel will trigger restarts. I am assuming you can have a multiple statements on a single line with Comments from the review on Reviewable.io |
Review status: 16 of 18 files reviewed at latest revision, 46 unresolved discussions. client/db.go, line 464 [r1] (raw file): client/txn.go, line 412 [r1] (raw file): client/txn.go, line 434 [r1] (raw file): client/txn.go, line 479 [r1] (raw file): sql/executor.go, line 225 [r1] (raw file): sql/executor.go, line 281 [r1] (raw file): sql/executor.go, line 295 [r1] (raw file): sql/executor.go, line 306 [r1] (raw file): if len(schemaChangers.schemaChangers) == 0 || disableSyncSchemaChangeExec {
return
}
// saved indentation everywhere sql/executor.go, line 310 [r1] (raw file): sql/executor.go, line 340 [r1] (raw file): sql/executor.go, line 365 [r1] (raw file): type sqlTransactionState int
const (
noTransaction sqlTransactionState = iota
openTransaction
failingTransaction // waiting for COMMIT/ROLLBACK
completedTransaction // is this needed? Could just be noTransaction?
)
func (s txnState) state() sqlTransactionState {
// return corresponding state
} sql/executor.go, line 372 [r1] (raw file): sql/executor.go, line 458 [r1] (raw file): sql/executor.go, line 461 [r1] (raw file): sql/executor.go, line 470 [r1] (raw file): sql/executor.go, line 502 [r1] (raw file): sql/executor.go, line 532 [r1] (raw file): sql/executor.go, line 534 [r1] (raw file): sql/executor.go, line 536 [r1] (raw file): sql/executor.go, line 544 [r1] (raw file): sql/executor.go, line 552 [r1] (raw file): sql/executor.go, line 560 [r1] (raw file): sql/executor.go, line 579 [r1] (raw file): sql/executor.go, line 604 [r1] (raw file): sql/executor.go, line 609 [r1] (raw file): sql/executor.go, line 623 [r1] (raw file): sql/executor.go, line 635 [r1] (raw file): sql/executor.go, line 643 [r1] (raw file): sql/executor.go, line 659 [r1] (raw file): sql/executor.go, line 679 [r1] (raw file): sql/executor.go, line 680 [r1] (raw file): sql/session.proto, line 31 [r1] (raw file): sql/session.proto, line 46 [r1] (raw file): Comments from the review on Reviewable.io |
@radu I'd like to see some more controlled unit tests. Randomized testing is welcome too, but only as a next step. Reviewed 18 of 18 files at r1. Comments from the review on Reviewable.io |
a1943e2
to
3328b6f
Compare
Review status: 10 of 18 files reviewed at latest revision, 46 unresolved discussions, some commit checks failed. client/db.go, line 464 [r1] (raw file): client/txn.go, line 404 [r1] (raw file): client/txn.go, line 406 [r1] (raw file): client/txn.go, line 408 [r1] (raw file): client/txn.go, line 412 [r1] (raw file): client/txn.go, line 422 [r1] (raw file): client/txn.go, line 428 [r1] (raw file): client/txn.go, line 434 [r1] (raw file): client/txn.go, line 439 [r1] (raw file): client/txn.go, line 449 [r1] (raw file): client/txn.go, line 465 [r1] (raw file): client/txn.go, line 479 [r1] (raw file): sql/drop_test.go, line 189 [r1] (raw file): sql/executor.go, line 225 [r1] (raw file): sql/executor.go, line 281 [r1] (raw file): sql/executor.go, line 295 [r1] (raw file): sql/executor.go, line 306 [r1] (raw file): sql/executor.go, line 310 [r1] (raw file): sql/executor.go, line 340 [r1] (raw file): sql/executor.go, line 365 [r1] (raw file): sql/executor.go, line 458 [r1] (raw file): sql/executor.go, line 461 [r1] (raw file): sql/executor.go, line 502 [r1] (raw file): sql/executor.go, line 532 [r1] (raw file): sql/executor.go, line 534 [r1] (raw file): sql/executor.go, line 536 [r1] (raw file): sql/executor.go, line 544 [r1] (raw file): sql/executor.go, line 552 [r1] (raw file): sql/executor.go, line 579 [r1] (raw file): sql/executor.go, line 604 [r1] (raw file): sql/executor.go, line 609 [r1] (raw file): sql/executor.go, line 623 [r1] (raw file): sql/executor.go, line 635 [r1] (raw file): sql/executor.go, line 643 [r1] (raw file): Btw, we use @cockroach names in TODOs, not github names, right? sql/executor.go, line 659 [r1] (raw file): sql/executor.go, line 679 [r1] (raw file): sql/executor.go, line 680 [r1] (raw file): sql/plan.go, line 61 [r1] (raw file): sql/session.proto, line 31 [r1] (raw file): sql/session.proto, line 46 [r1] (raw file): sql/testdata/datetime, line 187 [r1] (raw file): sql/testdata/txn, line 116 [r1] (raw file): sql/txn.go, line 41 [r1] (raw file): Comments from the review on Reviewable.io |
Reviewed 1 of 18 files at r1, 8 of 9 files at r2. sql/drop_test.go, line 189 [r1] (raw file): sql/pgwire/v3.go, line 199 [r1] (raw file): sql/plan.go, line 61 [r1] (raw file): sql/testdata/datetime, line 187 [r1] (raw file): Comments from the review on Reviewable.io |
Pretty sweet change! I like the core idea of having Reviewed 10 of 18 files at r1, 8 of 9 files at r2. client/txn.go, line 436 [r2] (raw file): client/txn.go, line 459 [r2] (raw file): client/txn.go, line 469 [r2] (raw file): sql/drop_test.go, line 189 [r1] (raw file): sql/executor.go, line 680 [r1] (raw file): sql/executor.go, line 16 [r2] (raw file): sql/executor.go, line 276 [r2] (raw file): sql/executor.go, line 280 [r2] (raw file): sql/executor.go, line 318 [r2] (raw file): sql/executor.go, line 474 [r2] (raw file): sql/executor.go, line 520 [r2] (raw file): sql/executor.go, line 575 [r2] (raw file): sql/executor.go, line 577 [r2] (raw file): sql/executor.go, line 636 [r2] (raw file): sql/executor.go, line 698 [r2] (raw file): Comments from the review on Reviewable.io |
Review status: all files reviewed at latest revision, 55 unresolved discussions, some commit checks failed. sql/drop_test.go, line 189 [r1] (raw file): sql/executor.go, line 225 [r1] (raw file): Performing a shallow-copy of the sql/executor.go, line 233 [r2] (raw file): sql/executor.go, line 379 [r2] (raw file): sql/executor.go, line 415 [r2] (raw file): sql/plan.go, line 63 [r2] (raw file): sql/session.proto, line 33 [r2] (raw file): Comments from the review on Reviewable.io |
3328b6f
to
6673920
Compare
Review status: 11 of 20 files reviewed at latest revision, 54 unresolved discussions, some commit checks failed. client/txn.go, line 436 [r2] (raw file): client/txn.go, line 459 [r2] (raw file): client/txn.go, line 469 [r2] (raw file): sql/executor.go, line 225 [r1] (raw file): Principally, I think the session should be shared by everybody. A shallow copy is not doing anybody any favors. sql/executor.go, line 680 [r1] (raw file): sql/executor.go, line 16 [r2] (raw file): sql/executor.go, line 276 [r2] (raw file): sql/executor.go, line 280 [r2] (raw file): sql/executor.go, line 318 [r2] (raw file): sql/executor.go, line 379 [r2] (raw file): sql/executor.go, line 415 [r2] (raw file): sql/executor.go, line 474 [r2] (raw file): sql/executor.go, line 520 [r2] (raw file): sql/executor.go, line 575 [r2] (raw file): sql/executor.go, line 577 [r2] (raw file): sql/executor.go, line 636 [r2] (raw file): sql/executor.go, line 698 [r2] (raw file): sql/plan.go, line 61 [r1] (raw file): sql/plan.go, line 63 [r2] (raw file): sql/session.proto, line 31 [r1] (raw file): Even without this, I've thought about it again. Introducing this struct makes code cleaner. pgwire, for example, should not be looking into a KV txn proto; it should know nothing about KV. sql/session.proto, line 46 [r1] (raw file): sql/session.proto, line 33 [r2] (raw file): Comments from the review on Reviewable.io |
6673920
to
43de10f
Compare
Brought back Review status: 10 of 23 files reviewed at latest revision, 43 unresolved discussions, some commit checks failed. sql/executor.go, line 474 [r2] (raw file): Comments from the review on Reviewable.io |
LGTM Review status: 10 of 23 files reviewed at latest revision, 45 unresolved discussions, some commit checks failed. sql/executor.go, line 541 [r4] (raw file): sql/txn_restart_test.go, line 63 [r4] (raw file): Comments from the review on Reviewable.io |
003bd06
to
b10ab02
Compare
Review status: 10 of 24 files reviewed at latest revision, 45 unresolved discussions, some commit checks failed. sql/executor.go, line 541 [r4] (raw file): Comments from the review on Reviewable.io |
b0c1e2e
to
194df70
Compare
@tamird I had to relax that Reviewed 1 of 21 files at r5. Comments from the review on Reviewable.io |
Previously the only SQL txns that we automatically retry were implicit txns (single-statements outside of a user txn). With this change, we're retrying batches of statements. In a future change, we'll bubble up retriable errors for txns that we don't auto-retry to the user, and allow her to manually retry. - unify the handling of implicit and explicit sql txns - automatically retry the batch of statements that contain BEGIN (the prefix of a txn sent as one batch) - made the executor and pgwire not look inside the kv txn proto any more to determine txn state. This was ugly. Instead, the executor and planner cooperate to maintain their own view of the sql txn status - made schema change errors from async processing insert errors into the Results at the appropriate position (the position of the statement that queued up the schema change)
194df70
to
26b7453
Compare
Refactor sql/executor to support retries of batches of statements.
Reviewed 1 of 9 files at r2, 1 of 9 files at r3, 19 of 21 files at r5, 1 of 1 files at r7, 1 of 1 files at r8. sql/executor.go, line 225 [r1] (raw file):
Having a global which is shared among potentially concurrent (maybe not now, but soon?) actors can also be surprising and/or dangerous. Passing it around forces you to join up potentially conflicting information, which is something that's easy to just not do when sharing. I think it's ok so far, but should keep an eye on it. sql/executor.go, line 340 [r1] (raw file): sql/executor.go, line 643 [r1] (raw file): sql/executor.go, line 361 [r8] (raw file): sql/executor.go, line 426 [r8] (raw file): sql/executor.go, line 738 [r8] (raw file): sql/executor.go, line 749 [r8] (raw file): sql/executor.go, line 773 [r8] (raw file): sql/metric_test.go, line 12 [r8] (raw file): sql/metric_test.go, line 18 [r8] (raw file): sql/txn_restart_test.go, line 64 [r8] (raw file): sql/txn_restart_test.go, line 78 [r8] (raw file): sql/txn_restart_test.go, line 133 [r8] (raw file): testutils/sqlutils/pg_url.go, line 41 [r8] (raw file): Comments from the review on Reviewable.io |
Previously the only SQL txns that we automatically retry were implicit
txns (single-statements outside of a user txn). With this change, we're
retrying batches of statements.
In a future change, we'll bubble up retriable errors for txns that we
don't auto-retry to the user, and allow her to manually retry.
prefix of a txn sent as one batch)
to determine txn state. This was ugly. Instead, the executor and
planner cooperate to maintain their own view of the sql txn status
Results at the appropriate position (the position of the statement
that queued up the schema change)