-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: implement XA transaction SQL statements #129448
sql: implement XA transaction SQL statements #129448
Conversation
Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
bd66628
to
c722328
Compare
c722328
to
353d041
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice work! i reviewed the final 3 commits of this PR.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @nvanbenschoten)
pkg/sql/two_phase_commit.go
line 227 at r13 (raw file):
nit: in most of our user facing docs we use the term "admin role" instead of "superuser" so let's make this say something like
Must have the admin role or be the same user as the one that prepared the transaction.
pkg/sql/two_phase_commit.go
line 49 at r13 (raw file):
if prepareErr != nil { if rbErr := ex.state.mu.txn.Rollback(ctx); rbErr != nil { log.Warningf(ctx, "txn rollback failed: err=%s", rbErr)
just confirming; is it correct to not short-circuit here? the state machine will advance to the NoTxn
state, but I'm not sure what that means if the transaction never rolled back.
pkg/sql/two_phase_commit.go
line 66 at r13 (raw file):
// TODO(nvanbenschoten): why are these needed here (and in the equivalent // functions for commit and rollback)? Shouldn't they be handled by // connExecutor.resetExtraTxnState?
i'm curious what happened when these close
s weren't added here?
pkg/sql/two_phase_commit.go
line 81 at r13 (raw file):
// which would prevent it from being prepared. if ex.extraTxnState.descCollection.HasUncommittedDescriptors() { return errors.Errorf("cannot prepare transaction: uncommitted descriptors")
nit: could the error say something more user facing, like "cannot prepare a transaction that has already performed schema changes"
and should it have an error code like InvalidTransactionState
(25000)?
pkg/sql/two_phase_commit.go
line 86 at r13 (raw file):
// TODO(nvanbenschoten): add a Txn.Key method to get the transaction's key, // instead of using TestingCloneTxn. txnProto := ex.state.mu.txn.TestingCloneTxn()
nit: just curious if there are any known gotchas with using this testing function here. if so, including them in the comment would be useful.
pkg/sql/two_phase_commit.go
line 97 at r13 (raw file):
ctx, ex.server.cfg.InternalDB.Executor(), nil, /* sqlTxn */
is it correct to use a separate transaction for this?
as long as it is, then the more conventional approach would be to pass in the InternalDB
, and then use internalDB.DescsTxn(...)
or internalDB.Txn(...)
in the callee. these are useful since they take care of some setup/teardown of the transaction.
pkg/sql/two_phase_commit.go
line 110 at r13 (raw file):
// v24.3 is no longer necessary. if pgerror.GetPGCode(err) == pgcode.UndefinedTable { return pgerror.Newf(pgcode.FeatureNotSupported, "PREPARE TRANSACTION unsupported in mixed-version cluster")
nit: instead of having this check here, let's make the first step of execPrepareTransactionInOpenStateInternal
be something like:
ex.planner.EvalContext().Settings.Version.IsActive(...)
the planner should already be set by this point, since that happens in the beginning of execStmtInOpenState
pkg/sql/two_phase_commit.go
line 170 at r13 (raw file):
err = deletePreparedTransaction( ctx, ex.server.cfg.InternalDB.Executor(),
similar comment as above, let's pass in InternalDB
here instead.
pkg/sql/two_phase_commit.go
line 278 at r13 (raw file):
// TODO(nvanbenschoten): Remove this logic when mixed-version support with // v24.3 is no longer necessary. if pgerror.GetPGCode(err) == pgcode.UndefinedTable {
similar comment as above; we can eagerly check for the mixed version state and short-circuit
pkg/internal/sqlsmith/sqlsmith.go
line 371 at r13 (raw file):
{2, makeCommit}, {2, makeRollback}, // TODO(nvanbenschoten): add two-phase commit statements.
do we already have tracking issues for the testing work that we are postponing for later?
pkg/sql/pg_catalog.go
line 2426 at r14 (raw file):
schema: vtable.PGCatalogPreparedXacts, populate: func(ctx context.Context, p *planner, dbContext catalog.DatabaseDescriptor, addRow func(...tree.Datum) error) error { rows, err := p.InternalSQLTxn().QueryBufferedEx(
this virtual table could be implemented as a view on system.prepared_transactions. those are generally easier to maintain, so we prefer that approach. take a look at pg_shdescription for example:
cockroach/pkg/sql/pg_catalog.go
Lines 1824 to 1833 in e235962
var pgCatalogSharedDescriptionView = virtualSchemaView{ | |
schema: vtable.PGCatalogSharedDescription, | |
resultColumns: colinfo.ResultColumns{ | |
{Name: "objoid", Typ: types.Oid}, | |
{Name: "classoid", Typ: types.Oid}, | |
{Name: "description", Typ: types.String}, | |
}, | |
comment: `shared object comments | |
https://www.postgresql.org/docs/9.5/catalog-pg-shdescription.html`, | |
} |
to implement the mixed version check, the view can additionally execute the crdb_internal.is_at_least_version
function.
353d041
to
f2acda2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR! I've rebased this on master now that the KV PR has landed.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @rafiss)
pkg/internal/sqlsmith/sqlsmith.go
line 371 at r13 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
do we already have tracking issues for the testing work that we are postponing for later?
I opened #137636 to track this additional testing work. We can add to it if we think of anything more.
pkg/sql/pg_catalog.go
line 2426 at r14 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
this virtual table could be implemented as a view on system.prepared_transactions. those are generally easier to maintain, so we prefer that approach. take a look at pg_shdescription for example:
cockroach/pkg/sql/pg_catalog.go
Lines 1824 to 1833 in e235962
var pgCatalogSharedDescriptionView = virtualSchemaView{ schema: vtable.PGCatalogSharedDescription, resultColumns: colinfo.ResultColumns{ {Name: "objoid", Typ: types.Oid}, {Name: "classoid", Typ: types.Oid}, {Name: "description", Typ: types.String}, }, comment: `shared object comments https://www.postgresql.org/docs/9.5/catalog-pg-shdescription.html`, } to implement the mixed version check, the view can additionally execute the
crdb_internal.is_at_least_version
function.
This almost works, but I'm struggling to get the handling of the non-existent table in the mixed-version case working correctly? Any suggestions for how to express that in SQL?
pkg/sql/two_phase_commit.go
line 49 at r13 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
just confirming; is it correct to not short-circuit here? the state machine will advance to the
NoTxn
state, but I'm not sure what that means if the transaction never rolled back.
I don't think there's anything better we can do here. If the rollback fails then I guess the kv transaction is going to need to time out. Failing to advance the sql state machine to NoTxn doesn't really accomplish anything.
This is the same kind of logic that we have in connExecutor.rollbackSQLTransaction
.
pkg/sql/two_phase_commit.go
line 66 at r13 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
i'm curious what happened when these
close
s weren't added here?
Nothing bad happened, IIRC. I just noticed this when copying over some of this logic from commitSQLTransactionInternal
. I kept the logic for symmetry, but I don't understand why it's needed in either place. Probably not necessary to bottom out on this here.
pkg/sql/two_phase_commit.go
line 81 at r13 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
nit: could the error say something more user facing, like "cannot prepare a transaction that has already performed schema changes"
and should it have an error code like
InvalidTransactionState
(25000)?
Done and done.
pkg/sql/two_phase_commit.go
line 86 at r13 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
nit: just curious if there are any known gotchas with using this testing function here. if so, including them in the comment would be useful.
No gotchas. I think it's just named like this because we at one point decided that exposing the raw proto from the transaction was a "bad API" which should only be used in tests.
I addressed the TODO in a new commit.
pkg/sql/two_phase_commit.go
line 97 at r13 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
is it correct to use a separate transaction for this?
as long as it is, then the more conventional approach would be to pass in the
InternalDB
, and then useinternalDB.DescsTxn(...)
orinternalDB.Txn(...)
in the callee. these are useful since they take care of some setup/teardown of the transaction.
These need to use a separate transaction, because we don't want the bookkeeping in the system.prepared_transactions
table to be part of the prepared transaction.
The internalDB.DescsTxn(...)
or internalDB.Txn(...)
approaches seem like a lot more boilerplate for kicking off a single-statement, implicit transaction. Is there a benefit to using them over passing a nil transaction?
The other benefit of the approach here is that I can call it with a nil transaction (and unbound Executor) in cleanupAfterFailedPrepareTransaction
and the SQL transaction (and bound Executor) in deletePreparedTxn
.
Happy to go with whatever approach you think is best.
pkg/sql/two_phase_commit.go
line 110 at r13 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
nit: instead of having this check here, let's make the first step of
execPrepareTransactionInOpenStateInternal
be something like:ex.planner.EvalContext().Settings.Version.IsActive(...)
the planner should already be set by this point, since that happens in the beginning of
execStmtInOpenState
Done.
pkg/sql/two_phase_commit.go
line 227 at r13 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
nit: in most of our user facing docs we use the term "admin role" instead of "superuser" so let's make this say something like
Must have the admin role or be the same user as the one that prepared the transaction.
This hint message comes straight from PG, so I figured I'd keep it identical. See https://github.com/postgres/postgres/blob/7a80e381d16c642d00ec6146ccdf1262a159c69e/src/backend/access/transam/twophase.c#L587.
Happy to change it if you feel strongly.
pkg/sql/two_phase_commit.go
line 278 at r13 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
similar comment as above; we can eagerly check for the mixed version state and short-circuit
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @nvanbenschoten)
pkg/sql/pg_catalog.go
line 2426 at r14 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
This almost works, but I'm struggling to get the handling of the non-existent table in the mixed-version case working correctly? Any suggestions for how to express that in SQL?
ah good point. i don't know a way around resolving the table name.
but since we do have a planner here, let's have this populate function short-circuit with a p.IsActive
check rather than using error codes to learn if it's a mixed version state.
pkg/sql/two_phase_commit.go
line 49 at r13 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I don't think there's anything better we can do here. If the rollback fails then I guess the kv transaction is going to need to time out. Failing to advance the sql state machine to NoTxn doesn't really accomplish anything.
This is the same kind of logic that we have in
connExecutor.rollbackSQLTransaction
.
i believe there is one state transition failure case where we simply end the session. i can't seem to find it now though, so i'm fine with leaving this as is.
pkg/sql/two_phase_commit.go
line 66 at r13 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Nothing bad happened, IIRC. I just noticed this when copying over some of this logic from
commitSQLTransactionInternal
. I kept the logic for symmetry, but I don't understand why it's needed in either place. Probably not necessary to bottom out on this here.
i see, thanks for explaining! i will take a cursory look into the history here separate from this PR.
pkg/sql/two_phase_commit.go
line 97 at r13 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
These need to use a separate transaction, because we don't want the bookkeeping in the
system.prepared_transactions
table to be part of the prepared transaction.The
internalDB.DescsTxn(...)
orinternalDB.Txn(...)
approaches seem like a lot more boilerplate for kicking off a single-statement, implicit transaction. Is there a benefit to using them over passing a nil transaction?The other benefit of the approach here is that I can call it with a nil transaction (and unbound Executor) in
cleanupAfterFailedPrepareTransaction
and the SQL transaction (and bound Executor) indeletePreparedTxn
.Happy to go with whatever approach you think is best.
i do think we should stick with internalDB
here. IMO it makes it more clear to readers that a separate transaction is going to be initiated.
it also takes care of releasing descriptor leases, which isn't super relevant right now, but it's very easy to forget about that if the code is extended more later, or if somebody looks at this as inspiration for a different function.
also IMO it's not a huge amount of boilerplate. it mostly comes down to a bit of added code indentation:
cockroach/pkg/sql/show_cluster_setting.go
Lines 58 to 64 in 9f61a38
return p.execCfg.InternalDB.Txn(ctx, func(ctx context.Context, txn isql.Txn) error { | |
datums, err := txn.QueryRowEx( | |
ctx, "read-setting", | |
txn.KV(), | |
sessiondata.NodeUserSessionDataOverride, | |
"SELECT value FROM system.settings WHERE name = $1", key, | |
) |
pkg/sql/two_phase_commit.go
line 227 at r13 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
This hint message comes straight from PG, so I figured I'd keep it identical. See https://github.com/postgres/postgres/blob/7a80e381d16c642d00ec6146ccdf1262a159c69e/src/backend/access/transam/twophase.c#L587.
Happy to change it if you feel strongly.
thanks, i'm fine with using the term superuser
pkg/sql/two_phase_commit.go
line 96 at r19 (raw file):
txnKey := txn.Key() if !txn.IsOpen() { return errors.AssertionFailedf("cannot prepare a transaction that is not open")
do we need one more check to make sure we're in an explicit transaction? if we're not in one, then it would be a user error.
f2acda2
to
9ecfba9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @rafiss)
pkg/sql/pg_catalog.go
line 2426 at r14 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
ah good point. i don't know a way around resolving the table name.
but since we do have a planner here, let's have this populate function short-circuit with a
p.IsActive
check rather than using error codes to learn if it's a mixed version state.
Done.
pkg/sql/two_phase_commit.go
line 97 at r13 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
i do think we should stick with
internalDB
here. IMO it makes it more clear to readers that a separate transaction is going to be initiated.it also takes care of releasing descriptor leases, which isn't super relevant right now, but it's very easy to forget about that if the code is extended more later, or if somebody looks at this as inspiration for a different function.
also IMO it's not a huge amount of boilerplate. it mostly comes down to a bit of added code indentation:
cockroach/pkg/sql/show_cluster_setting.go
Lines 58 to 64 in 9f61a38
return p.execCfg.InternalDB.Txn(ctx, func(ctx context.Context, txn isql.Txn) error { datums, err := txn.QueryRowEx( ctx, "read-setting", txn.KV(), sessiondata.NodeUserSessionDataOverride, "SELECT value FROM system.settings WHERE name = $1", key, )
Got it, I think I see how you wanted this to look. The nice part is that I now only need to pass one param (an isql.Txn
) to these functions, instead of two (a isql.Executor
and a *kv.Txn
).
I put this cleanup in a separate commit to make it easier for you to review, to confirm that I'm understanding the suggestion correctly.
If you 👍, I'll squash the commit into the first one in this PR.
pkg/sql/two_phase_commit.go
line 96 at r19 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
do we need one more check to make sure we're in an explicit transaction? if we're not in one, then it would be a user error.
That check is in connExecutor.execStmtInNoTxnState
. There's a test for this in two_phase_commit
, if you search for there is no transaction in progress
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @nvanbenschoten)
pkg/sql/two_phase_commit.go
line 97 at r13 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Got it, I think I see how you wanted this to look. The nice part is that I now only need to pass one param (an
isql.Txn
) to these functions, instead of two (aisql.Executor
and a*kv.Txn
).I put this cleanup in a separate commit to make it easier for you to review, to confirm that I'm understanding the suggestion correctly.
If you 👍, I'll squash the commit into the first one in this PR.
this looks great, thanks.
Informs cockroachdb#22329. This commit implements the statement handling for `PREPARE TRANSACTION <gid>`, `COMMIT PREPARED <gid>`, and `ROLLBACK PREPARED <gid>`, for parity with Postgres support. The implementation is based on the Postgres documentation and the Postgres source code. Release note (sql change): XA transaction support allows CockroachDB to participate in distributed transaction with other resources (e.g. databases, message queues, etc) using a two phase commit protocol.
Addresses a TODO. Epic: None Release note: None
Informs cockroachdb#22329. `pg_catalog.pg_prepared_xacts` reads from `system.prepared_transactions`. Release note: None
Informs cockroachdb#22329. This commit fixes the pgjdbc blocklist to reflect the current state of support for XA transactions in CockroachDB. With the newly added support for two-phase commit transactions, all tests in the XADataSourceTest test suite now pass except for `mappingOfConstraintViolations`, which is blocked by cockroachdb#31632 (DEFERRABLE foreign keys). Release note: None
9ecfba9
to
6e2aea5
Compare
TFTR! bors r=rafiss |
Informs #22329.
Closes #137325.
This PR implements the statement handling for
PREPARE TRANSACTION <gid>
,COMMIT PREPARED <gid>
, andROLLBACK PREPARED <gid>
, for parity with Postgres support. The implementation is based on the Postgres documentation and the Postgres source code.The PR then implements the
pg_catalog.pg_prepared_xacts
system catalog table, which reads fromsystem.prepared_transactions
. This adds the remaining PG compatibility, so that XA in CockroachDB looks near-identical to XA in PostgreSQL.Finally, the PR fixes the
pgjdbc
blocklist to reflect the current state of support for XA transactions in CockroachDB. With the newly added support for two-phase commit transactions, all tests in theXADataSourceTest
test suite now pass except formappingOfConstraintViolations
, which is blocked by #31632 (DEFERRABLE foreign keys).Release note (sql change): XA transaction support allows CockroachDB to participate in distributed transaction with other resources (e.g. databases, message queues, etc) using a two phase commit protocol.