-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: add SHOW COMMIT TIMESTAMP to retrieve a causality token #80848
sql: add SHOW COMMIT TIMESTAMP to retrieve a causality token #80848
Conversation
pkg/sql/conn_executor.go
Outdated
if err == nil { | ||
res.SetColumns(ctx, colinfo.ResultColumns{ | ||
{ | ||
Name: "causality_token", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One area where a new user to the system can get hung up is all of the different ways that these timestamps are referred to - crdb_internal_mvcc_timestamp
, cluster_logical_timestamp
, "hybrid-logical clock timestamps", etc. While this result column name is specific and correct, it does add another descriptor to this concept and require that the user know how to connect the dots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, let's see what other folks say regarding names. I feel no strong attachment towards this one.
7b647e3
to
00b25c7
Compare
365469f
to
13359eb
Compare
I'm thinking I'd like to extend this to work also in the case where you issue an implicit transaction with the special statement as the last statement in the simple protocol. This should be relatively simple to detect, but it probably means pushing the logic to detect our special statement into the |
Could it also be extended to an implicit transaction for a statement like This is cool! |
I think this is hard to reason about. Instead I'm thinking of supporting a final statement that looks like the one we support in explicit transactions which can be used to regain the autocommit behavior in simple protocol multi-statement batches. |
Would it be possible to spell this /cc @jakedt |
It gets tricky when you think about the wire protocol. I don't think it's valid to send
I haven't played with it, but I suspect that there are drivers out there which would be unhappy with this behavior. It's also hard to imagine how to work this into drivers which manage transaction lifecycles themselves. Is the idea that if you had a Like if instead it sent back the below, which is data followed by saying there was a
In summary, I guess I'm okay with different syntax but it's a little weird to treat something with a |
Maybe it's useful to explain the motivation - in the current proposal where the |
The
What is a write-only transaction in SQL? It's not a concept in cockroach as far as I know. Is there something special about that concept? Most sql write commands do reads internally. The only way to avoid a read (i.e. do a "blind write") in SQL syntax is an I think if we were to introduce such a concept, it wouldn't be safe to just look at the top-level kinds. Note also that top-level selects statements can also happily do writes both via CTE and, eventually, UDFs. Even more wild is that a select statement can call certain special builtins which do schema changes. |
The motivation is to be able to provide a client with the commit timestamp of a transaction which they've issued and successfully committed. A design goal is to make it such that you don't need to be particularly careful with your connection handling in order to associate the timestamp with the right connection. This constraint leads towards a design whereby the transaction is internally in cockroach committed before the wire protocol or client driver thinks it is so that as far as the client driver is concerned, the transaction is still open when the timestamp is communicated. Any approach which attempts to retrieve the timestamp after the driver issues
Yeah, this is true, but the COMMIT is there just to move the connection back to a state of being usable. Nothing meaningful is allowed to happen between the two (except the other special cockroach thing: |
I guess revisiting this, I'm far from wedded to the actual statement we use. The current thing is pretty gross. I think using a |
I think using Interestingly, it looks like there is |
FWIW another way to do this is to have COMMIT store the causality token in a session var, and have the client use |
I think having something that looks like a
My vote is for a SELECT/SHOW statement that returns the token for the last transaction committed on this transaction. It should be supported both after a plain commit for connection pooling strategies that permit this, and after RELEASE SAVEPOINT when returning connections to the pool makes this a problem. |
(removing sql-experience whilst this in debate, feel free to re-add when it's time for another look) |
I'm cool with this. The major problem I have with this is that existing libraries which exist and utilize In the current implementation, one can do something like: var hybridLogicalTimestamp apd.Decimal
if err := crdbpgx.ExecuteTx(ctx, client, pgx.TxOptions{}, func(tx pgx.Tx) error {
if _, err := tx.Exec(
ctx,
`INSERT INTO kv (key, value) VALUES ($1, $2);`,
uuid.New().String(), strings.Repeat("x", size),
); err != nil {
return err
}
return tx.QueryRow(
ctx, `SELECT crdb_internal.commit_with_causality_token()`,
).Scan(&hybridLogicalTimestamp)
}); err != nil {
return err
} One easy answer is just to allow either order. I'm happy to be convinced that it's not worth it. I think I understand the feedback and will do something that I hope will please the crowd. |
13359eb
to
8c0e681
Compare
7196f0a
to
199e654
Compare
199e654
to
25d609a
Compare
Maybe worth mentioning in the release note that this behavior is idempotent. |
Done. I'll file an issue to have the CLI warn if it's going to be issuing queries unbeknownst to the user. |
@rafiss this seems like a PR you might be well suited to review. If not, please advise on who else might be. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @stevekuznetsov)
pkg/sql/conn_executor_exec.go
line 1825 at r3 (raw file):
// CommitWait state because such a savepoint had been installed when // the transaction was committed via SHOW COMMIT TIMESTAMP. The act of // checking whether the statement should be accepted also prevents any
this seems like a confusing side effect for a caller keep track of (i.e. the act of checking being a thing that toggles the flag). would it be more natural if "accepting the statement" is the thing that toggles the flag?
pkg/sql/conn_executor_show_commit_timestamp.go
line 104 at r3 (raw file):
} // execShowCommitTimestampInOpenState deals with the special statement
nit: wrong function name in the comment
pkg/sql/conn_io.go
line 143 at r3 (raw file):
// simple protocol Query message that contains a batch of 1 or more queries. LastInBatch bool // LastInBatch indicates that this command contains the second-to-last query
nit: variable name in the comment
pkg/sql/sem/builtins/builtins.go
line 195 at r3 (raw file):
// committed. The implementation of this function here is a shim; the function // is processed explicitly by the connExecutor. const CommitWithCausalityTokenName = "commit_with_causality_token"
is this needed?
pkg/sql/logictest/testdata/logic_test/show_commit_timestamp
line 211 at r3 (raw file):
subtest prepare # You cannot prepare SHOW COMMIT TIMESTAMP because it is not preparable.
could you add a test in pgwire/testdata/ and Parse
a SHOW COMMIT TIMESTAMP statement?
Fixes cockroachdb#79591 Relates to cockroachdb#7945 Release note (sql change): A new sql statement `SHOW COMMIT TIMESTAMP` has been added. This statement can be used to retrieve the commit timestamp of the current explicit transaction, current multi-statement implicit transaction, or previous transaction. The statement may be used in a variety of settings to maximize its utility in the face of connection pooling. When used as a part of an explicit transaction, the statement implicitly commits the transaction internally before being able to return a causality token. This is similar to the `RELEASE cockroach_restart` behavior; after issuing this statement, commands to the transaction will be rejected until `COMMIT` is issued. When used as part of a multi-statement implicit transaction, the statement must be the final statement. If it occurs in the middle of a multi-statement implicit transaction, it will be rejected with an error. When sent as a stand-alone single-statement implicit transaction, it will return the commit timestamp of the previously committed transaction. If there was no transaction on the connection, or the previous transaction did not commit (either rolled back or encountered an error), the command will fail with an error code 25000 (`InvalidTransactionState`).
25d609a
to
2b86f13
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rafiss and @stevekuznetsov)
pkg/sql/conn_executor_exec.go
line 1825 at r3 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
this seems like a confusing side effect for a caller keep track of (i.e. the act of checking being a thing that toggles the flag). would it be more natural if "accepting the statement" is the thing that toggles the flag?
Done.
pkg/sql/conn_executor_show_commit_timestamp.go
line 104 at r3 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
nit: wrong function name in the comment
Done.
pkg/sql/conn_io.go
line 143 at r3 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
nit: variable name in the comment
Done.
pkg/sql/logictest/testdata/logic_test/show_commit_timestamp
line 211 at r3 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
could you add a test in pgwire/testdata/ and
Parse
a SHOW COMMIT TIMESTAMP statement?
Done. To be clear, you can Parse a SHOW COMMIT TIMESTAMP
statement and in that way you can prepare it using the extended protocol, you just can't prepare it using the simple protocol. The test does exercise this via the pgx code. I've added also the
pkg/sql/sem/builtins/builtins.go
line 195 at r3 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
is this needed?
Done.
I filed #92260 to track an edge case. The intention is not to do anything about it but to have it on the books for some future soul. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
Reviewed 15 of 40 files at r3, 27 of 27 files at r4, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @stevekuznetsov)
TFTR! bors r+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable seemed to be upset that I had an open review - very excited to see these changes @ajwerner !
bors r+ |
Build succeeded: |
While developing the test-case for `EmitImmediatelyStrategy` with CRDB, I detected something I wasn't expecting: when comparing the revision out of the datastore write, and what came out of Watch API, the revisions were different. We use `SHOW COMMIT TIMESTAMP` to retrieve a CRDB [transaction timestamp] (cockroachdb/cockroach#80848). The value coming out of the `update` field in change streams had the same value, but the difference was the notation: one included the logical clock, the other didn't, but logical clocks were the same (zero): - revision generated out of the transaction in `readTransactionCommitRev` uses `NewForHLC(hlcNow)`, which takes a `Decimal`. This in turn calls `decimal.String()`, which returns a number stripped out of the decimal part if it's zero. - revision obtained out of the changefeeds uses `revisions.HLCRevisionFromString (details.Updated)`, and `details.Updated` always comes with the decimal part, even when it's zero Both timestamps were the same, but had a different string representation, and led to a different `HCLRevision` logical lock, hence `.Equal()` method failing.
While developing the test-case for `EmitImmediatelyStrategy` with CRDB, I detected something I wasn't expecting: when comparing the revision out of the datastore write, and what came out of Watch API, the revisions were different. We use `SHOW COMMIT TIMESTAMP` to retrieve a CRDB [transaction timestamp] (cockroachdb/cockroach#80848). The value coming out of the `update` field in change streams had the same value, but the difference was the notation: one included the logical clock, the other didn't, but logical clocks were the same (zero): - revision generated out of the transaction in `readTransactionCommitRev` uses `NewForHLC(hlcNow)`, which takes a `Decimal`. This in turn calls `decimal.String()`, which returns a number stripped out of the decimal part if it's zero. - revision obtained out of the changefeeds uses `revisions.HLCRevisionFromString (details.Updated)`, and `details.Updated` always comes with the decimal part, even when it's zero Both timestamps were the same, but had a different string representation, and led to a different `HCLRevision` logical lock, hence `.Equal()` method failing.
While developing the test-case for `EmitImmediatelyStrategy` with CRDB, I detected something I wasn't expecting: when comparing the revision out of the datastore write, and what came out of Watch API, the revisions were different. We use `SHOW COMMIT TIMESTAMP` to retrieve a CRDB [transaction timestamp] (cockroachdb/cockroach#80848). The value coming out of the `update` field in change streams had the same value, but the difference was the notation: one included the logical clock, the other didn't, but logical clocks were the same (zero): - revision generated out of the transaction in `readTransactionCommitRev` uses `NewForHLC(hlcNow)`, which takes a `Decimal`. This in turn calls `decimal.String()`, which returns a number stripped out of the decimal part if it's zero. - revision obtained out of the changefeeds uses `revisions.HLCRevisionFromString (details.Updated)`, and `details.Updated` always comes with the decimal part, even when it's zero Both timestamps were the same, but had a different string representation, and led to a different `HCLRevision` logical lock, hence `.Equal()` method failing.
While developing the test-case for `EmitImmediatelyStrategy` with CRDB, I detected something I wasn't expecting: when comparing the revision out of the datastore write, and what came out of Watch API, the revisions were different. We use `SHOW COMMIT TIMESTAMP` to retrieve a CRDB [transaction timestamp] (cockroachdb/cockroach#80848). The value coming out of the `update` field in change streams had the same value, but the difference was the notation: one included the logical clock, the other didn't, but logical clocks were the same (zero): - revision generated out of the transaction in `readTransactionCommitRev` uses `NewForHLC(hlcNow)`, which takes a `Decimal`. This in turn calls `decimal.String()`, which returns a number stripped out of the decimal part if it's zero. - revision obtained out of the changefeeds uses `revisions.HLCRevisionFromString (details.Updated)`, and `details.Updated` always comes with the decimal part, even when it's zero Both timestamps were the same, but had a different string representation, and led to a different `HCLRevision` logical lock, hence `.Equal()` method failing.
While developing the test-case for `EmitImmediatelyStrategy` with CRDB, I detected something I wasn't expecting: when comparing the revision out of the datastore write, and what came out of Watch API, the revisions were different. We use `SHOW COMMIT TIMESTAMP` to retrieve a CRDB [transaction timestamp] (cockroachdb/cockroach#80848). The value coming out of the `update` field in change streams had the same value, but the difference was the notation: one included the logical clock, the other didn't, but logical clocks were the same (zero): - revision generated out of the transaction in `readTransactionCommitRev` uses `NewForHLC(hlcNow)`, which takes a `Decimal`. This in turn calls `decimal.String()`, which returns a number stripped out of the decimal part if it's zero. - revision obtained out of the changefeeds uses `revisions.HLCRevisionFromString (details.Updated)`, and `details.Updated` always comes with the decimal part, even when it's zero Both timestamps were the same, but had a different string representation, and led to a different `HCLRevision` logical lock, hence `.Equal()` method failing.
While developing the test-case for `EmitImmediatelyStrategy` with CRDB, I detected something I wasn't expecting: when comparing the revision out of the datastore write, and what came out of Watch API, the revisions were different. We use `SHOW COMMIT TIMESTAMP` to retrieve a CRDB [transaction timestamp] (cockroachdb/cockroach#80848). The value coming out of the `update` field in change streams had the same value, but the difference was the notation: one included the logical clock, the other didn't, but logical clocks were the same (zero): - revision generated out of the transaction in `readTransactionCommitRev` uses `NewForHLC(hlcNow)`, which takes a `Decimal`. This in turn calls `decimal.String()`, which returns a number stripped out of the decimal part if it's zero. - revision obtained out of the changefeeds uses `revisions.HLCRevisionFromString (details.Updated)`, and `details.Updated` always comes with the decimal part, even when it's zero Both timestamps were the same, but had a different string representation, and led to a different `HCLRevision` logical lock, hence `.Equal()` method failing.
While developing the test-case for `EmitImmediatelyStrategy` with CRDB, I detected something I wasn't expecting: when comparing the revision out of the datastore write, and what came out of Watch API, the revisions were different. We use `SHOW COMMIT TIMESTAMP` to retrieve a CRDB [transaction timestamp] (cockroachdb/cockroach#80848). The value coming out of the `update` field in change streams had the same value, but the difference was the notation: one included the logical clock, the other didn't, but logical clocks were the same (zero): - revision generated out of the transaction in `readTransactionCommitRev` uses `NewForHLC(hlcNow)`, which takes a `Decimal`. This in turn calls `decimal.String()`, which returns a number stripped out of the decimal part if it's zero. - revision obtained out of the changefeeds uses `revisions.HLCRevisionFromString (details.Updated)`, and `details.Updated` always comes with the decimal part, even when it's zero Both timestamps were the same, but had a different string representation, and led to a different `HCLRevision` logical lock, hence `.Equal()` method failing.
Fixes #79591
Relates to #7945
Release note (sql change): A new sql statement
SHOW COMMIT TIMESTAMP
has beenadded. This statement can be used to retrieve the commit timestamp of the
current explicit transaction, current multi-statement implicit transaction, or
previous transaction. The statement may be used in a variety of settings to
maximize its utility in the face of connection pooling.
When used as a part of an explicit transaction, the statement implicitly
commits the transaction internally before being able to return a causality
token. This is similar to the
RELEASE cockroach_restart
behavior; afterissuing this statement, commands to the transaction will be rejected until
COMMIT
is issued.When used as part of a multi-statement implicit transaction, the statement
must be the final statement. If it occurs in the middle of a multi-statement
implicit transaction, it will be rejected with an error.
When sent as a stand-alone single-statement implicit transaction, it will
return the commit timestamp of the previously committed transaction. If there
was no transaction on the connection, or the previous transaction did not
commit (either rolled back or encountered an error), the command will fail with
an error code 25000 (
InvalidTransactionState
).The
SHOW COMMIT TIMESTAMP
statement is idempotent; it can be issuedmultiple times both inside of and after transactions and will return the same result.
One rough edge is that the
cockroach sql
cli command will, by default, sendstatements on behalf of the user which will lead to repeated issuances of
SHOW COMMIT TIMESTAMP
from the CLI returning different values. If onedisables syntax checking with
\set check_syntax=false
and one changes theirprompt1 to not require a query, perhaps with
\set prompt1=%n@%M>;
, thecommand will become idempotent from the CLI.