-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: persist minimum transaction timestamps in intents #38782
storage: persist minimum transaction timestamps in intents #38782
Conversation
b394c85
to
8807f7a
Compare
828fd26
to
a27bd89
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r1, 1 of 1 files at r2, 6 of 6 files at r3, 3 of 3 files at r4, 22 of 22 files at r5, 7 of 7 files at r6, 27 of 27 files at r7, 3 of 3 files at r8, 22 of 22 files at r9, 7 of 7 files at r10, 2 of 2 files at r11, 1 of 1 files at r12.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/storage/replica_tscache.go, line 297 at r5 (raw file):
// CanCreateTxnRecord determines whether a transaction record can be created // for the provided transaction information. Callers should provide the
s/should/must/
pkg/storage/replica_tscache.go, line 299 at r5 (raw file):
, that is, the epoch zero OrigTimestamp.
pkg/storage/replica_tscache.go, line 305 at r5 (raw file):
// it returns the reason that transaction record was rejected. If the method // ever determines that a transaction record must be rejected, it will continue // to reject that transaction going forwards.
While you're here, you could also point out that if this method returns true, we may still fail to create the record because there's no locking here. Maybe this is clear already.
pkg/storage/replica_tscache.go, line 309 at r5 (raw file):
// provisional commit timestamp because it may have moved forward over the // course of a single epoch and they can't provide their (current epoch's) // OrigTimestamp because it may have moved forward over a series of prior
Feel like I've asked this before, but wasn't EpochZeroTimestamp
the thing folks wanted to pass here? Why not?
I also assume callers that had only seen an intent sometimes need this and could only provide something larger than optimal, but this isn't in the old comment.
Nice that this all simplifies with this PR.
pkg/storage/batcheval/transaction.go, line 165 at r5 (raw file):
// that we can remove it in v20.1 and replace it with: // // synthTxnRecord.Status = roachpb.ABORTED
👍
pkg/storage/rangefeed/processor.go, line 272 at r5 (raw file):
// harmless, especially because the txnPushAttempt performs a // high-priority push. MinTimestamp: txn.timestamp,
We can't have the real minTimestamp here?
a27bd89
to
d011683
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @tbg)
pkg/storage/replica_tscache.go, line 297 at r5 (raw file):
Previously, tbg (Tobias Grieger) wrote…
s/should/must/
Done.
pkg/storage/replica_tscache.go, line 299 at r5 (raw file):
Previously, tbg (Tobias Grieger) wrote…
, that is, the epoch zero OrigTimestamp.
The notion of an "epoch zero OrigTimestamp" is part of what I'm trying to get rid of here. The contract we want from this timestamp is that it's the minimum timestamp that a transaction could have ever written. Mixing this up with epochs and the OrigTimestamp (which I'd like to rename to the read timestamp) just confuses all of this.
pkg/storage/replica_tscache.go, line 305 at r5 (raw file):
because there's no locking here
Latching provides this locking. No request that would update the timestamp cache to prevent a txn record from being created could be evaluating concurrently.
pkg/storage/replica_tscache.go, line 309 at r5 (raw file):
but wasn't EpochZeroTimestamp the thing folks wanted to pass here
Yes, but that field is only populated after a restart. We were passing the result of InclusiveTimeBounds
here before, which accounted for this.
I also assume callers that had only seen an intent sometimes need this and could only provide something larger than optimal
Yes, that's correct
but this isn't in the old comment.
The old comment discussed the false negatives that could occur if this timestamp wasn't the real min timestamp. The (now removed) comment in cmd_push_txn.go
discussed the actual implications of this.
pkg/storage/rangefeed/processor.go, line 272 at r5 (raw file):
Previously, tbg (Tobias Grieger) wrote…
We can't have the real minTimestamp here?
We'd need to ship in through the LogicalOpLog
as an MVCCLogicalOp
. That doesn't seem worth it and would also cause compatibility concerns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 28 of 28 files at r13, 1 of 1 files at r14, 8 of 8 files at r15, 3 of 3 files at r16, 22 of 22 files at r17, 7 of 7 files at r18, 2 of 2 files at r19, 1 of 1 files at r20.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/storage/replica_tscache.go, line 299 at r5 (raw file):
That sounds good, but then why not
, that is, the epoch zero value of the
Timestamp
field (the epoch zero write timestamp).
Were you also planning to rename Timestamp
to WriteTimestamp
? That would make sense to me, the role of the two timestamps is confusing enough to those who haven't internalized it over the years.
pkg/storage/replica_tscache.go, line 305 at r5 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
because there's no locking here
Latching provides this locking. No request that would update the timestamp cache to prevent a txn record from being created could be evaluating concurrently.
But the low water mark could rise over due to unrelated requests rolling over the page and change the outcome?
Now that I looked up this method I'm seeing that it's really just part of the eval context, and there it makes more sense because that restricts who calls it (command evaluation), so latches are assumed. This reminds me that I was hoping to take off the eval context methods from the main replica, but it was a giant yak shave. I should get back to that at some point though.
pkg/storage/rangefeed/processor.go, line 272 at r5 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
We'd need to ship in through the
LogicalOpLog
as anMVCCLogicalOp
. That doesn't seem worth it and would also cause compatibility concerns.
That is unfortunate, but on the other hand this line keeps all of the magic in the comments that you removed alive, but in a more subtle way. It directly contradicts the contract of the MinTimestamp
field. I feel that this is problematic enough to fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @tbg)
pkg/storage/replica_tscache.go, line 299 at r5 (raw file):
Previously, tbg (Tobias Grieger) wrote…
That sounds good, but then why not
, that is, the epoch zero value of the
Timestamp
field (the epoch zero write timestamp).Were you also planning to rename
Timestamp
toWriteTimestamp
? That would make sense to me, the role of the two timestamps is confusing enough to those who haven't internalized it over the years.
Yes, exactly. The goal is to get the long list of transaction timestamps down to read_timestamp
, write_timestamp
, min_timestamp
, and max_timestamp
, last_heartbeat
.
max_timestamp
might also be due for a rename though, as it's not quite clear from the name what it's actual meaning is. Perhaps max_uncertain_timestamp
.
Also, hopefully, last_heartbeat
can also go away when we stop heartbeating individual transactions.
pkg/storage/replica_tscache.go, line 305 at r5 (raw file):
But the low water mark could rise over due to unrelated requests rolling over the page and change the outcome?
Sure, but that wouldn't prevent the request from creating the transaction record. It's no different than any other write that only checks the timestamp cache once. Timestamp cache pressure could create artificial conflicts, but nothing real, so the "race" is immaterial.
pkg/storage/rangefeed/processor.go, line 272 at r5 (raw file):
Previously, tbg (Tobias Grieger) wrote…
That is unfortunate, but on the other hand this line keeps all of the magic in the comments that you removed alive, but in a more subtle way. It directly contradicts the contract of the
MinTimestamp
field. I feel that this is problematic enough to fix.
I'm curious what your suggested fix is. It seems to me like it could go one of a few ways:
- ship the transaction's min timestamp around through the
LogicalOpLog
with the rest of the transaction metadata. This would have compatibility issues with v19.1, so at some level we'll still need this, although I guess we can let the missing min timestamp case fall through toSynthesizeTxnFromMeta
, which already needs to handle it. Interestingly, this is essentially what we'd get for free if the transaction ID included its minimum timestamp, which is the long term goal for this series of changes. - don't set any min timestamp here and formally support txn pushes without one specified by falling back to the old approach of using the write timestamp as the min timestamp when one is not specified, which essentially makes the current compatibility hack set in stone.
- don't set any min timestamp here and formally support txn pushes without one specified by considering missing records to be aborted. I'm not sure if this works in practice, but we could probably make it work.
- something else I haven't yet considered.
9655167
to
8135219
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @tbg)
pkg/storage/rangefeed/processor.go, line 272 at r5 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I'm curious what your suggested fix is. It seems to me like it could go one of a few ways:
- ship the transaction's min timestamp around through the
LogicalOpLog
with the rest of the transaction metadata. This would have compatibility issues with v19.1, so at some level we'll still need this, although I guess we can let the missing min timestamp case fall through toSynthesizeTxnFromMeta
, which already needs to handle it. Interestingly, this is essentially what we'd get for free if the transaction ID included its minimum timestamp, which is the long term goal for this series of changes.- don't set any min timestamp here and formally support txn pushes without one specified by falling back to the old approach of using the write timestamp as the min timestamp when one is not specified, which essentially makes the current compatibility hack set in stone.
- don't set any min timestamp here and formally support txn pushes without one specified by considering missing records to be aborted. I'm not sure if this works in practice, but we could probably make it work.
- something else I haven't yet considered.
I decided to go with the first option. The migration story ends up being almost the same as it is for normal PushTxn requests and the fact that this will come for free with a change to the transaction ID format was decently compelling. The new field also parallels the existing TxnKey
field, which is nice. PTAL.
Release note: None
…nsactions This was only needed for compatibility with 2.1 nodes. Release note: None
This isn't needed by v19.1 or v19.2 nodes. It was only added to avoid causing v2.1 nodes to panic. See 833a827. Release note: None
Before this change, zerofields only traversed the top-level struct. This change adjusts the library to traverse all the way down nested structs when performing its verification. Release note: None
Fixes cockroachdb#36089. Informs cockroachdb#37199. This commit addresses the second concern from cockroachdb#37199 (comment) by implementing its suggestion #1. It augments the TxnMeta struct to begin storing the transaction's minimum timestamp. This allows pushers to have perfect accuracy into whether an intent is part of a transaction that can eventually be committed or whether it has been aborted by any other pusher and uncommittable. This allows us to get rid of the false positive cases where a pusher incorrectly detects that a transaction is still active and begins waiting on it. In this worst case, this could lead to transactions waiting for the entire transaction expiration for a contending txn. Release note: None
This cleans up the handling of minimum timestamp in the rangefeed processor. The migration story here is relatively straightforward. If an MVCCWriteIntentOp doesn't have a MinTimestamp then we know that we're in one of the following three cases: 1. the coordinator for the transaction is a v19.1 node and did not assign a MinTimestamp to its transaction 2. an initResolvedTSScan found an intent without a MinTimestamp 3. the leaseholder that created the operation is a v19.1 node and did not propagate the transaction's MinTimestamp. The coordinator may be a v19.1 or v19.2 node The first and second case imply that the transaction coordinator will never interact with a v20.1 node. This means that the migration story discussed in SynthesizeTxnFromMeta will work well. The third case is more complicated because the leaseholder who proposed the MVCCWriteIntentOp might not be around any more. This means that by the time the MVCCWriteIntentOp results in a PushTxn request, the request may go to a v20.1 node even when the transaction's coordinator is still alive. In that case, the migration in SynthesizeTxnFromMeta isn't exactly what we want, but it also won't cause correctness issues. The worst thing that can happen is that we'll abort the transaction, which we were likely already going to do anyway. Release note: None
…tedMinTimestamp This also adds a comment about the migration plan.
We can't address this until v20.1. Release note: None
The change also adjusts the COCKROACH_TXN_LIVENESS_HEARTBEAT_MULTIPLIER and the minQPS to levels that more easily triggered stalls before the rest of this PR. Release note: None
8135219
to
0d91411
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 28 of 28 files at r21, 1 of 1 files at r22, 8 of 8 files at r23, 3 of 3 files at r24, 22 of 22 files at r25, 11 of 11 files at r26, 7 of 7 files at r27, 2 of 2 files at r28, 1 of 1 files at r29.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/storage/rangefeed/processor.go, line 272 at r5 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I decided to go with the first option. The migration story ends up being almost the same as it is for normal PushTxn requests and the fact that this will come for free with a change to the transaction ID format was decently compelling. The new field also parallels the existing
TxnKey
field, which is nice. PTAL.
I was leaning towards option 1 as well when I skimmed this comment last night. Thanks for making the change.
bors r+ |
38782: storage: persist minimum transaction timestamps in intents r=nvanbenschoten a=nvanbenschoten Fixes #36089. Informs #37199. This commit addresses the second concern from #37199 (comment) by implementing its suggestion #1. It augments the TxnMeta struct to begin storing the transaction's minimum timestamp. This allows pushers to have perfect accuracy into whether an intent is part of a transaction that can eventually be committed or whether it has been aborted by any other pusher and uncommittable. This allows us to get rid of the false positive cases where a pusher incorrectly detects that a transaction is still active and begins waiting on it. In this worst case, this could lead to transactions waiting for the entire transaction expiration for a contending txn. @tbg I'm assigning you because you reviewed most of the lazy transaction record stuff (especially #33523), but let me know if you'd like me to find a different reviewer. Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Build succeeded |
…mestamp Fixes cockroachdb#39008. This was missed in cockroachdb#38782. I must have been thinking about the behavior we'll want to switch to in v20.1. Luckily, the issue was easily caught by `tpcc/mixed-headroom/n5cpu16`. Release note: None
39055: storage: remain compatible with transactions missing their minimum timestamp r=tbg a=nvanbenschoten Fixes #39008. This was missed in #38782. I must have been thinking about the behavior we'll want to switch to in v20.1. Luckily, the issue was easily caught by `tpcc/mixed-headroom/n5cpu16`. Release note: None Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
…mestamp Fixes cockroachdb#39008. This was missed in cockroachdb#38782. I must have been thinking about the behavior we'll want to switch to in v20.1. Luckily, the issue was easily caught by `tpcc/mixed-headroom/n5cpu16`. Release note: None
Fixes #36089.
Informs #37199.
This commit addresses the second concern from #37199 (comment) by implementing its suggestion #1.
It augments the TxnMeta struct to begin storing the transaction's minimum timestamp. This allows pushers to have perfect accuracy into whether an intent is part of a transaction that can eventually be committed or whether it has been aborted by any other pusher and uncommittable.
This allows us to get rid of the false positive cases where a pusher incorrectly detects that a transaction is still active and begins waiting on it. In this worst case, this could lead to transactions waiting for the entire transaction expiration for a contending txn.
@tbg I'm assigning you because you reviewed most of the lazy transaction record stuff (especially #33523), but let me know if you'd like me to find a different reviewer.