-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvclient/kvcoord: inhibit parallel commit when retrying EndTxn request #46596
kvclient/kvcoord: inhibit parallel commit when retrying EndTxn request #46596
Conversation
8da3e4a
to
af92767
Compare
@nvanbenschoten was there any saving grace for not having to backport this to 19.2? I still need to check if this actually fixes kv/contention (#45700). |
Reviewing now.
Yes, we still had the MixedSuccessError back in 19.2, which is why we didn't hit this issue back then. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks for putting this together so fast. I feel good about how small this is, given where we are in the release cycle.
Reviewed 1 of 1 files at r1, 3 of 3 files at r2, 4 of 4 files at r3, 3 of 3 files at r4, 3 of 3 files at r5.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)
pkg/kv/kvclient/kvcoord/dist_sender_server_test.go, line 2616 at r4 (raw file):
} // Test that, even though at the kvserver level requests are not idempotent
Does this test fail without your change?
pkg/kv/kvclient/kvcoord/dist_sender_server_test.go, line 2682 at r4 (raw file):
require.NoError(t, err) // After the txn started, do a conflicting write. This will cause the txn's
"a conflicting read"
pkg/kv/kvclient/kvcoord/dist_sender_server_test.go, line 2685 at r4 (raw file):
// upcoming left half to return a WriteTooOldError on the first attempt, // causing in turn a refresh and a retry. Note that the upcoming write is a // CPut, so it can't return the WriteTooOld flag instead of a
Use the word "defer" somewhere in here, since that's what we seem to be calling this.
pkg/kv/kvclient/kvcoord/dist_sender_server_test.go, line 2582 at r5 (raw file):
// pushee txn record. expectPusheeTxnRecordNotFound dontExpectNothing
Give this a comment.
pkg/kv/kvclient/kvcoord/dist_sender_server_test.go, line 2658 at r5 (raw file):
// previous partial success). // 5. The batch is split again. b's sub-batch executed first. Say a's batch is // delayed.
What's up with the strange spacing in these comments (and below in the next test)?
pkg/kv/kvclient/kvcoord/dist_sender_server_test.go, line 2662 at r5 (raw file):
// The point of this test is to check that the txn is not considered to be // implicitly committed at this point. Handling this scenario requires special // care. If we wouldn't do anything, then we'd end up with a STAGING txn record
s/wouldn't/didn't/
pkg/kv/kvclient/kvcoord/dist_sender_server_test.go, line 2674 at r5 (raw file):
// attempt. Thus, the EndTxn will be split from all the other requests, and the // txn record is never written if anything fails. func TestTxnCoordSenderRetriesAcrossEndTxn_EndTxnSucceedsLate(t *testing.T) {
I'm surprised you're not able to combine these two tests into a single test with two subtests. Don't they only differ in three places?
- where the intermediate read is
- which write is eventually rejected
- whether we expect to find the txn record
This indicates that we should test all four combinations of the first two differences.
pkg/kv/kvclient/kvcoord/txn_interceptor_committer.go, line 154 at r5 (raw file):
// could technically enter the "implicitly committed" state before all the // sub-batches are evaluated and this is problematic: there's a race between // evaluating those requests randos coming along and transitioning the txn
"requests randos" is missing a word.
pkg/kv/kvclient/kvcoord/txn_interceptor_committer.go, line 158 at r5 (raw file):
// evaluations of the outstanding sub-batches. If the randos win, then the // re-evaluations will fail because we don't have idempotency of evaluations // across a txn commit.
Also mention that the re-evaluations might notice that their txn was committed and throw the error we see.
pkg/kv/kvclient/kvcoord/txn_interceptor_committer.go, line 164 at r5 (raw file):
// disable the parallel commit. etCpy := *et ba.Requests[len(ba.Requests)-1].SetInner(&etCpy)
Do we need to copy the entire ba.Requests
slice to keep the race detector happy?
pkg/kv/kvclient/kvcoord/txn_interceptor_committer.go, line 301 at r5 (raw file):
} type endTxnAttempt int
We should either give this a comment or just replace it with a boolean. The enum seems somewhat overkill.
pkg/kv/kvclient/kvcoord/txn_interceptor_committer.go, line 315 at r5 (raw file):
ctx context.Context, ba roachpb.BatchRequest, et *roachpb.EndTxnRequest, etAttempt endTxnAttempt, ) bool { if etAttempt == endTxnRetry {
nit: move this down below if !et.Commit {
. There's an order of precedence here.
pkg/kv/kvserver/batcheval/cmd_push_txn.go, line 143 at r1 (raw file):
ok, err := storage.MVCCGetProto( ctx, readWriter, key, hlc.Timestamp{}, // the txn record is an "inline value", not MVCC
👎 we do this all over the place, so adding a comment to this one place doesn't help. If you want to do something here, consider instead addin a comment on MVCCGet
that discusses inline values like we have on MVCCPut
.
pkg/kv/kvserver/batcheval/cmd_push_txn.go, line 266 at r4 (raw file):
if reply.PusheeTxn.Status == roachpb.STAGING && (pusherWins || recoverOnFailedPush) { err := roachpb.NewIndeterminateCommitError(reply.PusheeTxn) log.VEventf(ctx, 1, "%v", err)
If you're going to do that here, do the same down below for the txn push error.
pkg/util/tracing/recorded_span.go, line 18 at r3 (raw file):
) const LogMessageField string = "event"
No need for "string"
pkg/util/tracing/recorded_span.go, line 37 at r3 (raw file):
return f.Value } if key == "error" {
Consider pulling "error" into a constant as well.
pkg/util/tracing/tracer_span.go, line 352 at r4 (raw file):
// FindLogMessage returns the first log message in the recording that matches // the given regexp. If no such messages exist, returns "". func (r Recording) FindLogMessage(pattern string) string {
Consider also returning a bool so we're not relying on string comparisons.
af92767
to
279b550
Compare
Release note: None Release justification: Comment only.
Release note: None Release justification: Minor new log messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @nvanbenschoten)
pkg/kv/kvclient/kvcoord/dist_sender_server_test.go, line 2616 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Does this test fail without your change?
TestTxnCoordSenderRetriesAcrossEndTxn_EndTxnSucceedsLate
fails, yes.
pkg/kv/kvclient/kvcoord/dist_sender_server_test.go, line 2682 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
"a conflicting read"
done
pkg/kv/kvclient/kvcoord/dist_sender_server_test.go, line 2685 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Use the word "defer" somewhere in here, since that's what we seem to be calling this.
done
pkg/kv/kvclient/kvcoord/dist_sender_server_test.go, line 2582 at r5 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Give this a comment.
done
pkg/kv/kvclient/kvcoord/dist_sender_server_test.go, line 2658 at r5 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
What's up with the strange spacing in these comments (and below in the next test)?
done
pkg/kv/kvclient/kvcoord/dist_sender_server_test.go, line 2662 at r5 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
s/wouldn't/didn't/
done
pkg/kv/kvclient/kvcoord/dist_sender_server_test.go, line 2674 at r5 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I'm surprised you're not able to combine these two tests into a single test with two subtests. Don't they only differ in three places?
- where the intermediate read is
- which write is eventually rejected
- whether we expect to find the txn record
This indicates that we should test all four combinations of the first two differences.
done, except I'm only testing two of the 4 combinations. The others seem to trivial to me / they covered in more mundane tests.
pkg/kv/kvclient/kvcoord/txn_interceptor_committer.go, line 154 at r5 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
"requests randos" is missing a word.
done
pkg/kv/kvclient/kvcoord/txn_interceptor_committer.go, line 158 at r5 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Also mention that the re-evaluations might notice that their txn was committed and throw the error we see.
done
pkg/kv/kvclient/kvcoord/txn_interceptor_committer.go, line 164 at r5 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Do we need to copy the entire
ba.Requests
slice to keep the race detector happy?
I don't think so. The race transport claims that the client is already mutating the request slices of requests.
// Make a copy of the requests slice, and shallow copies of the requests. |
pkg/kv/kvclient/kvcoord/txn_interceptor_committer.go, line 301 at r5 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
We should either give this a comment or just replace it with a boolean. The enum seems somewhat overkill.
added a comment
pkg/kv/kvclient/kvcoord/txn_interceptor_committer.go, line 315 at r5 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: move this down below
if !et.Commit {
. There's an order of precedence here.
done
pkg/kv/kvserver/batcheval/cmd_push_txn.go, line 143 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
👎 we do this all over the place, so adding a comment to this one place doesn't help. If you want to do something here, consider instead addin a comment on
MVCCGet
that discusses inline values like we have onMVCCPut
.
done
pkg/kv/kvserver/batcheval/cmd_push_txn.go, line 266 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
If you're going to do that here, do the same down below for the txn push error.
done
pkg/util/tracing/recorded_span.go, line 37 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Consider pulling "error" into a constant as well.
I couldn't tell how used this "error" key is, that's why I left it alone.
pkg/util/tracing/tracer_span.go, line 352 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Consider also returning a bool so we're not relying on string comparisons.
done
279b550
to
1c8fc6f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/util/tracing/recorded_span.go, line 18 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
No need for "string"
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 11 of 11 files at r6, 3 of 3 files at r7, 4 of 4 files at r8, 1 of 1 files at r9, 1 of 1 files at r10, 4 of 4 files at r11.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)
pkg/kv/kvclient/kvcoord/dist_sender_server_test.go, line 2648 at r11 (raw file):
) type expectedTxnResolution int
Why do we need the enum? Can't we just use a roachpb.TransactionStatus
?
pkg/kv/kvclient/kvcoord/dist_sender_server_test.go, line 2749 at r11 (raw file):
// intent written in the 2nd attempt satisfying a STAGING record written on the // first attempt). See subtests for more details. func TestTxnCoordSenderRetriesAcrossEndTxn(t *testing.T) {
Nice, this test came out well.
pkg/kv/kvclient/kvcoord/dist_sender_server_test.go, line 2771 at r11 (raw file):
// The left side is CPut(a), the right side is CPut(b)+EndTxn(STAGING).
Consider moving this up to the side
type definition.
pkg/kv/kvclient/kvcoord/txn_interceptor_committer.go, line 211 at r10 (raw file):
// DistSender. if pErr := needTxnRetryAfterStaging(br); pErr != nil { log.VEventf(ctx, 2, "parallel commit failed since some writes were pushed")
Maybe we should include the error in here.
pkg/util/tracing/recorded_span.go, line 37 at r3 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
I couldn't tell how used this "error" key is, that's why I left it alone.
Ack.
pkg/util/tracing/tracer_span.go, line 264 at r9 (raw file):
// log 1 // start span B // duration relative to "log 1" // log 2
Give this a comment.
pkg/util/tracing/tracer_span.go, line 290 at r9 (raw file):
// TODO(andrei): this should be unified with // SessionTracing.GenerateSessionTraceVTable. func (r Recording) String() string {
Can we write a test for this? Mostly I'm interested in seeing what effect it had.
pkg/util/tracing/tracer_span.go, line 302 at r9 (raw file):
} var buf bytes.Buffer
strings.Builder
pkg/util/tracing/tracer_span.go, line 334 at r9 (raw file):
} func (r Recording) visitSpan(sp RecordedSpan, depth int) []traceLogData {
Give this a comment. Also, to be honest, I'm having a really hard time understanding what changed in this commit and why. This code is dense and a few diagrams and inline comments might help. I'll just assume it's all good I guess if this is only used in testing.
pkg/util/tracing/tracer_span.go, line 391 at r9 (raw file):
// Merge ownLogs with childSpans. mergedLogs := make([]traceLogData, 0, len(ownLogs)) timeMax := time.Date(2100, 0, 0, 0, 0, 0, 0, time.UTC)
This has a gross feel to it. And it doesn't seem very optimistic. What can we do to make this 2200?
This function was used in sql, but I want to start using it elsewhere when dealing with traces. Release note: None Release justification: Minor code move, in support of the next commit.
Before this patch, a trace (recording) was String()ed by sorting log messages by timestamp, across spans. This proved to be a terrible idea. This patch moves to keeping requests from the same span together, but still interspersing children spans in the parent. So messages from siblings are no longer mixed together. Release note: None Release justification: testing only
Release note: None Release justification: low risk, just logging
The scenario that this patch addresses is the following (from cockroachdb#46431): 1. txn1 sends Put(a) + Put(b) + EndTxn 2. DistSender splits the Put(a) from the rest. 3. Put(a) succeeds, but the rest catches some retriable error. 4. TxnCoordSender gets the retriable error. The fact that a sub-batch succeeded is lost. We used to care about that fact, but we've successively gotten rid of that tracking across cockroachdb#35140 and cockroachdb#44661. 5. we refresh everything that came before this batch. The refresh succeeds. 6. we re-send the batch. It gets split again. The part with the EndTxn executes first. The transaction is now STAGING. More than that, the txn is in fact implicitly committed - the intent on a is already there since the previous attempt and, because it's at a lower timestamp than the txn record, it counts as golden for the purposes of verifying the implicit commit condition. 7. some other transaction wonders in, sees that txn1 is in its way, and transitions it to explicitly committed. 8. the Put(a) now tries to evaluate. It gets really confused. I guess that different things can happen; none of them good. One thing that I believe we've observed in cockroachdb#46299 is that, if there's another txn's intent there already, the Put will try to push it, enter the txnWaitQueue, eventually observe that its own txn is committed and return an error. The client thus gets an error (and a non-ambiguous one to boot) although the txn is committed. Even worse perhaps, I think it's possible for a request to return wrong results instead of an error. This patch fixes it by inhibiting the parallel commit when the EndTxn batch is retried. This way, there's never a STAGING record. Release note (bug fix): A rare bug causing errors to be returned for successfully committed transactions was fixed. The most common error message was "TransactionStatusError: already committed". Release justification: serious bug fix Fixes cockroachdb#46341
1c8fc6f
to
7c6be57
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
pkg/kv/kvclient/kvcoord/dist_sender_server_test.go, line 2648 at r11 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Why do we need the enum? Can't we just use a
roachpb.TransactionStatus
?
because I only want a subset of the TransactionStatus
here, and also I like reading the "expect" part.
pkg/kv/kvclient/kvcoord/dist_sender_server_test.go, line 2771 at r11 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
// The left side is CPut(a), the right side is CPut(b)+EndTxn(STAGING).
Consider moving this up to the
side
type definition.
done
pkg/kv/kvclient/kvcoord/txn_interceptor_committer.go, line 211 at r10 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Maybe we should include the error in here.
done
pkg/util/tracing/tracer_span.go, line 264 at r9 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Give this a comment.
done
pkg/util/tracing/tracer_span.go, line 290 at r9 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Can we write a test for this? Mostly I'm interested in seeing what effect it had.
I'm gonna leave the test for next time :).
It looks like this. Before, messages from siblings were interspersed, sorted by their timestamp. Now children are inserted into parents, but otherwise spans are kept together.
0.000ms 0.000ms === operation:test trace
0.002ms 0.002ms === operation:dist sender send n:1
0.009ms 0.007ms event:[n1] querying next range at "b1"
0.023ms 0.014ms event:[n1] r33: sending batch 1 PushTxn to (n1,s1):1
0.024ms 0.001ms event:[n1] sending request to local client
0.026ms 0.002ms === operation:/cockroach.roachpb.Internal/Batch n:1
0.028ms 0.002ms event:[n1] node received request: 1 PushTxn
0.035ms 0.007ms event:[n1,s1] executing PushTxn(00000000->48691554) ["b1",/Min)
0.039ms 0.004ms event:[n1,s1,r33/1:{b-/Table/Syste…}] read-write path
0.044ms 0.005ms event:[n1,s1,r33/1:{b-/Table/Syste…}] sequencing request
0.046ms 0.002ms event:[n1,s1,r33/1:{b-/Table/Syste…}] acquiring latches
0.050ms 0.004ms event:[n1,s1,r33/1:{b-/Table/Syste…}] applied timestamp cache
0.101ms 0.051ms event:[n1,s1,r33/1:{b-/Table/Syste…}] found txn in indeterminate STAGING state meta={id=48691554 key="b1" pri=0.02894318 epo=0 ts=1585694049.825806000,0 min=1585694049.825806000,0 seq=4} lock=true stat=STAGING rts=0,0 wto=false max=0,0 int=1 ifw=2
0.120ms 0.019ms event:[n1,s1,r33/1:{b-/Table/Syste…}] evaluated request
0.125ms 0.005ms event:[n1,s1,r33/1:{b-/Table/Syste…}] recovering txn 48691554 from indeterminate commit
0.483ms 0.358ms event:[n1,s1,r33/1:{b-/Table/Syste…}] recovered txn 48691554 with status: ABORTED
0.486ms 0.003ms event:[n1,s1,r33/1:{b-/Table/Syste…}] sequencing request
0.487ms 0.001ms event:[n1,s1,r33/1:{b-/Table/Syste…}] acquiring latches
0.491ms 0.004ms event:[n1,s1,r33/1:{b-/Table/Syste…}] applied timestamp cache
0.513ms 0.022ms event:[n1,s1,r33/1:{b-/Table/Syste…}] evaluated request
pkg/util/tracing/tracer_span.go, line 302 at r9 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
strings.Builder
done
pkg/util/tracing/tracer_span.go, line 334 at r9 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Give this a comment. Also, to be honest, I'm having a really hard time understanding what changed in this commit and why. This code is dense and a few diagrams and inline comments might help. I'll just assume it's all good I guess if this is only used in testing.
Added a comment
pkg/util/tracing/tracer_span.go, line 391 at r9 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
This has a gross feel to it. And it doesn't seem very optimistic. What can we do to make this 2200?
changed to 2200 :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 6 files at r12, 1 of 1 files at r13, 4 of 4 files at r15.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andreimatei)
pkg/util/tracing/tracer_span.go, line 391 at r9 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
changed to 2200 :P
Now we're talking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andreimatei)
Build failed |
bors r+
…On Wed, Apr 1, 2020 at 10:46 AM craig[bot] ***@***.***> wrote:
Build failed
- GitHub CI (Cockroach)
<https://teamcity.cockroachdb.com/viewLog.html?buildId=1843304&buildTypeId=Cockroach_UnitTests>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#46596 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAC4C4NPWZ7MBIHMKPXGCETRKNHULANCNFSM4LT2EFHQ>
.
|
Build failed (retrying...) |
Build succeeded |
The scenario that this patch addresses is the following (from #46431):
succeeded is lost. We used to care about that fact, but we've
successively gotten rid of that tracking across storage: employ transactional idempotency to refresh mixed-success batches #35140 and kv: refresh less and retry more #44661.
succeeds.
executes first. The transaction is now STAGING. More than that, the txn
is in fact implicitly committed - the intent on a is already there since
the previous attempt and, because it's at a lower timestamp than the txn
record, it counts as golden for the purposes of verifying the implicit
commit condition.
transitions it to explicitly committed.
that different things can happen; none of them good. One thing that I
believe we've observed in roachtest: kv/contention/nodes=4 failed #46299 is that, if there's another txn's
intent there already, the Put will try to push it, enter the
txnWaitQueue, eventually observe that its own txn is committed and
return an error. The client thus gets an error (and a non-ambiguous one
to boot) although the txn is committed. Even worse perhaps, I think it's
possible for a request to return wrong results instead of an error.
This patch fixes it by inhibiting the parallel commit when the EndTxn
batch is retried. This way, there's never a STAGING record.
Release note (bug fix): A rare bug causing errors to be returned for
successfully committed transactions was fixed. The most common error
message was "TransactionStatusError: already committed".
Release justification: serious bug fix
Fixes #46341