-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv: refresh less and retry more #44661
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Are there any tests to add here? Can we demonstrate cases where this avoids a retry?
Reviewed 3 of 3 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @tbg)
pkg/internal/client/sender.go, line 56 at r1 (raw file):
type Sender interface { // Send sends a batch for evaluation. // The contract about whether both a response and an error can be
Can we restore the old contract that exactly one of the response or error will be returned? It should be in the diff that added this comment.
pkg/kv/dist_sender.go, line 728 at r1 (raw file):
pErr.Index.Index += int32(errIdxOffset) } // Break out of loop to collate batch responses received so far to
I think we can pull the return
up here.
pkg/kv/txn_interceptor_span_refresher.go, line 241 at r1 (raw file):
} // If a prefix of the batch was executed, collect refresh spans for
👋
a57b729
to
4f356d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a test.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @tbg)
pkg/internal/client/sender.go, line 56 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Can we restore the old contract that exactly one of the response or error will be returned? It should be in the diff that added this comment.
done
pkg/kv/dist_sender.go, line 728 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I think we can pull the
return
up here.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r2.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andreimatei and @tbg)
pkg/kv/dist_sender_server_test.go, line 2237 at r2 (raw file):
}, { // This test checks the behavior of batches that were split by the
So just to confirm, this would have failed without the rest of the PR, right?
pkg/kv/dist_sender_server_test.go, line 2249 at r2 (raw file):
return err } // This will cause a WriteTooOldError on the 2nd sub-batch, which will
Make a note that "b" is on a different range than "a" so that's why the batches are split. Or is it because reads and writes are split?
pkg/kv/dist_sender_server_test.go, line 2251 at r2 (raw file):
// This will cause a WriteTooOldError on the 2nd sub-batch, which will // cause a refresh. return db.Put(ctx, "b", "newval")
nit: "newval2", just so there's no possibility of confusion.
4f356d8
to
826efa2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten and @tbg)
pkg/kv/dist_sender_server_test.go, line 2237 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
So just to confirm, this would have failed without the rest of the PR, right?
correct
pkg/kv/dist_sender_server_test.go, line 2249 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Make a note that "b" is on a different range than "a" so that's why the batches are split. Or is it because reads and writes are split?
It's because of both. But I made it a multi-range test so that things are clear. I added a comment about "b" being on a different range.
pkg/kv/dist_sender_server_test.go, line 2251 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: "newval2", just so there's no possibility of confusion.
done
826efa2
to
c20504f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice! Basically every line of code you removed here bothered me while it existed. I thought I also had a TODO about this being gross but now I can't find it.
Reviewed 3 of 5 files at r2, 3 of 3 files at r3.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei and @nvanbenschoten)
pkg/kv/dist_sender_server_test.go, line 2249 at r3 (raw file):
In particular, this verifies that the get is not refreshed, for this would fail (and lead to a client-side retry instead of one at the txn coord sender).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei)
c20504f
to
eb113a4
Compare
Before this patch, when the DistSender would split a batch into multiple sub-batches and one of the sub-batches fails, it would collect responses for the successful ones and return them together with the error. This used to be pretty important before we had write idempotency, because it allowed the span refresher to only retry an EndTxn without also retring other writes in that batch (which would have failed). Since we've gotten idempotency in the meantime, we can retry those other writes. In fact, it's arguably better to do it: there's a tradeoff between refreshing and retrying. Currently the span refresher needs to refresh the read spans of the successful sub-batches, which refresh is at risk of failing under contention. This patch makes the span refresher retry the whole batch without considering partial successes. With this patch, refreshing the partial successes is no longer needed because we'll retry those requests. In other words, we'll refresh less and retry more. The existing policy of refreshing more and retrying less will start to be applied inconsistenly with cockroachdb#44654, where we start refreshing when the client sees a WriteTooOld flag - but we're forced to refresh the whole batch. Besides the rationalizations above, this patch allows us to simplify code by not having to deal with both responses and errors. We can thus get rid of the enthralling comment on the client.Sender.Send() stating: " // The contract about whether both a response and an error can be // returned varies between layers. " Release note: None
eb113a4
to
dc3e0b3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @tbg)
pkg/kv/dist_sender_server_test.go, line 2249 at r3 (raw file):
Previously, tbg (Tobias Grieger) wrote…
In particular, this verifies that the get is not refreshed, for this would fail (and lead to a client-side retry instead of one at the txn coord sender).
done
44661: kv: refresh less and retry more r=andreimatei a=andreimatei Before this patch, when the DistSender would split a batch into multiple sub-batches and one of the sub-batches fails, it would collect responses for the successful ones and return them together with the error. This used to be pretty important before we had write idempotency, because it allowed the span refresher to only retry an EndTxn without also retring other writes in that batch (which would have failed). Since we've gotten idempotency in the meantime, we can retry those other writes. In fact, it's arguably better to do it: there's a tradeoff between refreshing and retrying. Currently the span refresher needs to refresh the read spans of the successful sub-batches, which refresh is at risk of failing under contention. This patch makes the span refresher retry the whole batch without considering partial successes. With this patch, refreshing the partial successes is no longer needed because we'll retry those requests. In other words, we'll refresh less and retry more. The existing policy of refreshing more and retrying less will start to be applied inconsistenly with #44654, where we start refreshing when the client sees a WriteTooOld flag - but we're forced to refresh the whole batch. Besides the rationalizations above, this patch allows us to simplify code by not having to deal with both responses and errors. We can thus get rid of the enthralling comment on the client.Sender.Send() stating: " // The contract about whether both a response and an error can be // returned varies between layers. " Release note: None Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
Build succeeded |
The scenario that this patch addresses is the following (from cockroachdb#46431): 1. txn1 sends Put(a) + Put(b) + EndTxn 2. DistSender splits the Put(a) from the rest. 3. Put(a) succeeds, but the rest catches some retriable error. 4. TxnCoordSender gets the retriable error. The fact that a sub-batch succeeded is lost. We used to care about that fact, but we've successively gotten rid of that tracking across cockroachdb#35140 and cockroachdb#44661. 5. we refresh everything that came before this batch. The refresh succeeds. 6. we re-send the batch. It gets split again. The part with the EndTxn executes first. The transaction is now STAGING. More than that, the txn is in fact implicitly committed - the intent on a is already there since the previous attempt and, because it's at a lower timestamp than the txn record, it counts as golden for the purposes of verifying the implicit commit condition. 7. some other transaction wonders in, sees that txn1 is in its way, and transitions it to explicitly committed. 8. the Put(a) now tries to evaluate. It gets really confused. I guess that different things can happen; none of them good. One thing that I believe we've observed in cockroachdb#46299 is that, if there's another txn's intent there already, the Put will try to push it, enter the txnWaitQueue, eventually observe that its own txn is committed and return an error. The client thus gets an error (and a non-ambiguous one to boot) although the txn is committed. Even worse perhaps, I think it's possible for a request to return wrong results instead of an error. This patch fixes it by inhibiting the parallel commit when the EndTxn batch is retried. This way, there's never a STAGING record. Release note (bug fix): A rare bug causing errors to be returned for successfully committed transactions was fixed. The most common error message was "TransactionStatusError: already committed". Release justification: serious bug fix Fixes cockroachdb#46341
The scenario that this patch addresses is the following (from cockroachdb#46431): 1. txn1 sends Put(a) + Put(b) + EndTxn 2. DistSender splits the Put(a) from the rest. 3. Put(a) succeeds, but the rest catches some retriable error. 4. TxnCoordSender gets the retriable error. The fact that a sub-batch succeeded is lost. We used to care about that fact, but we've successively gotten rid of that tracking across cockroachdb#35140 and cockroachdb#44661. 5. we refresh everything that came before this batch. The refresh succeeds. 6. we re-send the batch. It gets split again. The part with the EndTxn executes first. The transaction is now STAGING. More than that, the txn is in fact implicitly committed - the intent on a is already there since the previous attempt and, because it's at a lower timestamp than the txn record, it counts as golden for the purposes of verifying the implicit commit condition. 7. some other transaction wonders in, sees that txn1 is in its way, and transitions it to explicitly committed. 8. the Put(a) now tries to evaluate. It gets really confused. I guess that different things can happen; none of them good. One thing that I believe we've observed in cockroachdb#46299 is that, if there's another txn's intent there already, the Put will try to push it, enter the txnWaitQueue, eventually observe that its own txn is committed and return an error. The client thus gets an error (and a non-ambiguous one to boot) although the txn is committed. Even worse perhaps, I think it's possible for a request to return wrong results instead of an error. This patch fixes it by inhibiting the parallel commit when the EndTxn batch is retried. This way, there's never a STAGING record. Release note (bug fix): A rare bug causing errors to be returned for successfully committed transactions was fixed. The most common error message was "TransactionStatusError: already committed". Release justification: serious bug fix Fixes cockroachdb#46341
The scenario that this patch addresses is the following (from cockroachdb#46431): 1. txn1 sends Put(a) + Put(b) + EndTxn 2. DistSender splits the Put(a) from the rest. 3. Put(a) succeeds, but the rest catches some retriable error. 4. TxnCoordSender gets the retriable error. The fact that a sub-batch succeeded is lost. We used to care about that fact, but we've successively gotten rid of that tracking across cockroachdb#35140 and cockroachdb#44661. 5. we refresh everything that came before this batch. The refresh succeeds. 6. we re-send the batch. It gets split again. The part with the EndTxn executes first. The transaction is now STAGING. More than that, the txn is in fact implicitly committed - the intent on a is already there since the previous attempt and, because it's at a lower timestamp than the txn record, it counts as golden for the purposes of verifying the implicit commit condition. 7. some other transaction wonders in, sees that txn1 is in its way, and transitions it to explicitly committed. 8. the Put(a) now tries to evaluate. It gets really confused. I guess that different things can happen; none of them good. One thing that I believe we've observed in cockroachdb#46299 is that, if there's another txn's intent there already, the Put will try to push it, enter the txnWaitQueue, eventually observe that its own txn is committed and return an error. The client thus gets an error (and a non-ambiguous one to boot) although the txn is committed. Even worse perhaps, I think it's possible for a request to return wrong results instead of an error. This patch fixes it by inhibiting the parallel commit when the EndTxn batch is retried. This way, there's never a STAGING record. Release note (bug fix): A rare bug causing errors to be returned for successfully committed transactions was fixed. The most common error message was "TransactionStatusError: already committed". Release justification: serious bug fix Fixes cockroachdb#46341
The scenario that this patch addresses is the following (from cockroachdb#46431): 1. txn1 sends Put(a) + Put(b) + EndTxn 2. DistSender splits the Put(a) from the rest. 3. Put(a) succeeds, but the rest catches some retriable error. 4. TxnCoordSender gets the retriable error. The fact that a sub-batch succeeded is lost. We used to care about that fact, but we've successively gotten rid of that tracking across cockroachdb#35140 and cockroachdb#44661. 5. we refresh everything that came before this batch. The refresh succeeds. 6. we re-send the batch. It gets split again. The part with the EndTxn executes first. The transaction is now STAGING. More than that, the txn is in fact implicitly committed - the intent on a is already there since the previous attempt and, because it's at a lower timestamp than the txn record, it counts as golden for the purposes of verifying the implicit commit condition. 7. some other transaction wonders in, sees that txn1 is in its way, and transitions it to explicitly committed. 8. the Put(a) now tries to evaluate. It gets really confused. I guess that different things can happen; none of them good. One thing that I believe we've observed in cockroachdb#46299 is that, if there's another txn's intent there already, the Put will try to push it, enter the txnWaitQueue, eventually observe that its own txn is committed and return an error. The client thus gets an error (and a non-ambiguous one to boot) although the txn is committed. Even worse perhaps, I think it's possible for a request to return wrong results instead of an error. This patch fixes it by inhibiting the parallel commit when the EndTxn batch is retried. This way, there's never a STAGING record. Release note (bug fix): A rare bug causing errors to be returned for successfully committed transactions was fixed. The most common error message was "TransactionStatusError: already committed". Release justification: serious bug fix Fixes cockroachdb#46341
46596: kvclient/kvcoord: inhibit parallel commit when retrying EndTxn request r=andreimatei a=andreimatei The scenario that this patch addresses is the following (from #46431): 1. txn1 sends Put(a) + Put(b) + EndTxn 2. DistSender splits the Put(a) from the rest. 3. Put(a) succeeds, but the rest catches some retriable error. 4. TxnCoordSender gets the retriable error. The fact that a sub-batch succeeded is lost. We used to care about that fact, but we've successively gotten rid of that tracking across #35140 and #44661. 5. we refresh everything that came before this batch. The refresh succeeds. 6. we re-send the batch. It gets split again. The part with the EndTxn executes first. The transaction is now STAGING. More than that, the txn is in fact implicitly committed - the intent on a is already there since the previous attempt and, because it's at a lower timestamp than the txn record, it counts as golden for the purposes of verifying the implicit commit condition. 7. some other transaction wonders in, sees that txn1 is in its way, and transitions it to explicitly committed. 8. the Put(a) now tries to evaluate. It gets really confused. I guess that different things can happen; none of them good. One thing that I believe we've observed in #46299 is that, if there's another txn's intent there already, the Put will try to push it, enter the txnWaitQueue, eventually observe that its own txn is committed and return an error. The client thus gets an error (and a non-ambiguous one to boot) although the txn is committed. Even worse perhaps, I think it's possible for a request to return wrong results instead of an error. This patch fixes it by inhibiting the parallel commit when the EndTxn batch is retried. This way, there's never a STAGING record. Release note (bug fix): A rare bug causing errors to be returned for successfully committed transactions was fixed. The most common error message was "TransactionStatusError: already committed". Release justification: serious bug fix Fixes #46341 Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
Before this patch, when the DistSender would split a batch into multiple
sub-batches and one of the sub-batches fails, it would collect responses
for the successful ones and return them together with the error. This
used to be pretty important before we had write idempotency, because it
allowed the span refresher to only retry an EndTxn without also retring
other writes in that batch (which would have failed).
Since we've gotten idempotency in the meantime, we can retry those other
writes. In fact, it's arguably better to do it: there's a tradeoff
between refreshing and retrying. Currently the span refresher needs to
refresh the read spans of the successful sub-batches, which refresh is
at risk of failing under contention.
This patch makes the span refresher retry the whole batch without
considering partial successes. With this patch, refreshing the partial
successes is no longer needed because we'll retry those requests. In
other words, we'll refresh less and retry more.
The existing policy of refreshing more and retrying less will start to be
applied inconsistenly with #44654, where we start refreshing when the
client sees a WriteTooOld flag - but we're forced to refresh the whole
batch.
Besides the rationalizations above, this patch allows us to simplify
code by not having to deal with both responses and errors. We can thus
get rid of the enthralling comment on the client.Sender.Send() stating:
"
// The contract about whether both a response and an error can be
// returned varies between layers.
"
Release note: None