Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: refresh less and retry more #44661

Merged
merged 1 commit into from
Feb 10, 2020

Conversation

andreimatei
Copy link
Contributor

Before this patch, when the DistSender would split a batch into multiple
sub-batches and one of the sub-batches fails, it would collect responses
for the successful ones and return them together with the error. This
used to be pretty important before we had write idempotency, because it
allowed the span refresher to only retry an EndTxn without also retring
other writes in that batch (which would have failed).

Since we've gotten idempotency in the meantime, we can retry those other
writes. In fact, it's arguably better to do it: there's a tradeoff
between refreshing and retrying. Currently the span refresher needs to
refresh the read spans of the successful sub-batches, which refresh is
at risk of failing under contention.

This patch makes the span refresher retry the whole batch without
considering partial successes. With this patch, refreshing the partial
successes is no longer needed because we'll retry those requests. In
other words, we'll refresh less and retry more.

The existing policy of refreshing more and retrying less will start to be
applied inconsistenly with #44654, where we start refreshing when the
client sees a WriteTooOld flag - but we're forced to refresh the whole
batch.

Besides the rationalizations above, this patch allows us to simplify
code by not having to deal with both responses and errors. We can thus
get rid of the enthralling comment on the client.Sender.Send() stating:
"
// The contract about whether both a response and an error can be
// returned varies between layers.
"

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Are there any tests to add here? Can we demonstrate cases where this avoids a retry?

Reviewed 3 of 3 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @tbg)


pkg/internal/client/sender.go, line 56 at r1 (raw file):

type Sender interface {
	// Send sends a batch for evaluation.
	// The contract about whether both a response and an error can be

Can we restore the old contract that exactly one of the response or error will be returned? It should be in the diff that added this comment.


pkg/kv/dist_sender.go, line 728 at r1 (raw file):

				pErr.Index.Index += int32(errIdxOffset)
			}
			// Break out of loop to collate batch responses received so far to

I think we can pull the return up here.


pkg/kv/txn_interceptor_span_refresher.go, line 241 at r1 (raw file):

	}

	// If a prefix of the batch was executed, collect refresh spans for

👋

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @tbg)


pkg/internal/client/sender.go, line 56 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Can we restore the old contract that exactly one of the response or error will be returned? It should be in the diff that added this comment.

done


pkg/kv/dist_sender.go, line 728 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I think we can pull the return up here.

done

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 5 of 5 files at r2.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andreimatei and @tbg)


pkg/kv/dist_sender_server_test.go, line 2237 at r2 (raw file):

		},
		{
			// This test checks the behavior of batches that were split by the

So just to confirm, this would have failed without the rest of the PR, right?


pkg/kv/dist_sender_server_test.go, line 2249 at r2 (raw file):

					return err
				}
				// This will cause a WriteTooOldError on the 2nd sub-batch, which will

Make a note that "b" is on a different range than "a" so that's why the batches are split. Or is it because reads and writes are split?


pkg/kv/dist_sender_server_test.go, line 2251 at r2 (raw file):

				// This will cause a WriteTooOldError on the 2nd sub-batch, which will
				// cause a refresh.
				return db.Put(ctx, "b", "newval")

nit: "newval2", just so there's no possibility of confusion.

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten and @tbg)


pkg/kv/dist_sender_server_test.go, line 2237 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

So just to confirm, this would have failed without the rest of the PR, right?

correct


pkg/kv/dist_sender_server_test.go, line 2249 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Make a note that "b" is on a different range than "a" so that's why the batches are split. Or is it because reads and writes are split?

It's because of both. But I made it a multi-range test so that things are clear. I added a comment about "b" being on a different range.


pkg/kv/dist_sender_server_test.go, line 2251 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: "newval2", just so there's no possibility of confusion.

done

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: very nice! Basically every line of code you removed here bothered me while it existed. I thought I also had a TODO about this being gross but now I can't find it.

Reviewed 3 of 5 files at r2, 3 of 3 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei and @nvanbenschoten)


pkg/kv/dist_sender_server_test.go, line 2249 at r3 (raw file):

In particular, this verifies that the get is not refreshed, for this would fail (and lead to a client-side retry instead of one at the txn coord sender).

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei)

Before this patch, when the DistSender would split a batch into multiple
sub-batches and one of the sub-batches fails, it would collect responses
for the successful ones and return them together with the error. This
used to be pretty important before we had write idempotency, because it
allowed the span refresher to only retry an EndTxn without also retring
other writes in that batch (which would have failed).

Since we've gotten idempotency in the meantime, we can retry those other
writes. In fact, it's arguably better to do it: there's a tradeoff
between refreshing and retrying. Currently the span refresher needs to
refresh the read spans of the successful sub-batches, which refresh is
at risk of failing under contention.

This patch makes the span refresher retry the whole batch without
considering partial successes. With this patch, refreshing the partial
successes is no longer needed because we'll retry those requests. In
other words, we'll refresh less and retry more.

The existing policy of refreshing more and retrying less will start to be
applied inconsistenly with cockroachdb#44654, where we start refreshing when the
client sees a WriteTooOld flag - but we're forced to refresh the whole
batch.

Besides the rationalizations above, this patch allows us to simplify
code by not having to deal with both responses and errors. We can thus
get rid of the enthralling comment on the client.Sender.Send() stating:
"
// The contract about whether both a response and an error can be
// returned varies between layers.
"

Release note: None
Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @tbg)


pkg/kv/dist_sender_server_test.go, line 2249 at r3 (raw file):

Previously, tbg (Tobias Grieger) wrote…

In particular, this verifies that the get is not refreshed, for this would fail (and lead to a client-side retry instead of one at the txn coord sender).

done

craig bot pushed a commit that referenced this pull request Feb 10, 2020
44661: kv: refresh less and retry more r=andreimatei a=andreimatei

Before this patch, when the DistSender would split a batch into multiple
sub-batches and one of the sub-batches fails, it would collect responses
for the successful ones and return them together with the error. This
used to be pretty important before we had write idempotency, because it
allowed the span refresher to only retry an EndTxn without also retring
other writes in that batch (which would have failed).

Since we've gotten idempotency in the meantime, we can retry those other
writes. In fact, it's arguably better to do it: there's a tradeoff
between refreshing and retrying. Currently the span refresher needs to
refresh the read spans of the successful sub-batches, which refresh is
at risk of failing under contention.

This patch makes the span refresher retry the whole batch without
considering partial successes. With this patch, refreshing the partial
successes is no longer needed because we'll retry those requests. In
other words, we'll refresh less and retry more.

The existing policy of refreshing more and retrying less will start to be
applied inconsistenly with #44654, where we start refreshing when the
client sees a WriteTooOld flag - but we're forced to refresh the whole
batch.

Besides the rationalizations above, this patch allows us to simplify
code by not having to deal with both responses and errors. We can thus
get rid of the enthralling comment on the client.Sender.Send() stating:
"
// The contract about whether both a response and an error can be
// returned varies between layers.
"

Release note: None

Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Feb 10, 2020

Build succeeded

@craig craig bot merged commit dc3e0b3 into cockroachdb:master Feb 10, 2020
@andreimatei andreimatei deleted the txn.no-partial-response branch March 2, 2020 19:27
andreimatei added a commit to andreimatei/cockroach that referenced this pull request Mar 25, 2020
The scenario that this patch addresses is the following (from cockroachdb#46431):
1. txn1 sends Put(a) + Put(b) + EndTxn
2. DistSender splits the Put(a) from the rest.
3. Put(a) succeeds, but the rest catches some retriable error.
4. TxnCoordSender gets the retriable error. The fact that a sub-batch
  succeeded is lost. We used to care about that fact, but we've
  successively gotten rid of that tracking across cockroachdb#35140 and cockroachdb#44661.
5. we refresh everything that came before this batch. The refresh
  succeeds.
6. we re-send the batch. It gets split again. The part with the EndTxn
  executes first. The transaction is now STAGING. More than that, the txn
  is in fact implicitly committed - the intent on a is already there since
  the previous attempt and, because it's at a lower timestamp than the txn
  record, it counts as golden for the purposes of verifying the implicit
  commit condition.
7. some other transaction wonders in, sees that txn1 is in its way, and
  transitions it to explicitly committed.
8. the Put(a) now tries to evaluate. It gets really confused. I guess
  that different things can happen; none of them good. One thing that I
  believe we've observed in cockroachdb#46299 is that, if there's another txn's
  intent there already, the Put will try to push it, enter the
  txnWaitQueue, eventually observe that its own txn is committed and
  return an error. The client thus gets an error (and a non-ambiguous one
  to boot) although the txn is committed. Even worse perhaps, I think it's
  possible for a request to return wrong results instead of an error.

This patch fixes it by inhibiting the parallel commit when the EndTxn
batch is retried. This way, there's never a STAGING record.

Release note (bug fix): A rare bug causing errors to be returned for
successfully committed transactions was fixed. The most common error
message was "TransactionStatusError: already committed".

Release justification: serious bug fix

Fixes cockroachdb#46341
andreimatei added a commit to andreimatei/cockroach that referenced this pull request Mar 31, 2020
The scenario that this patch addresses is the following (from cockroachdb#46431):
1. txn1 sends Put(a) + Put(b) + EndTxn
2. DistSender splits the Put(a) from the rest.
3. Put(a) succeeds, but the rest catches some retriable error.
4. TxnCoordSender gets the retriable error. The fact that a sub-batch
  succeeded is lost. We used to care about that fact, but we've
  successively gotten rid of that tracking across cockroachdb#35140 and cockroachdb#44661.
5. we refresh everything that came before this batch. The refresh
  succeeds.
6. we re-send the batch. It gets split again. The part with the EndTxn
  executes first. The transaction is now STAGING. More than that, the txn
  is in fact implicitly committed - the intent on a is already there since
  the previous attempt and, because it's at a lower timestamp than the txn
  record, it counts as golden for the purposes of verifying the implicit
  commit condition.
7. some other transaction wonders in, sees that txn1 is in its way, and
  transitions it to explicitly committed.
8. the Put(a) now tries to evaluate. It gets really confused. I guess
  that different things can happen; none of them good. One thing that I
  believe we've observed in cockroachdb#46299 is that, if there's another txn's
  intent there already, the Put will try to push it, enter the
  txnWaitQueue, eventually observe that its own txn is committed and
  return an error. The client thus gets an error (and a non-ambiguous one
  to boot) although the txn is committed. Even worse perhaps, I think it's
  possible for a request to return wrong results instead of an error.

This patch fixes it by inhibiting the parallel commit when the EndTxn
batch is retried. This way, there's never a STAGING record.

Release note (bug fix): A rare bug causing errors to be returned for
successfully committed transactions was fixed. The most common error
message was "TransactionStatusError: already committed".

Release justification: serious bug fix

Fixes cockroachdb#46341
andreimatei added a commit to andreimatei/cockroach that referenced this pull request Mar 31, 2020
The scenario that this patch addresses is the following (from cockroachdb#46431):
1. txn1 sends Put(a) + Put(b) + EndTxn
2. DistSender splits the Put(a) from the rest.
3. Put(a) succeeds, but the rest catches some retriable error.
4. TxnCoordSender gets the retriable error. The fact that a sub-batch
  succeeded is lost. We used to care about that fact, but we've
  successively gotten rid of that tracking across cockroachdb#35140 and cockroachdb#44661.
5. we refresh everything that came before this batch. The refresh
  succeeds.
6. we re-send the batch. It gets split again. The part with the EndTxn
  executes first. The transaction is now STAGING. More than that, the txn
  is in fact implicitly committed - the intent on a is already there since
  the previous attempt and, because it's at a lower timestamp than the txn
  record, it counts as golden for the purposes of verifying the implicit
  commit condition.
7. some other transaction wonders in, sees that txn1 is in its way, and
  transitions it to explicitly committed.
8. the Put(a) now tries to evaluate. It gets really confused. I guess
  that different things can happen; none of them good. One thing that I
  believe we've observed in cockroachdb#46299 is that, if there's another txn's
  intent there already, the Put will try to push it, enter the
  txnWaitQueue, eventually observe that its own txn is committed and
  return an error. The client thus gets an error (and a non-ambiguous one
  to boot) although the txn is committed. Even worse perhaps, I think it's
  possible for a request to return wrong results instead of an error.

This patch fixes it by inhibiting the parallel commit when the EndTxn
batch is retried. This way, there's never a STAGING record.

Release note (bug fix): A rare bug causing errors to be returned for
successfully committed transactions was fixed. The most common error
message was "TransactionStatusError: already committed".

Release justification: serious bug fix

Fixes cockroachdb#46341
andreimatei added a commit to andreimatei/cockroach that referenced this pull request Apr 1, 2020
The scenario that this patch addresses is the following (from cockroachdb#46431):
1. txn1 sends Put(a) + Put(b) + EndTxn
2. DistSender splits the Put(a) from the rest.
3. Put(a) succeeds, but the rest catches some retriable error.
4. TxnCoordSender gets the retriable error. The fact that a sub-batch
  succeeded is lost. We used to care about that fact, but we've
  successively gotten rid of that tracking across cockroachdb#35140 and cockroachdb#44661.
5. we refresh everything that came before this batch. The refresh
  succeeds.
6. we re-send the batch. It gets split again. The part with the EndTxn
  executes first. The transaction is now STAGING. More than that, the txn
  is in fact implicitly committed - the intent on a is already there since
  the previous attempt and, because it's at a lower timestamp than the txn
  record, it counts as golden for the purposes of verifying the implicit
  commit condition.
7. some other transaction wonders in, sees that txn1 is in its way, and
  transitions it to explicitly committed.
8. the Put(a) now tries to evaluate. It gets really confused. I guess
  that different things can happen; none of them good. One thing that I
  believe we've observed in cockroachdb#46299 is that, if there's another txn's
  intent there already, the Put will try to push it, enter the
  txnWaitQueue, eventually observe that its own txn is committed and
  return an error. The client thus gets an error (and a non-ambiguous one
  to boot) although the txn is committed. Even worse perhaps, I think it's
  possible for a request to return wrong results instead of an error.

This patch fixes it by inhibiting the parallel commit when the EndTxn
batch is retried. This way, there's never a STAGING record.

Release note (bug fix): A rare bug causing errors to be returned for
successfully committed transactions was fixed. The most common error
message was "TransactionStatusError: already committed".

Release justification: serious bug fix

Fixes cockroachdb#46341
craig bot pushed a commit that referenced this pull request Apr 1, 2020
46596: kvclient/kvcoord: inhibit parallel commit when retrying EndTxn request r=andreimatei a=andreimatei

The scenario that this patch addresses is the following (from #46431):
1. txn1 sends Put(a) + Put(b) + EndTxn
2. DistSender splits the Put(a) from the rest.
3. Put(a) succeeds, but the rest catches some retriable error.
4. TxnCoordSender gets the retriable error. The fact that a sub-batch
  succeeded is lost. We used to care about that fact, but we've
  successively gotten rid of that tracking across #35140 and #44661.
5. we refresh everything that came before this batch. The refresh
  succeeds.
6. we re-send the batch. It gets split again. The part with the EndTxn
  executes first. The transaction is now STAGING. More than that, the txn
  is in fact implicitly committed - the intent on a is already there since
  the previous attempt and, because it's at a lower timestamp than the txn
  record, it counts as golden for the purposes of verifying the implicit
  commit condition.
7. some other transaction wonders in, sees that txn1 is in its way, and
  transitions it to explicitly committed.
8. the Put(a) now tries to evaluate. It gets really confused. I guess
  that different things can happen; none of them good. One thing that I
  believe we've observed in #46299 is that, if there's another txn's
  intent there already, the Put will try to push it, enter the
  txnWaitQueue, eventually observe that its own txn is committed and
  return an error. The client thus gets an error (and a non-ambiguous one
  to boot) although the txn is committed. Even worse perhaps, I think it's
  possible for a request to return wrong results instead of an error.

This patch fixes it by inhibiting the parallel commit when the EndTxn
batch is retried. This way, there's never a STAGING record.

Release note (bug fix): A rare bug causing errors to be returned for
successfully committed transactions was fixed. The most common error
message was "TransactionStatusError: already committed".

Release justification: serious bug fix

Fixes #46341

Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants