-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: don't perform one-phase commit transactions after restarts #40518
Merged
craig
merged 5 commits into
cockroachdb:master
from
nvanbenschoten:nvanbenschoten/1pcEpoch
Sep 6, 2019
Merged
storage: don't perform one-phase commit transactions after restarts #40518
craig
merged 5 commits into
cockroachdb:master
from
nvanbenschoten:nvanbenschoten/1pcEpoch
Sep 6, 2019
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was no longer in use, as of 374413f. Release note: None
Fixes cockroachdb#40466. This commit adjusts the Replica-side logic around executing one-phase commit transactions. It prevents batches that could be executed on the 1PC fast-path from doing so if they contain transactions in any epoch other than their first epoch. We saw in cockroachdb#40466 that this could lead to intent abandonment if the txn had written intents in earlier epochs. I believe that this is a new issue that was introduced when we moved from using the presence of a BeginTxn and EndTxn in the same batch to detect 1PC transactions to using the sequence numbers of writes and the presence of an EndTxn in a batch to detect 1PC transactions. That change was necessary for parallel commits. Before that change, I think logic in DistSender was preventing us from hitting this case because it would always split EndTxn requests from the rest of its batch when a txn had been restarted: https://github.com/cockroachdb/cockroach/blob/18bdfe1a691fa6d785d510e86d27cecdac9c436e/pkg/kv/dist_sender.go#L692-L697 Release note: None
This change correctly sets sequence numbers in TestIsOnePhaseCommit so that we can test the detection of one-phase commit transactions whose batches do not contain BeginTransaction requests. Release note: None
As of cockroachdb#35284, this was already checked by IsEndTransactionTriggeringRetryError. Release note: None
We can then use this to centralize its logic and avoid leaking it around 1PC transaction handling. Release note: None
tbg
approved these changes
Sep 6, 2019
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r1, 3 of 3 files at r2, 1 of 1 files at r3, 2 of 2 files at r4, 3 of 3 files at r5.
Reviewable status: complete! 0 of 0 LGTMs obtained
bors r+ |
craig bot
pushed a commit
that referenced
this pull request
Sep 6, 2019
40518: storage: don't perform one-phase commit transactions after restarts r=nvanbenschoten a=nvanbenschoten Fixes #40466. This commit adjusts the Replica-side logic around executing one-phase commit transactions. It prevents batches that could be executed on the 1PC fast-path from doing so if they contain transactions in any epoch other than their first epoch. We saw in #40466 that this could lead to intent abandonment if the txn had written intents in earlier epochs. I believe that this is a new issue that was introduced when we moved from using the presence of a BeginTxn and EndTxn in the same batch to detect 1PC transactions to using the sequence numbers of writes and the presence of an EndTxn in a batch to detect 1PC transactions. That change was necessary for parallel commits. Before that change, I think logic in DistSender was preventing us from hitting this case because it would always split EndTxn requests from the rest of its batch when a txn had been restarted: https://github.com/cockroachdb/cockroach/blob/18bdfe1a691fa6d785d510e86d27cecdac9c436e/pkg/kv/dist_sender.go#L692-L697 Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Build succeeded |
nvanbenschoten
added a commit
to nvanbenschoten/cockroach
that referenced
this pull request
Sep 26, 2019
Found while verifying that cockroachdb#40518 didn't negatively impact YCSB performance. As it turns out, most txns that restart hit the check in DistSender instead, and that ends up being critical for forward progress. Without it, transactions appear to starve because they never write intents. The commit doesn't make any changes there, but it does fix an issue where 1PC transactions were being incorrectly detected. The detection was ignoring the fact that a 1PC attempt could be rejected by DistSender or a Replica. We now tie the metric to whether an EndTransaction actually evaluated as a 1PC txn instead of tying it to whether the EndTransaction _wanted_ to be evaluated as a 1PC txn. Release justification: low risk and improves metric reporting Release note: None
craig bot
pushed a commit
that referenced
this pull request
Sep 26, 2019
41098: roachtest/jepsen: don't fail test if retrieving invoke.log fails r=nvanbenschoten a=nvanbenschoten Fixes #41062. Release justification: Testing only. Release note: None 41104: kv: check for 1-phase commit after request, not before r=nvanbenschoten a=nvanbenschoten Found while verifying that #40518 didn't negatively impact YCSB performance. As it turns out, most txns that restart hit the check in DistSender instead, and that ends up being critical for forward progress. Without it, transactions appear to starve because they never write intents. The commit doesn't make any changes there, but it does fix an issue where 1PC transactions were being incorrectly detected. The detection was ignoring the fact that a 1PC attempt could be rejected by DistSender or a Replica. We now tie the metric to whether an EndTransaction actually evaluated as a 1PC txn instead of tying it to whether the EndTransaction _wanted_ to be evaluated as a 1PC txn. Release justification: low risk and improves metric reporting Release note: None 41127: Add myself to AUTHORS r=miretskiy a=miretskiy Authors += Myself Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com> Co-authored-by: Yevgeniy Miretskiy <yevgeniy@cockroachlabs.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #40466.
This commit adjusts the Replica-side logic around executing one-phase commit transactions. It prevents batches that could be executed on the 1PC fast-path from doing so if they contain transactions in any epoch other than their first epoch. We saw in #40466 that this could lead to intent abandonment if the txn had written intents in earlier epochs.
I believe that this is a new issue that was introduced when we moved from using the presence of a BeginTxn and EndTxn in the same batch to detect 1PC transactions to using the sequence numbers of writes and the presence of an EndTxn in a batch to detect 1PC transactions. That change was necessary for parallel commits. Before that change, I think logic in DistSender was preventing us from hitting this case because it would always split EndTxn requests from the rest of its batch when a txn had been restarted:
cockroach/pkg/kv/dist_sender.go
Lines 692 to 697 in 18bdfe1