Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tikv: eliminate unnecessary pessimistic rollback #12561

Merged
merged 7 commits into from
Oct 15, 2019

Conversation

youjiali1995
Copy link
Contributor

Signed-off-by: youjiali1995 zlwgx1023@gmail.com

What problem does this PR solve?

When acquiring pessimistic locks fails, we do async pessimistic rollback to clean up all locks even some of them are not locked. These rollback operations will conflict with the following lock operations which may reduce performance.

What is changed and how it works?

If there is only 1 key and lock failed, no need to do pessimistic rollback.

sysbench oltp_write_only, tables: 32*100, threads: 128

Before

SQL statistics:
    queries performed:
        read:                            0
        write:                           28523
        other:                           14900
        total:                           43423
    transactions:                        5859   (47.05 per sec.)
    queries:                             43423  (348.71 per sec.)
    ignored errors:                      3182   (25.55 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          124.5217s
    total number of events:              5859

Latency (ms):
         min:                                    1.77
         avg:                                 2694.71
         max:                                33254.30
         95th percentile:                     9624.59
         sum:                             15788288.77

This PR

SQL statistics:
    queries performed:
        read:                            0
        write:                           342892
        other:                           179245
        total:                           522137
    transactions:                        70701  (235.34 per sec.)
    queries:                             522137 (1737.99 per sec.)
    ignored errors:                      37843  (125.96 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          300.4248s
    total number of events:              70701

Latency (ms):
         min:                                    1.88
         avg:                                  543.52
         max:                                 8555.44
         95th percentile:                     1561.52
         sum:                             38427743.18

Check List

Tests

  • No code

Code changes

Side effects

Related changes

  • Need to cherry-pick to the release branch

Release note

  • Improve the performance of the pessimistic transaction in some situations with serious conflicts.

Signed-off-by: youjiali1995 <zlwgx1023@gmail.com>
// Sleep a little, wait for the other transaction that blocked by this transaction to acquire the lock.
time.Sleep(time.Millisecond * 5)
// If there is only 1 key and lock failed, no need to do pessimistic rollback.
if len(keys) > 1 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to make sure the err indicates the lock must not be locked to skip pessimistic rollback.

Signed-off-by: youjiali1995 <zlwgx1023@gmail.com>
@codecov
Copy link

codecov bot commented Oct 9, 2019

Codecov Report

Merging #12561 into master will not change coverage.
The diff coverage is n/a.

@@             Coverage Diff             @@
##             master     #12561   +/-   ##
===========================================
  Coverage   79.9092%   79.9092%           
===========================================
  Files           462        462           
  Lines        104934     104934           
===========================================
  Hits          83852      83852           
  Misses        14892      14892           
  Partials       6190       6190

@youjiali1995
Copy link
Contributor Author

@coocood PTAL again. Thanks.

@coocood
Copy link
Member

coocood commented Oct 9, 2019

LGTM

@coocood
Copy link
Member

coocood commented Oct 9, 2019

@tiancaiamao PTAL

@MyonKeminta MyonKeminta self-requested a review October 10, 2019 07:38
Copy link
Contributor

@MyonKeminta MyonKeminta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM.

wg.Wait()
// Sleep a little, wait for the other transaction that blocked by this transaction to acquire the lock.
time.Sleep(time.Millisecond * 5)
keyMayBeLocked := terror.ErrorNotEqual(kv.ErrWriteConflict, err) && terror.ErrorNotEqual(kv.ErrKeyExists, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid if we add more error types or changes our usage of these error types in the future, this may fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is why it use ErrorNotEqual here. As long as we make sure ErrWriteConflict and ErrKeyExists won't lock the key.

@tiancaiamao
Copy link
Contributor

LGTM

@tiancaiamao tiancaiamao added the status/LGT2 Indicates that a PR has LGTM 2. label Oct 10, 2019
@coocood coocood added the status/can-merge Indicates a PR has been approved by a committer. label Oct 10, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Oct 10, 2019

Your auto merge job has been accepted, waiting for 12322

@sre-bot
Copy link
Contributor

sre-bot commented Oct 10, 2019

/run-all-tests

@ngaut
Copy link
Member

ngaut commented Oct 12, 2019

/bench

@sre-bot
Copy link
Contributor

sre-bot commented Oct 12, 2019

@@                               Benchmark Diff                               @@
================================================================================
--- tidb: 394505b53aa747f2a4dce8a735cebb65ccf97cb9
+++ tidb: 179bacf3159875d3f3983f44e15d8e986be4eecc
tikv: 8cfeb1a1c1bf1194d10ebab3c9e98e084b2d2d46
pd: 2550168c27878a33a8465a72bafdc534e409b60a
================================================================================
test-1: < oltp_point_select >
    * QPS : 79077.94 ± 0.3341% (std=210.47) delta: 2.74% (p=0.198)
    * AvgMs : 3.24 ± 0.3086% (std=0.01) delta: -2.56%
    * PercentileMs99 : 6.62 ± 2.5370% (std=0.10) delta: -1.78%
            
test-2: < oltp_read_write >
    * QPS : 38381.48 ± 0.5853% (std=153.58) delta: 0.41% (p=0.142)
    * AvgMs : 133.96 ± 0.5987% (std=0.54) delta: -0.38%
    * PercentileMs99 : 257.95 ± 0.0000% (std=0.00) delta: 0.00%
            
test-3: < oltp_insert >
    * QPS : 22199.54 ± 0.6677% (std=100.56) delta: 0.77% (p=0.303)
    * AvgMs : 11.53 ± 0.6766% (std=0.05) delta: -0.77%
    * PercentileMs99 : 23.35 ± 1.0791% (std=0.21) delta: -0.71%
            
test-4: < oltp_update_index >
    * QPS : 17530.75 ± 0.6709% (std=77.49) delta: -1.05% (p=0.498)
    * AvgMs : 14.45 ± 0.7887% (std=0.07) delta: -0.10%
    * PercentileMs99 : 31.60 ± 1.0823% (std=0.28) delta: 0.00%
            
test-5: < oltp_update_non_index >
    * QPS : 29919.09 ± 0.2269% (std=45.56) delta: 0.22% (p=0.632)
    * AvgMs : 8.55 ± 0.2572% (std=0.01) delta: -0.17%
    * PercentileMs99 : 21.11 ± 0.0000% (std=0.00) delta: 0.00%
            

https://perf.pingcap.com

@sre-bot
Copy link
Contributor

sre-bot commented Oct 15, 2019

cherry pick to release-3.0 in PR #12707

@sre-bot
Copy link
Contributor

sre-bot commented Oct 15, 2019

cherry pick to release-3.1 in PR #12708

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/tikv sig/transaction SIG:Transaction status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants