Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executor: support innodb_lock_wait_timeout for pessimistic transaction #13103

Merged
merged 4 commits into from
Nov 5, 2019

Conversation

cfzjywxk
Copy link
Contributor

@cfzjywxk cfzjywxk commented Nov 4, 2019

What problem does this PR solve?

make innodb_lock_wait_timeout work for pessimistic transactions in tidb

in MySQL, this is a system variable to set lock wait time for transactions,
The length of time in seconds an InnoDB transaction waits for a row lock before giving up. The default value is 50 seconds. A transaction that tries to access a row that is locked by another InnoDB transaction waits at most this many seconds for write access to the row before issuing the following error:
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
When a lock wait timeout occurs, the current statement is rolled back (not the entire transaction).

mysql-doc

What is changed and how it works?

  • make system variable innodb_lock_wait_timeout effective,the default value is same with mysql
  • check wait time when met with ErrLocked error from kv, and report lock wait timout error if wait time greater than lockWaitTime, the current statement will be rollbacked like mysql behaviors
  • innodb_lock_wait_timeout does affect only user sessions dmls

TODO:

  1. refactor handlePessimiticError to always update for forUpdateTs
  2. make lock wait action in mvcc_leveldb same with action in tikv.

Check List

Tests

  • Unit test

Code changes

  • Has exported variable/fields change

Side effects

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation

Release note

  • Write release note for bug-fix or new feature.

@codecov
Copy link

codecov bot commented Nov 4, 2019

Codecov Report

Merging #13103 into master will not change coverage.
The diff coverage is n/a.

@@             Coverage Diff             @@
##             master     #13103   +/-   ##
===========================================
  Coverage   80.6801%   80.6801%           
===========================================
  Files           468        468           
  Lines        113153     113153           
===========================================
  Hits          91292      91292           
  Misses        15039      15039           
  Partials       6822       6822

tk3.MustExec("begin pessimistic")
start := time.Now()
_, err := tk3.Exec("select * from tk where c1 = 1 for update")
c.Check(time.Since(start), GreaterEqual, time.Duration(1*time.Second))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to check the duration is not too long.

@@ -447,6 +447,9 @@ type SessionVars struct {
isolationReadEngines map[kv.StoreType]struct{}

PlannerSelectBlockAsName []ast.HintTable

// Lock wait timeout for pessimistic transaction in milliseconds, `innodb_lock_wait_timeout` is in seconds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put is in seconds at the last is too easy to be misunderstood as LockWaitTimeout is in seconds.

@@ -447,6 +447,9 @@ type SessionVars struct {
isolationReadEngines map[kv.StoreType]struct{}

PlannerSelectBlockAsName []ast.HintTable

// Lock wait timeout for pessimistic transaction in milliseconds, `innodb_lock_wait_timeout` is in seconds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Lock wait timeout for pessimistic transaction in milliseconds, `innodb_lock_wait_timeout` is in seconds
// LockWaitTimeout is used for pessimistic transaction in milliseconds, `innodb_lock_wait_timeout` is in seconds.

@@ -815,7 +815,7 @@ func (e *SelectLockExec) Next(ctx context.Context, req *chunk.Chunk) error {
}
return nil
}
var lockWaitTime = kv.LockAlwaysWait
var lockWaitTime = e.ctx.GetSessionVars().LockWaitTimeout
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var lockWaitTime = e.ctx.GetSessionVars().LockWaitTimeout
lockWaitTime := e.ctx.GetSessionVars().LockWaitTimeout

if !c.store.oracle.IsExpired(lock.TxnID, lock.TTL) {
return ErrLockAcquireFailAndNoWaitSet
}
} else if action.lockWaitTime == kv.LockAlwaysWait {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to handle LockAlwaysWait, innodb_wait_timeout is always positive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coocood some inner usages which calls doLockKeys (like RecoverIndexExec, addIndexWorker) will set this param LockAlwaysWait

if !c.store.oracle.IsExpired(lock.TxnID, lock.TTL) {
return ErrLockAcquireFailAndNoWaitSet
}
} else if action.lockWaitTime == kv.LockAlwaysWait {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

Suggested change
} else if action.lockWaitTime == kv.LockAlwaysWait {
} else if action.lockWaitTime != kv.LockAlwaysWait {

then do something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

current we have 3 paths(negative, 0, positive value), I think it's more clear to make them like noodle style, and we just need to add/remove one more else if path if we want to modify the protocol

@@ -812,6 +816,9 @@ func (s *SessionVars) SetSystemVar(name string, val string) error {
case MaxExecutionTime:
timeoutMS := tidbOptPositiveInt32(val, 0)
s.MaxExecutionTime = uint64(timeoutMS)
case InnodbLockWaitTimeout:
lockWaitSec := tidbOptInt64(val, DefInnodbLockWaitTimeout)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to validate the range.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is done in function ValidateSetSystemVar

case InnodbLockWaitTimeout:
	return checkUInt64SystemVar(name, value, 1, 1073741824, vars)
mysql> set global innodb_lock_wait_timeout = 0;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> show warnings;
+---------+------+---------------------------------------------------------+
| Level   | Code | Message                                                 |
+---------+------+---------------------------------------------------------+
| Warning | 1292 | Truncated incorrect innodb_lock_wait_timeout value: '0' |
+---------+------+---------------------------------------------------------+
1 row in set (0.00 sec)

tk3.MustExec("set innodb_lock_wait_timeout = 1")
tk3.MustQuery(`show variables like "innodb_lock_wait_timeout"`).Check(testkit.Rows("innodb_lock_wait_timeout 1"))

tk2.MustExec("set @@autocommit = 0")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why set autocommit here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set this before next begin statement below

@@ -676,6 +676,7 @@ func (action actionPessimisticLock) handleSingleBatch(c *twoPhaseCommitter, bo *
IsFirstLock: c.isFirstLock,
WaitTimeout: action.lockWaitTime,
}, pb.Context{Priority: c.priority, SyncLog: c.syncLog})
lockWaitStartTime := time.Now()
for {
Copy link
Member

@coocood coocood Nov 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to update the req.WaitTimeout . so if the wait timeout is less than 3s,we can return sooner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

roger

return ErrLockAcquireFailAndNoWaitSet
if lock.LockType == pb.Op_PessimisticLock {
if action.lockWaitTime == kv.LockNoWait {
// the pessimistic lock found could be lock left behind(timeout but not recycled yet)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to improve the English of this comment.

} else if action.lockWaitTime == kv.LockAlwaysWait {
// do nothing but keep wait
} else {
// user has set the `InnodbLockWaitTimeout`, check timeout
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to improve the English of this comment.

Copy link
Member

@jackysp jackysp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jackysp jackysp added the status/LGT1 Indicates that a PR has LGTM 1. label Nov 5, 2019
@coocood
Copy link
Member

coocood commented Nov 5, 2019

LGTM

@cfzjywxk
Copy link
Contributor Author

cfzjywxk commented Nov 5, 2019

@tiancaiamao @lysu PTAL

@coocood
Copy link
Member

coocood commented Nov 5, 2019

@youjiali1995 PTAL

Copy link
Contributor

@youjiali1995 youjiali1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@coocood
Copy link
Member

coocood commented Nov 5, 2019

/merge

@sre-bot sre-bot added the status/can-merge Indicates a PR has been approved by a committer. label Nov 5, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Nov 5, 2019

/run-all-tests

@sre-bot sre-bot merged commit 5fd8b4d into pingcap:master Nov 5, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Nov 5, 2019

cherry pick to release-3.0 failed

@sre-bot
Copy link
Contributor

sre-bot commented Apr 7, 2020

It seems that, not for sure, we failed to cherry-pick this commit to release-3.0. Please comment '/run-cherry-picker' to try to trigger the cherry-picker if we did fail to cherry-pick this commit before. @cfzjywxk PTAL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/execution SIG execution sig/transaction SIG:Transaction status/can-merge Indicates a PR has been approved by a committer. status/LGT1 Indicates that a PR has LGTM 1.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants