-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doc: lazy constraint check in pessimistic txn #36889
doc: lazy constraint check in pessimistic txn #36889
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
5fd5435
to
40e0d74
Compare
Code Coverage Details: https://codecov.io/github/pingcap/tidb/commit/a197163ea7165467fe83c34dd1a0f31139b5db12 |
8401fe6
to
36027d1
Compare
Shall we add a section to demonstrate the safety of the change since the constraint check mechanism is subtle and vulnerable? |
|
||
Note that the feature is only needed if part of your transaction still needs to acquire pessimistic locks. Otherwise, use the optimistic transaction mode instead. | ||
|
||
* `INSERT` rows to a table without any unique key. TiDB does not generate duplicated implicit row ID, so normally there cannot be unique key conflicts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Limiting the use cases to simple INSERT
statements is needed.
/* s1 */ INSERT INTO t1 VALUES (1); | ||
/* s1 */ COMMIT; | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For such cases,
/* init */ CREATE TABLE t1 (id INT NOT NULL PRIMARY KEY, value int);
/* init */ INSERT INTO t1 VALUES (1, 1);
/* s1 */ BEGIN PESSIMISTIC;
/* s1 */ INSERT INTO t1 VALUES (1, 2) LAZY CHECK; -- Skip acquiring the lock. So the statement would succeed.
/* s1 */ SELECT * FROM t1 FOR UPDATE; -- Here the pessimistic lock on rowkey '1' would be acquired
Here it's a bit difficult to decide the execution result of the for update
statement, should the memory buffer content be returned or the row value(1, 1), or a duplicated entry error detected acquiring the pessimistic lock?
Actually, the snapshot read semantic becomes weird in some cases where the pessimistic current read is used together such as #24195. A side effect may be that the current read semantic would become weird too in some special cases, especially some historical technical debt like the behaviors of the unionScan
executor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right... This makes the case of #24195 more complex.
I don't think the complete solution to this problem should be part of this proposal. So, for this case, personally I think we can treat it as if the pessimistic lock of the INSERT
gets lost (due to failed pipelined or in-memory locking). This can happen in a normal pessimistic transaction now. In this case, the value in the table can be different from those in the buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
treat it as if the pessimistic lock of the INSERT gets lost
Make sense, so we could just ignore the lazily written result if it's conflicted with some returned current read result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "Acquisition of lock always resolves constraint check" is simpler in semantics, which requires 1 additional state of keys in prewrite: no lock + check
.
While treating as if the lock is lost requires 2 additional states: lock + check
and no lock + check
in prewrite. It also introduces complexity in the read/write procedures, which increases the risk of doing it right.
It seems the cost is throwing "Duplicate key" errors in SELECT FOR UPDATE
or other DMLs that acquire locks. I assume users should be ready for handling such cases since the feature is not for everyone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Umm, yes. As long as we can accept throwing "duplicated entry" errors in a read-only statement, "Acquisition of lock always resolves constraint check" is the easiest to implement and brings least problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I follow @ekexium's idea and add a section to define the behavior in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here it's a bit difficult to decide the execution result of the for update statement, should the memory buffer content be returned or the row value(1, 1), or a duplicated entry error detected acquiring the pessimistic lock?
If 'deferred insert' mixed with 'lock if exists', it will make pessimistic lock invalid in some cases. The feature 'lock if exists' may not coexist with 'deferred insert'.
create table t(a int primary key, b int)
begin;
insert into t values(1,1) /* lazy checked*/ ; -- cached in buffer and don't acquire pessimistic lock or unique check
select * from t where a = 1 for update /lock if exists/; -- tikv can't find the record and doesn't make a pessimistic lock
commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
insert into t values(1,1) /* lazy checked*/ ; -- cached in buffer and don't acquire pessimistic lock or unique check
select * from t where a = 1 for update /lock if exists/; -- tikv can't find the record and doesn't make a pessimistic lock
commit
In this case, the late acquisition of lock should override lock_if_exists. Maybe we have to get from the mem buffer first to decide how we lock the keys.
22d2d4d
to
b9b8a26
Compare
A case similar to that in #24195, if we enable the feature: create table t2 (id int primary key, uk int, unique key i1(uk))
insert into t2 values (1, 1)
set @@tidb_txn_assertion_level=off
begin pessimistic
insert into t2 values (2, 1)
select * from t2 use index(primary) for update -- same result whether for update or not
delete from t2 where id = 1
commit
admin check table t2 The select result is 2 rows "Acquisition of lock always resolves constraint check" doesn't help. The feature sort of enlarges the anomaly |
For reasoning its safety, I'm considering what if we allow an arbitrary mix of optimistic and pessimistic mutations in one transaction? Will it break any safety property? |
Pessimistic mutations are inheritantly no different from optimistic mutations after they are turned into 2pc locks in prewrite, so I think it is very unlikely that this will break any properties. Instead, I worry more about the handling about union scan and generating the mutations. It's more complicated when the result set sometimes break constraints but the transaction can commit in the end. |
There're several cases we may need to design and handle carefully, for example:
|
When a key that has its constraint check deferred in a previous statement gets modified, we have 2 options:
|
What I'm not sure about the second option is the risk of how we handle a result set with broken constraints. And you have found an example above that there can be inconsistency unless we add new special ops. There could be other possible issues that we haven't found. |
Yes, I'm pursuing proof of the safety of this change, whichever options we take. |
@ekexium |
@cfzjywxk The problem exists in the implementation of option 2 mentioned above (so neither row key nor unique key is locked in the |
For option 1, would this still be a problem if the lock on a unique key is deferred and then the |
I think so. I mean it's free from data-index inconsistency. The seemingly invalid |
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
…read from the membuf Signed-off-by: Yilin Chen <sticnarf@gmail.com>
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
c5accb5
to
9a6f4df
Compare
|
||
Because `enum` is compatible with `bool` on the wire, we can seamlessly extend the `is_pessimistic_lock` field. In a TiDB cluster, TiKV instances are upgraded before TiDB. So, we can make sure TiKV can handle it correctly when TiDB starts using the new protocol. | ||
|
||
When `tidb_constraint_check_in_place_pessimistic` is off, all the keys that have `PresumeKeyNotExists` flags don't need to be locked when executing the DML. They will be marked with a new special flag `NeedConstraintCheckInPrewrite`. When the transaction commits, TiDB will set the actions of these keys to `DO_CONSTRAINT_CHECK`. So, TiKV will not check the existence of pessimistic locks of these keys. Instead, constraint and write conflict checks should be done for these keys. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could clarify the detailed affected DML types by this variable, like:
- The simple
Insert
andUpdate
statements are considered. - Extended grammar like
Insert on duplicate
andInsert ignore
is not considered.
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/merge |
This pull request has been accepted and is ready to merge. Commit hash: cf11337
|
TiDB MergeCI notify
|
What problem does this PR solve?
Issue Number: ref #36579
Check List
Tests
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.