doc: lazy constraint check in pessimistic txn #36889

sticnarf · 2022-08-04T06:54:28Z

What problem does this PR solve?

Issue Number: ref #36579

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No code

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

ti-chi-bot · 2022-08-04T06:54:29Z

[REVIEW NOTIFICATION]

This pull request has been approved by:

cfzjywxk
ekexium

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

sticnarf · 2022-08-04T06:57:13Z

cc @ekexium @TonsnakeLin @cfzjywxk

sre-bot · 2022-08-04T07:03:30Z

Code Coverage Details: https://codecov.io/github/pingcap/tidb/commit/a197163ea7165467fe83c34dd1a0f31139b5db12

docs/design/2022-08-04-lazy-constraint-check.md

ekexium · 2022-08-04T07:53:28Z

Shall we add a section to demonstrate the safety of the change since the constraint check mechanism is subtle and vulnerable?

cfzjywxk · 2022-08-04T13:03:06Z

docs/design/2022-08-04-lazy-constraint-check.md

+
+Note that the feature is only needed if part of your transaction still needs to acquire pessimistic locks. Otherwise, use the optimistic transaction mode instead.
+
+* `INSERT` rows to a table without any unique key. TiDB does not generate duplicated implicit row ID, so normally there cannot be unique key conflicts.


Limiting the use cases to simple INSERT statements is needed.

cfzjywxk · 2022-08-04T13:11:16Z

docs/design/2022-08-04-lazy-constraint-check.md

+/* s1 */ INSERT INTO t1 VALUES (1);
+/* s1 */ COMMIT;
+```
+


For such cases,

/* init */ CREATE TABLE t1 (id INT NOT NULL PRIMARY KEY, value int); /* init */ INSERT INTO t1 VALUES (1, 1); /* s1 */ BEGIN PESSIMISTIC; /* s1 */ INSERT INTO t1 VALUES (1, 2) LAZY CHECK; -- Skip acquiring the lock. So the statement would succeed. /* s1 */ SELECT * FROM t1 FOR UPDATE; -- Here the pessimistic lock on rowkey '1' would be acquired

Here it's a bit difficult to decide the execution result of the for update statement, should the memory buffer content be returned or the row value(1, 1), or a duplicated entry error detected acquiring the pessimistic lock?

Actually, the snapshot read semantic becomes weird in some cases where the pessimistic current read is used together such as #24195. A side effect may be that the current read semantic would become weird too in some special cases, especially some historical technical debt like the behaviors of the unionScan executor.

Right... This makes the case of #24195 more complex.

I don't think the complete solution to this problem should be part of this proposal. So, for this case, personally I think we can treat it as if the pessimistic lock of the INSERT gets lost (due to failed pipelined or in-memory locking). This can happen in a normal pessimistic transaction now. In this case, the value in the table can be different from those in the buffer.

treat it as if the pessimistic lock of the INSERT gets lost

Make sense, so we could just ignore the lazily written result if it's conflicted with some returned current read result.

I think "Acquisition of lock always resolves constraint check" is simpler in semantics, which requires 1 additional state of keys in prewrite: no lock + check.
While treating as if the lock is lost requires 2 additional states: lock + check and no lock + check in prewrite. It also introduces complexity in the read/write procedures, which increases the risk of doing it right.

It seems the cost is throwing "Duplicate key" errors in SELECT FOR UPDATE or other DMLs that acquire locks. I assume users should be ready for handling such cases since the feature is not for everyone.

Umm, yes. As long as we can accept throwing "duplicated entry" errors in a read-only statement, "Acquisition of lock always resolves constraint check" is the easiest to implement and brings least problems.

I follow @ekexium's idea and add a section to define the behavior in this case.

Here it's a bit difficult to decide the execution result of the for update statement, should the memory buffer content be returned or the row value(1, 1), or a duplicated entry error detected acquiring the pessimistic lock?

If 'deferred insert' mixed with 'lock if exists', it will make pessimistic lock invalid in some cases. The feature 'lock if exists' may not coexist with 'deferred insert'.
create table t(a int primary key, b int)
begin;
insert into t values(1,1) /* lazy checked*/ ; -- cached in buffer and don't acquire pessimistic lock or unique check
select * from t where a = 1 for update /lock if exists/; -- tikv can't find the record and doesn't make a pessimistic lock
commit

insert into t values(1,1) /* lazy checked*/ ; -- cached in buffer and don't acquire pessimistic lock or unique check
select * from t where a = 1 for update /lock if exists/; -- tikv can't find the record and doesn't make a pessimistic lock
commit

In this case, the late acquisition of lock should override lock_if_exists. Maybe we have to get from the mem buffer first to decide how we lock the keys.

ekexium · 2022-08-09T10:20:02Z

I am somewhat convinced of the safety of write: writes won't break the integrity of data in the storage.
While the safety of read, namely read statements won't return results that break the constraint, may require further consideration.

A case similar to that in #24195, if we enable the feature:

create table t2 (id int primary key, uk int, unique key i1(uk))
insert into t2 values (1, 1)
set @@tidb_txn_assertion_level=off

begin pessimistic
insert into t2 values (2, 1)
select * from t2 use index(primary) for update -- same result whether for update or not
delete from t2 where id = 1
commit

admin check table t2

The select result is 2 rows (1, 1) and (2, 1), which breaks the uniqueness constraint. While the final commit can succeed. And the admin check will report data-index inconsistency. BTW, assertion could prevent the txn from committing.

"Acquisition of lock always resolves constraint check" doesn't help. The feature sort of enlarges the anomaly

ekexium · 2022-08-10T05:17:03Z

For reasoning its safety, I'm considering what if we allow an arbitrary mix of optimistic and pessimistic mutations in one transaction? Will it break any safety property?

sticnarf · 2022-08-10T06:15:01Z

For reasoning its safety, I'm considering what if we allow an arbitrary mix of optimistic and pessimistic mutations in one transaction? Will it break any safety property?

Pessimistic mutations are inheritantly no different from optimistic mutations after they are turned into 2pc locks in prewrite, so I think it is very unlikely that this will break any properties.

Instead, I worry more about the handling about union scan and generating the mutations. It's more complicated when the result set sometimes break constraints but the transaction can commit in the end.

cfzjywxk · 2022-08-10T09:29:15Z

the handling about union scan and generating the mutations

There're several cases we may need to design and handle carefully, for example:

delete-your-write from which we've encountered data inconsistency issues before. The inserted content may not need to be committed but constraint checks are still required.
Report duplicate error in time and avoid further weird behaviors considering the unionScan executor.
The transactional key flags and statement level key flags processing which are easy to make mistakes.

ekexium · 2022-08-10T09:36:05Z

When a key that has its constraint check deferred in a previous statement gets modified, we have 2 options:

Lock it as we did before, resolve the constraint check and unset the defer flag
Then we need some kind of a "once flag" mechanism, considering the possibility that a statement may involve multiple operations on a key. Any modifications to the key (even in the same statement) will unset the flag.
Skip the lock. The reason is the same as the one that let us skip the constraint check: if the user believes the key is less likely to have conflicts, why don't we skip the lock as well? It doesn't affect the success rate after all. It requires PresumeNotExist flags on more than PUT mutations. An extra Op like DeleteMustExist needs to be added, at least.

sticnarf · 2022-08-10T09:40:21Z

2. Skip the lock. The reason is the same as the one that let us skip the constraint check: if the user believes the key is less likely to have conflicts, why don't we skip the lock as well? It doesn't affect the success rate after all. It requires PresumeNotExist flags on more than PUT mutations. An extra Op like DeleteMustExist needs to be added, at least.

What I'm not sure about the second option is the risk of how we handle a result set with broken constraints. And you have found an example above that there can be inconsistency unless we add new special ops. There could be other possible issues that we haven't found.

ekexium · 2022-08-10T09:43:19Z

There could be other possible issues that we haven't found.

Yes, I'm pursuing proof of the safety of this change, whichever options we take.

cfzjywxk · 2022-08-11T06:47:38Z

select * from t2 use index(primary) for update -- same result whether for upda

@ekexium
Does it happen because the select for update statement does not acquire the pessimistic lock on the unique key but just the row key? Actually, the unique key should also be locked though it's not changed as described in the issue.
It seems that if the row key and unique key are always locked Acquisition of lock always resolves constraint check should be able to work.

ekexium · 2022-08-11T06:59:09Z

select * from t2 use index(primary) for update -- same result whether for upda
@ekexium Does it happen because the select for update statement does not acquire the pessimistic lock on the unique key but just the row key? Actually, the unique key should also be locked though it's not changed as described in the issue. It seems that if the row key and unique key are always locked Acquisition of lock always resolves constraint check should be able to work.

@cfzjywxk The problem exists in the implementation of option 2 mentioned above (so neither row key nor unique key is locked in the select for update and delete). Option 1 is free from this issue.

cfzjywxk · 2022-08-11T07:15:29Z

Actually, the unique key should also be locked though it's not changed as described in the #36438.

For option 1, would this still be a problem if the lock on a unique key is deferred and then the select for update does not lock it as expected because of the issue 36438 so the constraint resolving does not work on this unique key?

ekexium · 2022-08-11T07:25:19Z

Actually, the unique key should also be locked though it's not changed as described in the #36438.

For option 1, would this still be a problem if the lock on a unique key is deferred and then the select for update does not lock it as expected because of the issue 36438 so the constraint resolving does not work on this unique key?

I think so. I mean it's free from data-index inconsistency. The seemingly invalid select result is an old issue. It's better to fix it, while our bottom line is that we guarantee the constraint satisfaction of read results only if the transaction successfully commits.

docs/design/2022-08-04-pessimistic-lazy-constraint-check.md

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

…read from the membuf Signed-off-by: Yilin Chen <sticnarf@gmail.com>

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

cfzjywxk · 2022-08-16T12:23:24Z

docs/design/2022-08-04-pessimistic-lazy-constraint-check.md

+
+Because `enum` is compatible with `bool` on the wire, we can seamlessly extend the  `is_pessimistic_lock` field. In a TiDB cluster, TiKV instances are upgraded before TiDB. So, we can make sure TiKV can handle it correctly when TiDB starts using the new protocol.
+
+When `tidb_constraint_check_in_place_pessimistic` is off, all the keys that have `PresumeKeyNotExists` flags don't need to be locked when executing the DML. They will be marked with a new special flag `NeedConstraintCheckInPrewrite`. When the transaction commits, TiDB will set the actions of these keys to `DO_CONSTRAINT_CHECK`. So, TiKV will not check the existence of pessimistic locks of these keys. Instead, constraint and write conflict checks should be done for these keys.


We could clarify the detailed affected DML types by this variable, like:

The simple Insert and Update statements are considered.

Extended grammar like Insert on duplicate and Insert ignore is not considered.

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

cfzjywxk

LGTM

cfzjywxk · 2022-08-17T13:31:28Z

/merge

ti-chi-bot · 2022-08-17T13:31:33Z

This pull request has been accepted and is ready to merge.

Commit hash: cf11337

sre-bot · 2022-08-17T14:41:02Z

TiDB MergeCI notify

CI Name	Result	Duration	Compare with Parent commit
idc-jenkins-ci/integration-cdc-test	🟢 all 36 tests passed	54 min	Existing passed
idc-jenkins-ci-tidb/tics-test	🟢 all 1 tests passed	18 min	Existing passed
idc-jenkins-ci-tidb/integration-common-test	🟢 all 17 tests passed	12 min	Existing passed
idc-jenkins-ci-tidb/common-test	🟢 all 11 tests passed	12 min	Existing passed
idc-jenkins-ci-tidb/integration-ddl-test	🟢 all 6 tests passed	5 min 51 sec	Existing passed
idc-jenkins-ci-tidb/sqllogic-test-2	🟢 all 28 tests passed	5 min 31 sec	Existing passed
idc-jenkins-ci-tidb/sqllogic-test-1	🟢 all 26 tests passed	4 min 18 sec	Existing passed
idc-jenkins-ci-tidb/mybatis-test	🟢 all 1 tests passed	3 min 43 sec	Existing passed
idc-jenkins-ci-tidb/integration-compatibility-test	🟢 all 1 tests passed	3 min 12 sec	Existing passed
idc-jenkins-ci-tidb/plugin-test	🟢 build success, plugin test success	4min	Existing passed

ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 4, 2022

sticnarf force-pushed the pessimistic-lazy-constraint-check-doc branch from 5fd5435 to 40e0d74 Compare August 4, 2022 06:54

ekexium reviewed Aug 4, 2022

View reviewed changes

sticnarf force-pushed the pessimistic-lazy-constraint-check-doc branch from 8401fe6 to 36027d1 Compare August 4, 2022 07:45

cfzjywxk reviewed Aug 4, 2022

View reviewed changes

sticnarf force-pushed the pessimistic-lazy-constraint-check-doc branch 2 times, most recently from 22d2d4d to b9b8a26 Compare August 9, 2022 03:49

cfzjywxk requested review from ekexium and TonsnakeLin August 12, 2022 10:08

ekexium reviewed Aug 15, 2022

View reviewed changes

docs/design/2022-08-04-pessimistic-lazy-constraint-check.md Outdated Show resolved Hide resolved

docs/design/2022-08-04-pessimistic-lazy-constraint-check.md Outdated Show resolved Hide resolved

docs/design/2022-08-04-pessimistic-lazy-constraint-check.md Outdated Show resolved Hide resolved

sticnarf added 4 commits August 15, 2022 16:29

doc: lazy constraint check in pessimistic txn

dc4270a

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

Address comments

f1fd3ef

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

Define the behavior of locking lazy checked keys

72900d7

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

Briefly state safety of the proposal

28164fb

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

sticnarf added 4 commits August 15, 2022 16:29

Update system variable

3cd9009

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

Fix the delete-your-write case in safety section

043ee08

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

Acquire lock whenever a key with NeedConflictCheckInPrewrite flag is …

a268ee9

…read from the membuf Signed-off-by: Yilin Chen <sticnarf@gmail.com>

address comments

9a6f4df

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

sticnarf force-pushed the pessimistic-lazy-constraint-check-doc branch from c5accb5 to 9a6f4df Compare August 15, 2022 08:29

cfzjywxk reviewed Aug 16, 2022

View reviewed changes

Clarify affected statements

cf11337

Signed-off-by: Yilin Chen <sticnarf@gmail.com>

cfzjywxk approved these changes Aug 17, 2022

View reviewed changes

ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Aug 17, 2022

ekexium approved these changes Aug 17, 2022

View reviewed changes

ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Aug 17, 2022

ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Aug 17, 2022

Merge branch 'master' into pessimistic-lazy-constraint-check-doc

a197163

ti-chi-bot merged commit 9709249 into pingcap:master Aug 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

doc: lazy constraint check in pessimistic txn #36889

doc: lazy constraint check in pessimistic txn #36889

sticnarf commented Aug 4, 2022

ti-chi-bot commented Aug 4, 2022 •

edited

Loading

sticnarf commented Aug 4, 2022

sre-bot commented Aug 4, 2022 •

edited

Loading

ekexium commented Aug 4, 2022

cfzjywxk Aug 4, 2022

cfzjywxk Aug 4, 2022

sticnarf Aug 5, 2022

cfzjywxk Aug 5, 2022 •

edited

Loading

ekexium Aug 5, 2022

sticnarf Aug 5, 2022

sticnarf Aug 5, 2022

TonsnakeLin Aug 11, 2022 •

edited

Loading

sticnarf Aug 11, 2022

ekexium commented Aug 9, 2022 •

edited

Loading

ekexium commented Aug 10, 2022 •

edited

Loading

sticnarf commented Aug 10, 2022

cfzjywxk commented Aug 10, 2022 •

edited

Loading

ekexium commented Aug 10, 2022 •

edited

Loading

sticnarf commented Aug 10, 2022

ekexium commented Aug 10, 2022

cfzjywxk commented Aug 11, 2022 •

edited

Loading

ekexium commented Aug 11, 2022

cfzjywxk commented Aug 11, 2022 •

edited

Loading

ekexium commented Aug 11, 2022 •

edited

Loading

cfzjywxk Aug 16, 2022

cfzjywxk left a comment

cfzjywxk commented Aug 17, 2022

ti-chi-bot commented Aug 17, 2022

sre-bot commented Aug 17, 2022


		Note that the feature is only needed if part of your transaction still needs to acquire pessimistic locks. Otherwise, use the optimistic transaction mode instead.

		* `INSERT` rows to a table without any unique key. TiDB does not generate duplicated implicit row ID, so normally there cannot be unique key conflicts.


		Because `enum` is compatible with `bool` on the wire, we can seamlessly extend the `is_pessimistic_lock` field. In a TiDB cluster, TiKV instances are upgraded before TiDB. So, we can make sure TiKV can handle it correctly when TiDB starts using the new protocol.

		When `tidb_constraint_check_in_place_pessimistic` is off, all the keys that have `PresumeKeyNotExists` flags don't need to be locked when executing the DML. They will be marked with a new special flag `NeedConstraintCheckInPrewrite`. When the transaction commits, TiDB will set the actions of these keys to `DO_CONSTRAINT_CHECK`. So, TiKV will not check the existence of pessimistic locks of these keys. Instead, constraint and write conflict checks should be done for these keys.

doc: lazy constraint check in pessimistic txn #36889

doc: lazy constraint check in pessimistic txn #36889

Conversation

sticnarf commented Aug 4, 2022

What problem does this PR solve?

Check List

Release note

ti-chi-bot commented Aug 4, 2022 • edited Loading

sticnarf commented Aug 4, 2022

sre-bot commented Aug 4, 2022 • edited Loading

ekexium commented Aug 4, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cfzjywxk Aug 5, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TonsnakeLin Aug 11, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ekexium commented Aug 9, 2022 • edited Loading

ekexium commented Aug 10, 2022 • edited Loading

sticnarf commented Aug 10, 2022

cfzjywxk commented Aug 10, 2022 • edited Loading

ekexium commented Aug 10, 2022 • edited Loading

sticnarf commented Aug 10, 2022

ekexium commented Aug 10, 2022

cfzjywxk commented Aug 11, 2022 • edited Loading

ekexium commented Aug 11, 2022

cfzjywxk commented Aug 11, 2022 • edited Loading

ekexium commented Aug 11, 2022 • edited Loading

Choose a reason for hiding this comment

cfzjywxk left a comment

Choose a reason for hiding this comment

cfzjywxk commented Aug 17, 2022

ti-chi-bot commented Aug 17, 2022

sre-bot commented Aug 17, 2022

TiDB MergeCI notify

ti-chi-bot commented Aug 4, 2022 •

edited

Loading

sre-bot commented Aug 4, 2022 •

edited

Loading

cfzjywxk Aug 5, 2022 •

edited

Loading

TonsnakeLin Aug 11, 2022 •

edited

Loading

ekexium commented Aug 9, 2022 •

edited

Loading

ekexium commented Aug 10, 2022 •

edited

Loading

cfzjywxk commented Aug 10, 2022 •

edited

Loading

ekexium commented Aug 10, 2022 •

edited

Loading

cfzjywxk commented Aug 11, 2022 •

edited

Loading

cfzjywxk commented Aug 11, 2022 •

edited

Loading

ekexium commented Aug 11, 2022 •

edited

Loading