-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lightning: add complex integration tests for lightning post-import conflict detection "replace" mode #47460
Conversation
…ost-import conflict detection
merge master
merge master
…port conflict detection 'replace' mode
Hi @lyzx2001. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #47460 +/- ##
================================================
+ Coverage 71.8856% 75.8521% +3.9664%
================================================
Files 1398 1447 +49
Lines 405057 418804 +13747
================================================
+ Hits 291178 317672 +26494
+ Misses 94273 81140 -13133
- Partials 19606 19992 +386
Flags with carried forward coverage won't be shown. Click here to find out more.
|
/test pull-br-integration-test |
@lyzx2001: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/test pull-br-integration-test |
@lyzx2001: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest |
@lyzx2001: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest lgtm
@@ -94,6 +95,7 @@ const ( | |||
raw_value mediumblob NOT NULL COMMENT 'the value of the conflicted key', | |||
raw_handle mediumblob NOT NULL COMMENT 'the data handle derived from the conflicted key or value', | |||
raw_row mediumblob NOT NULL COMMENT 'the data retrieved from the handle', | |||
is_data_kv tinyint(1) NOT NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe kv_group
, values can be data
/index
, or if we want finer grain, we can use index id as the name.
current is lgtm too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe current is_data_kv
is simpler in implemention, it seems like it is not necessary to determine whether the value is data
or index
based on the result tablecodec.IsRecordKey(conflictInfo.RawKey)
c int not null, | ||
d text, | ||
key key_b(b), | ||
key key_c(b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
index on c
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modified
check_contains 'a: 1' | ||
check_contains 'b: 1' | ||
check_contains 'c: 1' | ||
check_contains 'd: 1.csv' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more like ignore
semantic in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not quite understand. Here 1,1,1,1.csv
and 1,1,2,2.csv
have conflicts due to PK, and here in replace
semantic we reserve the first row and delete the second row.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ignore
will keep the first row met, like what's done here, replace
will keep the last
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here batch-size = 1
, the actual import order is not the same as that presented in dup_resolve.a.1
c int not null, | ||
d text, | ||
key key_b(b), | ||
key key_c(b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doitto
/lgtm |
/test pull-br-integration-test |
@lyzx2001: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@@ -186,7 +186,7 @@ require ( | |||
github.com/eapache/go-resiliency v1.2.0 // indirect | |||
github.com/eapache/go-xerial-snappy v0.0.0-20180814174437-776d5712da21 // indirect | |||
github.com/eapache/queue v1.1.0 // indirect | |||
github.com/fatih/structtag v1.2.0 // indirect | |||
github.com/fatih/structtag v1.2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it become a direct dependency go module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This deletion was auto performed by make tidy
/review default |
@wuhuizuo:
Suggestions:
diff --git a/br/pkg/lightning/config/config.go b/br/pkg/lightning/config/config.go
index 6d71f35795b923..03d97522a4ebd2 100644
--- a/br/pkg/lightning/config/config.go
+++ b/br/pkg/lightning/config/config.go
@@ -588,16 +588,16 @@ const (
// DupeResAlgNone doesn't detect duplicate.
DupeResAlgNone DuplicateResolutionAlgorithm = iota
- // DupeResAlgRecord only records duplicate records to `lightning_task_info.conflict_error_v1` table on the target TiDB.
+ // DupeResAlgRecord only records duplicate records to `lightning_task_info.conflict_error_v2` table on the target TiDB.
DupeResAlgRecord
// DupeResAlgRemove records all duplicate records like the 'record' algorithm and remove all information related to the
- // duplicated rows. Users need to analyze the lightning_task_info.conflict_error_v1 table to add back the correct rows.
+ // duplicated rows. Users need to analyze the lightning_task_info.conflict_error_v2 table to add back the correct rows.
DupeResAlgRemove
// DupeResAlgReplace records all duplicate records like the 'record' algorithm, and remove some rows with conflict
// and reserve other rows that can be kept and not cause conflict anymore. Users need to analyze the
- // lightning_task_info.conflict_error_v1 table to check whether the reserved data cater to their need and check whether
+ // lightning_task_info.conflict_error_v2 table to check whether the reserved data cater to their need and check whether
// they need to add back the correct rows
......
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
In the code all |
/cc @3pointer |
Yes, the AI misjudged it. |
/ok-to-test |
/cc @easonn7 |
@lyzx2001: GitHub didn't allow me to request PR reviews from the following users: easonn7. Note that only pingcap members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: 3pointer, D3Hunter, easonn7, lance6716 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
What problem does this PR solve?
Issue Number: ref #45774
Problem Summary:
Currently lightning only supports "remove" mode for post-import conflict detection, but many customers request lightning to support "replace" mode for lightning post-import conflict detection.
We would like to support "replace" mode for lightning post-import conflict detection:
To resolve rows with conflict, instead of deleting all the rows that are engaged in conflict (the algorithm for remove), we delete some of the rows with conflict and reserve other rows that can be kept and not cause conflict anymore. Under this circumstance, we only delete the necessary rows to resolve conflicts, so that we can keep more original rows than remove mode as long as the conflicts are resolved.
The algorithms for index KV checking is contained in #45926
The algorithms for data KV checking is contained in #46763
This PR contains complex integration tests to test the correctness and performance of the algorithms.
The tests vary in the setting of engine batch-size to cover various complex cases .
What is changed and how it works?
Demo code for 'replace' mode of lightning post-import conflict detection:
https://github.com/lyzx2001/tidb-conflict-replace
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.