Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lightning: add complex integration tests for lightning post-import conflict detection "replace" mode #47460

Merged
merged 45 commits into from
Nov 3, 2023

Conversation

lyzx2001
Copy link
Contributor

@lyzx2001 lyzx2001 commented Oct 8, 2023

What problem does this PR solve?

Issue Number: ref #45774

Problem Summary:

Currently lightning only supports "remove" mode for post-import conflict detection, but many customers request lightning to support "replace" mode for lightning post-import conflict detection.

We would like to support "replace" mode for lightning post-import conflict detection:
To resolve rows with conflict, instead of deleting all the rows that are engaged in conflict (the algorithm for remove), we delete some of the rows with conflict and reserve other rows that can be kept and not cause conflict anymore. Under this circumstance, we only delete the necessary rows to resolve conflicts, so that we can keep more original rows than remove mode as long as the conflicts are resolved.

The algorithms for index KV checking is contained in #45926

The algorithms for data KV checking is contained in #46763

This PR contains complex integration tests to test the correctness and performance of the algorithms.

The tests vary in the setting of engine batch-size to cover various complex cases .

What is changed and how it works?

Demo code for 'replace' mode of lightning post-import conflict detection:

https://github.com/lyzx2001/tidb-conflict-replace

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot ti-chi-bot bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Oct 8, 2023
@tiprow
Copy link

tiprow bot commented Oct 8, 2023

Hi @lyzx2001. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@codecov
Copy link

codecov bot commented Oct 8, 2023

Codecov Report

Merging #47460 (26f5578) into master (91a8023) will increase coverage by 3.9665%.
Report is 155 commits behind head on master.
The diff coverage is 81.3559%.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #47460        +/-   ##
================================================
+ Coverage   71.8856%   75.8521%   +3.9664%     
================================================
  Files          1398       1447        +49     
  Lines        405057     418804     +13747     
================================================
+ Hits         291178     317672     +26494     
+ Misses        94273      81140     -13133     
- Partials      19606      19992       +386     
Flag Coverage Δ
integration 49.5630% <81.3559%> (?)
unit 72.2566% <57.6271%> (+0.3709%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 53.9874% <ø> (-0.0039%) ⬇️
parser ∅ <ø> (∅)
br 67.3197% <81.3559%> (+14.3386%) ⬆️

@lyzx2001 lyzx2001 changed the title [WIP] lightning: add complex unit tests and integration tests for lightning post-import conflict detection [WIP] lightning: add complex unit tests and integration tests (part 1) for lightning post-import conflict detection Oct 9, 2023
@ti-chi-bot ti-chi-bot bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Oct 9, 2023
@lyzx2001
Copy link
Contributor Author

lyzx2001 commented Oct 9, 2023

/test pull-br-integration-test

@tiprow
Copy link

tiprow bot commented Oct 9, 2023

@lyzx2001: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/test pull-br-integration-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@lyzx2001 lyzx2001 changed the title [WIP] lightning: add complex unit tests and integration tests (part 1) for lightning post-import conflict detection [WIP] lightning: add complex unit tests and integration tests (part 1) for lightning post-import conflict detection "replace" mode Oct 9, 2023
@lyzx2001
Copy link
Contributor Author

lyzx2001 commented Oct 9, 2023

/test pull-br-integration-test

@tiprow
Copy link

tiprow bot commented Oct 30, 2023

@lyzx2001: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@lyzx2001
Copy link
Contributor Author

/retest

@tiprow
Copy link

tiprow bot commented Oct 30, 2023

@lyzx2001: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Contributor

@D3Hunter D3Hunter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest lgtm

@@ -94,6 +95,7 @@ const (
raw_value mediumblob NOT NULL COMMENT 'the value of the conflicted key',
raw_handle mediumblob NOT NULL COMMENT 'the data handle derived from the conflicted key or value',
raw_row mediumblob NOT NULL COMMENT 'the data retrieved from the handle',
is_data_kv tinyint(1) NOT NULL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe kv_group, values can be data/index, or if we want finer grain, we can use index id as the name.

current is lgtm too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe current is_data_kv is simpler in implemention, it seems like it is not necessary to determine whether the value is data or index based on the result tablecodec.IsRecordKey(conflictInfo.RawKey)

c int not null,
d text,
key key_b(b),
key key_c(b)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

index on c?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified

Comment on lines +30 to +33
check_contains 'a: 1'
check_contains 'b: 1'
check_contains 'c: 1'
check_contains 'd: 1.csv'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more like ignore semantic in this case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not quite understand. Here 1,1,1,1.csv and 1,1,2,2.csv have conflicts due to PK, and here in replace semantic we reserve the first row and delete the second row.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignore will keep the first row met, like what's done here, replace will keep the last

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here batch-size = 1, the actual import order is not the same as that presented in dup_resolve.a.1

c int not null,
d text,
key key_b(b),
key key_c(b)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doitto

@D3Hunter
Copy link
Contributor

D3Hunter commented Nov 1, 2023

/lgtm

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Nov 1, 2023
Copy link

ti-chi-bot bot commented Nov 1, 2023

[LGTM Timeline notifier]

Timeline:

  • 2023-10-30 06:07:48.546654075 +0000 UTC m=+2846866.133764221: ☑️ agreed by lance6716.
  • 2023-11-01 03:13:39.747115176 +0000 UTC m=+3009217.334225338: ☑️ agreed by D3Hunter.

@lyzx2001
Copy link
Contributor Author

lyzx2001 commented Nov 1, 2023

/test pull-br-integration-test

Copy link

tiprow bot commented Nov 1, 2023

@lyzx2001: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/test pull-br-integration-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@@ -186,7 +186,7 @@ require (
github.com/eapache/go-resiliency v1.2.0 // indirect
github.com/eapache/go-xerial-snappy v0.0.0-20180814174437-776d5712da21 // indirect
github.com/eapache/queue v1.1.0 // indirect
github.com/fatih/structtag v1.2.0 // indirect
github.com/fatih/structtag v1.2.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it become a direct dependency go module?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This deletion was auto performed by make tidy

@wuhuizuo
Copy link
Contributor

wuhuizuo commented Nov 1, 2023

/review default

Copy link

ti-chi-bot bot commented Nov 1, 2023

@wuhuizuo:
Potential Problems:

  1. In the config.go file, the comments for DupeResAlgRecord and DupeResAlgRemove still mention the old lightning_task_info.conflict_error_v1 table instead of the new lightning_task_info.conflict_error_v2 table.

Suggestions:

  1. Update the comments for DupeResAlgRecord and DupeResAlgRemove to mention the new lightning_task_info.conflict_error_v2 table instead of the old lightning_task_info.conflict_error_v1 table.
diff --git a/br/pkg/lightning/config/config.go b/br/pkg/lightning/config/config.go
index 6d71f35795b923..03d97522a4ebd2 100644
--- a/br/pkg/lightning/config/config.go
+++ b/br/pkg/lightning/config/config.go
@@ -588,16 +588,16 @@ const (
 	// DupeResAlgNone doesn't detect duplicate.
 	DupeResAlgNone DuplicateResolutionAlgorithm = iota
 
-	// DupeResAlgRecord only records duplicate records to `lightning_task_info.conflict_error_v1` table on the target TiDB.
+	// DupeResAlgRecord only records duplicate records to `lightning_task_info.conflict_error_v2` table on the target TiDB.
 	DupeResAlgRecord
 
 	// DupeResAlgRemove records all duplicate records like the 'record' algorithm and remove all information related to the
-	// duplicated rows. Users need to analyze the lightning_task_info.conflict_error_v1 table to add back the correct rows.
+	// duplicated rows. Users need to analyze the lightning_task_info.conflict_error_v2 table to add back the correct rows.
 	DupeResAlgRemove
 
 	// DupeResAlgReplace records all duplicate records like the 'record' algorithm, and remove some rows with conflict
 	// and reserve other rows that can be kept and not cause conflict anymore. Users need to analyze the
-	// lightning_task_info.conflict_error_v1 table to check whether the reserved data cater to their need and check whether
+	// lightning_task_info.conflict_error_v2 table to check whether the reserved data cater to their need and check whether
 	// they need to add back the correct rows

......

Response is trunked for length limits.

In response to this:

/review default

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@lyzx2001
Copy link
Contributor Author

lyzx2001 commented Nov 1, 2023

@wuhuizuo: Potential Problems:

  1. In the config.go file, the comments for DupeResAlgRecord and DupeResAlgRemove still mention the old lightning_task_info.conflict_error_v1 table instead of the new lightning_task_info.conflict_error_v2 table.

Suggestions:

  1. Update the comments for DupeResAlgRecord and DupeResAlgRemove to mention the new lightning_task_info.conflict_error_v2 table instead of the old lightning_task_info.conflict_error_v1 table.
diff --git a/br/pkg/lightning/config/config.go b/br/pkg/lightning/config/config.go
index 6d71f35795b923..03d97522a4ebd2 100644
--- a/br/pkg/lightning/config/config.go
+++ b/br/pkg/lightning/config/config.go
@@ -588,16 +588,16 @@ const (
 	// DupeResAlgNone doesn't detect duplicate.
 	DupeResAlgNone DuplicateResolutionAlgorithm = iota
 
-	// DupeResAlgRecord only records duplicate records to `lightning_task_info.conflict_error_v1` table on the target TiDB.
+	// DupeResAlgRecord only records duplicate records to `lightning_task_info.conflict_error_v2` table on the target TiDB.
 	DupeResAlgRecord
 
 	// DupeResAlgRemove records all duplicate records like the 'record' algorithm and remove all information related to the
-	// duplicated rows. Users need to analyze the lightning_task_info.conflict_error_v1 table to add back the correct rows.
+	// duplicated rows. Users need to analyze the lightning_task_info.conflict_error_v2 table to add back the correct rows.
 	DupeResAlgRemove
 
 	// DupeResAlgReplace records all duplicate records like the 'record' algorithm, and remove some rows with conflict
 	// and reserve other rows that can be kept and not cause conflict anymore. Users need to analyze the
-	// lightning_task_info.conflict_error_v1 table to check whether the reserved data cater to their need and check whether
+	// lightning_task_info.conflict_error_v2 table to check whether the reserved data cater to their need and check whether
 	// they need to add back the correct rows

......

Response is trunked for length limits.

In the code all lightning_task_info.conflict_error_v1 have already been changed to lightning_task_info.conflict_error_v2, we may ignore this suggestion.

@lyzx2001 lyzx2001 requested a review from D3Hunter November 1, 2023 06:27
@lyzx2001
Copy link
Contributor Author

lyzx2001 commented Nov 2, 2023

/cc @3pointer

@ti-chi-bot ti-chi-bot bot requested a review from 3pointer November 2, 2023 03:59
@wuhuizuo
Copy link
Contributor

wuhuizuo commented Nov 2, 2023

@wuhuizuo: Potential Problems:

  1. In the config.go file, the comments for DupeResAlgRecord and DupeResAlgRemove still mention the old lightning_task_info.conflict_error_v1 table instead of the new lightning_task_info.conflict_error_v2 table.

Suggestions:

  1. Update the comments for DupeResAlgRecord and DupeResAlgRemove to mention the new lightning_task_info.conflict_error_v2 table instead of the old lightning_task_info.conflict_error_v1 table.
diff --git a/br/pkg/lightning/config/config.go b/br/pkg/lightning/config/config.go
index 6d71f35795b923..03d97522a4ebd2 100644
--- a/br/pkg/lightning/config/config.go
+++ b/br/pkg/lightning/config/config.go
@@ -588,16 +588,16 @@ const (
 	// DupeResAlgNone doesn't detect duplicate.
 	DupeResAlgNone DuplicateResolutionAlgorithm = iota
 
-	// DupeResAlgRecord only records duplicate records to `lightning_task_info.conflict_error_v1` table on the target TiDB.
+	// DupeResAlgRecord only records duplicate records to `lightning_task_info.conflict_error_v2` table on the target TiDB.
 	DupeResAlgRecord
 
 	// DupeResAlgRemove records all duplicate records like the 'record' algorithm and remove all information related to the
-	// duplicated rows. Users need to analyze the lightning_task_info.conflict_error_v1 table to add back the correct rows.
+	// duplicated rows. Users need to analyze the lightning_task_info.conflict_error_v2 table to add back the correct rows.
 	DupeResAlgRemove
 
 	// DupeResAlgReplace records all duplicate records like the 'record' algorithm, and remove some rows with conflict
 	// and reserve other rows that can be kept and not cause conflict anymore. Users need to analyze the
-	// lightning_task_info.conflict_error_v1 table to check whether the reserved data cater to their need and check whether
+	// lightning_task_info.conflict_error_v2 table to check whether the reserved data cater to their need and check whether
 	// they need to add back the correct rows

......

Response is trunked for length limits.

In the code all lightning_task_info.conflict_error_v1 have already been changed to lightning_task_info.conflict_error_v2, we may ignore this suggestion.

Yes, the AI misjudged it.

@hawkingrei
Copy link
Member

/ok-to-test

@ti-chi-bot ti-chi-bot bot added the ok-to-test Indicates a PR is ready to be tested. label Nov 3, 2023
@lyzx2001
Copy link
Contributor Author

lyzx2001 commented Nov 3, 2023

/cc @easonn7

Copy link

ti-chi-bot bot commented Nov 3, 2023

@lyzx2001: GitHub didn't allow me to request PR reviews from the following users: easonn7.

Note that only pingcap members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @easonn7

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@easonn7
Copy link

easonn7 commented Nov 3, 2023

/approve

Copy link

ti-chi-bot bot commented Nov 3, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: 3pointer, D3Hunter, easonn7, lance6716

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the approved label Nov 3, 2023
@lyzx2001
Copy link
Contributor Author

lyzx2001 commented Nov 3, 2023

/retest

@ti-chi-bot ti-chi-bot bot merged commit 37965e5 into pingcap:master Nov 3, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm ok-to-test Indicates a PR is ready to be tested. release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants