Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: Reorg partition fix delete ranges and handling non-clustered tables with concurrent DML #57114

Merged
merged 63 commits into from
Nov 29, 2024

Conversation

mjonss
Copy link
Contributor

@mjonss mjonss commented Nov 5, 2024

What problem does this PR solve?

Issue Number: close #56822 ref #45133 close #57510

based on #56974.

Problem Summary:
First the DeleteRange did not include the old replaced Global Indexes,
Then I found an issue with non-clustered tables and reorganize partition,
where double written records during StateWriteOnly/StateWriteReorganization would get Duplicate key errors, due to they would generate new _tidb_rowid in the new partition, making it impossible to detect if an index entry came from Backfilling or if it is new, when recreating the indexes.

What changed and how does it work?

Fixed the DeleteRange, by adding a list of TableID+IndexID ranges that should also be cleaned by DeleteRanges.

Instead of first copy all records and then recreate the indexes, I now use AddRecord table interface, and created both the table record as well as all the index entries for that row. Note that AddRecord will also generate a new _tidb_rowid, so that part of the reorg code is removed as well.

Drawbacks are:

  • I'm not sure if we can workaround an index lookup for each unique index entry needed to be written (i.e. if we can batch them instead of do one lookup at a time, or rely on optimistic commit).
  • Big change and little time to test it further, including performance difference (correctness is more important though).

Benefits are:

  • Simpler code, just check if the record was written with the original handle (_tidb_rowid) and skip if so, otherwise use AddRecord.
  • Skipping reading the partitions twice, once for the 'table' records and once for generating the indexes.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

Fix for REORGANIZE PARTITION to remove replaced Global Indexes and handle non-clustered tables unique indexes.

mjonss added 30 commits October 22, 2024 16:40
Still some debug and cleanup to do.
Note that TestMultiSchemaDropListColumnsDefaultPartition is failing now!
Next step is to investigate this...
@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Nov 28, 2024
pkg/ddl/partition.go Show resolved Hide resolved
panic(fmt.Sprintf("expect delete range count %d, actual count %d", expectedCnt, actualCnt))
switch job.Type {
case model.ActionReorganizePartition, model.ActionRemovePartitioning, model.ActionAlterTablePartitioning:
if actualCnt < expectedCnt {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think modify expectedDeleteRangeCnt is better.

Agree. This is more like a test, it is better to change the test itself instead of the assertion.

Copy link

ti-chi-bot bot commented Nov 28, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Defined2014, tangenta

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added lgtm approved and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Nov 28, 2024
Copy link

ti-chi-bot bot commented Nov 28, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-11-28 01:45:18.548962945 +0000 UTC m=+687306.168617461: ☑️ agreed by Defined2014.
  • 2024-11-28 12:07:08.8963275 +0000 UTC m=+724616.515982021: ☑️ agreed by tangenta.

@mjonss
Copy link
Contributor Author

mjonss commented Nov 28, 2024

/retest

4 similar comments
@mjonss
Copy link
Contributor Author

mjonss commented Nov 28, 2024

/retest

@mjonss
Copy link
Contributor Author

mjonss commented Nov 28, 2024

/retest

@mjonss
Copy link
Contributor Author

mjonss commented Nov 28, 2024

/retest

@mjonss
Copy link
Contributor Author

mjonss commented Nov 29, 2024

/retest

Copy link

tiprow bot commented Nov 29, 2024

@mjonss: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
fast_test_tiprow 7556105 link true /test fast_test_tiprow

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@ti-chi-bot ti-chi-bot bot merged commit b6025b9 into pingcap:master Nov 29, 2024
26 of 27 checks passed
@mjonss mjonss deleted the reorg-partition-fix-delete-ranges branch November 29, 2024 00:32
@ti-chi-bot ti-chi-bot bot added the needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. label Nov 29, 2024
ti-chi-bot pushed a commit to ti-chi-bot/tidb that referenced this pull request Nov 29, 2024
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-8.5: #57812.

mjonss added a commit to ti-chi-bot/tidb that referenced this pull request Nov 29, 2024
ti-chi-bot bot pushed a commit that referenced this pull request Nov 29, 2024
@ti-chi-bot ti-chi-bot bot added the needs-cherry-pick-release-7.5 Should cherry pick this PR to release-7.5 branch. label Dec 10, 2024
ti-chi-bot pushed a commit to ti-chi-bot/tidb that referenced this pull request Dec 10, 2024
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-7.5: #58124.

mjonss added a commit to ti-chi-bot/tidb that referenced this pull request Dec 12, 2024
ti-chi-bot bot pushed a commit that referenced this pull request Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved lgtm needs-cherry-pick-release-7.5 Should cherry pick this PR to release-7.5 branch. needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. ok-to-test Indicates a PR is ready to be tested. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Found extra duplicate rows after reorg partition REORGANIZE PARTITION does not cleanup replaced indexes.
5 participants