-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lightning: optimize region split check logic (#30428) #30876
lightning: optimize region split check logic (#30428) #30876
Conversation
Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
/run-all-tests |
@glorv you're already a collaborator in bot's repo. |
/merge |
This pull request has been accepted and is ready to merge. Commit hash: cf38c22
|
@ti-srebot: Your PR was out of date, I have automatically updated it for you. At the same time I will also trigger all tests for you: /run-all-tests If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
cherry-pick #30428 to release-5.3
You can switch your code base to this Pull Request by using git-extras:
# In tidb repo: git pr https://github.com/pingcap/tidb/pull/30876
After apply modifications, you can push your change to this PR via:
What problem does this PR solve?
Issue Number: close #30018
Problem Summary: After tidb-lightning importing, there are a lot of empty region in tidb, and these regions won't automatically merge until 1h later.
What is changed and how it works?
Analysis:
In the current implementation, lightning will ingest a SST file into one region with at most 1440k kvs or 128MiB(with default region-split-size) in size. 1440k is the region auto split threshold that will cause tikv auto split.
With some data set, because lightning can accurately estimate the region range, so the kvs in a region range slightly exceeds the key count threshold, then lightning will ingest these keys into two region, one with 1440k kvs and another small region this pd will treat it as empty region. Thus after lightning import, the bigger region will cause tidb auto-split. Due to pd's region merge row, pd will prevent one of the new split region and the nearly empty region merging before 1h later.
Test result:
Before this PR, after lightning import there are a lot of empty region in grafana and tikv logs contains 1k+ log with following pattern:
After this pr:
There are still empty regions after lightning import, but these regions can be merge in a few minutes. There are no region auto-split after import.
Check List
Tests
Side effects
Documentation
Release note