Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler: make hot v2 more suitable small hot region #6827

Merged
merged 5 commits into from
Jul 31, 2023

Conversation

lhy1024
Copy link
Contributor

@lhy1024 lhy1024 commented Jul 20, 2023

What problem does this PR solve?

Issue Number: Close #6645

What is changed and how does it work?

At the master, when the load difference between the high and low nodes is large, the hot peer needs to be larger than 20% of the diff or 2% of the low node.

In this pr, we introduce the 20th node to be compared with 2% of the low node and take the smaller of these values to avoid unscheduling when the load is all composed of small hotspots.

And we cache filterHotpeers to avoid redundancy calculations.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  1. test in some small hot regions
    master after evict leader
    image
    image

this pr after evict leader
image
image

  1. contrast to 7.2, there is no rollback of tpcc loads for this pr
    image

Release note

None.

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jul 20, 2023

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • bufferflies
  • nolouch

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jul 20, 2023

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. labels Jul 20, 2023
@ti-chi-bot ti-chi-bot bot requested review from disksing and rleungx July 20, 2023 16:17
@ti-chi-bot ti-chi-bot bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 20, 2023
@lhy1024
Copy link
Contributor Author

lhy1024 commented Jul 20, 2023

/run-build-arm64 comment=true

@lhy1024
Copy link
Contributor Author

lhy1024 commented Jul 20, 2023

/run-build-arm64 comment=true

@sre-bot
Copy link
Contributor

sre-bot commented Jul 20, 2023

Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: lhy1024 <admin@liudos.us>
@lhy1024
Copy link
Contributor Author

lhy1024 commented Jul 20, 2023

/ok-to-test

@ti-chi-bot ti-chi-bot bot added the ok-to-test Indicates a PR is ready to be tested. label Jul 20, 2023
@codecov
Copy link

codecov bot commented Jul 20, 2023

Codecov Report

Merging #6827 (7e29313) into master (fe52361) will decrease coverage by 0.11%.
Report is 1 commits behind head on master.
The diff coverage is 67.88%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6827      +/-   ##
==========================================
- Coverage   74.21%   74.10%   -0.11%     
==========================================
  Files         417      418       +1     
  Lines       43926    43998      +72     
==========================================
+ Hits        32599    32605       +6     
- Misses       8430     8482      +52     
- Partials     2897     2911      +14     
Flag Coverage Δ
unittests 74.10% <67.88%> (-0.11%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

@lhy1024 lhy1024 marked this pull request as ready for review July 22, 2023 05:22
@ti-chi-bot ti-chi-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 22, 2023
Signed-off-by: lhy1024 <admin@liudos.us>
@@ -262,6 +268,7 @@ func (bs *balanceSolver) getScoreByPriorities(dim int, rs *rankV2Ratios) int {
// maxBetterRate may be less than minBetterRate, in which case a positive fraction cannot be produced.
minNotWorsenedRate = -bs.getMinRate(dim)
minBetterRate = math.Min(minBalancedRate*rs.perceivedRatio, lowRate*rs.minHotRatio)
minBetterRate = math.Min(minBetterRate, topnRate)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why no need for thepre-balanced state? and should maxBetterRate be influenced by it?

Copy link
Contributor Author

@lhy1024 lhy1024 Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this branch, it means that there is a big difference between a high store and a low store.

Previously, we limited the scheduling by minBalancedRate*rs.perceivedRatio and lowRate*rs.minHotRatio to avoid scheduling a region that is too small.

However, these values are not flexible enough for the scenario involved in the issue, which is a hotspot store composed of small regions, so we use topn for evaluation.

So this new topn will only influence minBetterRate in this branch, where there is a big difference between a high store and a low store.

pkg/schedule/schedulers/hot_region.go Show resolved Hide resolved
@ti-chi-bot ti-chi-bot bot added the status/LGT1 Indicates that a PR has LGTM 1. label Jul 27, 2023
Copy link
Contributor

@nolouch nolouch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ti-chi-bot ti-chi-bot bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jul 31, 2023
@lhy1024
Copy link
Contributor Author

lhy1024 commented Jul 31, 2023

/merge

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jul 31, 2023

@lhy1024: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jul 31, 2023

This pull request has been accepted and is ready to merge.

Commit hash: 3588db3

@ti-chi-bot ti-chi-bot bot added the status/can-merge Indicates a PR has been approved by a committer. label Jul 31, 2023
@ti-chi-bot ti-chi-bot bot merged commit 16926ad into tikv:master Jul 31, 2023
21 of 23 checks passed
ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this pull request Jul 31, 2023
close tikv#6645

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-6.5: #6864.

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created to branch release-7.1: #6865.

ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this pull request Jul 31, 2023
close tikv#6645

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
ti-chi-bot bot added a commit that referenced this pull request Aug 7, 2023
close #6645

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: lhy1024 <admin@liudos.us>
Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
ti-chi-bot bot pushed a commit that referenced this pull request Aug 7, 2023
close #6645

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
Signed-off-by: lhy1024 <admin@liudos.us>

Co-authored-by: lhy1024 <admin@liudos.us>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. ok-to-test Indicates a PR is ready to be tested. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

scheduler: hot region scheduler doesn't work in v2 policy
5 participants