Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up balance leader #4610

Closed
nolouch opened this issue Jan 25, 2022 · 5 comments · Fixed by #4742 or #4747
Closed

Speed up balance leader #4610

nolouch opened this issue Jan 25, 2022 · 5 comments · Fixed by #4742 or #4747
Assignees
Labels
type/development The issue belongs to a development tasks

Comments

@nolouch
Copy link
Contributor

nolouch commented Jan 25, 2022

Development Task

Currently, balance leader only have max ops 100 op/s. we want to increase the speed when a big cluster restart (2M regions).

one way like: #4008

@nolouch nolouch added the type/development The issue belongs to a development tasks label Jan 25, 2022
@nolouch
Copy link
Contributor Author

nolouch commented Jan 25, 2022

cc @CabinfeverB, Would you like to take a look? ptal @rleungx

@CabinfeverB
Copy link
Member

I will take a look

@nolouch
Copy link
Contributor Author

nolouch commented Feb 8, 2022

/assign @CabinfeverB

@CabinfeverB
Copy link
Member

CabinfeverB commented Feb 18, 2022

Motivation

Currently, the MinScheduleInterval param determines the balance-leader speed. According to MinScheduleInterval equals 10 ms, balance-leader only has max ops 100 op/s.

If there are 100K regions that need to balance leader when a big cluster restart (2M regions), it will take 30 minutes. This is an unacceptable time cost

Detailed Design

Considering that the trigger frequency of the scheduler should not be too fast, we decided to add a batch field in the balance-leader scheduler to speed up balance leader by increasing the number of operators generated every scheduling.

In the TiKV, since we believe that the performance overhead of transferring leader in a raft group is small, the transfer-leader operator does not consume the store limit. This means that regions can be repeatedly selected from a store, so a priority-queue-like idea can be adopted. Operators are extracted from the store which has the highest/lowest leader score and we calculate the influence to adjust this top of 'heap', unless it is really impossible to extract. Then extract the next highest/timer low, and so on.

Usage Desc

Since we think the balance leader is an urgent scheduler, we set the Batch parameter of the balance-leader to 4 by default. But considering the potential scheduler competition scenario, we have added an api for configuration. At the same time, related configuration functions will be added into pd-ctl.

Development Plan

Subtasks

#4652 must be involved

Test Plan

Under the same cluster size, it should be possible to obtain an approximate linear optimization by testing the time to reach the equilibrium state when the Batch is equal to different values.

In order to test whether the original goal can be achieved, it is best to have a large cluster to do the test.

@CabinfeverB
Copy link
Member

cc @mayjiang0203

ti-chi-bot added a commit that referenced this issue Mar 14, 2022
…rs (#4652)

ref #4008, ref #4610

speed up balance leader by batch

Signed-off-by: Cabinfever_B <cabinfeveroier@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
ti-chi-bot added a commit that referenced this issue Mar 16, 2022
ref #4610, ref #4652

Add `balance-leader-leader` config API.

Signed-off-by: Cabinfever_B <cabinfeveroier@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
ti-chi-bot added a commit that referenced this issue Mar 16, 2022
ref #4610, ref #4652, ref #4655

pdctl supports update balance-leader-scheduler config

Signed-off-by: Cabinfever_B <cabinfeveroier@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
ti-chi-bot added a commit that referenced this issue Mar 17, 2022
close #4610

add lock to avoid data race in balance-leader-scheduler

Signed-off-by: Cabinfever_B <cabinfeveroier@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
rleungx pushed a commit to rleungx/pd that referenced this issue Mar 17, 2022
ref tikv#4610, ref tikv#4652, ref tikv#4655

pdctl supports update balance-leader-scheduler config

Signed-off-by: Cabinfever_B <cabinfeveroier@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
rleungx pushed a commit to rleungx/pd that referenced this issue Mar 17, 2022
close tikv#4610

add lock to avoid data race in balance-leader-scheduler

Signed-off-by: Cabinfever_B <cabinfeveroier@gmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
ti-chi-bot pushed a commit that referenced this issue Mar 17, 2022
…4747)

close #4610

adjust `Batch` size when created by `ConfigJSONDecoder`

Signed-off-by: Cabinfever_B <cabinfeveroier@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/development The issue belongs to a development tasks
Projects
None yet
2 participants