Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

placement: refine the bundle contruction logic #28450

Merged
merged 17 commits into from
Oct 2, 2021

Conversation

xhebox
Copy link
Contributor

@xhebox xhebox commented Sep 28, 2021

What problem does this PR solve?

Issue Number: #18030

Problem Summary: Several days ago, we had a meeting and decided to change the logic a little bit, refer weekly report.

What is changed and how it works?

  1. voterConstraints and voters are removed. Later parser should remove the counter part, too.
  2. NewMergeRules is replaced with NewBundleFromOptions for validation.
  3. primaryRegion must be included in regions.
  4. Add locationLabels: [region, zone, rack, host] to every rule.
  5. Use locationLabels to balance the schedule automatically, rather than constructing for every region separately. It is safe since PD will exhaust all combinations recursively for the best isolation result.
  6. Introduce NewConstraintsDirect and NewConstraintDirect helper.

Check List

Tests

  • Unit test
  • Integration test

Documentation

Release note

None

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Sep 28, 2021

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • AilinKid
  • lcwangchao

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added the release-note-none Denotes a PR that doesn't merit a release note. label Sep 28, 2021
Comment on lines -197 to -223
if uint64(len(regions)) > followers {
return nil, fmt.Errorf("%w: remain %d region to schedule, only %d follower left", ErrInvalidPlacementOptions, uint64(len(regions)), followers)
}

if len(regions) == 0 {
constraints, err := NewConstraints(nil)
if err != nil {
return nil, err
}
Rules = append(Rules, NewRule(Follower, followers, constraints))
} else {
count := followers / uint64(len(regions))
rem := followers - count*uint64(len(regions))
for _, region := range regions {
constraints, err = NewConstraints([]string{fmt.Sprintf("+region=%s", region)})
if err != nil {
return nil, fmt.Errorf("%w: invalid region of 'Regions', '%s'", err, region)
}
replica := count
if rem > 0 {
replica += 1
rem--
}
Rules = append(Rules, NewRule(Follower, replica, constraints))
}
}

Copy link
Contributor Author

@xhebox xhebox Sep 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old logic is replaced by region in [a,b,c] completely, which relies on PD to balance(the result is deterministic). And to balance it perfectly, we must set location labels manually for every rule. PD will score every schedule combination to see if every single rule has the best possible isolation, i.e. peers are far from each other.

Previously we will construct region in [a] for every region a.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make PD balance the peers, I think it is required that the label must be included in the LocationLabels?

@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 28, 2021
Signed-off-by: xhe <xw897002528@gmail.com>
Signed-off-by: xhe <xw897002528@gmail.com>
Signed-off-by: xhe <xw897002528@gmail.com>
Signed-off-by: xhe <xw897002528@gmail.com>
Signed-off-by: xhe <xw897002528@gmail.com>
Signed-off-by: xhe <xw897002528@gmail.com>
Signed-off-by: xhe <xw897002528@gmail.com>
@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 28, 2021
Signed-off-by: xhe <xw897002528@gmail.com>
Signed-off-by: xhe <xw897002528@gmail.com>
Comment on lines -197 to -223
if uint64(len(regions)) > followers {
return nil, fmt.Errorf("%w: remain %d region to schedule, only %d follower left", ErrInvalidPlacementOptions, uint64(len(regions)), followers)
}

if len(regions) == 0 {
constraints, err := NewConstraints(nil)
if err != nil {
return nil, err
}
Rules = append(Rules, NewRule(Follower, followers, constraints))
} else {
count := followers / uint64(len(regions))
rem := followers - count*uint64(len(regions))
for _, region := range regions {
constraints, err = NewConstraints([]string{fmt.Sprintf("+region=%s", region)})
if err != nil {
return nil, fmt.Errorf("%w: invalid region of 'Regions', '%s'", err, region)
}
replica := count
if rem > 0 {
replica += 1
rem--
}
Rules = append(Rules, NewRule(Follower, replica, constraints))
}
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make PD balance the peers, I think it is required that the label must be included in the LocationLabels?

ddl/placement/bundle.go Show resolved Hide resolved
@xhebox
Copy link
Contributor Author

xhebox commented Sep 28, 2021

To make PD balance the peers, I think it is required that the label must be included in the LocationLabels?

Here:

https://github.com/pingcap/tidb/pull/28450/files#diff-0b441afe10e04791ae9ef8a5c351c9e75c6477867ae8db765d70df1ded3da6dbR56-R62

For sugar options, there are only 4 location labels, according to https://github.com/pingcap/tidb/blob/master/docs/design/2020-06-24-placement-rules-in-sql.md#direct-assignment.

Signed-off-by: xhe <xw897002528@gmail.com>
@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Sep 28, 2021
ddl/placement/bundle.go Outdated Show resolved Hide resolved
Signed-off-by: xhe <xw897002528@gmail.com>
Signed-off-by: xhe <xw897002528@gmail.com>
ddl/placement/bundle.go Outdated Show resolved Hide resolved
Copy link
Contributor

@AilinKid AilinKid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

Role: role,
Count: int(replicas),
Constraints: cnst,
LocationLabels: []string{"region", "zone", "rack", "host"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just one question: what this two fields is for?

Copy link
Contributor Author

@xhebox xhebox Sep 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For isolation scores, it is used to compare which schedule(in all possible schedule combination cases) is better. Say region=sh,bj,us, 2,1,0 will have a lower score than 1,1,1 because of the existence of LocationLaels: []string{..., region, ...}. Similar arguments apply to zone,rack,host.

As for IsolationLevel, it means two locations are far enough and will affect schedule behaviors. Refer https://github.com/tikv/pd/blob/740fac18e89fa1754a40f5ec885b5e5b6caa2471/server/schedule/filter/filters.go#L696-L701.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Oct 2, 2021
@AilinKid
Copy link
Contributor

AilinKid commented Oct 2, 2021

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 946e6ab

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Oct 2, 2021
@ti-chi-bot ti-chi-bot merged commit 2a02e6e into pingcap:master Oct 2, 2021
@xhebox xhebox deleted the placement_4 branch October 29, 2021 10:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants