Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mcs: dynamic enable scheduling jobs #7325

Merged
merged 15 commits into from
Nov 20, 2023
Merged

Conversation

rleungx
Copy link
Member

@rleungx rleungx commented Nov 6, 2023

What problem does this PR solve?

Issue Number: Ref #5839. Also close #7375.

What is changed and how does it work?

This PR supports dynamic enable/disable scheduling service.

Check List

Tests

  • Unit test

Release note

None.

Copy link
Contributor

ti-chi-bot bot commented Nov 6, 2023

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • CabinfeverB
  • lhy1024

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

Copy link
Contributor

ti-chi-bot bot commented Nov 6, 2023

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Nov 6, 2023
@ti-chi-bot ti-chi-bot bot requested review from disksing and Yisaer November 6, 2023 10:13
@rleungx rleungx force-pushed the dynamic-scheduling branch 2 times, most recently from 65f4a45 to 9ca0d80 Compare November 14, 2023 04:15
@rleungx rleungx marked this pull request as ready for review November 14, 2023 04:15
@ti-chi-bot ti-chi-bot bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. do-not-merge/needs-linked-issue size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 14, 2023
@rleungx rleungx force-pushed the dynamic-scheduling branch 2 times, most recently from 7c6b012 to 1271686 Compare November 15, 2023 02:54
@rleungx rleungx requested review from nolouch, lhy1024 and CabinfeverB and removed request for disksing and Yisaer November 15, 2023 02:54
server/api/middleware.go Show resolved Hide resolved
schedulerStatusGauge.Reset()
ruleStatusGauge.Reset()
// create in map again
rulesCntStatusGauge = ruleStatusGauge.WithLabelValues("rule_count")
groupsCntStatusGauge = ruleStatusGauge.WithLabelValues("group_count")
ruleStatusGauge.WithLabelValues("rule_count").Set(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why Set(0) after Reset

server/cluster/cluster.go Show resolved Hide resolved
server/cluster/cluster.go Show resolved Hide resolved
@@ -1562,6 +1562,7 @@ func TestTransferLeaderBack(t *testing.T) {
svr := leaderServer.GetServer()
rc := cluster.NewRaftCluster(ctx, svr.ClusterID(), syncer.NewRegionSyncer(svr), svr.GetClient(), svr.GetHTTPClient())
rc.InitCluster(svr.GetAllocator(), svr.GetPersistOptions(), svr.GetStorage(), svr.GetBasicCluster(), svr.GetHBStreams(), svr.GetKeyspaceGroupManager())
rc.SchedulingController = cluster.NewSchedulingController(ctx, rc.GetBasicCluster(), rc.GetOpts(), rc.GetRuleManager())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about putting the initialization of SchedulingController into InitCluster? This way we can avoid one less function call and avoid forgetting

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't init RuleManager.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also put the initialization of RuleManager into InitCluster?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove it now

Signed-off-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: Ryan Leung <rleungx@gmail.com>
Copy link

codecov bot commented Nov 16, 2023

Codecov Report

Merging #7325 (013577b) into master (dda748a) will decrease coverage by 0.09%.
The diff coverage is 88.97%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7325      +/-   ##
==========================================
- Coverage   74.27%   74.19%   -0.09%     
==========================================
  Files         451      451              
  Lines       48967    49044      +77     
==========================================
+ Hits        36372    36387      +15     
- Misses       9375     9441      +66     
+ Partials     3220     3216       -4     
Flag Coverage Δ
unittests 74.19% <88.97%> (-0.09%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Copy link
Member

@CabinfeverB CabinfeverB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest LGTM!

@@ -1562,6 +1562,7 @@ func TestTransferLeaderBack(t *testing.T) {
svr := leaderServer.GetServer()
rc := cluster.NewRaftCluster(ctx, svr.ClusterID(), syncer.NewRegionSyncer(svr), svr.GetClient(), svr.GetHTTPClient())
rc.InitCluster(svr.GetAllocator(), svr.GetPersistOptions(), svr.GetStorage(), svr.GetBasicCluster(), svr.GetHBStreams(), svr.GetKeyspaceGroupManager())
rc.SchedulingController = cluster.NewSchedulingController(ctx, rc.GetBasicCluster(), rc.GetOpts(), rc.GetRuleManager())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove it now

)

type schedulingController struct {
// SchedulingController is used to manage all schedulers and checkers.
type SchedulingController struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And maybe we can keep unexported :)

@@ -52,26 +56,23 @@ type schedulingController struct {
running bool
}

func newSchedulingController(parentCtx context.Context) *schedulingController {
// NewSchedulingController creates a new scheduling controller.
func NewSchedulingController(parentCtx context.Context, basicCluster *core.BasicCluster, opt sc.ConfProvider, ruleManager *placement.RuleManager) *SchedulingController {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@ti-chi-bot ti-chi-bot bot added the status/LGT1 Indicates that a PR has LGTM 1. label Nov 16, 2023
Signed-off-by: Ryan Leung <rleungx@gmail.com>
Copy link
Contributor

@lhy1024 lhy1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Do we need a manual test for it?

@ti-chi-bot ti-chi-bot bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Nov 16, 2023
@rleungx
Copy link
Member Author

rleungx commented Nov 16, 2023

LGTM. Do we need a manual test for it?

Yes, will do more tests later.

@rleungx
Copy link
Member Author

rleungx commented Nov 17, 2023

@nolouch PTAL

@rleungx
Copy link
Member Author

rleungx commented Nov 20, 2023

/merge

Copy link
Contributor

ti-chi-bot bot commented Nov 20, 2023

@rleungx: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

Copy link
Contributor

ti-chi-bot bot commented Nov 20, 2023

This pull request has been accepted and is ready to merge.

Commit hash: 013577b

@ti-chi-bot ti-chi-bot bot added the status/can-merge Indicates a PR has been approved by a committer. label Nov 20, 2023
@ti-chi-bot ti-chi-bot bot merged commit 9845c12 into tikv:master Nov 20, 2023
26 checks passed
@rleungx rleungx deleted the dynamic-scheduling branch November 20, 2023 03:14
rleungx added a commit to rleungx/pd that referenced this pull request Dec 1, 2023
ref tikv#5839, close tikv#7375

Signed-off-by: Ryan Leung <rleungx@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

potencial deadlock in tests/server/api
3 participants