Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

master/worker(dm): suppport update config when task not run #4760

Merged
merged 29 commits into from
Mar 10, 2022
Merged

master/worker(dm): suppport update config when task not run #4760

merged 29 commits into from
Mar 10, 2022

Conversation

Ehco1996
Copy link
Contributor

@Ehco1996 Ehco1996 commented Mar 3, 2022

What problem does this PR solve?

Issue Number: ref #4484

What is changed and how it works?

for master:

  • add enable filed in source cfg for later use this will added in next pr
  • can update source cfg when no running tasks and no relay workers
  • can update task cfg when no running tasks and all subtasks is in sync unit, also only limited fields can be updated

for worker

  • provide a new rpc :CheckSubtasksCanUpdate for master
  • refresh source cfg when enable relay or resume subtask
  • refresh subtask cfg when resume subtask

but there is no call now, just inter logic

Check List

Tests

  • Unit test
  • Integration test

Code changes

  • Has exported function/method change
  • Has exported variable/fields change
  • Has persistent data change

Side effects

  • Possible performance regression
  • Increased code complexity
  • Breaking backward compatibility

Related changes

Release note

`None`.

@Ehco1996 Ehco1996 added the area/dm Issues or PRs related to DM. label Mar 3, 2022
@ti-chi-bot
Copy link
Member

ti-chi-bot commented Mar 3, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • D3Hunter
  • lance6716

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 3, 2022
dm/dm/config/source_config.go Outdated Show resolved Hide resolved
dm/dm/master/scheduler/scheduler.go Show resolved Hide resolved
dm/dm/master/scheduler/scheduler_test.go Outdated Show resolved Hide resolved
dm/dm/worker/source_worker.go Outdated Show resolved Hide resolved
@ti-chi-bot ti-chi-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 3, 2022
}
w.cfg = sourceCfg
// we need update config from etcd first in case this cfg is updated by master
if refreshErr := w.refreshSourceCfg(); refreshErr != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we only need to refresh when user want to enable-relay or restart task for source

failpoint.Inject("SkipRefreshFromETCDInUT", func(_ failpoint.Value) {
failpoint.Goto("bypassRefresh")
})
if op == pb.TaskOp_Resume || op == pb.TaskOp_AutoResume {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when user want to restart or start a stopped task, use the newest cfg from etcd.

dynamic update will impl later

@@ -1039,6 +1027,63 @@ func (s *Scheduler) RemoveSubTasks(task string, sources ...string) error {
return nil
}

// UpdateSubTasks update the information of one or more subtasks for one task.
// now only support update a task that not in running stage.
func (s *Scheduler) UpdateSubTasks(cfgs ...config.SubTaskConfig) error {
Copy link
Contributor Author

@Ehco1996 Ehco1996 Mar 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need check if this field is safe to updated, also need to check current unit with this subtask check this in worker or master? if checked in worekr i need to added new rpc

Copy link
Contributor Author

@Ehco1996 Ehco1996 Mar 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actully we can check this just on master side
no we need send request to worker to check the current task unit anyway.... so we better just do this check on worker side

@codecov-commenter
Copy link

codecov-commenter commented Mar 4, 2022

Codecov Report

Merging #4760 (e2aa469) into master (9607554) will increase coverage by 0.0736%.
The diff coverage is 53.4869%.

Flag Coverage Δ
cdc 59.8003% <53.4869%> (-0.1219%) ⬇️
dm 52.3426% <ø> (+0.3137%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

@@               Coverage Diff                @@
##             master      #4760        +/-   ##
================================================
+ Coverage   55.6402%   55.7138%   +0.0736%     
================================================
  Files           494        521        +27     
  Lines         61283      64930      +3647     
================================================
+ Hits          34098      36175      +2077     
- Misses        23750      25201      +1451     
- Partials       3435       3554       +119     

@ti-chi-bot ti-chi-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 4, 2022
@Ehco1996
Copy link
Contributor Author

Ehco1996 commented Mar 4, 2022

/run-dm-integration-tests

@Ehco1996
Copy link
Contributor Author

Ehco1996 commented Mar 4, 2022

/run-dm-integration-tests

@Ehco1996
Copy link
Contributor Author

Ehco1996 commented Mar 4, 2022

/run-dm-integration-tests

@Ehco1996
Copy link
Contributor Author

Ehco1996 commented Mar 4, 2022

/run-dm-integration-tests

1 similar comment
@Ehco1996
Copy link
Contributor Author

Ehco1996 commented Mar 4, 2022

/run-dm-integration-tests

@Ehco1996
Copy link
Contributor Author

Ehco1996 commented Mar 4, 2022

/run-dm-integration-tests

1 similar comment
@Ehco1996
Copy link
Contributor Author

Ehco1996 commented Mar 4, 2022

/run-dm-integration-tests

// 3. check this cfg is ok to update.
if !checkSourceCfgCanUpdated(s.sourceCfgs[cfg.SourceID], cfg) {
return terror.ErrSchedulerSourceCfgUpdate.Generate()
// 3. check if there is relay workers for this source
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember since v5.4, source config with enable-relay: true will not record a relay worker. Should we also allow update for this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now use expectRelayStages to check if this source is enable relay, seems this can avoid this case

7aa818a

return terror.ErrSchedulerNotStarted.Generate()
}
if len(cfgs) == 0 {
return nil // no subtasks need to add, this should not happen.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can change the function parameter to (context, SubTaskConfig, ...SubTaskConfig) to tell the compiler there must be at least one SubTaskConfig.

OK to not change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can't find the example of how to use unnamed paramaters , how can i refer the unnamed stcfgs?

seems the only use of unnamed parameters is when you have to define a function with a specific
https://stackoverflow.com/questions/24000305/how-to-refer-to-an-unnamed-function-argument-in-go

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(context, ...SubTaskConfig) -> (context, SubTaskConfig, ...SubTaskConfig)

dm/dm/master/scheduler/scheduler.go Outdated Show resolved Hide resolved
dm/dm/master/scheduler/scheduler.go Outdated Show resolved Hide resolved
dm/dm/master/scheduler/scheduler.go Outdated Show resolved Hide resolved
dm/dm/master/scheduler/scheduler.go Outdated Show resolved Hide resolved
dm/dm/worker/source_worker.go Outdated Show resolved Hide resolved
dm/dm/worker/source_worker.go Outdated Show resolved Hide resolved
@Ehco1996
Copy link
Contributor Author

Ehco1996 commented Mar 9, 2022

/run-dm-integration-tests

Copy link
Contributor

@D3Hunter D3Hunter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest lgtm

dm/syncer/syncer.go Outdated Show resolved Hide resolved
dm/syncer/syncer.go Show resolved Hide resolved
@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Mar 9, 2022
@Ehco1996
Copy link
Contributor Author

Ehco1996 commented Mar 9, 2022

/run-dm-integration-tests

dm/dm/master/scheduler/scheduler_test.go Outdated Show resolved Hide resolved
dm/dm/master/scheduler/scheduler_test.go Show resolved Hide resolved
dm/dm/master/scheduler/scheduler_test.go Outdated Show resolved Hide resolved
dm/dm/master/scheduler/scheduler_test.go Outdated Show resolved Hide resolved
dm/dm/master/scheduler/scheduler_test.go Outdated Show resolved Hide resolved
@Ehco1996
Copy link
Contributor Author

/run-verify

dm/dm/worker/source_worker.go Show resolved Hide resolved
dm/dm/worker/source_worker.go Outdated Show resolved Hide resolved
dm/dm/worker/source_worker.go Outdated Show resolved Hide resolved
dm/dm/master/scheduler/scheduler.go Show resolved Hide resolved
dm/pkg/terror/error_list.go Outdated Show resolved Hide resolved
func (s *Syncer) CheckCanUpdateCfg(newCfg *config.SubTaskConfig) error {
s.RLock()
defer s.RUnlock()
// can't update when in sharding merge
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in some optimistic case, like partially drop column, BAlist should also not be updated

@GMHDBJD PTAL, can you list the forbidden cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it's hard to list or check all the case ... maybe let the user to bear the risk of changing config

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also remember to add this limitation to doc

dm/syncer/syncer.go Outdated Show resolved Hide resolved
dm/syncer/syncer.go Outdated Show resolved Hide resolved
@Ehco1996
Copy link
Contributor Author

/run-dm-integration-tests

@@ -640,3 +664,80 @@ func loadSourceConfigWithoutPassword(c *C) *config.SourceConfig {
sourceCfg.From.Password = "" // no password set
return sourceCfg
}

func (t *testServer) testSourceWorker(c *C) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why move it? the content of this function is related to source worker, so it's natural to put it in source_worker_test.go

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Mar 10, 2022
@Ehco1996
Copy link
Contributor Author

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 6074849

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Mar 10, 2022
@ti-chi-bot ti-chi-bot merged commit 65db3c5 into pingcap:master Mar 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/dm Issues or PRs related to DM. release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. status/ptal Could you please take a look?
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants