Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

operator: check store status for running operators #4223

Merged
merged 7 commits into from
Nov 23, 2021

Conversation

disksing
Copy link
Contributor

@disksing disksing commented Oct 19, 2021

What problem does this PR solve?

Fix #3353

What is changed and how it works?

  • For TransferLeader, AddLearner, AddPeer, check if the target store is down and cancel the operator if need.
  • Other types of OperatorStep may or may not get blocked by down store, we don't restrict them in the PR.
  • Add tests

Check List

Tests

  • Unit test

Related changes

  • Need to cherry-pick to the release branch

Release note

Fix the issue that operator can get blocked due to down store

close tikv#3353

Signed-off-by: disksing <i@disksing.com>
Signed-off-by: disksing <i@disksing.com>
Signed-off-by: disksing <i@disksing.com>
@disksing disksing added type/bugfix This PR fixes a bug. needs-cherry-pick-release-4.0 The PR needs to cherry pick to release-4.0 branch. needs-cherry-pick-release-5.0 The PR needs to cherry pick to release-5.0 branch. needs-cherry-pick-release-5.1 Type: Need cherry pick to release-5.1 needs-cherry-pick-release-5.2 Type: Need cherry pick to release-5.2 labels Oct 19, 2021
@ti-chi-bot
Copy link
Member

ti-chi-bot commented Oct 19, 2021

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • HunDunDM
  • nolouch

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Oct 19, 2021
@codecov
Copy link

codecov bot commented Oct 19, 2021

Codecov Report

Merging #4223 (46e139b) into master (dbe5e29) will increase coverage by 0.00%.
The diff coverage is 83.33%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #4223   +/-   ##
=======================================
  Coverage   75.02%   75.02%           
=======================================
  Files         263      263           
  Lines       27307    27317   +10     
=======================================
+ Hits        20486    20494    +8     
+ Misses       5015     5011    -4     
- Partials     1806     1812    +6     
Flag Coverage Δ
unittests 75.02% <83.33%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
server/schedule/operator/step.go 74.73% <81.81%> (+1.67%) ⬆️
server/schedule/operator_controller.go 83.50% <100.00%> (ø)
pkg/errs/errs.go 75.00% <0.00%> (-25.00%) ⬇️
server/region_syncer/client.go 78.90% <0.00%> (-4.69%) ⬇️
pkg/dashboard/adapter/manager.go 79.78% <0.00%> (-3.20%) ⬇️
server/election/leadership.go 77.31% <0.00%> (-3.10%) ⬇️
server/tso/tso.go 63.63% <0.00%> (-2.28%) ⬇️
server/core/storage.go 69.31% <0.00%> (-0.76%) ⬇️
server/cluster/cluster.go 82.56% <0.00%> (-0.47%) ⬇️
server/server.go 71.70% <0.00%> (-0.30%) ⬇️
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dbe5e29...46e139b. Read the comment docs.

@nolouch
Copy link
Contributor

nolouch commented Oct 19, 2021

/run-unit-tests

1 similar comment
@disksing
Copy link
Contributor Author

/run-unit-tests

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Oct 21, 2021
@@ -165,6 +165,25 @@ func (t *testOperatorControllerSuite) TestFastFailOperator(c *C) {
c.Assert(oc.GetOperator(region.GetID()), IsNil)
}

// Issue 3353
func (t *testOperatorControllerSuite) TestFastFailOperator2(c *C) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about TestFastFailWithUnhealthyStore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Nov 16, 2021
@disksing
Copy link
Contributor Author

/merge

@ti-chi-bot
Copy link
Member

@disksing: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 2c0adbe

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Nov 16, 2021
@nolouch
Copy link
Contributor

nolouch commented Nov 22, 2021

/run-unit-tests

1 similar comment
@nolouch
Copy link
Contributor

nolouch commented Nov 22, 2021

/run-unit-tests

@nolouch
Copy link
Contributor

nolouch commented Nov 22, 2021

/run-unit-tests

@disksing
Copy link
Contributor Author

/run-unit-tests

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #4365.

ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this pull request Nov 23, 2021
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #4366.

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #4367.

@ti-chi-bot
Copy link
Member

In response to a cherrypick label: new pull request created: #4368.

@disksing disksing deleted the issue-3353 branch November 23, 2021 09:04
IcePigZDB pushed a commit to IcePigZDB/pd that referenced this pull request Nov 29, 2021
* operator: check store status for running operators

close tikv#3353

Signed-off-by: disksing <i@disksing.com>

* add test

Signed-off-by: disksing <i@disksing.com>

* add tests

Signed-off-by: disksing <i@disksing.com>

* address comment

Signed-off-by: disksing <i@disksing.com>
disksing pushed a commit that referenced this pull request Nov 30, 2021
* operator: check store status for running operators

close #3353

Signed-off-by: disksing <i@disksing.com>
disksing pushed a commit that referenced this pull request Dec 1, 2021
* operator: check store status for running operators

close #3353

Signed-off-by: disksing <i@disksing.com>
ti-chi-bot added a commit that referenced this pull request Dec 1, 2021
* operator: check store status for running operators

close #3353

Signed-off-by: disksing <i@disksing.com>

* add test

Signed-off-by: disksing <i@disksing.com>

* add tests

Signed-off-by: disksing <i@disksing.com>

* address comment

Signed-off-by: disksing <i@disksing.com>

* fix build

Signed-off-by: disksing <i@disksing.com>

* fix ci (try)

Signed-off-by: disksing <i@disksing.com>

Co-authored-by: disksing <i@disksing.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-cherry-pick-release-4.0 The PR needs to cherry pick to release-4.0 branch. needs-cherry-pick-release-5.0 The PR needs to cherry pick to release-5.0 branch. needs-cherry-pick-release-5.1 Type: Need cherry pick to release-5.1 needs-cherry-pick-release-5.2 Type: Need cherry pick to release-5.2 release-note Denotes a PR that will be considered when it comes time to generate release notes. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PD keeps transfering leader to a down store
4 participants