Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mcs: fix rule sync when meet "no rule left" and concurrency #7481

Closed
wants to merge 18 commits into from

Conversation

lhy1024
Copy link
Contributor

@lhy1024 lhy1024 commented Nov 29, 2023

What problem does this PR solve?

Issue Number: Close #7418

What is changed and how does it work?

  1. To address scenarios where deletion precedes an update, we've introduced a pendingDeletion structure. This structure tracks rules or rule groups marked for deletion. If there's an inability to delete immediately due to a lack of rules, it's temporarily stored. Subsequently, during future rule or rule group placement events, the system allows their removal once it becomes feasible due to the availability of required rules.
  2. To solve concurrent issues, I merged two watchers that operate on a shared path but handle rules and groups differently. The new watcher now collectively manages the handling of rules and groups, ensuring coordinated and consistent operations in the face of concurrent access.
  3. add more tests.

Check List

Tests

  • Unit test
  • Integration test

Release note

None.

Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: lhy1024 <admin@liudos.us>
Copy link
Contributor

ti-chi-bot bot commented Nov 29, 2023

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • rleungx

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. do-not-merge/needs-triage-completed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed do-not-merge/needs-triage-completed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 29, 2023
Copy link

codecov bot commented Nov 29, 2023

Codecov Report

Merging #7481 (4087916) into master (f51f913) will decrease coverage by 0.01%.
The diff coverage is 81.37%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7481      +/-   ##
==========================================
- Coverage   74.65%   74.64%   -0.01%     
==========================================
  Files         454      454              
  Lines       50142    50177      +35     
==========================================
+ Hits        37432    37454      +22     
- Misses       9385     9399      +14     
+ Partials     3325     3324       -1     
Flag Coverage Δ
unittests 74.64% <81.37%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

env.RunTestInTwoModes(suite.checkDeleteAndUpdate)
}

func (suite *ruleTestSuite) checkDeleteAndUpdate(cluster *tests.TestCluster) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it fail in the previous implementation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes
image

@@ -1330,5 +1330,5 @@ func checkRegionsReplicated(c *gin.Context) {
c.String(http.StatusBadRequest, err.Error())
return
}
c.String(http.StatusOK, state)
c.IndentedJSON(http.StatusOK, state)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for this modification? Do we need to modify the corresponding API interface for the non-API mode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid failure in JSON unmarshalling and maintain consistency with PD mode.

Comment on lines 63 to 64
ruleWatcher *etcdutil.LoopWatcher
groupWatcher *etcdutil.LoopWatcher
Copy link
Member

@rleungx rleungx Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still worry about whether two watchers will get the event disordered and overwrite the same rule.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add a test about it. If necessary, I will merge these watchers.

Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: lhy1024 <admin@liudos.us>
@lhy1024 lhy1024 changed the title mcs: fix rule sync when meet "no rule left" mcs: fix rule sync when meet "no rule left" and concurrency Dec 4, 2023
Signed-off-by: lhy1024 <admin@liudos.us>
pkg/mcs/scheduling/server/rule/watcher.go Outdated Show resolved Hide resolved
pkg/mcs/scheduling/server/rule/watcher.go Outdated Show resolved Hide resolved
pkg/mcs/scheduling/server/rule/watcher.go Outdated Show resolved Hide resolved
Signed-off-by: lhy1024 <admin@liudos.us>
@ti-chi-bot ti-chi-bot bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Dec 7, 2023
Copy link
Member

@CabinfeverB CabinfeverB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM. Any manual test?

Signed-off-by: lhy1024 <admin@liudos.us>
Signed-off-by: lhy1024 <admin@liudos.us>
@ti-chi-bot ti-chi-bot bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 8, 2023
Copy link
Contributor

ti-chi-bot bot commented Dec 8, 2023

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ti-chi-bot ti-chi-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Dec 8, 2023
Signed-off-by: lhy1024 <admin@liudos.us>
@ti-chi-bot ti-chi-bot bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Dec 8, 2023
return false
}
}
return true
log.Info("respBundle", zap.Any("respBundle", respBundle), zap.Any("bundle", bundle))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it log too much?

@ti-chi-bot ti-chi-bot bot added the status/LGT1 Indicates that a PR has LGTM 1. label Dec 12, 2023
rw.pendingDeletion.kvs[path] = [2]string{groupID, ruleID}
}

func (rw *Watcher) tryFinishPendingDeletion() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am worried about if the put and delete can be disordered. If so, the newly added rule might be deleted unexpectedly.

@lhy1024
Copy link
Contributor Author

lhy1024 commented Dec 14, 2023

We will use replace this pr with txn.

@lhy1024 lhy1024 closed this Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/LGT1 Indicates that a PR has LGTM 1.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

mcs: inconsistency in watch placement rule
4 participants