-
Notifications
You must be signed in to change notification settings - Fork 351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: delete the cluster object when give up the leadership #774
Conversation
In the previous implementation, the cluster object won't be removed if the controller gives up the leadership, which causes some error state is cached even if the next leader takes office, this hurts the running. In this pull request, the cluster is deleted once the leadership is given up, and the retry-logic will be aborted (for the cluster added in last term) if the context is canceld. Signed-off-by: Chao Zhang <tokers@apache.org>
Codecov Report
@@ Coverage Diff @@
## master #774 +/- ##
==========================================
+ Coverage 32.28% 32.56% +0.27%
==========================================
Files 66 69 +3
Lines 6808 7285 +477
==========================================
+ Hits 2198 2372 +174
- Misses 4353 4643 +290
- Partials 257 270 +13
Continue to review full report at Codecov.
|
CI failed. |
The go version on my local is |
I use the go/1.16.10 to re-generate some files. |
tools.go
Outdated
@@ -1,3 +1,4 @@ | |||
//go:build tools |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is new in 1.17, I want to delete it before we fully switch to v1.17
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You haven't pushed the latest code yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping @tokers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tokers any update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated it but I don't have time to validate the e2e case until this weekend.
better to add e2e for this feature. |
First of all, this is not a feature, indeed it's a bugfix, also, it's tough to add cases for it. |
Maybe we can introduce chaos test later. If you want to add tests here, We may also need to rely on #770 to confirm that the current cluster is not available |
I think the bugfix also need e2e test , You can restart the APISIX pod during the test, and then check the behavior of the Ingress controller. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license-eye has totally checked 473 files.
Valid | Invalid | Ignored | Fixed |
---|---|---|---|
249 | 1 | 223 | 0 |
Click to see the invalid file list
- test/e2e/chaos/chaos.go
Signed-off-by: Jintao Zhang <zhangjintao9020@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Please answer these questions before submitting a pull request
Why submit this pull request?
Bugfix
New feature provided
Improve performance
Backport patches
Related issues
bug: Unable to reconnect to apisix, when all ep are deleted under svc of apisix #769
In the previous implementation, the cluster object won't be removed if
the controller gives up the leadership, which causes some error state is
cached even if the next leader takes office, this hurts the running.
In this pull request, the cluster is deleted once the leadership is
given up, and the retry-logic will be aborted (for the cluster added in
last term) if the context is canceld.
Signed-off-by: Chao Zhang tokers@apache.org