-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix][broker] Fix the broker registery cannot recover from the metadata node deletion #23359
[fix][broker] Fix the broker registery cannot recover from the metadata node deletion #23359
Conversation
...ava/org/apache/pulsar/broker/loadbalance/extensions/channel/ServiceUnitStateChannelImpl.java
Outdated
Show resolved
Hide resolved
I wonder if we need to periodically check this broker registration lock in the monitor thread as well for better fault tolerance. |
Yeah it can handle the case when there is a bug with the metadata store client. But even if it has a bug, we can easily find it from the alerts and fix it by manually restarting the broker or creating the metadata node. Therefore, I neither support nor oppose this idea. |
...ava/org/apache/pulsar/broker/loadbalance/extensions/channel/ServiceUnitStateChannelImpl.java
Outdated
Show resolved
Hide resolved
33d630b
to
5c720d8
Compare
...test/java/org/apache/pulsar/broker/loadbalance/extensions/BrokerRegistryIntegrationTest.java
Show resolved
Hide resolved
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #23359 +/- ##
============================================
+ Coverage 73.57% 74.56% +0.99%
- Complexity 32624 33942 +1318
============================================
Files 1877 1934 +57
Lines 139502 145035 +5533
Branches 15299 15848 +549
============================================
+ Hits 102638 108150 +5512
+ Misses 28908 28605 -303
- Partials 7956 8280 +324
Flags with carried forward coverage won't be shown. Click here to find out more.
|
This PR introduced a flaky test #23365, @BewareMyPower do you have a chance to fix it? |
Motivation
#23298 introduce a regression that once the metadata node of this broker was deleted (e.g. by session timeout), the broker register would never have a chance to recover. In this case, the clients whose owner is this broker would never be able to produce or consume.
Modifications
Add a default listener to
BrokerRegisterImpl
that will register itself again in an asynchronous way if the deleted node is the current broker. Add a new stateUnregistering
to prevent the broker from registering itself again afterunregister()
is called.Add
BrokerRegistryIntegrationTest
to verify this fix and the behavior introduced from #23298Documentation
doc
doc-required
doc-not-needed
doc-complete
Matching PR in forked repository
PR in forked repository: