Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][broker] Fix the broker registery cannot recover from the metadata node deletion #23359

Conversation

BewareMyPower
Copy link
Contributor

@BewareMyPower BewareMyPower commented Sep 26, 2024

Motivation

#23298 introduce a regression that once the metadata node of this broker was deleted (e.g. by session timeout), the broker register would never have a chance to recover. In this case, the clients whose owner is this broker would never be able to produce or consume.

Modifications

Add a default listener to BrokerRegisterImpl that will register itself again in an asynchronous way if the deleted node is the current broker. Add a new state Unregistering to prevent the broker from registering itself again after unregister() is called.

Add BrokerRegistryIntegrationTest to verify this fix and the behavior introduced from #23298

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

@BewareMyPower BewareMyPower added type/bug The PR fixed a bug or issue reported a bug area/broker release/3.3.2 labels Sep 26, 2024
@BewareMyPower BewareMyPower added this to the 4.0.0 milestone Sep 26, 2024
@BewareMyPower BewareMyPower self-assigned this Sep 26, 2024
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Sep 26, 2024
@BewareMyPower BewareMyPower changed the title [fix][broker] Fix the broker register cannot recover from the metadata node deletion [fix][broker] Fix the broker registery cannot recover from the metadata node deletion Sep 26, 2024
@heesung-sn
Copy link
Contributor

I wonder if we need to periodically check this broker registration lock in the monitor thread as well for better fault tolerance.

@BewareMyPower
Copy link
Contributor Author

periodically check this broker registration lock in the monitor thread as well for better fault tolerance.

Yeah it can handle the case when there is a bug with the metadata store client. But even if it has a bug, we can easily find it from the alerts and fix it by manually restarting the broker or creating the metadata node. Therefore, I neither support nor oppose this idea.

@BewareMyPower BewareMyPower force-pushed the bewaremypower/broker-registry-session-timeout branch from 33d630b to 5c720d8 Compare September 27, 2024 05:51
@codecov-commenter
Copy link

codecov-commenter commented Sep 27, 2024

Codecov Report

Attention: Patch coverage is 71.42857% with 8 lines in your changes missing coverage. Please review.

Project coverage is 74.56%. Comparing base (bbc6224) to head (5c720d8).
Report is 606 commits behind head on master.

Files with missing lines Patch % Lines
...ker/loadbalance/extensions/BrokerRegistryImpl.java 71.42% 5 Missing and 3 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #23359      +/-   ##
============================================
+ Coverage     73.57%   74.56%   +0.99%     
- Complexity    32624    33942    +1318     
============================================
  Files          1877     1934      +57     
  Lines        139502   145035    +5533     
  Branches      15299    15848     +549     
============================================
+ Hits         102638   108150    +5512     
+ Misses        28908    28605     -303     
- Partials       7956     8280     +324     
Flag Coverage Δ
inttests 27.75% <46.42%> (+3.16%) ⬆️
systests 24.56% <0.00%> (+0.24%) ⬆️
unittests 73.91% <71.42%> (+1.07%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...ker/loadbalance/extensions/BrokerRegistryImpl.java 81.41% <71.42%> (-3.35%) ⬇️

... and 600 files with indirect coverage changes

@BewareMyPower BewareMyPower merged commit 95bd1d1 into apache:master Sep 27, 2024
51 checks passed
@BewareMyPower BewareMyPower deleted the bewaremypower/broker-registry-session-timeout branch September 27, 2024 11:49
@lhotari
Copy link
Member

lhotari commented Sep 28, 2024

This PR introduced a flaky test #23365, @BewareMyPower do you have a chance to fix it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/broker cherry-picked/branch-3.3 doc-not-needed Your PR changes do not impact docs release/3.3.2 type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants