-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix][broker] new load balancer system topic should not be auto-created now #20566
[fix][broker] new load balancer system topic should not be auto-created now #20566
Conversation
27007c9
to
ff3e3fa
Compare
// ServiceUnitStateChannelImpl.TOPIC expects to be a non-partitioned-topic now. | ||
// We don't allow the auto-creation here. | ||
// ServiceUnitStateChannelImpl.start() is responsible to create the topic. | ||
if (ServiceUnitStateChannelImpl.TOPIC.equals(topicName.toString())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change makes sense to me, but we should finally change it to a partitioned topic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is another optimization area for scalability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it that in some cases we auto create in a non-partitioned way and in this case we do not auto create it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if I understood your question. Since the system topic is currently expected to work on a non-partitioned topic, we specifically create the topic in ServiceUnitStateChannelImpl.start(). This change will block the system topic auto-creation(when any other broker tries to auto-create the topic due to any race-condition )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I was mixing up #20370 and #20397.
My primary concern is that the solution in this PR creates a special case instead of creating a framework to make it easier to build system topics in the future. In my opinion, the load manager should be contained by the relevant interfaces and should not be referenced here. There is some relevant discussion about system topics in this PR too #20514 (comment). There is also relevant discussion on the ML here https://lists.apache.org/thread/f0q8n0hf1lgw9r2j53tm4yjjfdyr9kjd.
I think it is relevant that until now, we have always allowed system topics to be auto created to prevent classes of failures.
If I were to guess, the problem this PR is trying to solve is actually more generic than the system topics we're working with here. It is likely a challenge for all pulsar users to know how to create their topics correctly. That is one reason I want to push for thinking about a general solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can make this fix general like the followings.
-
Maintain a White-list for system topics that are pre-created. We can check this list and reject any auto creation here.
-
Or Make Pulsar pre-create all system topics, and then we can reject auto-creation for all topics under the system namespace.
I think the first option is less intrusive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A third option is to only auto create topics as non-partitioned topics. Any component that needs to create a system topic as a partitioned topic could do so on start up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I proposed as much here on the ML: https://lists.apache.org/thread/1lndgf1hx821fc12t0pc6j6zdrhkntht.
Codecov Report
@@ Coverage Diff @@
## master #20566 +/- ##
=============================================
+ Coverage 33.50% 72.92% +39.42%
- Complexity 12053 31930 +19877
=============================================
Files 1613 1867 +254
Lines 126120 138592 +12472
Branches 13749 15223 +1474
=============================================
+ Hits 42254 101072 +58818
+ Misses 78332 29499 -48833
- Partials 5534 8021 +2487
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Motivation
Example Error Logs
This system topic
persistent://pulsar/system/loadbalancer-service-unit-state-partition-0
should be a non-partitioned topic after this fix ,1080ad5 .It seems like there is a racing condition that auto-topic creation creates this system topic.
Modifications
Do not auto-create the new load balancer system topic.
Verifying this change
This change added tests.
Does this pull request potentially affect one of the following parts:
If the box was checked, please highlight the changes
Documentation
doc
doc-required
doc-not-needed
doc-complete
Matching PR in forked repository
PR in forked repository: