Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supports evenly distribute topics count when splits bundle #6241

Merged
merged 9 commits into from
Feb 13, 2020

Conversation

codelipenghui
Copy link
Contributor

Motivation

Currently, bundle split splits the bundle into two parts of the same size. When there are fewer topics, bundle split does not work well. The topic assigned to the broker according to the topic name hash value, hashing is not effective in a small number of topics bundle split.

So, this PR introduces an option(-balance-topic-count) for bundle split. When setting it to true, the given bundle splits to 2 parts, each part has the same amount of topics.

And introduce a new Load Manager implementation named org.apache.pulsar.broker.loadbalance.impl.BalanceTopicCountModularLoadManager. The new Load Manager implementation splits bundle with balance topics count, others are not different from ModularLoadManagerImpl.

Verifying this change

Test is coming

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API: (no)
  • The schema: (no)
  • The default values of configurations: (no)
  • The wire protocol: (no)
  • The rest endpoints: (no)
  • The admin cli options: (no)
  • Anything that affects deployment: (no)

Documentation

  • Does this pull request introduce a new feature? (yes)

@codelipenghui codelipenghui self-assigned this Feb 6, 2020
@jiazhai
Copy link
Member

jiazhai commented Feb 6, 2020

+1 good for small number of partitions

Copy link
Member

@sijie sijie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall the direction looks good. However, I don't suggest just adding a simple boolean flag to introduce another split algorithm. This makes code maintenance pretty hard and not easy to extend in the future. Instead, I would suggest introducing an interface about BundleSplitAlgorithm. The algorithm takes NamespaceService and a Namespace Bundle as the inputs and returns a Long number as the split boundary.

interface BundleSplitAlgorithm {
    CompletableFuture<Long> split(NamespaceService service, NamespaceBundle bundle);
}

We can have two algorithms in place. One is the default implementation, splitting a bundle by dividing the key range by half; the other one is the new implementation, splitting the number by number of topics.

You can have two configurations in the broker.conf called supportedSplitAlgorithms and 'defaultBundleSplitAlgorithm'. It allows brokers to configure the supported algorithms in the broker. Then in the admin endpoint, you can add a string parameter to allow specifying a split algorithm. If users don't specify the algorithm then use the default split algorithm.

We can also add this setting to namespace policy to specify the default split algorithm for a given namespace.

Hope these make sense to you.

@sijie sijie added area/broker type/feature The PR added a new feature or issue requested a new feature labels Feb 7, 2020
@sijie sijie added this to the 2.6.0 milestone Feb 7, 2020
@sijie sijie added the doc-required Your PR changes impact docs and you will update later. label Feb 7, 2020
@codelipenghui
Copy link
Contributor Author

@sijie Good idea, I will try to apply your comment. Thanks

@codelipenghui
Copy link
Contributor Author

@sijie I have applied your comments, please take a look, thanks.

Copy link
Member

@sijie sijie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codelipenghui the code change looks good to me. Do you mind adding documentation about this?

@codelipenghui
Copy link
Contributor Author

@sijie I have added the description in reference-configuration. PTAL.

@sijie sijie merged commit 1c099da into apache:master Feb 13, 2020
@codelipenghui codelipenghui deleted the load_balance_by_topic branch February 14, 2020 01:04
@tuteng
Copy link
Member

tuteng commented Mar 21, 2020

Add label release-2.5.1, due to #6178 dependency

tuteng pushed a commit to AmateurEvents/pulsar that referenced this pull request Mar 21, 2020
### Motivation

Currently, bundle split splits the bundle into two parts of the same size. When there are fewer topics, bundle split does not work well. The topic assigned to the broker according to the topic name hash value, hashing is not effective in a small number of topics bundle split.

So, this PR introduces an option(-balance-topic-count) for bundle split.  When setting it to true, the given bundle splits to 2 parts, each part has the same amount of topics.

And introduce a new Load Manager implementation named `org.apache.pulsar.broker.loadbalance.impl.BalanceTopicCountModularLoadManager`.  The new Load Manager implementation splits bundle with balance topics count, others are not different from ModularLoadManagerImpl.

(cherry picked from commit 1c099da)
tuteng pushed a commit that referenced this pull request Apr 13, 2020
### Motivation

Currently, bundle split splits the bundle into two parts of the same size. When there are fewer topics, bundle split does not work well. The topic assigned to the broker according to the topic name hash value, hashing is not effective in a small number of topics bundle split.

So, this PR introduces an option(-balance-topic-count) for bundle split.  When setting it to true, the given bundle splits to 2 parts, each part has the same amount of topics.

And introduce a new Load Manager implementation named `org.apache.pulsar.broker.loadbalance.impl.BalanceTopicCountModularLoadManager`.  The new Load Manager implementation splits bundle with balance topics count, others are not different from ModularLoadManagerImpl.

(cherry picked from commit 1c099da)
jiazhai pushed a commit to jiazhai/pulsar that referenced this pull request May 18, 2020
### Motivation

Currently, bundle split splits the bundle into two parts of the same size. When there are fewer topics, bundle split does not work well. The topic assigned to the broker according to the topic name hash value, hashing is not effective in a small number of topics bundle split.

So, this PR introduces an option(-balance-topic-count) for bundle split.  When setting it to true, the given bundle splits to 2 parts, each part has the same amount of topics.

And introduce a new Load Manager implementation named `org.apache.pulsar.broker.loadbalance.impl.BalanceTopicCountModularLoadManager`.  The new Load Manager implementation splits bundle with balance topics count, others are not different from ModularLoadManagerImpl.
(cherry picked from commit 1c099da)
@Anonymitaet
Copy link
Member

@Anonymitaet Anonymitaet removed the doc-required Your PR changes impact docs and you will update later. label Jun 10, 2020
huangdx0726 pushed a commit to huangdx0726/pulsar that referenced this pull request Aug 24, 2020
### Motivation

Currently, bundle split splits the bundle into two parts of the same size. When there are fewer topics, bundle split does not work well. The topic assigned to the broker according to the topic name hash value, hashing is not effective in a small number of topics bundle split.

So, this PR introduces an option(-balance-topic-count) for bundle split.  When setting it to true, the given bundle splits to 2 parts, each part has the same amount of topics.

And introduce a new Load Manager implementation named `org.apache.pulsar.broker.loadbalance.impl.BalanceTopicCountModularLoadManager`.  The new Load Manager implementation splits bundle with balance topics count, others are not different from ModularLoadManagerImpl.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/broker release/2.5.1 type/feature The PR added a new feature or issue requested a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants