Skip to content
This repository has been archived by the owner on Jan 24, 2024. It is now read-only.

Fix publish throttling and LoadManager API for pulsar upgrade #488

Merged

Conversation

BewareMyPower
Copy link
Collaborator

This PR upgrades pulsar dependency to 2.8.0-rc-202105092228, which has two major API changes.

apache/pulsar#10391 changed LoadManager API so that MetadataCache is used instead of ZookeeperCache in this PR.

apache/pulsar#7406 changed the throttling strategy. However, currently KoP is different from Pulsar that the produce and its callback may be in different threads. KoP calls PersistentTopic#publishMessages in a callback of KafkaTopicManager#getTopic if the returned future is not completed immediately. Otherwise, it's called just in the I/O thread. Therefore, here we still use a channel based publish bytes stats for throttling, while apache/pulsar#7406 uses a thread based publish bytes stats.

The other refactors are:

  1. Change the throttling related fields from InternalServerCnx to KafkaRequestHandler.
  2. Use BrokerService#getPausedConnections to check if the channel's auto read is disabled and modify the tests as well.

Copy link
Contributor

@dockerzhang dockerzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

BewareMyPower added a commit to BewareMyPower/kop that referenced this pull request May 11, 2021
…native#488)

This PR upgrades pulsar dependency to 2.8.0-rc-202105092228, which has
two major API changes.

apache/pulsar#10391 changed `LoadManager` API so
that `MetadataCache` is used instead of `ZookeeperCache` in this PR.

apache/pulsar#7406 changed the throttling
strategy. However, currently KoP is different from Pulsar that the
produce and its callback may be in different threads. KoP calls
`PersistentTopic#publishMessages` in a callback of
`KafkaTopicManager#getTopic` if the returned future is not completed
immediately. Otherwise, it's called just in the I/O thread. Therefore,
here we still use a **channel based** publish bytes stats for
throttling, while apache/pulsar#7406 uses a
**thread based** publish bytes stats.

The other refactors are:
1. Change the throttling related fields from `InternalServerCnx` to
   `KafkaRequestHandler`.
2. Use `BrokerService#getPausedConnections` to check if the channel's
   auto read is disabled and modify the tests as well.
@BewareMyPower
Copy link
Collaborator Author

@jiazhai PTAL, it looks like the dependency from bintray is not available from now. See another PR's CI: https://github.com/streamnative/kop/pull/495/checks?check_run_id=2568314587

Failed to execute goal org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process (process-resource-bundles) on project pulsar-protocol-handler-kafka-parent: Failed to resolve dependencies for one or more projects in the reactor. Reason: Unable to get dependency information for org.apache.pulsar:pulsar-broker:jar:2.8.0-rc-202104202206: Failed to retrieve POM for org.apache.pulsar:pulsar-broker:jar:2.8.0-rc-202104202206: Could not transfer artifact org.apache.pulsar:pulsar-broker:pom:2.8.0-rc-202104202206 from/to bintray-streamnative-maven (https://dl.bintray.com/streamnative/maven): authorization failed for https://dl.bintray.com/streamnative/maven/org/apache/pulsar/pulsar-broker/2.8.0-rc-202104202206/pulsar-broker-2.8.0-rc-202104202206.pom, status: 403 Forbidden

@jiazhai jiazhai merged commit a50af46 into streamnative:master May 13, 2021
@BewareMyPower BewareMyPower deleted the bewaremypower/upgrade-pulsar branch May 16, 2021 16:21
BewareMyPower added a commit that referenced this pull request Aug 18, 2021
### Motivation

When there're a lot of topics, there could be a lot of `MetadataCache` instances in memory.

![image](https://user-images.githubusercontent.com/18204803/129903767-4b2f3961-c0e1-4ad5-88b0-7451ed41393a.png)

This bug was introduced from #488 because the Pulsar side adopted a new class `MetadataCache` as ZK cache.  However, each time `MetadataStore#getMetadataCache` is called, a new `MetadataCache` instance will be created. In KoP, it means that each time a topic lookup is performed, a new `MetadataCache` instance will be created.

### Modifications

Only call `MetadataStore#getMetadataCache` once when KoP starts and reuse the same `MetadataCache` instance for each `KafkaRequestHandler` instance.
BewareMyPower added a commit that referenced this pull request Aug 19, 2021
### Motivation

When there're a lot of topics, there could be a lot of `MetadataCache` instances in memory.

![image](https://user-images.githubusercontent.com/18204803/129903767-4b2f3961-c0e1-4ad5-88b0-7451ed41393a.png)

This bug was introduced from #488 because the Pulsar side adopted a new class `MetadataCache` as ZK cache.  However, each time `MetadataStore#getMetadataCache` is called, a new `MetadataCache` instance will be created. In KoP, it means that each time a topic lookup is performed, a new `MetadataCache` instance will be created.

### Modifications

Only call `MetadataStore#getMetadataCache` once when KoP starts and reuse the same `MetadataCache` instance for each `KafkaRequestHandler` instance.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants