Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ZooKeeper to 3.6.2 and Curator to 5.1.0 #8590

Merged
merged 29 commits into from
Feb 11, 2021

Conversation

eolivelli
Copy link
Contributor

Motivation

Upgrade to latest stable ZK version 3.6.2. The new minor versions brings several advantages:

Performance improvements (eg: tuning group commit on txn log)
Prometheus based metrics (so that we can get rid of AspectJ hacky way to instrument ZK)
New features like persistent recursive watches which would greatly simplify the logic to handle metadata cache invalidations.
The possibility of rollback to previous version has also been validated.

Modifications

  • Update ZooKeeper to 3.6.2
  • Updated Apache Curator to 5.1.0 (that supports ZK 3.6.x), Curator was implicitly imported by BookKeeper Storage Service

Verifying this change

  • This change is already covered by existing tests, such as (please describe tests).
  • Test manually a Pulsar service and Pulsar Standalone
  • Test ZooKeeper metrics

Problems

Unfortunately we cannot upgrade Curator because there is an error with "pulsar standalone":
Basically we have to upgrade ZooKeeper and Curator on BookKeeper before doing this change here

09:40:36.781 [main] ERROR org.apache.bookkeeper.common.component.AbstractLifecycleComponent - Failed to start Component: zk-storage-container-manager
java.lang.NoSuchMethodError: 'org.apache.curator.framework.listen.ListenerContainer org.apache.curator.framework.recipes.cache.NodeCache.getListenable()'
	at org.apache.bookkeeper.stream.storage.impl.cluster.ZkClusterMetadataStore.watchClusterAssignmentData(ZkClusterMetadataStore.java:169) ~[org.apache.bookkeeper-stream-storage-service-impl-4.12.0.jar:4.12.0]
	at org.apache.bookkeeper.stream.storage.impl.sc.ZkStorageContainerManager.doStart(ZkStorageContainerManager.java:105) ~[org.apache.bookkeeper-stream-storage-service-impl-4.12.0.jar:4.12.0]
	at org.apache.bookkeeper.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:83) [org.apache.bookkeeper-bookkeeper-common-4.12.0.jar:4.12.0]
	at org.apache.bookkeeper.stream.storage.impl.StorageContainerStoreImpl.doStart(StorageContainerStoreImpl.java:97) [org.apache.bookkeeper-stream-storage-service-impl-4.12.0.jar:4.12.0]
	at org.apache.bookkeeper.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:83) [org.apache.bookkeeper-bookkeeper-common-4.12.0.jar:4.12.0]
	at org.apache.bookkeeper.stream.server.service.StorageService.doStart(StorageService.java:47) [org.apache.bookkeeper-stream-storage-server-4.12.0.jar:4.12.0]
	at org.apache.bookkeeper.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:83) [org.apache.bookkeeper-bookkeeper-common-4.12.0.jar:4.12.0]
	at org.apache.bookkeeper.common.component.LifecycleComponentStack.lambda$start$4(LifecycleComponentStack.java:144) [org.apache.bookkeeper-bookkeeper-common-4.12.0.jar:4.12.0]
	at com.google.common.collect.ImmutableList.forEach(ImmutableList.java:405) [com.google.guava-guava-30.0-jre.jar:?]
	at org.apache.bookkeeper.common.component.LifecycleComponentStack.start(LifecycleComponentStack.java:144) [org.apache.bookkeeper-bookkeeper-common-4.12.0.jar:4.12.0]
	at org.apache.bookkeeper.stream.server.StreamStorageLifecycleComponent.doStart(StreamStorageLifecycleComponent.java:51) [org.apache.bookkeeper-stream-storage-server-4.12.0.jar:4.12.0]
	at org.apache.bookkeeper.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:83) [org.apache.bookkeeper-bookkeeper-common-4.12.0.jar:4.12.0]
	at org.apache.pulsar.zookeeper.LocalBookkeeperEnsemble.runStreamStorage(LocalBookkeeperEnsemble.java:350) [org.apache.pulsar-pulsar-zookeeper-utils-2.7.0-SNAPSHOT.jar:2.7.0-SNAPSHOT]
	at org.apache.pulsar.zookeeper.LocalBookkeeperEnsemble.startStandalone(LocalBookkeeperEnsemble.java:432) [org.apache.pulsar-pulsar-zookeeper-utils-2.7.0-SNAPSHOT.jar:2.7.0-SNAPSHOT]
	at org.apache.pulsar.PulsarStandalone.start(PulsarStandalone.java:258) [org.apache.pulsar-pulsar-broker-2.7.0-SNAPSHOT.jar:2.7.0-SNAPSHOT]
	at org.apache.pulsar.PulsarStandaloneStarter.main(PulsarStandaloneStarter.java:115) [org.apache.pulsar-pulsar-broker-2.7.0-SNAPSHOT.jar:2.7.0-SNAPSHOT]
09:40:36.782 [main] ERROR org.apache.bookkeeper.common.component.AbstractLifecycleComponent - Failed to start Component: range-service
java.lang.NoSuchMethodError: 'org.apache.curator.framework.listen.ListenerContainer org.apache.curator.framework.recipes.cache.NodeCache.getListenable()'
	at org.apache.bookkeeper.stream.storage.impl.cluster.ZkClusterMetadataStore.watchClusterAssignmentData(ZkClusterMetadataStore.java:169) ~[org.apache.bookkeeper-stream-storage-service-impl-4.12.0.jar:4.12.0]
	at org.apache.bookkeeper.stream.storage.impl.sc.ZkStorageContainerManager.doStart(ZkStorageContainerManager.java:105) ~[org.apache.bookkeeper-stream-storage-service-impl-4.12.0.jar:4.12.0]

@eolivelli
Copy link
Contributor Author

@sijie @jiazhai
This is the patch on BookKeeper, we could make a 4.12.1 release in the future
apache/bookkeeper#2491

there is no hurry at all

@eolivelli
Copy link
Contributor Author

This patch is waiting for BK 4.12.1 release, a VOTE is pending on BK community

@eolivelli
Copy link
Contributor Author

BK vote completed
now this patch depends on the BK upgrade to 4.12.1
#9019

@codelipenghui PTAL

@eolivelli eolivelli marked this pull request as ready for review January 14, 2021 09:08
@eolivelli
Copy link
Contributor Author

@merlimat @codelipenghui this patch is good to go as soon as CI passes

@@ -84,7 +84,16 @@
<classifier>tests</classifier>
<scope>test</scope>
</dependency>

<dependency>
<groupId>io.dropwizard.metrics</groupId>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why do we need this? I failed to see this dependency was used here. The same question apply to the other dependency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because from zk 3.6 onwards the "zookeeper" Maven artifact does not import those dependencies anymore to the dependant projects.
This is because we (ZooKeeper project community) want to say that this artifact is only the "zookeeper java client" and the zookeeper java client does not need such jars.
But we still do not provide a "zookeeper server" jar

In Pulsar we are also running the ZooKeeper server (both for production and for tests) and so we need to explicitly add those two jar to the build, otherwise the ZooKeeper server won't start

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to leave a comment with the reason we're adding these, otherwise down the road it will be difficult to remember the rationale.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sijie PTAL the description was added on the new deps

@eolivelli
Copy link
Contributor Author

/pulsarbot run-failure-checks

@eolivelli
Copy link
Contributor Author

/pulsarbot run-failure-checks

@eolivelli
Copy link
Contributor Author

/pulsarbot run-failure-checks

@eolivelli
Copy link
Contributor Author

/pulsarbot run-failure-checks

@eolivelli
Copy link
Contributor Author

/pulsarbot run-failure-checks

@eolivelli
Copy link
Contributor Author

/pulsarbot run-failure-checks

2 similar comments
@eolivelli
Copy link
Contributor Author

/pulsarbot run-failure-checks

@eolivelli
Copy link
Contributor Author

/pulsarbot run-failure-checks

- io.dropwizard.metrics-metrics-core-3.1.0.jar
- io.dropwizard.metrics-metrics-graphite-3.1.0.jar
- io.dropwizard.metrics-metrics-jvm-3.1.0.jar
- io.dropwizard.metrics-metrics-core-3.2.5.jar
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bookkeeper uses Zookeeper 3.6.2 and dropwizard 3.1.0, I think it makes sense to keep dependency versions in sync to avoid surprises later

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dlg99 I can't remember why I had to make this bump.
Probably it is the same version as in ZK.

I would prefer to upgrade this dep in BK 4.13.

Copy link
Contributor

@dlg99 dlg99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@merlimat
Copy link
Contributor

@sijie PTAL again

@sijie sijie merged commit 9c39ca0 into apache:master Feb 11, 2021
@eolivelli eolivelli deleted the fix/zk-3.6 branch February 11, 2021 07:24
lhotari added a commit to lhotari/pulsar that referenced this pull request Feb 11, 2021
codelipenghui pushed a commit that referenced this pull request Jun 3, 2021
…10803)

### Motivation
After upgrade zookeeper version to 3.6.2 in #8590 and removed AspectJ based metrics for ZooKeeper in #10533, the zookeeper's prometheus metric has lost if we start zookeeper by `bin/puldar-daemon start zookeeper`.

Due to zookeeper 3.6.0+ has add internal prometheus metric provider, so we can turn on by default in pulsar.

### Modification
1. turn on zookeeper prometheus metric provider by default in `conf/zookeeper.conf` and use 8000 as default port sync with old zookeeper metric port
2. add grafana panel for new zookeeper metrics
3. remove old prometheus metric provider in `ZooKeeperStarter` and `ConfigurationStoreStarter`.
4. update reference-metric.md doc
yangl pushed a commit to yangl/pulsar that referenced this pull request Jun 23, 2021
…pache#10803)

### Motivation
After upgrade zookeeper version to 3.6.2 in apache#8590 and removed AspectJ based metrics for ZooKeeper in apache#10533, the zookeeper's prometheus metric has lost if we start zookeeper by `bin/puldar-daemon start zookeeper`.

Due to zookeeper 3.6.0+ has add internal prometheus metric provider, so we can turn on by default in pulsar.

### Modification
1. turn on zookeeper prometheus metric provider by default in `conf/zookeeper.conf` and use 8000 as default port sync with old zookeeper metric port
2. add grafana panel for new zookeeper metrics
3. remove old prometheus metric provider in `ZooKeeperStarter` and `ConfigurationStoreStarter`.
4. update reference-metric.md doc
pkumar-singh pushed a commit to pkumar-singh/pulsar that referenced this pull request Aug 10, 2021
…pache#10803)

After upgrade zookeeper version to 3.6.2 in apache#8590 and removed AspectJ based metrics for ZooKeeper in apache#10533, the zookeeper's prometheus metric has lost if we start zookeeper by `bin/puldar-daemon start zookeeper`.

Due to zookeeper 3.6.0+ has add internal prometheus metric provider, so we can turn on by default in pulsar.

1. turn on zookeeper prometheus metric provider by default in `conf/zookeeper.conf` and use 8000 as default port sync with old zookeeper metric port
2. add grafana panel for new zookeeper metrics
3. remove old prometheus metric provider in `ZooKeeperStarter` and `ConfigurationStoreStarter`.
4. update reference-metric.md doc
bharanic-dev pushed a commit to bharanic-dev/pulsar that referenced this pull request Mar 18, 2022
…pache#10803)

### Motivation
After upgrade zookeeper version to 3.6.2 in apache#8590 and removed AspectJ based metrics for ZooKeeper in apache#10533, the zookeeper's prometheus metric has lost if we start zookeeper by `bin/puldar-daemon start zookeeper`.

Due to zookeeper 3.6.0+ has add internal prometheus metric provider, so we can turn on by default in pulsar.

### Modification
1. turn on zookeeper prometheus metric provider by default in `conf/zookeeper.conf` and use 8000 as default port sync with old zookeeper metric port
2. add grafana panel for new zookeeper metrics
3. remove old prometheus metric provider in `ZooKeeperStarter` and `ConfigurationStoreStarter`.
4. update reference-metric.md doc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants