Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster setup - ZooKeeper errors #19

Closed
radekg opened this issue Sep 17, 2016 · 6 comments
Closed

Cluster setup - ZooKeeper errors #19

radekg opened this issue Sep 17, 2016 · 6 comments

Comments

@radekg
Copy link
Contributor

radekg commented Sep 17, 2016

Expected behavior

Bookie starts and confirms it is running.

Actual behavior

Service dies with the following error:

2016-09-17 20:55:37,605 - WARN  [GarbageCollectorThread-1-1:GarbageCollectorThread@392] - Exception in gc
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /ledgers/LAYOUT
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
    at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
    at org.apache.bookkeeper.meta.LedgerLayout.store(LedgerLayout.java:146)
    at org.apache.bookkeeper.meta.LedgerManagerFactory.createNewLMFactory(LedgerManagerFactory.java:214)
    at org.apache.bookkeeper.meta.LedgerManagerFactory.newLedgerManagerFactory(LedgerManagerFactory.java:126)
    at org.apache.bookkeeper.bookie.GarbageCollectorThread$LedgerManagerProviderImpl.getLedgerManager(GarbageCollectorThread.java:635)
    at org.apache.bookkeeper.bookie.GarbageCollectorThread.safeRun(GarbageCollectorThread.java:346)
    at org.apache.bookkeeper.util.SafeRunnable.run(SafeRunnable.java:31)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
    at java.lang.Thread.run(Thread.java:745)

My local zookeeper tells me:

[zk: localhost:2181(CONNECTED) 0] ls /
[namespace, admin, loadbalance, zookeeper, ledgers, managed-ledgers]
[zk: localhost:2181(CONNECTED) 1] ls /ledgers
[available, cookies, LAYOUT]

Steps to reproduce

Setup a clustered pulsar installation as described in the README. Start the bookkeeper.

System configuration

Pulsar version: 1.14

@merlimat
Copy link
Contributor

This looks strange, the bookie gets NoNodeException for /ledgers/LAYOUT, but the z-node appears to be there from the zk shell.

A couple of thoughts :

  1. Verify the bookie is configured with the right ZK connection string
  2. Verify all the ZK servers are using the same configuration (eg: if one of them is running as standalone.. you can have funny behaviors)

@radekg
Copy link
Contributor Author

radekg commented Sep 18, 2016

Question then, do I need to run the ZooKeeper with Pulsar, like described in the readme? Or can I set my own zookeeper up and configure it to accept connections from Pulsar. Currently I'm doing the latter. Using 3.4.8 version. Would there be a difference?

@radekg radekg closed this as completed Sep 18, 2016
@radekg
Copy link
Contributor Author

radekg commented Sep 18, 2016

I found it, the problem was totally on my side. About to verify a fix.

@radekg
Copy link
Contributor Author

radekg commented Sep 18, 2016

I have fixed my zookeepers, they now report leaders and followers correctly. However, what I see now is the following:

  1. on the bookkeeper:
...
2016-09-18 20:24:31,011 - INFO  [main:ZooKeeper@438] - Initiating client connection, connectString=10.239.182.38:2181,10.239.182.39:2181,10.239.182.54:2181 sessionTimeout=10000 watcher=org.apache.bookkeeper.bookie.Bookie$7@3f3afe78
2016-09-18 20:24:31,028 - INFO  [main-SendThread(10.239.182.54:2181):ClientCnxn$SendThread@1032] - Opening socket connection to server 10.239.182.54/10.239.182.54:2181. Will not attempt to authenticate using SASL (unknown error)
2016-09-18 20:24:31,100 - INFO  [main-SendThread(10.239.182.54:2181):ClientCnxn$SendThread@876] - Socket connection established to 10.239.182.54/10.239.182.54:2181, initiating session
2016-09-18 20:24:31,108 - INFO  [main-SendThread(10.239.182.54:2181):ClientCnxn$SendThread@1299] - Session establishment complete on server 10.239.182.54/10.239.182.54:2181, sessionid = 0x3573ee2302d05a4, negotiated timeout = 10000
2016-09-18 20:24:31,202 - INFO  [main:Bookie@422] - INSTANCEID not exists in zookeeper. Not considering it for data verification
...
2016-09-18 20:28:47,484 - INFO  [GarbageCollectorThread-1-1-SendThread(10.239.182.39:2181):ClientCnxn$SendThread@1299] - Session establishment complete on server 10.239.182.39/10.239.182.39:2181, sessionid = 0x2573ee2c2a9067f, negotiated timeout = 10000
2016-09-18 20:28:47,485 - INFO  [GarbageCollectorThread-1-1:GarbageCollectorThread$LedgerManagerProviderImpl@636] - instantiate ledger manager org.apache.bookkeeper.meta.FlatLedgerManagerFactory
2016-09-18 20:28:47,490 - INFO  [GarbageCollectorThread-1-1:ZooKeeper@684] - Session: 0x2573ee2c2a9067f closed
2016-09-18 20:28:47,490 - INFO  [GarbageCollectorThread-1-1-EventThread:ClientCnxn$EventThread@519] - EventThread shut down for session: 0x2573ee2c2a9067f
2016-09-18 20:28:48,479 - INFO  [GarbageCollectorThread-1-1:ZooKeeper@438] - Initiating client connection, connectString=10.239.182.38:2181,10.239.182.39:2181,10.239.182.54:2181 sessionTimeout=10000 watcher=org.apache.bookkeeper.zookeeper.ZooKeeperWatcherBase@15e7f5ee
2016-09-18 20:28:48,479 - INFO  [GarbageCollectorThread-1-1-SendThread(10.239.182.39:2181):ClientCnxn$SendThread@1032] - Opening socket connection to server 10.239.182.39/10.239.182.39:2181. Will not attempt to authenticate using SASL (unknown error)
2016-09-18 20:28:48,480 - INFO  [GarbageCollectorThread-1-1-SendThread(10.239.182.39:2181):ClientCnxn$SendThread@876] - Socket connection established to 10.239.182.39/10.239.182.39:2181, initiating session
2016-09-18 20:28:48,484 - INFO  [GarbageCollectorThread-1-1-SendThread(10.239.182.39:2181):ClientCnxn$SendThread@1299] - Session establishment complete on server 10.239.182.39/10.239.182.39:2181, sessionid = 0x2573ee2c2a90680, negotiated timeout = 10000
...

And it keeps going on like this. I think it's not normal?

  1. On the broker, I get this:
2016-09-18 20:31:08,470 - WARN  [main:ZkBookieRackAffinityMapping@140] - Error getting bookie info from zk, using default rack node /default-rack: KeeperErrorCode = NoNode for /bookies
2016-09-18 20:31:08,481 - INFO  [main:RackawareEnsemblePlacementPolicy@339] - Initialize rackaware ensemble placement policy @ <Bookie:10.239.182.50:0> : com.yahoo.pulsar.zookeeper.ZkBookieRackAffinityMapping.
2016-09-18 20:31:08,579 - WARN  [main-EventThread:ZkBookieRackAffinityMapping@140] - Error getting bookie info from zk, using default rack node /default-rack: KeeperErrorCode = NoNode for /bookies
2016-09-18 20:31:08,579 - INFO  [main-EventThread:NetworkTopology@394] - Adding a new node: /default-rack/10.239.182.34:3181
2016-09-18 20:31:08,585 - WARN  [main-EventThread:ZkBookieRackAffinityMapping@140] - Error getting bookie info from zk, using default rack node /default-rack: KeeperErrorCode = NoNode for /bookies
2016-09-18 20:31:08,585 - INFO  [main-EventThread:NetworkTopology@394] - Adding a new node: /default-rack/10.239.182.47:3181
2016-09-18 20:31:08,595 - WARN  [main-EventThread:ZkBookieRackAffinityMapping@140] - Error getting bookie info from zk, using default rack node /default-rack: KeeperErrorCode = NoNode for /bookies
2016-09-18 20:31:08,595 - INFO  [main-EventThread:NetworkTopology@394] - Adding a new node: /default-rack/10.239.182.27:3181
2016-09-18 20:31:08,601 - ERROR [main:PulsarService@311] - Configured layout org.apache.bookkeeper.meta.HierarchicalLedgerManagerFactory does not match existing layout org.apache.bookkeeper.meta.FlatLedgerManagerFactory
java.io.IOException: Configured layout org.apache.bookkeeper.meta.HierarchicalLedgerManagerFactory does not match existing layout org.apache.bookkeeper.meta.FlatLedgerManagerFactory
    at org.apache.bookkeeper.meta.LedgerManagerFactory.newLedgerManagerFactory(LedgerManagerFactory.java:159)
    at org.apache.bookkeeper.client.BookKeeper.<init>(BookKeeper.java:299)
    at org.apache.bookkeeper.client.BookKeeper.<init>(BookKeeper.java:266)
    at org.apache.bookkeeper.client.BookKeeper.<init>(BookKeeper.java:243)
    at com.yahoo.pulsar.broker.BookKeeperClientFactoryImpl.create(BookKeeperClientFactoryImpl.java:79)
    at com.yahoo.pulsar.broker.ManagedLedgerClientFactory.<init>(ManagedLedgerClientFactory.java:38)
    at com.yahoo.pulsar.broker.PulsarService.start(PulsarService.java:231)
    at com.yahoo.pulsar.PulsarBrokerStarter.main(PulsarBrokerStarter.java:59)
2016-09-18 20:31:08,603 - ERROR [main:PulsarBrokerStarter@62] - Failed to start pulsar service.
com.yahoo.pulsar.broker.PulsarServerException: java.io.IOException: Configured layout org.apache.bookkeeper.meta.HierarchicalLedgerManagerFactory does not match existing layout org.apache.bookkeeper.meta.FlatLedgerManagerFactory
    at com.yahoo.pulsar.broker.PulsarService.start(PulsarService.java:312)
    at com.yahoo.pulsar.PulsarBrokerStarter.main(PulsarBrokerStarter.java:59)
Caused by: java.io.IOException: Configured layout org.apache.bookkeeper.meta.HierarchicalLedgerManagerFactory does not match existing layout org.apache.bookkeeper.meta.FlatLedgerManagerFactory
    at org.apache.bookkeeper.meta.LedgerManagerFactory.newLedgerManagerFactory(LedgerManagerFactory.java:159)
    at org.apache.bookkeeper.client.BookKeeper.<init>(BookKeeper.java:299)
    at org.apache.bookkeeper.client.BookKeeper.<init>(BookKeeper.java:266)
    at org.apache.bookkeeper.client.BookKeeper.<init>(BookKeeper.java:243)
    at com.yahoo.pulsar.broker.BookKeeperClientFactoryImpl.create(BookKeeperClientFactoryImpl.java:79)
    at com.yahoo.pulsar.broker.ManagedLedgerClientFactory.<init>(ManagedLedgerClientFactory.java:38)
    at com.yahoo.pulsar.broker.PulsarService.start(PulsarService.java:231)
    ... 1 more

I don't see any way to specify the layout?

@radekg radekg reopened this Sep 18, 2016
@radekg
Copy link
Contributor Author

radekg commented Sep 18, 2016

Right, the setting I'm looking for is ledgerManagerType=hierarchical

@radekg
Copy link
Contributor Author

radekg commented Sep 18, 2016

Referenced PR fixes the issues described here for the cluster setup.

@radekg radekg closed this as completed Sep 18, 2016
sijie pushed a commit to sijie/pulsar that referenced this issue Mar 4, 2018
* Create pulsar-functions module (#1)

* Create pulsar-functions module

* rename `sdk` package to `api`

* Added the first cut of the Java interface for Pulsar functions (#2)

* Added a simple String based serde
hrsakai pushed a commit to hrsakai/pulsar that referenced this issue Dec 10, 2020
Signed-off-by: xiaolong.ran <ranxiaolong716@gmail.com>

Add go report to checking code format

We need to add a check on the code format in ci.
hangc0276 referenced this issue in hangc0276/pulsar May 26, 2021
Master Issue: #4 

This is the request Fetch implementation.
Basic logic: For each Fetch request from a partition, create and maintain a NonDurableCursor to read from backed PersistentTopic.

** changes **
- Add basic code implementation
- Add unit tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants