Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug][broker] BrokerId npe when broker restart #22975

Closed
3 tasks done
TakaHiR07 opened this issue Jun 25, 2024 · 1 comment · Fixed by #22977
Closed
3 tasks done

[Bug][broker] BrokerId npe when broker restart #22975

TakaHiR07 opened this issue Jun 25, 2024 · 1 comment · Fixed by #22977
Labels
type/bug The PR fixed a bug or issue reported a bug

Comments

@TakaHiR07
Copy link
Contributor

Search before asking

  • I searched in the issues and found nothing similar.

Read release policy

  • I understand that unsupported versions don't get bug fixes. I will attempt to reproduce the issue on a supported version of Pulsar client and Pulsar broker.

Version

broker: 3.0.4
client: 2.9.5

Minimal reproduce step

client continue to produce and consume

broker restart

What did you expect to see?

no npe

What did you see instead?

2024-06-25T13:27:33,051+0800 [main] INFO  org.apache.pulsar.broker.service.BrokerService - Started Pulsar Broker service on /0.0.0.0:6651, TLS: (none), listener: (none)
2024-06-25T13:27:33,129+0800 [main] INFO  org.apache.pulsar.broker.service.BrokerService - Started Pulsar Broker service on /0.0.0.0:6751, TLS: OPENSSL, listener: (none)
2024-06-25T13:27:33,153+0800 [main] INFO  org.apache.pulsar.broker.service.BrokerService - Scheduling a thread to check backlog quota after [60] seconds in background
2024-06-25T13:27:33,155+0800 [main] INFO  org.apache.pulsar.broker.service.persistent.DispatchRateLimiter - configured broker message-dispatch rate DispatchRateImpl(dispatchThrottlingRateInMsg=0, dispatchThrottlingRateInByte=0, relativeToPublishRate=false, ratePeriodInSecond=1)
2024-06-25T13:27:33,155+0800 [main] INFO  org.apache.pulsar.broker.service.persistent.DispatchRateLimiter - setting message-dispatch-rate DispatchRateImpl(dispatchThrottlingRateInMsg=0, dispatchThrottlingRateInByte=0, relativeToPublishRate=false, ratePeriodInSecond=1)
2024-06-25T13:27:33,175+0800 [main] INFO  org.eclipse.jetty.util.log - Logging initialized @15239ms to org.eclipse.jetty.util.log.Slf4jLog
2024-06-25T13:27:33,175+0800 [pulsar-io-23-3] INFO  org.apache.pulsar.broker.service.ServerCnx - [/xxx:51004] connected with role=admin using authMethod=token, clientVersion=2.9.5, clientProtocolVersion=19, proxyVersion=null
2024-06-25T13:27:33,175+0800 [pulsar-io-23-6] INFO  org.apache.pulsar.broker.service.ServerCnx - [/xxx:51010] connected with role=admin using authMethod=token, clientVersion=2.9.5, clientProtocolVersion=19, proxyVersion=null
2024-06-25T13:27:33,175+0800 [pulsar-io-23-5] INFO  org.apache.pulsar.broker.service.ServerCnx - [/xxx:51008] connected with role=admin using authMethod=token, clientVersion=2.9.4, clientProtocolVersion=19, proxyVersion=null
2024-06-25T13:27:33,175+0800 [pulsar-io-23-1] INFO  org.apache.pulsar.broker.service.ServerCnx - [/xxx:51000] connected with role=admin using authMethod=token, clientVersion=2.9.5, clientProtocolVersion=19, proxyVersion=null
2024-06-25T13:27:33,175+0800 [pulsar-io-23-4] INFO  org.apache.pulsar.broker.service.ServerCnx - [/xxx:51006] connected with role=admin using authMethod=token, clientVersion=2.9.4, clientProtocolVersion=19, proxyVersion=null
2024-06-25T13:27:33,175+0800 [pulsar-io-23-2] INFO  org.apache.pulsar.broker.service.ServerCnx - [/xxx:51002] connected with role=admin using authMethod=token, clientVersion=2.9.5, clientProtocolVersion=19, proxyVersion=null
2024-06-25T13:27:33,191+0800 [pulsar-io-23-7] INFO  org.apache.pulsar.broker.service.ServerCnx - [/xxx:51012] connected with role=admin using authMethod=token, clientVersion=2.9.4, clientProtocolVersion=19, proxyVersion=null
2024-06-25T13:27:33,202+0800 [pulsar-io-23-4] INFO  org.apache.pulsar.broker.service.ServerCnx - [[id: 0xf8f4511f, L:/xxx:6651 - R:/xxx:51006]] Subscribing on topic xxx-partition-115 / debug_gateway. consumerId: 981
2024-06-25T13:27:33,202+0800 [pulsar-io-23-7] INFO  org.apache.pulsar.broker.service.ServerCnx - [[id: 0xb3980641, L:/xxx:6651 - R:/xxx:51012]] Subscribing on topic xxx-partition-115 / debug_gateway. consumerId: 1148
2024-06-25T13:27:33,202+0800 [pulsar-io-23-5] INFO  org.apache.pulsar.broker.service.ServerCnx - [[id: 0xd26bffc1, L:/xxx:6651 - R:/xxx:51008]] Subscribing on topic xxx-partition-129 / debug_gateway. consumerId: 1113
2024-06-25T13:27:33,242+0800 [pulsar-io-23-9] INFO  org.apache.pulsar.broker.service.ServerCnx - [/xxx:51014] connected with role=admin using authMethod=token, clientVersion=2.9.4, clientProtocolVersion=19, proxyVersion=null
2024-06-25T13:27:33,243+0800 [pulsar-io-23-9] INFO  org.apache.pulsar.broker.service.ServerCnx - [[id: 0x2e7cd8b2, L:/xxx:6651 - R:/xxx:51014]] Subscribing on topic xxx-partition-108 / debug_gateway. consumerId: 897
2024-06-25T13:27:33,378+0800 [metadata-store-29-1] ERROR org.apache.pulsar.broker.service.ServerCnx - [/xxx:51000] Failed to create topic xxx-partition-56, producerId=56

java.util.concurrent.CompletionException: java.lang.NullPointerException: brokerId is not initialized before start has been called
        at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:368) ~[?:?]
        at java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:377) ~[?:?]
        at java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:1097) ~[?:?]
        at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]
        at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162) ~[?:?]
        at org.apache.pulsar.broker.service.BrokerService.lambda$loadOrCreatePersistentTopic$59(BrokerService.java:1607) ~[org.apache.pulsar-pulsar-broker-3.0.4.1.jar:3.0.4.1]
        at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:990) ~[?:?]
        at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:974) ~[?:?]
        at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]
        at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147) ~[?:?]
        at org.apache.pulsar.common.naming.NamespaceBundleFactory.lambda$doLoadBundles$2(NamespaceBundleFactory.java:107) ~[org.apache.pulsar-pulsar-broker-3.0.4.1.jar:3.0.4.1]
        at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:718) ~[?:?]
        at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]
        at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147) ~[?:?]
        at org.apache.pulsar.metadata.impl.ZKMetadataStore.handleGetResult(ZKMetadataStore.java:275) ~[org.apache.pulsar-pulsar-metadata-3.0.4.1.jar:3.0.4.1]
        at org.apache.pulsar.metadata.impl.ZKMetadataStore.lambda$batchOperation$5(ZKMetadataStore.java:219) ~[org.apache.pulsar-pulsar-metadata-3.0.4.1.jar:3.0.4.1]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[io.netty-netty-common-4.1.105.Final.jar:4.1.105.Final]
        at java.lang.Thread.run(Thread.java:833) ~[?:?]
Caused by: java.lang.NullPointerException: brokerId is not initialized before start has been called
        at java.util.Objects.requireNonNull(Objects.java:233) ~[?:?]
        at org.apache.pulsar.broker.PulsarService.getBrokerId(PulsarService.java:1804) ~[org.apache.pulsar-pulsar-broker-3.0.4.1.jar:3.0.4.1]
        at org.apache.pulsar.broker.service.BrokerService.lambda$checkTopicNsOwnership$100(BrokerService.java:2231) ~[org.apache.pulsar-pulsar-broker-3.0.4.1.jar:3.0.4.1]
        at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1150) ~[?:?]
        ... 15 more

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@lhotari
Copy link
Member

lhotari commented Jun 26, 2024

This is a regression caused by #21894 which was released in versions 3.2.0, 3.0.3 and 3.1.3. However, the root cause existed already before that change. The fix in PR #22977 will address the root cause. The root cause is that the broker starts serving requests before it is ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
2 participants