Skip to content
This repository has been archived by the owner on Mar 3, 2023. It is now read-only.

Updated Bookkeeper to 4.11.0 #3571

Merged
merged 6 commits into from
Jul 20, 2020
Merged

Updated Bookkeeper to 4.11.0 #3571

merged 6 commits into from
Jul 20, 2020

Conversation

nicknezis
Copy link
Contributor

Upgrading Bookkeeper to a newer version which resolves various issues.

@nicknezis nicknezis requested review from huijunwu and nwangtw July 17, 2020 08:26
@nicknezis
Copy link
Contributor Author

I believe the bookie-format init container is no longer needed. But would need @huijunwu to verify if this is ok to remove. I was able to run the modified Helm chart in Minikube with an ZK replica size of 1 and a BK replica size of 1. I have not yet tested with multiple BK instances. I also have not yet tested the other Kubernetes yamls.

@nicknezis nicknezis self-assigned this Jul 17, 2020
@nicknezis nicknezis added dependencies Pull requests that update a dependency file Kubernetes labels Jul 17, 2020
@nicknezis nicknezis marked this pull request as ready for review July 17, 2020 15:45
@huijunwu
Copy link
Member

i have a ubuntu2004 box, will test bookie-format and get back to you soon

@huijunwu
Copy link
Member

ubuntu 20.04 minikube
looks like some config is missing for bookeeper Exception in thread "main" org.apache.commons.configuration.ConversionException: 'httpServerPort' doesn't map to an Integer object

$ kubectl logs bookie-9hbmp
......
2020-07-18 04:34:17,715 - INFO  - [main-SendThread(zookeeper:2181):ClientCnxn$SendThread@1394] - Session establishment complete on server zookeeper/172.18.0.3:2181, sessionid = 0x1002a08191e0009, negotiated timeout = 10000
2020-07-18 04:34:17,716 - INFO  - [main-EventThread:ZooKeeperWatcherBase@130] - ZooKeeper client is connected now.
2020-07-18 04:34:17,807 - ERROR - [main:RackawareEnsemblePlacementPolicyImpl@267] - Failed to initialize DNS Resolver org.apache.bookkeeper.net.ScriptBasedMapping, used default subnet resolver : java.lang.RuntimeException: No network topology script is found when using script based DNS resolver.
2020-07-18 04:34:17,832 - INFO  - [main:RackawareEnsemblePlacementPolicyImpl@214] - Initialize rackaware ensemble placement policy @ <Bookie:172.18.0.4:0> @ /default-rack : org.apache.bookkeeper.client.TopologyAwareEnsemblePlacementPolicy$DefaultResolver.
2020-07-18 04:34:17,832 - INFO  - [main:RackawareEnsemblePlacementPolicyImpl@224] - Not weighted
2020-07-18 04:34:17,840 - INFO  - [main:BookKeeper@509] - Weighted ledger placement is not enabled
Jul 18, 2020 4:34:18 AM io.vertx.core.spi.resolver.ResolverProvider
INFO: Using the default address resolver as the dns resolver could not be loaded
2020-07-18 04:34:18,205 - INFO  - [main:Main@345] - Load lifecycle component : org.apache.bookkeeper.server.service.HttpService
Exception in thread "main" org.apache.commons.configuration.ConversionException: 'httpServerPort' doesn't map to an Integer object
        at org.apache.commons.configuration.AbstractConfiguration.getInteger(AbstractConfiguration.java:848)
        at org.apache.commons.configuration.AbstractConfiguration.getInt(AbstractConfiguration.java:822)
        at org.apache.bookkeeper.conf.ServerConfiguration.getHttpServerPort(ServerConfiguration.java:3118)
        at org.apache.bookkeeper.server.service.HttpService.publishInfo(HttpService.java:73)
        at org.apache.bookkeeper.common.component.LifecycleComponentStack.lambda$publishInfo$2(LifecycleComponentStack.java:130)
        at com.google.common.collect.ImmutableList.forEach(ImmutableList.java:408)
        at org.apache.bookkeeper.common.component.LifecycleComponentStack.publishInfo(LifecycleComponentStack.java:126)
        at org.apache.bookkeeper.common.component.ComponentStarter.startComponent(ComponentStarter.java:82)
        at org.apache.bookkeeper.server.Main.doMain(Main.java:234)
        at org.apache.bookkeeper.server.Main.main(Main.java:208)
Caused by: org.apache.commons.configuration.ConversionException: Could not convert  to java.lang.Integer
        at org.apache.commons.configuration.PropertyConverter.toNumber(PropertyConverter.java:461)
        at org.apache.commons.configuration.PropertyConverter.toInteger(PropertyConverter.java:294)
        at org.apache.commons.configuration.AbstractConfiguration.getInteger(AbstractConfiguration.java:844)
        ... 9 more
Caused by: java.lang.NumberFormatException: For input string: ""
        at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
        at java.lang.Integer.parseInt(Integer.java:592)
        at java.lang.Integer.<init>(Integer.java:867)
        at sun.reflect.GeneratedConstructorAccessor1.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.commons.configuration.PropertyConverter.toNumber(PropertyConverter.java:457)
        ... 11 more


$ kubectl logs heron-apiserver-65b9f7c79f-vb5lp
.....
2020-07-18 04:37:07,986 ERROR org.apache.heron.apiserver.resources.TopologyResource submit qtp2100961961-16 error submitting topology acking
org.apache.heron.spi.uploader.UploaderException: Encountered exceptions on uploading the package 'acking-ubuntu-tag-0--6015305960494613991.tar.gz'
        at org.apache.heron.uploader.dlog.DLUploader.uploadPackage(DLUploader.java:167)
        at org.apache.heron.scheduler.SubmitterMain.uploadPackage(SubmitterMain.java:549)
        at org.apache.heron.scheduler.SubmitterMain.submitTopology(SubmitterMain.java:452)
        at org.apache.heron.apiserver.actions.SubmitTopologyAction.execute(SubmitTopologyAction.java:39)
        at org.apache.heron.apiserver.resources.TopologyResource.submit(TopologyResource.java:230)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81)
        at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144)
        at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161)
        at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:160)
        at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99)
        at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389)
        at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347)
        at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102)
        at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326)
        at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
        at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
        at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
        at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
        at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
        at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
        at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305)
        at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154)
        at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473)
        at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427)
        at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388)
        at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341)
        at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228)
        at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:841)
        at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:535)
        at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
        at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
        at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
        at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
        at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
        at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
        at org.eclipse.jetty.server.Server.handle(Server.java:564)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:317)
        at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
        at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:110)
        at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
        at org.eclipse.jetty.util.thread.Invocable.invokePreferred(Invocable.java:128)
        at org.eclipse.jetty.util.thread.Invocable$InvocableExecutor.invoke(Invocable.java:222)
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:294)
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:199)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:673)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:591)
        at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.distributedlog.exceptions.WriteException: Write rejected because stream acking-ubuntu-tag-0--6015305960494613991.tar.gz has encountered an error : writer has been closed due to error.
        at org.apache.distributedlog.BKAsyncLogWriter.doGetLogSegmentWriter(BKAsyncLogWriter.java:218)
        at org.apache.distributedlog.BKAsyncLogWriter.getLogSegmentWriter(BKAsyncLogWriter.java:208)
        at org.apache.distributedlog.BKAsyncLogWriter.getLogSegmentWriterForEndOfStream(BKAsyncLogWriter.java:237)
        at org.apache.distributedlog.BKAsyncLogWriter.markEndOfStream(BKAsyncLogWriter.java:475)
        at org.apache.distributedlog.AppendOnlyStreamWriter.markEndOfStream(AppendOnlyStreamWriter.java:82)
        at org.apache.heron.dlog.DLOutputStream.close(DLOutputStream.java:74)
        at org.apache.heron.uploader.dlog.DLUploader.doUploadPackage(DLUploader.java:196)
        at org.apache.heron.uploader.dlog.DLUploader.uploadPackage(DLUploader.java:161)
        ... 53 more
.....

to reproduce:
  ./docker/scripts/build-docker.sh ubuntu20.04 0.0.0 ~/heron-release
  DIR=./deploy/kubernetes/minikube
  STR='s!heron/heron:latest!heron/heron:0.0.0!g'
  sed ${STR} ${DIR}/zookeeper.yaml > /tmp/zookeeper.yaml
  sed ${STR} ${DIR}/tools.yaml > /tmp/tools.yaml
  sed ${STR} ${DIR}/apiserver.yaml > /tmp/apiserver.yaml
  kubectl create -f /tmp/zookeeper.yaml
  kubectl create -f ${DIR}/bookkeeper.yaml
  kubectl create -f /tmp/tools.yaml
  kubectl create -f /tmp/apiserver.yaml
  kubectl get pods
  kubectl proxy -p 8001 &
  curl http://localhost:8001/api/v1/namespaces/default/services/heron-apiserver:9000/proxy/api/v1/version
  heron config kubernetes set service_url http://localhost:8001/api/v1/namespaces/default/services/heron-apiserver:9000/proxy
  heron submit kubernetes ~/.heron/examples/heron-api-examples.jar org.apache.heron.examples.api.AckingTopology acking
[2020-07-18 04:47:56 +0000] [ERROR]: Encountered exceptions on uploading the package 'acking-ubuntu-tag-0--2714210181101618016.tar.gz'
[2020-07-18 04:47:56 +0000] [ERROR]: Failed to launch topology 'acking'
  kubectl  get pods
  heron submit kubernetes ~/.heron/examples/heron-api-examples.jar org.apache.heron.examples.api.AckingTopology acking
[2020-07-18 04:47:56 +0000] [ERROR]: Encountered exceptions on uploading the package 'acking-ubuntu-tag-0--2714210181101618016.tar.gz'
[2020-07-18 04:47:56 +0000] [ERROR]: Failed to launch topology 'acking'  kubectl describe pod heron-apiserver-65b9f7c79f-vb5lp
  kubectl logs heron-apiserver-65b9f7c79f-vb5lp
  kubectl get pods
  kubectl logs bookie-9hbmp

@nicknezis
Copy link
Contributor Author

Thank you! I added the missing httpServerPort to the Helm chart, but forgot to commit the change to the minikube folder's yamls.

@huijunwu
Copy link
Member

huijunwu commented Jul 18, 2020

looks like we still need bookie format init container ..

$ kubectl logs bookie-2rl2j
...
2020-07-18 11:05:17,827 - INFO  - [main-EventThread:ZooKeeperWatcherBase@130] - ZooKeeper client is connected now.
2020-07-18 11:05:18,233 - INFO  - [main:BookieNettyServer@424] - Shutting down BookieNettyServer
2020-07-18 11:05:18,268 - ERROR - [main:Main@228] - Failed to build bookie server
org.apache.bookkeeper.bookie.BookieException$InvalidCookieException: instanceId 4eae15a0-4a00-44a1-849d-df5fcb9d2789 is not matching with 395e2565-3488-4e4c-b8d2-9a6795ee42ae
        at org.apache.bookkeeper.bookie.Cookie.verifyInternal(Cookie.java:142)
        at org.apache.bookkeeper.bookie.Cookie.verify(Cookie.java:147)
        at org.apache.bookkeeper.bookie.Bookie.verifyAndGetMissingDirs(Bookie.java:372)
        at org.apache.bookkeeper.bookie.Bookie.checkEnvironmentWithStorageExpansion(Bookie.java:435)
        at org.apache.bookkeeper.bookie.Bookie.checkEnvironment(Bookie.java:253)
        at org.apache.bookkeeper.bookie.Bookie.<init>(Bookie.java:699)
        at org.apache.bookkeeper.proto.BookieServer.newBookie(BookieServer.java:140)
        at org.apache.bookkeeper.proto.BookieServer.<init>(BookieServer.java:108)
        at org.apache.bookkeeper.server.service.BookieService.<init>(BookieService.java:52)
        at org.apache.bookkeeper.server.Main.buildBookieServer(Main.java:313)
        at org.apache.bookkeeper.server.Main.doMain(Main.java:226)
        at org.apache.bookkeeper.server.Main.main(Main.java:208)

how to reproduce: following the previous ubuntu20.04 environment, kill the old cluster and start the new cluster, then submit the job again

$ DIR=./deploy/kubernetes/minikube
$ STR='s!heron/heron:latest!heron/heron:0.0.0!g' 
$ sed ${STR} ${DIR}/zookeeper.yaml > /tmp/zookeeper.yaml
$ sed ${STR} ${DIR}/tools.yaml > /tmp/tools.yaml
$ sed ${STR} ${DIR}/apiserver.yaml > /tmp/apiserver.yaml

kubectl delete -f /tmp/zookeeper.yaml
kubectl delete -f ${DIR}/bookkeeper.yaml
kubectl delete -f /tmp/tools.yaml
kubectl delete -f /tmp/apiserver.yaml

kubectl create -f /tmp/zookeeper.yaml 
kubectl create -f ${DIR}/bookkeeper.yaml
kubectl create -f /tmp/tools.yaml
kubectl create -f /tmp/apiserver.yaml 

@nicknezis
Copy link
Contributor Author

Oh I see. I deleted the pod as a test, but I did not delete the cluster while keeping the old Persistent Volume. I will try to add it back. The command may have changed with 4.11.0.

@nicknezis
Copy link
Contributor Author

@huijunw Ok I added back the format init container. I had an issue with it complaining about the cookie not existing. It seems that newer Bookkeeper might do some extra checks. I ended up removing the -deleteCookie parameter. To compensate, I added the BK_useHostNameAsBookieID environment variable. It should now use the HostIP as the advertised address.

Copy link
Member

@huijunwu huijunwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minikube test pass on ubuntu20.04

@nicknezis nicknezis merged commit 8a825e4 into master Jul 20, 2020
nicknezis added a commit that referenced this pull request Sep 14, 2020
@nicknezis nicknezis deleted the nicknezis/bookkeeper-upgrade branch September 14, 2020 20:42
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
dependencies Pull requests that update a dependency file Kubernetes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants