Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bookie restarted with exception "java.lang.OutOfMemoryError: Java heap space" #141

Closed
vedanthh opened this issue Mar 25, 2019 · 18 comments
Closed
Labels
kind/bug Something isn't working

Comments

@vedanthh
Copy link

While running moderate IO workload in Longevity run (Total - 7 readers, 14 writers, 2500 events/sec, ~ 20 MB/s IO) observed multiple bookie (bookie1,bookie 2 and bookie3) restarted with exception java.lang.OutOfMemoryError: Java heap space on PKS environment.

Environment details: PKS / K8 with medium cluster:

3 master: xlarge: 4 CPU, 16 GB Ram, 32 GB Disk
5 worker: 2xlarge: 8 CPU, 32 GB Ram, 64 GB Disk
Tier-1 storage is from VSAN datastore
Tier-2 storage curved on NFS Client provisioner using Isilon as backend
Pravega version: zk-closed-client-issue-0.5.0-2161.60655bf
Zookeeper Operator : pravega/zookeeper-operator:0.2.1
Pravega Operator: adrianmo/pravega-operator:issue-134-5

bookie_controller_log.zip

NAME                                                 READY     STATUS    RESTARTS   AGE
pravega-bookie-0                                     1/1       Running   0          3d
pravega-bookie-1                                     1/1       Running   1          3d
pravega-bookie-2                                     1/1       Running   1          3d
pravega-bookie-3                                     1/1       Running   2          3d
pravega-operator-775554574d-7flrw                    1/1       Running   0          3d
pravega-pravega-controller-77cbcc59c7-hm9p9          1/1       Running   2          3d
pravega-pravega-controller-77cbcc59c7-rms82          1/1       Running   2          3d
pravega-pravega-segmentstore-0                       1/1       Running   1          3d
pravega-pravega-segmentstore-1                       1/1       Running   0          3d
pravega-pravega-segmentstore-2                       1/1       Running   1          3d
pravega-pravega-segmentstore-3                       1/1       Running   0          3d
pravega-zk-0                                         1/1       Running   0          3d
pravega-zk-1                                         1/1       Running   0          3d
pravega-zk-2                                         1/1       Running   0          3d
shaka-zulu-nfs-client-provisioner-59d7f8f84c-xvktb   1/1       Running   0          3d
zookeeper-operator-685bfcbbc5-tjtjr                  1/1       Running   0          3d
019-03-22 12:51:20,370 - INFO  - [ZKC-connect-executor-0:ZooKeeperClient$1$1@130] - ZooKeeper session 1000209059e3f7f is created to 10.100.200.57:2181.

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "BookieStateManagerService-0"

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "LedgerDirsMonitorThread"
Exception in thread "BookieWriteThreadPool-OrderedExecutor-0-0" java.lang.OutOfMemoryError: Java heap space
Exception in thread "SortedLedgerStorage-0" java.lang.OutOfMemoryError: Java heap space
2019-03-22 12:53:22,574 - ERROR - [ForceWriteThread:BookieCriticalThread@41] - Uncaught exception in thread ForceWriteThread and is exiting!
java.lang.OutOfMemoryError: Java heap space
2019-03-22 12:53:22,574 - ERROR - [bookie-io-1-1:BookieRequestHandler@70] - Unhandled exception occurred in I/O thread or handler
io.netty.util.IllegalReferenceCountException: refCnt: 0, increment: 1
        at io.netty.buffer.AbstractReferenceCountedByteBuf.release0(AbstractReferenceCountedByteBuf.java:100)
        at io.netty.buffer.AbstractReferenceCountedByteBuf.release(AbstractReferenceCountedByteBuf.java:84)
        at io.netty.util.ReferenceCountUtil.release(ReferenceCountUtil.java:88)
        at org.apache.bookkeeper.proto.BookieProtoEncoding$RequestDecoder.channelRead(BookieProtoEncoding.java:443)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
        at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310)
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1414)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:945)
        at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:806)
        at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe$1.run(AbstractEpollChannel.java:386)
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:309)
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:886)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.lang.Thread.run(Thread.java:748)
@RaulGracia
Copy link

@vedanthh, could you describe the resources requested for each of the services? (kubectl describe command)

@vedanthh
Copy link
Author

Attaching logs
Kubectl_describe_pravega-k8s-logging.zip

@adrianmo
Copy link
Contributor

adrianmo commented Mar 25, 2019

@vedanthh please use the latest operator version (pravega/pravega-operator:0.3.1) and set higher resource limits to Pravega components and BK.

You can use the resource values set in the example file here: https://github.com/pravega/pravega-operator/blob/master/example/cr-detailed.yaml

Your PKS should be able to handle it.

EDIT: I was looking at the wrong logs. @vedanthh is actually setting high resource limits in the manifest.

@RaulGracia
Copy link

RaulGracia commented Mar 25, 2019

@vedanthh @adrianmo I see the following:

  1. 2 out of 4 Bookies have crashed:
State:          Running
      Started:      Fri, 22 Mar 2019 08:55:56 -0400
    Last State:     Terminated
      Reason:       Error
      Exit Code:    5
      Started:      Thu, 21 Mar 2019 06:34:02 -0400
      Finished:     Fri, 22 Mar 2019 08:55:55 -0400
    Ready:          True
    Restart Count:  1
    Limits:
      cpu:     2
      memory:  5Gi
    Requests:
      cpu:      1
      memory:   3Gi
Last State:     Terminated
      Reason:       Error
      Exit Code:    5
      Started:      Thu, 21 Mar 2019 06:34:00 -0400
      Finished:     Fri, 22 Mar 2019 11:40:32 -0400
    Ready:          True
    Restart Count:  1
    Limits:
      cpu:     2
      memory:  5Gi
    Requests:
      cpu:      1
      memory:   3Gi
  1. After the last crash of a Bookie, we see the Segment Store crashing with an OOM:
State:          Running
      Started:      Fri, 22 Mar 2019 05:49:27 -0400
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Thu, 21 Mar 2019 06:33:50 -0400
      Finished:     Fri, 22 Mar 2019 05:49:25 -0400
    Ready:          True
    Restart Count:  1
    Limits:
      cpu:     2
      memory:  5Gi
    Requests:
      cpu:      1
      memory:   3Gi

It is expected that if there are only 2 out of 4 working Bookies, the Segment Store eventually crashes because the write quorum is not met (by default 3). The thing here is: why the Bookies are failing? We may need BK and ZK logs for understanding that.

@adrianmo
Copy link
Contributor

I've seen that the following error log when Bookies start:

/opt/bookkeeper/conf/bkenv.sh: line 35: -XX:+UnlockExperimentalVMOptions: command not found

It looks like JVM options are not being set. This can explain why @vedanthh is hitting OOM issues despite setting high resource limits in the manifest. I'll investigate it and get back to this issue.

@vedanthh
Copy link
Author

Attaching Bookie,controller and zk logs

bookie_controller_zk_log.zip

@sumit-bm sumit-bm added the kind/bug Something isn't working label Mar 25, 2019
@adrianmo
Copy link
Contributor

adrianmo commented Mar 25, 2019

@vedanthh I've built adrianmo/pravega-operator:pr-145-1 based on PR #145. Can you test it out and provide your feedback on the PR? Thanks.

@vedanthh
Copy link
Author

vedanthh commented Mar 26, 2019

@adrianmo I created a cluster with adrianmo/pravega-operator:pr-145-1 and started Longevity run on same , will share my timely observations.

Environment details: PKS / K8 with medium cluster:

3 master: xlarge: 4 CPU, 16 GB Ram, 32 GB Disk
5 worker: 2xlarge: 8 CPU, 32 GB Ram, 64 GB Disk
Tier-1 storage is from VSAN datastore
Tier-2 storage curved on NFS Client provisioner using Isilon as backend```

@vedanthh
Copy link
Author

ISSUE - 141.zip
@adrianmo I observed that all bookies and 1 segment store restarted several times within ~ 30 mins of IO run.

[root@manager1 ~]# kubectl get po
NAME                                                 READY     STATUS    RESTARTS   AGE
pravega-bookie-0                                     1/1       Running   3          55m
pravega-bookie-1                                     1/1       Running   3          55m
pravega-bookie-2                                     1/1       Running   3          55m
pravega-bookie-3                                     1/1       Running   3          55m
pravega-operator-778b8c9485-6g9xh                    1/1       Running   0          1h
pravega-pravega-controller-68c4bd549c-5vqcm          1/1       Running   0          55m
pravega-pravega-controller-68c4bd549c-knxl9          1/1       Running   0          55m
pravega-pravega-segmentstore-0                       1/1       Running   0          55m
pravega-pravega-segmentstore-1                       1/1       Running   1          54m
pravega-pravega-segmentstore-2                       1/1       Running   0          54m
pravega-pravega-segmentstore-3                       1/1       Running   0          53m
pravega-zk-0                                         1/1       Running   0          1h
pravega-zk-1                                         1/1       Running   0          1h
pravega-zk-2                                         1/1       Running   0          1h
shaka-zulu-nfs-client-provisioner-59d7f8f84c-xvktb   1/1       Running   0          4d
zookeeper-operator-685bfcbbc5-tjtjr                  1/1       Running   0          4d

@adrianmo
Copy link
Contributor

@vedanthh based on Bookie pod descriptions, I see that you haven't set proper resource requests and limits.

Name:               pravega-bookie-0
...
    State:          Running
      Started:      Tue, 26 Mar 2019 02:10:31 -0400
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Tue, 26 Mar 2019 02:06:48 -0400
      Finished:     Tue, 26 Mar 2019 02:10:03 -0400
    Ready:          True
    Restart Count:  3
    Limits:
      cpu:     1
      memory:  2Gi
    Requests:
      cpu:      500m
      memory:   1Gi
...

Please set custom resource requests and limits as in the following example manifest: https://github.com/pravega/pravega-operator/blob/master/example/cr-detailed.yaml

Thanks.

@vedanthh
Copy link
Author

vedanthh commented Mar 27, 2019

@adrianmo Observing Bookie restart during longevity run with adrianmo/pravega-operator:pr-145-1 build after ~10 hrs with custom resource requests and limits.

# kubectl describe po/pravega-bookie-0
    State:          Running
      Started:      Tue, 26 Mar 2019 18:38:49 -0400
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Tue, 26 Mar 2019 07:47:32 -0400
      Finished:     Tue, 26 Mar 2019 18:38:49 -0400
    Ready:          True
    Restart Count:  1
    Limits:
      cpu:     2
      memory:  4Gi
    Requests:
      cpu:      1
      memory:   2Gi

# kubectl get po
NAME                                                 READY     STATUS    RESTARTS   AGE
pravega-bookie-0                                     1/1       Running   1          16h
pravega-bookie-1                                     1/1       Running   1          16h
pravega-bookie-2                                     1/1       Running   0          16h
pravega-bookie-3                                     1/1       Running   1          16h
pravega-operator-778b8c9485-tnhjl                    1/1       Running   1          16h
pravega-pravega-controller-c67d6b758-gcmht           1/1       Running   1          16h
pravega-pravega-controller-c67d6b758-j6btb           1/1       Running   1          16h
pravega-pravega-segmentstore-0                       1/1       Running   0          16h
pravega-pravega-segmentstore-1                       1/1       Running   0          16h
pravega-pravega-segmentstore-2                       1/1       Running   0          16h
pravega-pravega-segmentstore-3                       1/1       Running   0          16h
pravega-zk-0                                         1/1       Running   0          16h
pravega-zk-1                                         1/1       Running   0          16h
pravega-zk-2                                         1/1       Running   0          16h
shaka-zulu-nfs-client-provisioner-59d7f8f84c-xvktb   1/1       Running   0          5d
zookeeper-operator-685bfcbbc5-tjtjr                  1/1       Running   0          5d

@adrianmo
Copy link
Contributor

@vedanthh the initial BK restarts are expected as they all come up at the same time and they all try to initialize the BK cluster, but only one of them succeeds. However, that OOMKilled is something we should never see. It means that the Kubernetes framework has killed the container for consuming too much memory. @Tristan1900 found out something very interesting in this regard (see: #145 (comment)) and I made the modifications in the PR.

I've built a new image that contains the latest modifications with @Tristan1900 suggestions that should prevent Kubernetes from OOMKilling the containers.

Image is adrianmo/pravega-operator:pr-145-2, please give it a try and let us know.

Thanks!

@vedanthh
Copy link
Author

@adrianmo I created a cluster with adrianmo/pravega-operator:pr-145-2 and started Longevity run on same , will share my timely observations.

@adrianmo
Copy link
Contributor

@sumit-bm
Copy link

@adrianmo, regarding initial BK restart, could we implement like Segmentstore logic a pod readiness check and sequence initialization of BKs like start first BK and make it healthy / running then initialize second BK to avoid these restart? Any restart doesn't look good and earlier we used to not see that. You have implemented #102, would something similar help here?

@vedanthh
Copy link
Author

@adrianmo Longevity going fine for ~14 hours with adrianmo/pravega-operator:pr-145-2 still no bookie restart observed.

But I'm Observing unhealthy "Readiness probe failed" and error "ERROR Closing ledger 23272 due to NotEnoughBookiesException: Not enough non-faulty bookies available" events in Bookkeeper hence reported #146 issue.

@adrianmo
Copy link
Contributor

@sumit-bm that's would work, but the reason why we want to start all Bookies at the same time is because the Segmentstores wait until all Bookies are ready. Starting them sequentially would increase the deployment time. I'll explore other options to prevent Bookies from restarting at the beginning.

@vedanthh good news - I'll merge PR #145, close this issue, and release a new operator. Please reopen the issue if this OOM error pops up again. Thanks!

@adrianmo
Copy link
Contributor

Fixed in #145

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants