Skip to content
This repository has been archived by the owner on May 28, 2018. It is now read-only.

Jersey threads stuck in WAITING #3800

Open
joekarl opened this issue Apr 5, 2018 · 4 comments
Open

Jersey threads stuck in WAITING #3800

joekarl opened this issue Apr 5, 2018 · 4 comments

Comments

@joekarl
Copy link

joekarl commented Apr 5, 2018

As part of the upgrade path from Dropwizard 0.x -> 1.x the jersey version was bumped to 2.25.1

What we've noticed is that across a number of our production services, we're seeing threads randomly moving from a RUNNABLE to a WAITING state and staying in that waiting state. This seems to occur randomly, but mostly centered around higher throughput paths that can sometimes incur higher latency. As these blocked threads increase over time, the thread pool is eventually starved and the service topples over.

This seems somewhat related to this previous issue (targeted for 2.26+ milestone) https://github.com/jersey/jersey/issues/3207

Here's an example from a recent thread dump. We see something similar for each stuck thread and the entries in the thread dump slowly increase over time as they're not evicted from the thread pool.

"dw-2113 - GET /path/being/called..." id=2113 state=WAITING
    - waiting on <0x1f2f7a6b> (a jersey.repackaged.com.google.common.util.concurrent.AbstractFuture$Sync)
    - locked <0x1f2f7a6b> (a jersey.repackaged.com.google.common.util.concurrent.AbstractFuture$Sync)
    at sun.misc.Unsafe.park(Native Method)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
    at jersey.repackaged.com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:285)
    at jersey.repackaged.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
    at org.glassfish.jersey.servlet.internal.ResponseWriter.getResponseContext(ResponseWriter.java:299)
    at org.glassfish.jersey.servlet.internal.ResponseWriter.callSendError(ResponseWriter.java:215)
    at org.glassfish.jersey.servlet.internal.ResponseWriter.commit(ResponseWriter.java:194)
    at org.glassfish.jersey.server.ContainerResponse.close(ContainerResponse.java:413)
    at org.glassfish.jersey.server.ServerRuntime$Responder.writeResponse(ServerRuntime.java:784)
    at org.glassfish.jersey.server.ServerRuntime$Responder.processResponse(ServerRuntime.java:444)
    at org.glassfish.jersey.server.ServerRuntime$Responder.process(ServerRuntime.java:490)
    at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:334)
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
    at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
    at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305)
    at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154)
    at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:473)
    at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:427)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:388)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:341)
    at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:228)
    at io.dropwizard.jetty.NonblockingServletHolder.handle(NonblockingServletHolder.java:49)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1650)
    at io.dropwizard.servlets.ThreadNameFilter.doFilter(ThreadNameFilter.java:34)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637)
    at io.dropwizard.jersey.filter.AllowedMethodsFilter.handle(AllowedMethodsFilter.java:45)
    at io.dropwizard.jersey.filter.AllowedMethodsFilter.doFilter(AllowedMethodsFilter.java:39)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637)
    at com.amazonaws.xray.javax.servlet.AWSXRayServletFilter.doFilter(AWSXRayServletFilter.java:151)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1637)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
    at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
    at com.codahale.metrics.jetty9.InstrumentedHandler.handle(InstrumentedHandler.java:241)
    at io.dropwizard.jetty.RoutingHandler.handle(RoutingHandler.java:52)
    at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:527)
    at io.dropwizard.jetty.BiDiGzipHandler.handle(BiDiGzipHandler.java:68)
    at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:56)
    at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:169)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
    at org.eclipse.jetty.server.Server.handle(Server.java:561)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:334)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:104)
    at org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:289)
    at org.eclipse.jetty.io.ssl.SslConnection$3.succeeded(SslConnection.java:149)
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:104)
    at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
    at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:243)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:679)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:597)
    at java.lang.Thread.run(Thread.java:748)
@joekarl
Copy link
Author

joekarl commented Apr 5, 2018

And to be clear, all of these requests are synchronous. They do reach out to other network resources (other services/database/etc...) but those are generally done synchronously within the request thread (sometimes we'll spawn futures internally to fetch resources in parallel, but we resolve those futures synchronously in the request thread).

@daniloradenovic
Copy link

Could this be related to #3772 ?

@joekarl
Copy link
Author

joekarl commented Apr 6, 2018

We've discovered that while we can't seem to figure out why the threads never return, we can at least forcefully clean them up using Jetty's blockingTimeout config option http://download.eclipse.org/jetty/stable-9/apidocs/org/eclipse/jetty/server/HttpConfiguration.html#getBlockingTimeout--

Though seems more of a bandaid than an actual fix.

@joekarl
Copy link
Author

joekarl commented Apr 6, 2018

Reporting back, the blockingTimeout setting didn't actually help the situation, still seeing threads getting in a stuck state.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants