Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential deadlock with Jetty #19938

Closed
mperktold opened this issue Sep 11, 2024 · 11 comments
Closed

Potential deadlock with Jetty #19938

mperktold opened this issue Sep 11, 2024 · 11 comments

Comments

@mperktold
Copy link

Description of the bug

We have several reports of our application not being able to shutdown. Apparently, some VaadinSessions stay alive and cannot be destroyed. Here the stack dumps of two such cases:
StackTraces1.txt
StackTraces2.txt

I found some common patterns in these dumps:

One of the threads blocks on a Jetty semaphore while reading the request content of an UIDL request. Note that this thread holds the lock on the VaadinSession while blocking.

java.base@21.0.4/jdk.internal.misc.Unsafe.park(Native Method)
java.base@21.0.4/java.util.concurrent.locks.LockSupport.park(Unknown Source)
java.base@21.0.4/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(Unknown Source)
java.base@21.0.4/java.util.concurrent.ForkJoinPool.unmanagedBlock(Unknown Source)
java.base@21.0.4/java.util.concurrent.ForkJoinPool.managedBlock(Unknown Source)
java.base@21.0.4/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown Source)
app//org.eclipse.jetty.ee10.servlet.AsyncContentProducer$LockedSemaphore.acquire(AsyncContentProducer.java:393)
app//org.eclipse.jetty.ee10.servlet.BlockingContentProducer.nextChunk(BlockingContentProducer.java:119)
app//org.eclipse.jetty.ee10.servlet.HttpInput.read(HttpInput.java:245)
app//org.eclipse.jetty.ee10.servlet.HttpInput.read(HttpInput.java:226)
java.base@21.0.4/sun.nio.cs.StreamDecoder.readBytes(Unknown Source)
java.base@21.0.4/sun.nio.cs.StreamDecoder.implRead(Unknown Source)
java.base@21.0.4/sun.nio.cs.StreamDecoder.lockedRead(Unknown Source)
java.base@21.0.4/sun.nio.cs.StreamDecoder.read(Unknown Source)
java.base@21.0.4/java.io.InputStreamReader.read(Unknown Source)
java.base@21.0.4/java.io.BufferedReader.read1(Unknown Source)
java.base@21.0.4/java.io.BufferedReader.implRead(Unknown Source)
java.base@21.0.4/java.io.BufferedReader.read(Unknown Source)
java.base@21.0.4/java.io.Reader.read(Unknown Source)
app//com.vaadin.flow.server.communication.ServerRpcHandler.getMessage(ServerRpcHandler.java:503)
app//com.vaadin.flow.server.communication.ServerRpcHandler.handleRpc(ServerRpcHandler.java:253)
app//com.vaadin.flow.server.communication.UidlRequestHandler.synchronizedHandleRequest(UidlRequestHandler.java:114)
app//com.vaadin.flow.server.SynchronizedRequestHandler.handleRequest(SynchronizedRequestHandler.java:40)
app//com.vaadin.flow.server.VaadinService.handleRequest(VaadinService.java:1584)
app//com.vaadin.flow.server.VaadinServlet.service(VaadinServlet.java:398)
app//jakarta.servlet.http.HttpServlet.service(HttpServlet.java:614)
app//org.eclipse.jetty.ee10.servlet.ServletHolder.handle(ServletHolder.java:736)

A second thread blocks on the VaadinSession while trying to close the websocket:

java.base@21.0.4/jdk.internal.misc.Unsafe.park(Native Method)
java.base@21.0.4/java.util.concurrent.locks.LockSupport.park(Unknown Source)
java.base@21.0.4/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(Unknown Source)
java.base@21.0.4/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(Unknown Source)
java.base@21.0.4/java.util.concurrent.locks.ReentrantLock$Sync.lock(Unknown Source)
java.base@21.0.4/java.util.concurrent.locks.ReentrantLock.lock(Unknown Source)
app//com.vaadin.flow.server.VaadinService.lockSession(VaadinService.java:792)
app//com.vaadin.flow.server.VaadinService.findOrCreateVaadinSession(VaadinService.java:839)
app//com.vaadin.flow.server.VaadinService.findVaadinSession(VaadinService.java:684)
app//com.vaadin.flow.server.communication.PushHandler.handleConnectionLost(PushHandler.java:408)
app//com.vaadin.flow.server.communication.PushHandler.connectionLost(PushHandler.java:368)
app//com.vaadin.flow.server.communication.PushAtmosphereHandler.onStateChange(PushAtmosphereHandler.java:62)
app//org.atmosphere.cpr.AsynchronousProcessor.invokeAtmosphereHandler(AsynchronousProcessor.java:538)
app//org.atmosphere.cpr.AsynchronousProcessor.completeLifecycle(AsynchronousProcessor.java:480)
app//org.atmosphere.cpr.AsynchronousProcessor.endRequest(AsynchronousProcessor.java:584)
app//org.atmosphere.websocket.DefaultWebSocketProcessor.close(DefaultWebSocketProcessor.java:639)
app//org.atmosphere.container.JSR356Endpoint.onClose(JSR356Endpoint.java:318)
java.base@21.0.4/java.lang.invoke.LambdaForm$DMH/0x000000001e1a4000.invokeVirtual(LambdaForm$DMH)
java.base@21.0.4/java.lang.invoke.LambdaForm$MH/0x000000001f292000.invoke(LambdaForm$MH)
java.base@21.0.4/java.lang.invoke.LambdaForm$MH/0x000000001ee44800.invoke_MT(LambdaForm$MH)
app//org.eclipse.jetty.ee10.websocket.jakarta.common.JakartaWebSocketFrameHandler.notifyOnClose(JakartaWebSocketFrameHandler.java:295)
app//org.eclipse.jetty.ee10.websocket.jakarta.common.JakartaWebSocketFrameHandler.onClose(JakartaWebSocketFrameHandler.java:267)
app//org.eclipse.jetty.ee10.websocket.jakarta.common.JakartaWebSocketFrameHandler.onFrame(JakartaWebSocketFrameHandler.java:255)
app//org.eclipse.jetty.websocket.core.WebSocketCoreSession$IncomingAdaptor.onFrame(WebSocketCoreSession.java:680)
app//org.eclipse.jetty.websocket.core.AbstractExtension.nextIncomingFrame(AbstractExtension.java:145)
app//org.eclipse.jetty.websocket.core.internal.PerMessageDeflateExtension.nextIncomingFrame(PerMessageDeflateExtension.java:239)
app//org.eclipse.jetty.websocket.core.internal.PerMessageDeflateExtension$IncomingFlusher$$Lambda/0x000000001e90bd08.onFrame(Unknown Source)
app//org.eclipse.jetty.websocket.core.util.DemandingFlusher.emitFrame(DemandingFlusher.java:143)
app//org.eclipse.jetty.websocket.core.internal.PerMessageDeflateExtension$IncomingFlusher.handle(PerMessageDeflateExtension.java:382)
app//org.eclipse.jetty.websocket.core.util.DemandingFlusher.process(DemandingFlusher.java:167)
app//org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:262)
app//org.eclipse.jetty.util.IteratingCallback.succeeded(IteratingCallback.java:401)
app//org.eclipse.jetty.websocket.core.util.DemandingFlusher.onFrame(DemandingFlusher.java:105)
app//org.eclipse.jetty.websocket.core.internal.PerMessageDeflateExtension.onFrame(PerMessageDeflateExtension.java:96)
app//org.eclipse.jetty.websocket.core.ExtensionStack.onFrame(ExtensionStack.java:113)
app//org.eclipse.jetty.websocket.core.WebSocketCoreSession.onFrame(WebSocketCoreSession.java:463)
app//org.eclipse.jetty.websocket.core.WebSocketConnection.onFrame(WebSocketConnection.java:254)
app//org.eclipse.jetty.websocket.core.WebSocketConnection.fillAndParse(WebSocketConnection.java:447)
app//org.eclipse.jetty.websocket.core.WebSocketConnection.onFillable(WebSocketConnection.java:332)
app//org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:322)
app//org.eclipse.jetty.http2.HTTP2StreamEndPoint.process(HTTP2StreamEndPoint.java:497)
app//org.eclipse.jetty.http2.HTTP2StreamEndPoint.processDataAvailable(HTTP2StreamEndPoint.java:484)
app//org.eclipse.jetty.http2.server.internal.ServerHTTP2StreamEndPoint.onDataAvailable(ServerHTTP2StreamEndPoint.java:40)
app//org.eclipse.jetty.http2.server.internal.HTTP2ServerConnection.onDataAvailable(HTTP2ServerConnection.java:158)
app//org.eclipse.jetty.http2.server.HTTP2ServerConnectionFactory$HTTPServerSessionListener.onDataAvailable(HTTP2ServerConnectionFactory.java:153)
app//org.eclipse.jetty.http2.HTTP2Stream.notifyDataAvailable(HTTP2Stream.java:861)
app//org.eclipse.jetty.http2.HTTP2Stream.processData(HTTP2Stream.java:543)
app//org.eclipse.jetty.http2.HTTP2Stream.onData(HTTP2Stream.java:461)
app//org.eclipse.jetty.http2.HTTP2Stream.process(HTTP2Stream.java:368)
app//org.eclipse.jetty.http2.HTTP2Session.onData(HTTP2Session.java:280)
app//org.eclipse.jetty.http2.HTTP2Connection.onData(HTTP2Connection.java:246)
app//org.eclipse.jetty.http2.parser.BodyParser.notifyData(BodyParser.java:103)
app//org.eclipse.jetty.http2.parser.DataBodyParser.onData(DataBodyParser.java:145)
app//org.eclipse.jetty.http2.parser.DataBodyParser.onData(DataBodyParser.java:140)
app//org.eclipse.jetty.http2.parser.DataBodyParser.parse(DataBodyParser.java:106)
app//org.eclipse.jetty.http2.parser.Parser.parseBody(Parser.java:229)
app//org.eclipse.jetty.http2.parser.Parser.parse(Parser.java:156)
app//org.eclipse.jetty.http2.parser.ServerParser.parse(ServerParser.java:121)
app//org.eclipse.jetty.http2.HTTP2Connection$HTTP2Producer.produce(HTTP2Connection.java:342)
app//org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.produceTask(AdaptiveExecutionStrategy.java:512)
app//org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:258)
app//org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.run(AdaptiveExecutionStrategy.java:201)
app//org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:311)
app//org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:979)
app//org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1209)
app//org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1164)
java.base@21.0.4/java.lang.Thread.runWith(Unknown Source)
java.base@21.0.4/java.lang.Thread.run(Unknown Source)

A third thread also blocks on the VaadinSession while handling a connection loss, but this one comes from the HeartbeatInterception:

java.base@21.0.4/jdk.internal.misc.Unsafe.park(Native Method)
java.base@21.0.4/java.util.concurrent.locks.LockSupport.park(Unknown Source)
java.base@21.0.4/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(Unknown Source)
java.base@21.0.4/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(Unknown Source)
java.base@21.0.4/java.util.concurrent.locks.ReentrantLock$Sync.lock(Unknown Source)
java.base@21.0.4/java.util.concurrent.locks.ReentrantLock.lock(Unknown Source)
app//com.vaadin.flow.server.VaadinService.lockSession(VaadinService.java:798)
app//com.vaadin.flow.server.VaadinService.findOrCreateVaadinSession(VaadinService.java:845)
app//com.vaadin.flow.server.VaadinService.findVaadinSession(VaadinService.java:690)
app//com.vaadin.flow.server.communication.PushHandler.handleConnectionLost(PushHandler.java:414)
app//com.vaadin.flow.server.communication.PushHandler.connectionLost(PushHandler.java:368)
app//com.vaadin.flow.server.communication.PushAtmosphereHandler$AtmosphereResourceListener.onDisconnect(PushAtmosphereHandler.java:113)
app//org.atmosphere.cpr.AtmosphereResourceImpl.onDisconnect(AtmosphereResourceImpl.java:752)
app//org.atmosphere.cpr.AtmosphereResourceImpl.notifyListeners(AtmosphereResourceImpl.java:644)
app//org.atmosphere.cpr.AtmosphereResponseImpl.handleException(AtmosphereResponseImpl.java:732)
app//org.atmosphere.cpr.AtmosphereResponseImpl.access$1500(AtmosphereResponseImpl.java:57)
app//org.atmosphere.cpr.AtmosphereResponseImpl$Stream.write(AtmosphereResponseImpl.java:958)
app//org.atmosphere.cpr.AtmosphereResponseImpl.write(AtmosphereResponseImpl.java:805)
app//org.atmosphere.interceptor.HeartbeatInterceptor.lambda$clock$0(HeartbeatInterceptor.java:367)
app//org.atmosphere.interceptor.HeartbeatInterceptor$$Lambda/0x0000000021c1cf78.call(Unknown Source)
java.base@21.0.4/java.util.concurrent.FutureTask.run(Unknown Source)
java.base@21.0.4/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
java.base@21.0.4/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
java.base@21.0.4/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
java.base@21.0.4/java.lang.Thread.runWith(Unknown Source)
java.base@21.0.4/java.lang.Thread.run(Unknown Source)

I'm not sure where things go wrong, but this does look a bit suspicious to me.

Expected behavior

The application should shut down without getting stuck in a deadlock.

Minimal reproducible example

Unfortunately, I don't have a reproducer. I hope that the stack traces are good enough.

Versions

  • Vaadin / Flow version: 24.4.10
  • Java version: Eclipse Temurin 21.0.3
  • OS version: Windows 11
  • Application Server (if applicable): Jetty 12.0.13
@Legioth
Copy link
Member

Legioth commented Sep 12, 2024

The big question on my mind is which thread holds the lock that the semaphore is waiting for? Thread dumps do typically also include information on locks held by each thread but that information doesn't seem to be there in the dumps you shared. It might be useful if you could somehow find a way of getting a thread dump that includes that data.

@mperktold
Copy link
Author

This thread dump is a custom one produced by our code. I can try to come up with a better one.

In this case, however, the thread waits on a condition, not a lock, so no other thread is "holding" the lock. The blocked thread is waiting in BlockingContentProducer.nextChunk until content is available.

From what I see, the semaphore should eventually be released in BlockingContentProducer.onContentAvailable when new content is available, which should be called from HttpInput.run.

I'm not really seeing the deadlock here, but I find it suspicious that some threads wait on the VaadinSession and another holds it while blocking for some other reason.

I also thought about reporting this to the Jetty team, but since the locks on the VaadinSession seem to be involved, I figured you could have a more complete picture.

@Legioth
Copy link
Member

Legioth commented Sep 13, 2024

I assume that one of those two threads waiting for the session lock would eventually do something that would allow the BlockingContentProduce thread to proceed. Someone from the Jetty team might have better insights into where to look for that to help gain a full understanding of how this could be resolved.

@mperktold
Copy link
Author

@sbordet from the Jetty team suggests not to perform a blocking read (or write) while holding a VaadinSession lock:
jetty/jetty.project#12272 (comment)

I'm not sure whether that would be compatible with how syncIds are checked and incremented, but you should probably at least investigate this further.

@Legioth
Copy link
Member

Legioth commented Sep 16, 2024

If my hypothesis about TCP head-of-line blocking holds true, then I don't see any other way around it than changing our code to not read while holding the session lock.

Is see two potential ways of achieving that:

  • We could read the whole response up-front before locking. This uses slightly more resources since the memory remains allocated in the JVM for a longer time (though it correspondingly releases memory in the TCP stack so it might balance out). We would also have to check for edge cases causing issues if there's a route to some other code path that expects to be able to read the payload since those would fail if the servlet request stream has already been read.
  • We could temporarily unlock while reading and then lock again. This might be fragile with reentrant locks and so on. I don't see any direct issue with sync ids since the client should anyways not send a new UIDL request before it has received a response to the previous one.

There's also a risk that we'd have to do something similar for writing the response but I'm not 100% sure about that.

@TatuLund
Copy link
Contributor

TatuLund commented Oct 8, 2024

Is this related to #18077 ?

@mperktold
Copy link
Author

@TatuLund I don't think so.

I just tried to shutdown the server (we use Jetty, not Tomcat) with Atmosphere warnings enabled, but I didn't see anything.

@mshabarov mshabarov moved this from 🪵Product backlog to 🟢Ready to Go in Vaadin Flow ongoing work (Vaadin 10+) Oct 16, 2024
@tepi tepi self-assigned this Nov 11, 2024
@tepi tepi moved this from 🔖 High Priority (P1) to 🏗 WIP in Vaadin Flow bugs & maintenance (Vaadin 10+) Nov 11, 2024
@tepi tepi moved this from 🟢Ready to Go to ⚒️ In progress in Vaadin Flow ongoing work (Vaadin 10+) Nov 11, 2024
@tepi
Copy link
Contributor

tepi commented Nov 11, 2024

Looks like this could actually be a Jetty issue (jetty/jetty.project#12272 (comment)). Let's wait for Jetty project people to work that ticket to a conclusion before attempting to fix something in Vaadin.

@tepi tepi moved this from 🏗 WIP to 🔖 High Priority (P1) in Vaadin Flow bugs & maintenance (Vaadin 10+) Nov 11, 2024
@tepi tepi moved this from ⚒️ In progress to 🟢Ready to Go in Vaadin Flow ongoing work (Vaadin 10+) Nov 11, 2024
@tepi tepi moved this from 🟢Ready to Go to 🪵Product backlog in Vaadin Flow ongoing work (Vaadin 10+) Nov 11, 2024
@sbordet
Copy link

sbordet commented Nov 11, 2024

@tepi, eventually it was a Jetty issue for that very specific corner case.

However, in general, what discussed with @Legioth above holds valid: Vaadin should not hold the session lock while calling blocking code (either blocking reads or blocking writes).

Even with the Jetty fix, the blocking code may be woken up after a long idle timeout, and meanwhile the session lock would cause other threads to block, causing a cascading effect where all threads could block and lockup the server.

@tepi
Copy link
Contributor

tepi commented Nov 12, 2024

Fair enough. Jetty fix looks good. I'll start looking into how we could refactor the blocking reads and writes out of session lock being held.

@tepi tepi moved this from 🟢Ready to Go to ⚒️ In progress in Vaadin Flow ongoing work (Vaadin 10+) Nov 13, 2024
@tepi tepi moved this from 🔖 High Priority (P1) to 🏗 WIP in Vaadin Flow bugs & maintenance (Vaadin 10+) Nov 13, 2024
tepi added a commit that referenced this issue Nov 19, 2024
vaadin-bot pushed a commit that referenced this issue Nov 19, 2024
mcollovati pushed a commit that referenced this issue Nov 19, 2024
Move blocking calls outside session lock (#19938)

Co-authored-by: Teppo Kurki <teppo.kurki@vaadin.com>
tepi added a commit that referenced this issue Nov 19, 2024
tepi added a commit that referenced this issue Nov 19, 2024
@tepi
Copy link
Contributor

tepi commented Nov 19, 2024

Blocking calls for UidlRequestHandler have now been moved outside VaadinSession lock for 24.4, 24.5 and main branches, so I'll close this ticket. If there are further problems, please create a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants