Description
We have been tracking rather strange & random failures in our application: eventually, after many browser closed (and/or page refresh), our SSE event flux would not be sending anything anymore to new clients.
We identified that threads used for async tasks execution (AsyncSupportConfigurer) may deadlock due to a timing-dependant ABBA deadlock between SseEmitter sending and error handling on connection closed.
If using the the default SimpleAsyncTaskExecutor which fires up a new Thread for each task, no user visible effect (except that some deadlocked unused threads are consumming resources on the server).
But, if using a fixed number of threads (as we do in our application), the system may finally come to a halt (when all worker threads have been deadlocked).
What we think is going on (see example jstack trace captured):
- The client is connected to the SSE endpoint:
- For an object to be written to the sse flux:
- the SseEmitter locks itself (synchronized on super.send(...))
- then the Object is serialized (Jackson)
- eventually, writing/flushing on the output stream will require StandardServletAsyncWebRequest to aquire its stateLock ReentrantLock
- The client disconnects abruptly (browser closed), while the object is being serialized:
- the server "sees" that the connection has been closed (probably because another object is sent concurrently on the SseEmitter)
- this triggers StandardServletAsyncWebRequest.onError handling process that first locks the stateLock ReentrantLock and then calls SseEmitter.completeWithError which wants to lock itself (synchronized)
So, we think that there's a potential ABBA deadlock between SSE error handling and concurrent SSE sending.
Things are highly timing dependent, but we have been able to reproduce (with high probablity) our observed deadlocks:
the idea is to emit big JSON objects (increasing the concurrency window) and normal objects concurrenty on the flux, while closing the browser
Reproduced with spring boot 3.3.2 (springframework 6.1.11) and also 3.3.3 (springframework 6.1.12)
Steps to reproduce:
See the minimalist example attached, with the following steps to reproduce:
- Launch the main class
- note the process PID for further reference
- Using a browser as an http client, go to the index.html via url : http://localhost:8080
- Refresh the browser
- Run the "jstack $PID" command
- You should notice 1 deadlock (varying probability, but still very likely)
- For each additional refresh, an additional deadlock may appear
Example of deadlock captured using jstack:
Found one Java-level deadlock:
=============================
"http-nio-8080-exec-3":
waiting to lock monitor 0x000001dfe2866ac0 (object 0x0000000614fc1408, a org.springframework.web.servlet.mvc.method.annotation.SseEmitter),
which is held by "task-4"
"task-4":
waiting for ownable synchronizer 0x0000000614e998c0, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
which is held by "http-nio-8080-exec-3"
Java stack information for the threads listed above:
===================================================
"http-nio-8080-exec-3":
at org.springframework.web.servlet.mvc.method.annotation.ResponseBodyEmitter.completeWithError(ResponseBodyEmitter.java:264)
- waiting to lock <0x0000000614fc1408> (a org.springframework.web.servlet.mvc.method.annotation.SseEmitter)
at org.springframework.web.servlet.mvc.method.annotation.ReactiveTypeHandler$AbstractEmitterSubscriber$$Lambda/0x000001df9d4acaf0.accept(Unknown Source)
at org.springframework.web.servlet.mvc.method.annotation.ResponseBodyEmitter$ErrorCallback.accept(ResponseBodyEmitter.java:396)
at org.springframework.web.servlet.mvc.method.annotation.ResponseBodyEmitter$ErrorCallback.accept(ResponseBodyEmitter.java:383)
at org.springframework.web.context.request.async.DeferredResult$1.handleError(DeferredResult.java:319)
at org.springframework.web.context.request.async.DeferredResultInterceptorChain.triggerAfterError(DeferredResultInterceptorChain.java:99)
at org.springframework.web.context.request.async.WebAsyncManager.lambda$startDeferredResultProcessing$6(WebAsyncManager.java:451)
at org.springframework.web.context.request.async.WebAsyncManager$$Lambda/0x000001df9d4b0220.accept(Unknown Source)
at org.springframework.web.context.request.async.StandardServletAsyncWebRequest.lambda$onError$0(StandardServletAsyncWebRequest.java:193)
at org.springframework.web.context.request.async.StandardServletAsyncWebRequest$$Lambda/0x000001df9d4c1970.accept(Unknown Source)
at java.util.ArrayList.forEach(java.base@21.0.3/ArrayList.java:1596)
at org.springframework.web.context.request.async.StandardServletAsyncWebRequest.onError(StandardServletAsyncWebRequest.java:193)
at org.apache.catalina.core.AsyncListenerWrapper.fireOnError(AsyncListenerWrapper.java:49)
at org.apache.catalina.core.AsyncContextImpl.setErrorState(AsyncContextImpl.java:415)
at org.apache.catalina.connector.CoyoteAdapter.asyncDispatch(CoyoteAdapter.java:155)
at org.apache.coyote.AbstractProcessor.dispatch(AbstractProcessor.java:243)
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:57)
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:904)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1741)
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:52)
at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1190)
at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:63)
at java.lang.Thread.runWith(java.base@21.0.3/Thread.java:1596)
at java.lang.Thread.run(java.base@21.0.3/Thread.java:1583)
"task-4":
at jdk.internal.misc.Unsafe.park(java.base@21.0.3/Native Method)
- parking to wait for <0x0000000614e998c0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(java.base@21.0.3/LockSupport.java:221)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@21.0.3/AbstractQueuedSynchronizer.java:754)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@21.0.3/AbstractQueuedSynchronizer.java:990)
at java.util.concurrent.locks.ReentrantLock$Sync.lock(java.base@21.0.3/ReentrantLock.java:153)
at java.util.concurrent.locks.ReentrantLock.lock(java.base@21.0.3/ReentrantLock.java:322)
at org.springframework.web.context.request.async.StandardServletAsyncWebRequest$LifecycleHttpServletResponse.obtainLockAndCheckState(StandardServletAsyncWebRequest.java:306)
at org.springframework.web.context.request.async.StandardServletAsyncWebRequest$LifecycleServletOutputStream.write(StandardServletAsyncWebRequest.java:373)
at org.springframework.util.StreamUtils$NonClosingOutputStream.write(StreamUtils.java:261)
at com.fasterxml.jackson.core.json.UTF8JsonGenerator._flushBuffer(UTF8JsonGenerator.java:2210)
at com.fasterxml.jackson.core.json.UTF8JsonGenerator.close(UTF8JsonGenerator.java:1234)
at org.springframework.http.converter.json.AbstractJackson2HttpMessageConverter.writeInternal(AbstractJackson2HttpMessageConverter.java:452)
at org.springframework.http.converter.AbstractGenericHttpMessageConverter.writeInternal(AbstractGenericHttpMessageConverter.java:123)
at org.springframework.http.converter.AbstractHttpMessageConverter.write(AbstractHttpMessageConverter.java:235)
at org.springframework.web.servlet.mvc.method.annotation.ResponseBodyEmitterReturnValueHandler$HttpMessageConvertingHandler.sendInternal(ResponseBodyEmitterReturnValueHandler.java:221)
at org.springframework.web.servlet.mvc.method.annotation.ResponseBodyEmitterReturnValueHandler$HttpMessageConvertingHandler.send(ResponseBodyEmitterReturnValueHandler.java:212)
at org.springframework.web.servlet.mvc.method.annotation.ResponseBodyEmitter.sendInternal(ResponseBodyEmitter.java:223)
at org.springframework.web.servlet.mvc.method.annotation.ResponseBodyEmitter.send(ResponseBodyEmitter.java:214)
- locked <0x0000000614fc1408> (a org.springframework.web.servlet.mvc.method.annotation.SseEmitter)
at org.springframework.web.servlet.mvc.method.annotation.SseEmitter.send(SseEmitter.java:135)
at org.springframework.web.servlet.mvc.method.annotation.ReactiveTypeHandler$SseEmitterSubscriber.send(ReactiveTypeHandler.java:389)
at org.springframework.web.servlet.mvc.method.annotation.ReactiveTypeHandler$AbstractEmitterSubscriber.run(ReactiveTypeHandler.java:332)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@21.0.3/ThreadPoolExecutor.java:1144)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@21.0.3/ThreadPoolExecutor.java:642)
at java.lang.Thread.runWith(java.base@21.0.3/Thread.java:1596)
at java.lang.Thread.run(java.base@21.0.3/Thread.java:1583)
Found 1 deadlock.