-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ReservedThreadExecutor
blocked in tryExecute
#6495
Comments
Can you cause this to happen? Using : 9.4.31.v20200723 |
I can see two threads trying to produce, but waiting on the lock:
and
which is an already producing thread offering a produced task to a reserved thread. But the offer is blocked on the lock: 0x00000000ffa73038.... which is not held by any other thread, the reason being that is a So how can we offer a task to a reserved thread that is not wanting one? Note this is not a classic deadlock, nor does the problem look to be in EWYK. It seams to me to be a problem in the |
ReservedThreadExecutor
blocked in tryExecute
The only change between 9.4.31 and head in |
So tasks are only offered to a reserved thread that is taken from the stack. So for this problem to happen, there must somehow be a reserved thread on the stack that is not listening ( _stack.offerFirst(this);
Runnable task = reservedWait(); So perhaps we need to check the return of the Once in if (_stack.remove(this))
{
_size.decrementAndGet();
return STOP;
} So that looks good as well. The only other way a reserved thread can get on the stack is if it adds itself in try
{
_task.put(task);
return true;
}
catch (Throwable e)
{
_size.getAndIncrement();
_stack.offerFirst(this);
return false;
} I'm not sure how/why I can't see any other removal cases from the stack.... so there is no obvious smoking gun from code inspection. |
The other thing of note, is that there a zero reserved threads. The most common snapshot of a server should have a few reserved threads, so it is unusual to see absolutely no reserved threads. This offer must have been for the last reserved thread.... |
the size and pending atomics could probably be merged into an AtomicBiInteger, but again I can't see how that is related. |
We don't decrement the pending count if an execution of a reserved thread is rejected, but again that doesn't put a thread onto the stack, so not related. |
@zjffdu how many times has this occurred? Does it always happen for you after a few days? If you leave a server idle does it happen? Can you try a different JVM? |
@gregw This is the first time I hit this issue, I could try again to see whether I reproduce it again |
@zjffdu it is a very VERY strange thread dump. We have no idea how it can happen like that!! So any extra information you can give us would be most appreciated... even if it is the same information from a re-occurance, or if it doesn't reoccur for some time etc. |
Hi. I've the same problem on a test server with two server connector on two different ports in the following environment: I also can't really reproduce it but I hope this info helps you to find the problem. I encountered the problem after some other services just blocked in a API call on the Jetty server. After some testing with cURL calls I could see that every second call on one connector blocked indefinitely. Calls on the other connector worked just fine. In the trace logs I could see that every second call "dies" with the acceptor thread. These are the logs on a successful call:
On a blocked call only the first line is logged and nothing more:
The stack traces in the thread dump showing the same problem as the ones from @zjffdu (only the "htp" threads form the Jetty server for simplicity):
I'll restart the Jetty server and will also try if I can reproduce it. |
@macmon-secure thanks for that report. We have 2 theories... either the same cosmic ray that hit @zjffdu's server continued on through the earth and hit yours as well; OR there is a real problem!! We have looked hard at our code and can't see an issue, so suspect a possible JVM library issue, but you saw the problem on a different JVM, so that is even less likely. We can think of some changes to make that will potentially avoid the block forever semantic, but might leak a thread. That's probably preferable to block forever. |
@gregw thanks for the fast reply. I'll try to use a small timeout on the clients side to workaround the problem for now. |
@macmon-secure if the problem is really an issue for you, then you can configure the server to use 0 reserved threads. This will have a performance cost, but should avoid this issue entirely. |
@gregw thank. I'll consider that solution on the productive server. If I can reproduce the problem on the test server I'll let you know. |
@macmon-secure what is your exact JDK version/vendor? Can you please report the full output of |
Full output of
|
@macmon-secure your call, but I would advise against Debian builds of OpenJDK. If possible, use AdoptOpenJDK builds, at least it is clear and known what source they are built from (i.e. the official OpenJDK tags). |
@macmon-secure @zjffdu when you reproduce this issue, could you take a server dump as explained here: That would hopefully allows us to see the internal state of the |
@sbordet I can't change the environment where the application is running. |
Use MAX_VALUE rather than -1 as the stopped marker value.
Remove the stack data structure entirely. ReservedThreads all poll the same SynchronousQueue and tryExecute does a non blocking offer.
Remember last time we hit zero reserved threads
Whilst we never found the root cause of this problem, we have completely refactored the ReservedThreadExcecutor to be simpler and to totally avoid blocking. |
reopened for port to 10/11 |
@gregw sorry for the late answer but no, we weren't able to reproduce it. |
Closing as forward port PR #6559 has been merged. |
Jetty version(s). 9.4.31.v20200723
Java version/vendor
(use: java -version)
. openjdk version "1.8.0_252"OS type/version: centos 7
Description, Jetty has not response, and no logs in jetty server
How to reproduce? It would not be easy to reproduce, I run jetty server in my app for more than 5 days, and it happens suddenly.
Here's the thead dump and I suspect it might be due to deadlock in EatWhatYouKill, because I see 2 EatWhatYouKill threads area blocked.
The text was updated successfully, but these errors were encountered: