-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
100% CPU usage in Selector using Jetty on Linux #3022
Comments
If the selector is in a loop, you should have a ton of logs like the first snippet you posted, in particular like:
Can you please configure the Jetty thread pool in this way: QueuedThreadPool threadPool = ...;
threadPool.setDetailedDump(true); and take a server dump when you see 100% CPU? |
I am working on to get the server dump. But I got a thread dump with threadPool.setDetailedDump(true); Yes, when CPU goes to 100% I get the above log entries thousands of times continuously. Like in an infinite loop. Thread with nid=0x3ec5 was consuming 100% of CPU. Can continuous log entries be considered itself a sign of bug? Or can there be occasions Jetty can output thousands of debug logs like this but still work fine?
|
The selector should not be woken up if there is nothing to do. After #2755 (fixed in 9.4.12) we are also taking care of interrupted threads that continuously wake up the selector. We need a server dump and possibly the DEBUG logs. Can you reproduce this issue reliably? |
Yes, I can reproduce the issue. But only in Tinycore Virtual Appliance with 2 cores. I couldn't reproduce it with more than two cores or in my Mac. I have attached the DEBUG logs and Thread Dump. I wonder whether this can be related to the following bug report. |
Can you test with a recent (not-EOL) Linux 4.x kernel? |
I tried with Linux Kernel 4.19 in CentOS 7. Got the same issue. I attached the full log file. Below is an extract.
|
@prasadlvi, the log only shows the loop, not what happened before the loop started, which is the critical information. |
Yes, I understand. I'll attach the necessary information as soon as possible. |
I attached the server dump and the full log file. |
@prasadlvi seems you can reproduce easily, so can you try a recent Linux kernel and JDK 11 (since in JDK 11 the NIO was basically rewritten)? |
The logs show this:
I cannot see anything wrong in the way Jetty handles this particular case. @prasadlvi it's even more important if you can test with JDK 11. |
@sbordet Co-worker of @prasadlvi here, we have tried on several 3.x and 4.x kernels. |
And all of them show the spin loop? Did you manage to try JDK 11? |
I tried with JDK 11 and Linux 4.19.0 (jetty-9.4.12.v20180830) and was not able to reproduce the issue. |
@gregw so this seems a JDK issue rather than a OS issue. We would need to put in the selector loop a counter, detect when it spin loops (with zero keys?), destroy and rebuild the selector, but we need to exclude the selector from being chosen when a new connection is accepted - we need a lock 😞 |
Or we just System.exit(1) and let the server be restarted externally? |
@gregw we can't |
I think that random JVM failures are like out of memory exceptions.... who knows what else is broken in the JVM, so best to exit and try again. Perhaps if a selector wakes up for no reason (0 keys and no tasks), then we can sleep for a while (say 50ms). That would stop the busy loop and we'd recover in 50ms if the selector comes good.... would be interesting to learn if the selector ever comes good? |
Do not ever, under any circumstance, call System.exit(). You have no idea what the impact of that could be. Airplanes could crash, missiles could launch, respirators could stop. Sleeping 50ms seems perfectly reasonable. Logging something ominous every minute that the condition persists is also a good idea, so as to not sweep the problem under the rug. |
@brettwooldridge I think So primary action should be to triage the JVM bug and make every effort to get that fixed. Secondary action would be to either find a change in behaviour in jetty to avoid triggering the bug... or perhaps inserting a pause to mitigate the impact of trying to continue on in the face of the bug. |
I'm -1 on using |
Sorry I think my "suggestion" of So let's triage the issue some more and see if we can quantify exactly when it happens and if it can be raised with open/jdk. Only then can we work out if it can be detected in the code and some non-obnoxious work around put in place... but if the problem is not happening on current JVMs, then perhaps there is nothing we need to do as the problem is already fixed? |
Last update pointed to a JVM and/or kernel issue. |
so, how to solve? |
@XuHe1 please open a new issue and be sure to add a server dump and a proof that you have a spin loop in the selector, along with exact Jetty version, OS, JDK version, etc. |
@prasadlvi how did you guys manage this behavior? |
@bhuvangu It was fixed after updating to Java 11. |
I am running embedded Jetty 9.4.7.v20170914. I observed 100% CPU usage when there is no load on the server. After noticing #2205 I upgraded to latest Jetty 9.4.12.v20180830. But observed result was the same.
Environment
java version "1.8.0_181"
Linux 3.16.38-tinycore64 #1 SMP Thu Nov 17 15:37:06 UTC 2016 x86_64 GNU/Linux
I found the CPU consuming threads using
top -H -o %CPU -p <process id>
When looking at the thread dump it seems same two threads in the above log cosume most CPU.
Does this looks like a bus in Jetty? Or could it be a problem with my application?
The text was updated successfully, but these errors were encountered: