-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CPU steal stuck at 100% #1210
Comments
I currently own the vps affected by this problem, in case something need to be tested on the machine itself. |
Mmm can you try this and paste the output?
|
|
can you please also post the results of:
|
Timeline of events:
|
From what I see in the logs requested by @giampaolo , the "steal" value actually decreases every second instead of going up (the values are supposed to be cumulative). Looking at the first two results:
When we count the percentage, we divide the difference in the specific field (steal) with the total difference of the cpu times. In this case almost all of the difference is the decrease in steal time so we return 100%:
A decrease in the cumulative steal time should not happen, but apparently can happen erroneously: psutil should probably ignore negative differences if values in "/proc/stat" decrease. Something like:
"top" is doing this: https://github.com/thlorenz/procps/blob/faa41f864a599854ceafa4ea634b29a6924bbbe6/deps/procps/top/top.c#L5017 |
Wow! Thanks for fine brain work Arnon! psutil already provides some logic to prevent numbers from wrapping: |
I think we need to use the same logic as in "top" which is to wrap ("TRIMZ") each delta separately (they do it for all values, not just "steal") and then sum up the "trimmed" deltas to get the total to divide by. This way negative differences won't impact other fields when we use |
Arnon: don't know let me check. |
Yes, you need to be a collaborator. I just made you one (then I suppose you can assign the issue to yourself). |
@giampaolo Yes, it is running KVM on Xen (AWS) |
Thanks! |
The only thing is that |
Actually I'm not entirely sure
...in which case
This case is different as AFAIU we can have numbers like this:
I suppose what should happen in this case is 955 should be translated to 1000 (wrap_numbers would translated it to 1955, which is wrong). |
Patch in #1214 merged. Please close this issue. |
okay |
…etimes go backwards Signed-off-by: Giampaolo Rodola <g.rodola@gmail.com>
Occasionally psutil returns a CPU steal time of 100%, the only way to get this back to the correct value is by rebooting the system.
Running version 5.4.3 on AWS Ubuntu 16.04 xenial.
The text was updated successfully, but these errors were encountered: