-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIndows] CPU percent is incorrect (perf counters) #2467
Comments
Internally the code uses psutil/psutil/arch/windows/cpu.c Lines 103 to 105 in 7cae974
Unfortunately that function's documentation says
Of course the alternate function is completely wrong, it is the one that only gives System times:
I've seen other functions changing behavior in Windows 11. This code should probably be switched to use performance counters ("Processor Information"). |
According to this description, both Note: internally
So are we sure For reference, here's the links to psutil implementation
|
Related #2384 (comment). |
ChatGPT seems to confirm
It's unfortunate I have to apprehend this from AI instead of MS doc. :-\ If this is true, it may indeed make sense to calculate system CPU times by using perf counters. I remember you Daniel (@dbwiddis) did something similar: you replaced a native Windows API with performance counters for There seems to be one problem: according to code (e.g. see here and here) some performance counters may be disabled and fail. As such, we should probably ship a dual implementation: try perf counters first else use Windows native API. And still unsolved, since we're discussing 2 problems here: it's not clear how to replace |
Yes, that's generally what I've done over on the Java/JNA side.
Having navigated through the range of associated problems over the years and implemented multiple fallbacks, yes, "it's complicated". Here are some of the obstacles:
In both of the above cases, it may be possible to use a WMI table to fetch the counters from the same source without using the PDH functions. It can be slower (COM overhead) but typically works as a backup.
When they're disabled, you can't do anything, WMI doesn't even work as a backup. Just say so in an error message; however, allow for configuration to minimize log messages in that case. :)
That's the "Processor Information" performance counters. Here's the Corresponding WMI Table (it's the 'formatted' one that gives usage metrics you'd expect, the 'raw' data gives "ticks"). Note "Processor Information" is processor-group aware but is Windows 7+. There is a similar "Processor" performance counter that can be used pre-Win7, but it is not processor-group aware. Also note "Processor Information" can give you "real" tick counts, but then your users will complain that you don't match the Task Manager output, so you'll need a configuration option to choose whether to use the "Utility" counters rather than the "Percent" counters. |
That's a lot to chew on. Let's see what I can do. In the meantime... thanks as always. =) The above info is very useful. |
Thank you for taking time to address this issue. Answering your question: Yes, both versions (percpu=True and percpu=False) produce incorrect values. |
Summary
Description
When using cpu_percent with percpu=False to display CPU load the value is always much lower than expected, e.g. cpu_percent returns load or single digit percent, while CPU actually is loaded to e.g. 50-70% (when looking at Task Manager). When using percpu=True only one element in the array contains large number (the high load element seems to change from run to run), which roughly corresponds to the full CPU utilization (see output example below). CPU has 12 cores and 24 threads.
Code snippet:
Example output:
CPU load: [0.0, 0.0, 1.6, 3.1, 0.0, 3.1, 0.0, 4.7, 0.0, 0.0, 0.0, 1.6, 0.0, 4.7, 1.6, 0.0, 1.6, 3.1, 3.1, 0.0, 0.0, 3.1, 1.6, 42.4]%
CPU load: [3.1, 3.1, 6.2, 1.6, 0.0, 3.1, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.6, 0.0, 0.0, 1.6, 0.0, 0.0, 3.1, 1.6, 41.5]%
CPU load: [0.0, 1.6, 6.2, 6.2, 0.0, 0.0, 1.6, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 3.1, 0.0, 0.0, 0.0, 0.0, 0.0, 70.1]%
CPU load: [4.6, 0.0, 3.1, 4.7, 0.0, 1.6, 1.6, 1.6, 1.6, 1.6, 4.7, 3.1, 0.0, 3.1, 10.9, 3.1, 0.0, 4.7, 3.1, 10.9, 1.6, 0.0, 3.1, 50.0]%
CPU load: [0.0, 0.0, 0.0, 6.3, 0.0, 0.0, 1.6, 3.1, 0.0, 0.0, 3.1, 0.0, 0.0, 3.1, 3.1, 1.6, 1.6, 3.1, 0.0, 3.1, 0.0, 1.6, 0.0, 35.4]%
That can't be correct behavior. Expected result would be to have roughly even load across all cores as seen in the attached screenshot.
The text was updated successfully, but these errors were encountered: