Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in impl Sub<CpuUsage> for CpuUsage #199

Open
nbigaouette opened this issue Feb 12, 2020 · 3 comments
Open

Bug in impl Sub<CpuUsage> for CpuUsage #199

nbigaouette opened this issue Feb 12, 2020 · 3 comments
Labels
A-cpu Area: heim-cpu crate C-bug Category: something isn't working C-design Category: design decisions and discussions

Comments

@nbigaouette
Copy link

nbigaouette commented Feb 12, 2020

As reported on Gitter, I think I've found a bug in the impl Sub<CpuUsage> for CpuUsage.

I was trying to get the cpu usage and verify against htop's CPU% column. It turned out there was a factor of 2 in what heim was reporting vs what htop was reporting.

The verification is simple. I create two threads which busy-loop long enough to be measurable (say 5 seconds). Because two threads are fully busy, I expected to see 200% as the CPU usage. This is what htop reports. heim on the contrary reports 100%.

Note that this is hardware dependent. I've ran this on my macbook pro which contains a i7-7820HQ. This (single) CPU is a quad-core + hyperthreading.

❯ sysctl hw.logicalcpu; sysctl hw.physicalcpu; sysctl hw.ncpu
hw.logicalcpu: 8
hw.physicalcpu: 4
hw.ncpu: 8

htop sees the 8 cores and reports a % for each.

Here's the really simple code I've used:

async fn run() {
    let process = heim::process::current().await.unwrap();

    let cpu_usage_before = process.cpu_usage().await.unwrap();
    let child1 = std::thread::spawn(move || {
        // spin loop
        for _i in 0..100_000_000 {}
    });
    let child2 = std::thread::spawn(move || {
        // spin loop
        for _i in 0..100_000_000 {}
    });
    child1.join().unwrap();
    child2.join().unwrap();
    let cpu_usage_after = process.cpu_usage().await.unwrap();
    println!(
        "Usage (%): {}",
        (cpu_usage_after - cpu_usage_before).get::<heim::units::ratio::percent>()
    );
}

fn main() {
    let mut rt = tokio::runtime::Runtime::new().unwrap();
    rt.block_on(run());
}
[dependencies]
heim = "0.0.9"
tokio = { version = "0.2", default-features = false }

On my machine, the busy-loop takes just less than seconds.

When run, the code abode reports:

❯ cargo run    
   Compiling t v0.1.0 (/tmp)
    Finished dev [unoptimized + debuginfo] target(s) in 0.99s
     Running `target/debug/t`
Usage (%): 102.349174

Note that I am not using --release as I don't want the loop to be optimized out.

I looked at the code impl Sub<CpuUsage> for CpuUsage code and I think I've found the bug.

After having calculated the difference in time between the two CpuUsage:
https://github.com/heim-rs/heim/blob/v0.0.9/heim-process/src/process/cpu_usage.rs#L25
this difference is multiplied by the cpu_count:
https://github.com/heim-rs/heim/blob/v0.0.9/heim-process/src/process/cpu_usage.rs#L33-L34

A ratio is obtained by dividing the time difference of user+system by the time difference between the two CpuCount:
https://github.com/heim-rs/heim/blob/v0.0.9/heim-process/src/process/cpu_usage.rs#L37
This results in the ration containing a 1 / cpu_count.

Then later on the ratio is multipled by the cpu_count:
https://github.com/heim-rs/heim/blob/v0.0.9/heim-process/src/process/cpu_usage.rs#L38
This cancels out the previous 1 / cpu_count.

At this point, the variable single_cpu_ratio is independent of the cpu_count, which I believe is not the expected behaviour.

I was told the code was inspired/ported from Python's psutils package:
https://github.com/giampaolo/psutil/blob/e65cc95de72828caed74c7916530dd74fca351e3/psutil/__init__.py#L958-L1048

Looking at this code, the comments does mention the core scaling. See:
https://github.com/giampaolo/psutil/blob/e65cc95de72828caed74c7916530dd74fca351e3/psutil/__init__.py#L1024-L1026
and:
https://github.com/giampaolo/psutil/blob/e65cc95de72828caed74c7916530dd74fca351e3/psutil/__init__.py#L1033-L1035

Surprisingly, the Python code seems to also suffer from the "ratio is independent of cpu count" issue! The timer() function is scaled by the number of cpu:
https://github.com/giampaolo/psutil/blob/e65cc95de72828caed74c7916530dd74fca351e3/psutil/__init__.py#L999
This timer() output is used to measure the time difference delta_time:
https://github.com/giampaolo/psutil/blob/e65cc95de72828caed74c7916530dd74fca351e3/psutil/__init__.py#L1002
https://github.com/giampaolo/psutil/blob/e65cc95de72828caed74c7916530dd74fca351e3/psutil/__init__.py#L1018
So at this point delta_time is scaled by the number of cpu.

The ratio is then taken:
https://github.com/giampaolo/psutil/blob/e65cc95de72828caed74c7916530dd74fca351e3/psutil/__init__.py#L1027
resulting in a factor of 1 / cpu_count for the overall_cpus_percent. The finally the overall_cpus_percent variable is multiplied back by the number of cpu, cancelling it out!
https://github.com/giampaolo/psutil/blob/e65cc95de72828caed74c7916530dd74fca351e3/psutil/__init__.py#L1047

Looking at psutils's issue tracker I find the following:

@nbigaouette
Copy link
Author

nbigaouette commented Feb 12, 2020

I got the same behaviour on a Linux machine (Intel(R) Xeon(R) CPU E5-2650, 48 cores, including HT).

@nbigaouette
Copy link
Author

nbigaouette commented Feb 12, 2020

I tried inside a Docker container on my mac (docker run -it --rm rust@sha256:7b7fb0682b12e5d0683a2a85e7f8fe34236d9b4b9c137d7395cec00dca09bcc5):

Note that I assigned a single core to the Docker VM on my mac:

root@2755be76a731:~/c# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 158
model name      : Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
stepping        : 9
cpu MHz         : 2900.000
cache size      : 8192 KB
physical id     : 0
siblings        : 1
core id         : 0
cpu cores       : 1
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 22
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht pbe syscall nx pdpe1gb lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq dtes64 ds_cpl ssse3 sdbg fma cx16 xtpr pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch pti fsgsbase bmi1 hle avx2 bmi2 erms rtm xsaveopt arat
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs
bogomips        : 5808.00
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:
root@2755be76a731:~/c# /usr/local/cargo/bin/cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.11s
     Running `target/debug/c`
Usage (%): 50.306522

If I look at htop during this run, only a single CPU is shown. I see both threads using 50% each for a total of 100%:

  CPU[||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]   Tasks: 7, 3 thr; 1 running
  Mem[|||||||||||||||||||||||||||||||||||||||||||||||||   302M/1.94G]   Load average: 0.74 0.49 0.30
  Swp[|                                                   780K/1024M]   Uptime: 00:15:51

  PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
 1935 root       21   1  5324  3696  2736 R  0.0  0.2  0:00.17 htop
 1927 root       20   0  4000  3240  2736 S  0.0  0.2  0:00.01 -bash
 1926 root       20   0  8360  3348  2748 S  0.0  0.2  0:00.17 tmux
    1 root       20   0  3992  2084  1756 S  0.0  0.1  0:00.05 bash
 1931 root       20   0  4000  3020  2612 S  0.0  0.1  0:00.00 -bash
 1942 root       20   0  8180  3212  2876 S  0.0  0.2  0:00.00 tmux a
 1961 root       20   0  201M   876   748 S 99.3  0.0  0:12.67 /usr/local/rustup/toolchains/1.41.0-x86_64-unknown-linux-gnu/bin/cargo run
 2046 root       20   0  201M   876   748 S  0.0  0.0  0:00.00 target/debug/c
 2047 root       21   1  201M   876   748 R 44.4  0.0  0:05.96 target/debug/c
 2048 root       20   0  201M   876   748 R 55.0  0.0  0:06.63 target/debug/c

See PID 2047 and 2048 which are the two threads.

What is weird about the above is that the VM has a single CPU (a single core even). Both heim::cpu::logical_count() and heim::cpu::physical_count() report 1. So it the "cancelling out cpu_count" should not affect this...

@nbigaouette
Copy link
Author

I've opened an issue on psutil which seems to suffer from this too: giampaolo/psutil#1700

@svartalf svartalf added A-cpu Area: heim-cpu crate C-bug Category: something isn't working C-design Category: design decisions and discussions labels May 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cpu Area: heim-cpu crate C-bug Category: something isn't working C-design Category: design decisions and discussions
Projects
None yet
Development

No branches or pull requests

2 participants