Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Linux] process IO wait #1527

Closed
giampaolo opened this issue Jun 11, 2019 · 3 comments
Closed

[Linux] process IO wait #1527

giampaolo opened this issue Jun 11, 2019 · 3 comments
Assignees

Comments

@giampaolo
Copy link
Owner

I'm benchmarking a network app by using 2 different algorithms: plain send() and zero-copy sendfile(). I would like to know how much CPU time the process spends waiting for IO to complete (e.g. on a single send() or sendfile() call). Linux exposes this information at system level (psutil.cpu_times().iowait). It seems we can do this also on a per-process level by reading delayacct_blkio_ticks field from /proc/pid/stat:
https://unix.stackexchange.com/questions/459288/where-to-find-iowait-stat-for-a-single-process-as-a-file
I still have to understand whether this is really useful to have and whether other platforms provide the same metric (I guess they don't), so basically this is a possible enhancement proposal. The API should be exposed as a Process.iowait() method.

@giampaolo
Copy link
Owner Author

...or possibly it can be added to the namedtuple returned by Process.io_counters().

@giampaolo
Copy link
Owner Author

giampaolo commented Jun 11, 2019

On a second thought it's better to return this as part of Process cpu_times() (it also requires less privileges). Attaching a patch here. I wrote a script which keeps writing stuff in a file though, and psutil.cpu_times().iowait increases quite a bit while psutil.Process().cpu_times().iowait does much more rarely so I'm not sure how to interpret this.

diff --git a/HISTORY.rst b/HISTORY.rst
index b9e08da1..be2d88ce 100644
--- a/HISTORY.rst
+++ b/HISTORY.rst
@@ -7,6 +7,8 @@ XXXX-XX-XX
 
 **Enhancements**
 
+- 1527_: [Linux] process cpu_times() now reports 'iowait' time, which is the
+  time spent waiting for blocking I/O to complete.
 - 1530_: [NetBSD] add process cwd() support.  (patch by Kamil Rytarowski)
 
 **Bug fixes**
diff --git a/docs/index.rst b/docs/index.rst
index f068967e..f0f7cabb 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -1403,16 +1403,25 @@ Process class
 
   .. method:: cpu_times()
 
-    Return a `(user, system, children_user, children_system)` named tuple
-    representing the accumulated process time, in seconds (see
-    `explanation <http://stackoverflow.com/questions/556405/>`__).
-    On Windows and macOS only *user* and *system* are filled, the others are
-    set to ``0``.
+    Return a named tuple representing the accumulated process times, in seconds
+    (see `explanation <http://stackoverflow.com/questions/556405/>`__).
     This is similar to `os.times`_ but can be used for any process PID.
 
+    - **user**: time spent in user mode.
+    - **system**: time spent in kernel mode.
+    - **children_user**: user time of all child processes (always ``0`` on
+      Windows and macOS).
+    - **system_user**: user time of all child processes (always ``0`` on
+      Windows and macOS).
+    - **iowait**: (Linux only) time spent waiting for blocking I/O to complete.
+      This value is excluded from `user` and `system` times count.
+
     .. versionchanged::
       4.1.0 return two extra fields: *children_user* and *children_system*.
 
+    .. versionchanged::
+      5.6.4 added *iowait* on Linux.
+
   .. method:: cpu_percent(interval=None)
 
     Return a float representing the process CPU utilization as a percentage
diff --git a/psutil/_pslinux.py b/psutil/_pslinux.py
index e4bc7d75..b094aba0 100644
--- a/psutil/_pslinux.py
+++ b/psutil/_pslinux.py
@@ -200,6 +200,10 @@ pmmap_ext = namedtuple(
 pio = namedtuple('pio', ['read_count', 'write_count',
                          'read_bytes', 'write_bytes',
                          'read_chars', 'write_chars'])
+# psutil.Process.cpu_times()
+pcputimes = namedtuple('pcputimes',
+                       ['user', 'system', 'children_user', 'children_system',
+                        'iowait'])
 
 
 # =====================================================================
@@ -1576,6 +1580,7 @@ class Process(object):
         ret['children_stime'] = fields[14]
         ret['create_time'] = fields[19]
         ret['cpu_num'] = fields[36]
+        ret['blkio_ticks'] = fields[39]  # aka 'delayacct_blkio_ticks'
 
         return ret
 
@@ -1707,7 +1712,8 @@ class Process(object):
         stime = float(values['stime']) / CLOCK_TICKS
         children_utime = float(values['children_utime']) / CLOCK_TICKS
         children_stime = float(values['children_stime']) / CLOCK_TICKS
-        return _common.pcputimes(utime, stime, children_utime, children_stime)
+        iowait = float(values['blkio_ticks']) / CLOCK_TICKS
+        return pcputimes(utime, stime, children_utime, children_stime, iowait)
 
     @wrap_exceptions
     def cpu_num(self):
diff --git a/psutil/tests/test_process.py b/psutil/tests/test_process.py
index 512a8437..82c79c47 100755
--- a/psutil/tests/test_process.py
+++ b/psutil/tests/test_process.py
@@ -251,6 +251,8 @@ class TestProcess(unittest.TestCase):
         assert (times.user > 0.0) or (times.system > 0.0), times
         assert (times.children_user >= 0.0), times
         assert (times.children_system >= 0.0), times
+        if LINUX:
+            assert times.iowait >= 0.0, times
         # make sure returned values can be pretty printed with strftime
         for name in times._fields:
             time.strftime("%H:%M:%S", time.localtime(getattr(times, name)))

@giampaolo
Copy link
Owner Author

I confirm this does what intended: it's a counter which keeps track of how much time the CPU waits for blocking I/O to complete. To test this I created a 1G file, I then evicted the page/file cache with vmtouch utility, then I ran this script which read()s the whole file into memory:

import psutil
sysio1 = psutil.cpu_times().iowait
f = open('filein', 'rb')
f.read()
print(psutil.Process().cpu_times())
sysio2 = psutil.cpu_times().iowait
print(sysio2 - sysio1)

It prints:

pcputimes(user=0.03, system=0.67, children_user=0.0, children_system=0.0, iowait=0.08)
0.08000000000004093

Process iowait and system-wide iowait counters match, so they refer to the same thing. Similarly to system-wide iowait time, this represents an "idle" time, meaning the CPU is not being used. As such the doc should probably mention that sum(proc.cpu_times()) will not represent the total "busy" time of the process anymore.

We do a similar thing when calculating system-wide CPU percent:

psutil/psutil/__init__.py

Lines 1775 to 1781 in c10df5a

# Linux: "iowait" is time during which the CPU does not do anything
# (waits for IO to complete). On Linux IO wait is *not* accounted
# in "idle" time so we subtract it. Htop does the same.
# References:
# https://github.com/torvalds/linux/blob/
# 447976ef4fd09b1be88b316d1a81553f1aa7cd07/kernel/sched/cputime.c#L244
busy -= getattr(times, "iowait", 0)

...we take into account iowait and substract it from the total "busy" CPU time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant