Speedup Process methods #799

giampaolo · 2016-03-29T09:43:33Z

This is something I've been thinking about for a while. Problem with current Process class implementation is that if you want to fetch multiple process info the underlying (C / Python) implementation may unnecessarily do the same thing more than once.

For instance, on Linux we read /proc/pid/stat file to get terminal, cpu_times, and create_time, and each time we invoke those methods we open the file and read from it. We get the one info we're interested in and discard the rest.
A similar thing happens on basically every OS. For instance on BSD we use kinfo_proc syscall to get basically 80% of all process info (uids, gids, create_time, ppid, io_counters, status etc.).
Again, all this info retrieved once (in C) and re-requested every time we call a Process method.

Since we typically get more than one info about the process (e.g. think about a top-like app) it appears clear that this could (and should) be done in a single operation. A possible solution would be to provide a context manager which temporarily puts the Process instance in a state so that internally the requested metrics are determined in a single shot and then "cached" / "stored" somewhere:

p = psutil.Process()
with p.oneshot():
    p.terminal()  # internally, this retrieves terminal, cpu_times and create time
    p.cpu_times()  # return the cached value
    p.create_time()  # return the cached value

Note: Process.as_dict() method would use this method implicitly.

=== EDITS AFTER COMMENTS BELOW ===

Branch

master...oneshot#files_bucket

Benchmark scripts

bench_oneshot.py (quick, based on timeit module)
bench_oneshot_2.py (slower, more precise, using perf module).

Linux (+2.56x speedup)

$ python scripts/internal/bench_oneshot.py 
11 methods involved on platform 'linux2' (1000 iterations):
    cpu_percent
    cpu_times
    create_time
    gids
    name
    num_ctx_switches
    num_threads
    ppid
    status
    terminal
    uids
normal:  0.233 secs
oneshot: 0.091 secs
speedup: +2.56x

Windows (+1.9x or +6.5x speedup)

user's process:

C:\Python27\python.exe scripts\internal\bench_oneshot.py
13 methods involved on platform 'win32' (1000 iterations, psutil 4.5.0)
    cpu_affinity
    cpu_percent
    cpu_times
    io_counters
    ionice
    memory_info
    memory_percentnice
    num_ctx_switches
    num_handles
    num_threads
    parent
    ppid
normal:  1.243 secs
onshot:  0.655 secs
speedup: +1.90x

other user's process:

C:\Python27\python.exe scripts\internal\bench_oneshot.py
11 methods involved on platform 'win32' (1000 iterations, psutil 4.4.2):
    cpu_percent
    cpu_times
    create_time
    io_counters
    memory_info
    memory_percent
    num_ctx_switches
    num_handles
    num_threads
    parent
    ppid
normal:  5.027 secs
onshot:  0.765 secs
speedup: +6.57x

FreeBSD (+2.18x speedup)

$ python scripts/internal/bench_oneshot.py 
13 methods involved on platform 'freebsd10' (1000 iterations):
    cpu_percent
    cpu_times
    create_time
    gids
    io_counters
    memory_full_info
    memory_info
    memory_percent
    num_ctx_switches
    ppid
    status
    terminal
    uids
normal:  0.121 secs
oneshot: 0.056 secs
speedup: +2.18x

OSX (+1.92x speedup)

$ python scripts/internal/bench_oneshot.py
13 methods involved on platform 'darwin' (1000 iterations):
    cpu_percent
    cpu_times
    create_time
    gids
    memory_info
    memory_percent
    name
    num_ctx_switches
    num_threads
    parent
    ppid
    terminal
    uids
    username
normal:  0.200 secs
onshot:  0.104 secs
speedup: +1.92x

SunOS (+1.37x speedup)

$ python scripts/internal/bench_oneshot.py
12 methods involved on platform 'sunos5' (1000 iterations):
    cmdline
    create_time
    gids
    memory_full_info
    memory_info
    memory_percent
    name
    num_threads
    ppid
    status
    terminal
    uids
normal:  0.087 secs
oneshot: 0.064 secs
speedup: +1.37x

The text was updated successfully, but these errors were encountered:

nicolargo · 2016-04-26T12:27:27Z

+1 for this enhancement request. It will be awesome for the Glances project.

giampaolo · 2016-04-30T03:09:52Z

I started working on this in a separate branch (master...oneshot#files_bucket
) and completed the Linux implementation. The code below runs about twice as fast:

import psutil
import time

attrs = ['ppid', 'uids', 'gids', 'num_ctx_switches', 'num_threads', 'status',
         'name', 'cpu_times', 'terminal']
p = psutil.Process()
t = time.time()
for x in range(1000):
    p.as_dict(attrs)
print(time.time() - t)

nicolargo · 2016-07-10T11:20:46Z

Any head up on this enhancement ?

giampaolo · 2016-07-10T11:22:21Z

I completed the linux implementation but I still have to benchmark it
properly. All other platform implementations are still missing. It's gonna
take a while.
On Jul 10, 2016 1:20 PM, "Nicolas Hennion" notifications@github.com wrote:

Any head up on this enhancement ?

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#799 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAplLE5TZUT2J0ENMsnRbgt6kc7j46wiks5qUNWPgaJpZM4H6kAZ
.

giampaolo · 2016-08-02T19:54:38Z

Linux benchmark. With this I get a 2x speedup (twice as fast) if I involve all the "one shot" methods, meaning I am emulating the best possible scenario:

import psutil
import time


def doit(p):
    p.name()
    p.terminal()
    p.cpu_times()
    p.create_time()
    p.status()
    p.ppid()
    #
    p.num_ctx_switches()
    p.num_threads()
    p.uids()
    p.gids()


p = psutil.Process()

t = time.time()
for x in range(1000):
    doit(p)
print("normal:  %f" % (time.time() - t))

t = time.time()
for x in range(1000):
    with p.oneshot():
        doit(p)
print("oneshot: %f" % (time.time() - t))

Output:

normal:  0.189042
oneshot: 0.097632

giampaolo · 2016-08-02T23:00:45Z

FreeBSD impact deriving from getting multiple (14) info when only 1 is needed is negligible 0.46 secs vs. 0.42, so even when NOT using oneshot() and retrieving a single process info does not slow things down.

giampaolo · 2016-08-03T17:38:41Z

Linux speedup went from 1.9x to 2.6x after f851be9.

giampaolo · 2016-08-03T19:38:12Z

BSD platforms implementation is completed. On FreeBSD I get a +2.18x speedup.
I also added a benchmark script here: https://github.com/giampaolo/psutil/blob/oneshot/scripts/internal/bench_oneshot.py.

nicolargo · 2016-08-04T04:19:13Z

Nice! Are you also think that the new process method will speedup on Windows Operating System?

giampaolo · 2016-08-04T08:15:13Z

Yes, this is intended for all OSes, even though Windows is probably gonna be the most difficult platform because it has less C APIs which can be used to directly retrieve multiple info in one shot.
For instance, BSD is the exact opposite, as in one shot you get a whole blob of stuff:

psutil/psutil/_psutil_bsd.c

Line 234 in 50015c4

#ifdef __FreeBSD__

The only Windows C call I can think of that is being used basically all the time on Windows is OpenProcess.
We use a wrapper around it:

psutil/psutil/arch/windows/process_info.c

Line 21 in 50015c4

/*

...which is extensively used in the main C extension module:

~/svn/psutil {master}$ grep psutil_handle psutil/_psutil_windows.c | wc -l
16

What we can do is get the handle once, store it in Python (as an int), then pass it back to the C extension as an argument, and do this as long as we're in the oneshot context (then on __exit__ we're gonna "CloseHandle() it"). The methods involved should be (at least): cpu_times(), create_time(), memory_info(), nice(), io_counters(), cpu_affinity(), num_handles() and memory_maps(). So yes, also on Windows there's a lot of space for speeding things up quite a bit.

giampaolo · 2016-08-05T19:17:14Z

Solaris implementation 630b40d +1.37x speedup.

…Handle in order to keep the handle reference at Python level and allow caching

giampaolo · 2016-08-08T21:41:39Z

It turns out storing OpenProcess handle in Python is slower than retrieving it in C every time. I experimented with this in here:
oneshot...oneshot-win#files_bucket
...and I get a -1.5x slowdown. As such Windows is apparently the only platform which cannot take advantage of this.

… be more reliable

giampaolo · 2016-10-07T01:45:58Z

OSX implemented as of 7b2a6b3 and cf21849. The speedup is 1.8x! Unless I'm missing something else we should be done with all platforms.

nicolargo · 2016-10-07T11:28:25Z

Good news @giampaolo !

giampaolo · 2016-10-07T23:29:03Z

OSX: going from 1.8 to 1.9 speedup with 1e8cef9.

…er speedup

…; speedup from 1.2x to 1.8x

giampaolo · 2016-10-28T00:37:12Z

It turns out the apparent slowdown occurring on Windows as per my previous message #799 (comment) was due to the benchmark script not being stable enough, so we're good also on Windows.
https://github.com/giampaolo/psutil/blob/7f51f0074b6d727a01fea0290ed0988dd51ad288/scripts/internal/bench_oneshot_2.py script relying on perf module shows a +1.2x speedup.
With c10a7aa and 3efb6bf I went from +1.2x to +1.8x.

giampaolo · 2016-10-28T18:19:01Z

The interesting thing about Windows is that because some Process methods use a dual implementation (see #304) we can get a way bigger speedup for PIDs owned by other users, for which the first "fast" implementation raises AccessDenied.
On a high-privileged PID by using oneshot() I am now getting an awesome +6.3x speedup!

giampaolo · 2016-11-05T01:17:34Z

OK, this is now merged into master as of de41bcc.

nicolargo · 2016-11-05T09:14:36Z

Great job @giampaolo !

Many thanks.

suzaku · 2016-11-07T00:06:16Z

Great job!

* master: (375 commits) update psutil fix procsmem script which was not printing processes try to fix tests on travis try to fix tests on travis OSX: fix compilation warning giampaolo#936: give credits to Max Bélanger giampaolo#811: move DLL check logic in _pswindows.py winmake: use the right win slashes winmake: do not try to install GIT commit hook if this is not a GIT cloned dir giampaolo#811: on Win XP let the possibility to install psutil from sources as it still (kind of) works) giampaolo#811: add a Q&A section in the doc; tell what Win versions are supported giampaolo#811: raise a meaningful error message if on Windows XP update doc; bump up version giampaolo#939: update MANIFEST to include only src files and not much else update HISTORY travis: execute mem leaks and flake8 tests only on py 2.7 and 3.5; no need to test all python versions bump up version update version in doc add simple test case for oneshot() ctx manager add simple test case for oneshot() ctx manager speedup fetch all process test by using oneshot giampaolo#799 - oneshot / linux: speedup memory_full_info and memory_maps fix flake8 first pass giampaolo#943: better error message in case of version conflict on import. update doc 799 onshot / win: no longer store the handle in python; I am now sure this is slower than using OpenProcess/CloseHandle in C update doc (win) add memleak test for proc_info() move stuff around memleak: fix false positive on windows giampaolo#933 (win) fix memory leak in WindowsService.description() giampaolo#933 (win) fix memory leak in cpu_stats() (missing free()) refactoring giampaolo#799 / win: pass handle also to memory_maps() and username() functions fix numbers mem leak script: provide better error output in case of failure refactor memleak script refactor memleak script refactor memleak script refactor memleak script refactor memleak script: get rid of no longer used logic to deal with Process properties memleak script refactoring doc styling giampaolo#799 / win: use oneshot() around num_threads() and num_ctx_switches(); speedup from 1.2x to 1.8x refactor windows tests win: enable dueal process impl tests win / C: refactor memory_info_2 code() and return it along side other proc_info() metrics windows c refactor proc_info() code update windmake script winmake clean: make it an order of magnitude faster; also update Makefile update doc bench script: add psutil ver winmake: more aggressive logic to uninstall psutil adjust bench2 script to new perf API try to adjust perf upgrade perf code memory leak script: humanize memory difference in case of failure style changes fix giampaolo#932 / netbsd: check connections return value and raise exception netbsd / connections: refactoring netbsd / connections: refactoring netbsd / connections: refactoring netbsd / connections: refactoring netbsd / connections: refactoring testing make clean with unittests was a bad idea after all make 'make clean' 4x faster! add test for make clean adjust winmake script fix netbsd/openvsd compilation failure bsd: fix mem leak osx: fix memory leak pre-release refactoring update IDEAS add mtu test for osx and bsd osx: separate IFFLAGS function osx/bsd: separate IFFLAGS function linux: separate IFFLAGS function share C function to retrieve MTU across all UNIXes HISTORY: make anchors more easily referenceable fix giampaolo#927: Popen.__del__ may cause maximum recursion depth error. fix Popen test which is occasionally failing more releases timeline from README to doc ignore failing tests on OSX + TRAVIS update INSTALL instructions update print_announce.py script update HISTORY HISTORY: provide links to issues on the bug tracker update IDEAS giampaolo#910: [OSX / BSD] in case of error, psutil.pids() raised RuntimeError instead of the original OSError exception. fix unicode tests on windows / py3 small test refactoring fix giampaolo#926: [OSX] Process.environ() on Python 3 can crash interpreter if process environ has an invalid unicode string. osx: fix compiler warnings refactor unicode tests fix unicode test giampaolo#783: fix some unicode related test failures on osx test refactoring test refactroring ...

giampaolo added the enhancement label Mar 29, 2016

nicolargo mentioned this issue May 5, 2016

Replace PSUtil processes stats grabber (on GNU/Linux) nicolargo/glances#698

Closed

giampaolo added a commit that referenced this issue Aug 3, 2016

#799: speedup @memoize_when_activated deco from +1.9x to +2.6x

f851be9

giampaolo added a commit that referenced this issue Aug 3, 2016

#799 add benchmark script

c88a385

giampaolo added a commit that referenced this issue Aug 3, 2016

#799: oneshot() BSD implementation

8fea6e3

giampaolo added a commit that referenced this issue Aug 3, 2016

#799: BSD: use onectx() also for proc memory info

50015c4

giampaolo added a commit that referenced this issue Aug 5, 2016

#799 - oneshot: solaris implementation 1.37x speedup

630b40d

giampaolo added a commit that referenced this issue Aug 6, 2016

#799, oneshot(), windows: expose C functions to OpenProcess and Close…

39165fb

…Handle in order to keep the handle reference at Python level and allow caching

giampaolo added a commit that referenced this issue Aug 6, 2016

#799 / windows / cpu_times: add oneshot() support

6c55785

giampaolo added a commit that referenced this issue Aug 6, 2016

#799 / windows / cpu_times: add memory_info() support

242f522

giampaolo added a commit that referenced this issue Aug 6, 2016

#799: use timeit module for doing benchmark

146df37

giampaolo added a commit that referenced this issue Aug 8, 2016

#799 / windows / cpu_times: add nice() support

c8c8dfb

giampaolo added a commit that referenced this issue Aug 8, 2016

#799 / windows / cpu_times: add ionice() support

614ecab

giampaolo added a commit that referenced this issue Aug 8, 2016

#799 / windows / cpu_times: add io_counters() and cpu_affinity() support

9af9621

giampaolo added a commit that referenced this issue Aug 23, 2016

#799 cache cpu_times, memory_info and ppid a Process class level

143d6a2

giampaolo added a commit that referenced this issue Aug 23, 2016

#799 add uids/userma,e

a69b7df

giampaolo added a commit that referenced this issue Aug 23, 2016

#799 add a second script which uses perf module, which is supposed to…

16e11c0

… be more reliable

giampaolo added a commit that referenced this issue Oct 7, 2016

#799 / OSX: implement oneshot for kinfo_proc info

7b2a6b3

giampaolo added a commit that referenced this issue Oct 7, 2016

#799 / OSX: implement oneshot for PROC_PIDTASKINFO

cf21849

giampaolo added a commit that referenced this issue Oct 7, 2016

#799 / OSX: also include proc.name() in the list of grouped oneshot info

1e8cef9

giampaolo added a commit that referenced this issue Oct 8, 2016

#799 / BSD: also include the name() in the oneshot() info; adds furth…

77b484c

…er speedup

giampaolo added a commit that referenced this issue Oct 28, 2016

#799 / win: use oneshot() around num_threads() and num_ctx_switches()…

3efb6bf

…; speedup from 1.2x to 1.8x

giampaolo added the critical label Oct 28, 2016

giampaolo added a commit that referenced this issue Oct 28, 2016

#799 / win: pass handle also to memory_maps() and username() functions

4a06a54

giampaolo added a commit that referenced this issue Nov 5, 2016

#799 - oneshot / linux: speedup memory_full_info and memory_maps

5274e8a

giampaolo mentioned this issue Nov 5, 2016

Oneshot ctx manager #937

Merged

giampaolo closed this as completed Nov 5, 2016

nicolargo mentioned this issue Nov 5, 2016

Relatively high CPU usage nicolargo/glances#519

Closed

giampaolo added the performance label Nov 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speedup Process methods #799

Speedup Process methods #799

giampaolo commented Mar 29, 2016 •

edited

Loading

nicolargo commented Apr 26, 2016

giampaolo commented Apr 30, 2016 •

edited

Loading

nicolargo commented Jul 10, 2016

giampaolo commented Jul 10, 2016

giampaolo commented Aug 2, 2016

giampaolo commented Aug 2, 2016 •

edited

Loading

giampaolo commented Aug 3, 2016

giampaolo commented Aug 3, 2016 •

edited

Loading

nicolargo commented Aug 4, 2016 via email

giampaolo commented Aug 4, 2016 •

edited

Loading

giampaolo commented Aug 5, 2016

giampaolo commented Aug 8, 2016

giampaolo commented Oct 7, 2016 •

edited

Loading

nicolargo commented Oct 7, 2016

giampaolo commented Oct 7, 2016

giampaolo commented Oct 28, 2016 •

edited

Loading

giampaolo commented Oct 28, 2016

giampaolo commented Nov 5, 2016 •

edited

Loading

nicolargo commented Nov 5, 2016

suzaku commented Nov 7, 2016

Speedup Process methods #799

Speedup Process methods #799

Comments

giampaolo commented Mar 29, 2016 • edited Loading

Branch

Benchmark scripts

Linux (+2.56x speedup)

Windows (+1.9x or +6.5x speedup)

FreeBSD (+2.18x speedup)

OSX (+1.92x speedup)

SunOS (+1.37x speedup)

nicolargo commented Apr 26, 2016

giampaolo commented Apr 30, 2016 • edited Loading

nicolargo commented Jul 10, 2016

giampaolo commented Jul 10, 2016

giampaolo commented Aug 2, 2016

giampaolo commented Aug 2, 2016 • edited Loading

giampaolo commented Aug 3, 2016

giampaolo commented Aug 3, 2016 • edited Loading

nicolargo commented Aug 4, 2016 via email

giampaolo commented Aug 4, 2016 • edited Loading

giampaolo commented Aug 5, 2016

giampaolo commented Aug 8, 2016

giampaolo commented Oct 7, 2016 • edited Loading

nicolargo commented Oct 7, 2016

giampaolo commented Oct 7, 2016

giampaolo commented Oct 28, 2016 • edited Loading

giampaolo commented Oct 28, 2016

giampaolo commented Nov 5, 2016 • edited Loading

nicolargo commented Nov 5, 2016

suzaku commented Nov 7, 2016

giampaolo commented Mar 29, 2016 •

edited

Loading

giampaolo commented Apr 30, 2016 •

edited

Loading

giampaolo commented Aug 2, 2016 •

edited

Loading

giampaolo commented Aug 3, 2016 •

edited

Loading

giampaolo commented Aug 4, 2016 •

edited

Loading

giampaolo commented Oct 7, 2016 •

edited

Loading

giampaolo commented Oct 28, 2016 •

edited

Loading

giampaolo commented Nov 5, 2016 •

edited

Loading