process_iter(): no longer check whether PIDs have been reused #2396

giampaolo · 2024-04-08T08:12:34Z

Summary

OS: all
Type: performance

Description

For every process yielded by psutil.process_iter(), internally we check whether the process PID has been reused, in which case we return a "fresh" Process instance. In order to check for PID reuse we are forced to create a new Process instance, retrieve process create_time() and compare it with the original process. Performance wise, it turns out this has a huge (and exponential) cost. This is particularly relevant because process_iter() is typically used to write task manager like apps, where the full process list is retrieved every second. I realized this at work, while writing a process monitor agent that runs on small hardware (a cleaning robot).

By removing the PID reuse check I get a a 21x speedup on a Linux OS with 481 running PIDs:

import time, psutil
started = time.monotonic()
for x in range(1000):
    list(psutil.process_iter())
print(f"completed in {(time.monotonic() - started):.4f} secs")

Current master:
Number of pids: 481. Completed in 5.1079 secs

With PID reuse check removed:
Number of pids: 481. Completed in 0.2419 secs

Repercussions

PID reuse is already pre-emptively checked for "write" Process APIs such as kill(), terminate(), nice() (set), etc., so in that sense it won't make any difference and we'll remain safe.
There are some Process APIs that are cached: exe(), create_time() and name() (Windows only). In this case, if PID has been reused, the Process instance will keep returning the old value, which doesn't happen with the current (slow) implementation, since process_iter() returns a brand new Process instance.
We may clear Process cache on is_running(), but we cannot clear create_time()'s cache, as the old value is necessary to detect PID reusage. This basically means a PID-reused Process instance should just be discarded by process_iter() somehow (but how?).

The text was updated successfully, but these errors were encountered:

giampaolo · 2024-06-11T21:55:36Z

Fixed in 7556e5d and 89b6096.

giampaolo added the enhancement label Apr 8, 2024

github-actions bot added the performance label Apr 8, 2024

giampaolo mentioned this issue Apr 8, 2024

Is htop safe from PID reuse? htop-dev/htop#1441

Closed

nicolargo mentioned this issue May 4, 2024

PsUtil 6+ no longer check PID reused nicolargo/glances#2755

Closed

giampaolo closed this as completed Jun 11, 2024

freakboy3742 mentioned this issue Jun 23, 2024

Update psutil requirement from <6.0,>=5.9 to >=5.9,<7.0 beeware/briefcase#1885

Merged

Rixxan mentioned this issue Jul 29, 2024

[610] Improved process_iter EDCD/EDMarketConnector#2279

Merged

giampaolo mentioned this issue Sep 27, 2024

[Windows] process_iter() is 10x slower when running from non-admin account #2366

Closed

tim-schilling mentioned this issue Oct 4, 2024

Unpin psutil dependency scoutapp/scout_apm_python#790

Merged

ntindle mentioned this issue Oct 28, 2024

build(deps): bump psutil from 5.9.8 to 6.1.0 in /autogpt_platform/backend Significant-Gravitas/AutoGPT#8473

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

process_iter(): no longer check whether PIDs have been reused #2396

process_iter(): no longer check whether PIDs have been reused #2396

giampaolo commented Apr 8, 2024 •

edited

Loading

giampaolo commented Jun 11, 2024

process_iter(): no longer check whether PIDs have been reused #2396

process_iter(): no longer check whether PIDs have been reused #2396

Comments

giampaolo commented Apr 8, 2024 • edited Loading

Summary

Description

Repercussions

giampaolo commented Jun 11, 2024

giampaolo commented Apr 8, 2024 •

edited

Loading