Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_children() sometimes contains non-children of a process #314

Closed
giampaolo opened this issue May 23, 2014 · 9 comments
Closed

get_children() sometimes contains non-children of a process #314

giampaolo opened this issue May 23, 2014 · 9 comments

Comments

@giampaolo
Copy link
Owner

From stanchev.emil@gmail.com on August 06, 2012 11:49:51

What steps will reproduce the problem?  
(You cannot guarantee it will be reproduced every time)
1. Create lots of processes (hundreds, maybe a thousand)
2. Run psutil.Process(pid).get_children() for all of them. 

What is the expected output?  
The actual children should be returned. 

What do you see instead?  
Some of the processes end up having "children" that are not really their children. 

What version of psutil are you using? What Python version?  
psutil-0.5, psutil-0.4.0, python-2.6.1 

On what operating system? Is it 32bit or 64bit version?  
64-bit Windows 2008 

Please provide any additional information below.  
This happens because when you scan the table you are only checking for match on:
p.ppid == self.pid when going through the process table.
PIDs can be reused and a process could have a ppid that's already dead.

Instead I think you should also check that a parent's creation time is less 
than or equal to the given process.

Original issue: http://code.google.com/p/psutil/issues/detail?id=314

@giampaolo
Copy link
Owner Author

From stanchev.emil@gmail.com on August 06, 2012 03:33:08

Adding patch that also covers the recursive version.

Attachment: children_bug_r2.patch

@giampaolo
Copy link
Owner Author

From g.rodola on August 06, 2012 04:57:54

Hmmm I'm not sure I'm following you.
Are you reporting this because it actually happened or is it just theoretical?
Please note that when we iterate through all processes we already make sure 
every process PID has not been reused: 
https://code.google.com/p/psutil/source/browse/tags/release-0.5.1/psutil/__init__.py#763
 ...and this check is automatically inherited by get_children().
Perhaps you can provide a test code?

@giampaolo
Copy link
Owner Author

From stanchev.emil@gmail.com on August 06, 2012 05:10:15

It actually happened.

I think the code you pointed out handles the case where the PID is reused 
between calls to process_iter() ?

I am talking about something different. Example from my system:

1) explorer.exe  has a PPID of 3948. There is no process with PID 3948 running.
2) I start a process X. It happens to reuse PID 3948.
3) X.get_children() returns explorer.exe as a child, which is obviously wrong.

Let me know if you need more information.

@giampaolo
Copy link
Owner Author

From stanchev.emil@gmail.com on August 06, 2012 05:44:13

Attaching a reproduce script for windows.
I think this bug does not apply to linux, as an orphaned process gets adopted 
by init, so ppid cannot point to a dead process?

Please beware it creates a lot of 'cmd' processes on the machine. It should 
cleanup on ctrl-c.
Here's the tail of my example run on the windows 2008 machine using psutil-0.4:

329 (new PID=4200)
psutil.Process(pid=4200, name='cmd.exe') is not really a parent of 
psutil.Process(pid=4252, name='NetTime.exe')
psp.create_time <= c.create_time == False

Obviously NetTime.exe was not started by the script and also the processes 
started by the script do not have any children at all.

As you can see from the "False" value here, if you put the check about 
create_time this bug will not be happening.

Attachment: reproduce_children_bug.py

@giampaolo
Copy link
Owner Author

From g.rodola on August 06, 2012 05:55:59

Ok, I get it now, and you're right: we should skip all children which appears 
to be older than their parents, meaning their PID has been reused.
This should now be fixed as r1503 .
At the moment I don't have a Windows box to test against though.
Can you try reproduce_children_bug.py before and after r1503 to make sure the 
problem is fixed?

@giampaolo
Copy link
Owner Author

From stanchev.emil@gmail.com on August 06, 2012 06:55:30

Verified with 5 runs of the test script: 5 times it failed on @ r1502 , 5 times 
it finished succesfully (creating 1000 processes) @ r1503 .

Thanks for the quick fix!

@giampaolo
Copy link
Owner Author

From g.rodola on August 06, 2012 12:34:36

Great! Thanks for verifying.

Status: FixedInSVN
Labels: Milestone-1.0.0

@giampaolo
Copy link
Owner Author

From g.rodola on August 13, 2012 09:25:14

Fixed in version 0.6.0, released just now.

Status: Fixed
Labels: -Milestone-1.0.0 Milestone-0.6.0

@giampaolo
Copy link
Owner Author

From g.rodola on March 02, 2013 04:12:07

Updated csets after the SVN -> Mercurial migration: r1502 == revision ??? r1503 
== revision ???

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant