-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FIX] MultiProc starting workers at dubious wd #2368
Conversation
When ``maxtasksperchild`` was set at a very low value, workers are created very often, sometimes at working directories deleted after the interface cleanup. That would trigger an ``OSError`` when calling ``os.getcwd()`` during ``nipype.config`` import. This PR sets an initializer for the workers that just changes to the appropriate working directory before the worker is spun up. Fixes nipreps/fmriprep#868
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable. Two questions.
nipype/pipeline/plugins/multiproc.py
Outdated
self.pool = NipypePool( | ||
processes=self.processors, | ||
maxtasksperchild=maxtasks, | ||
initializer=_init_worker, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason this shouldn't just be os.chdir
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not really
@@ -128,6 +135,10 @@ def __init__(self, plugin_args=None): | |||
self._task_obj = {} | |||
self._taskid = 0 | |||
|
|||
# Cache current working directory and make sure we | |||
# change to it when workers are set up | |||
self._cwd = os.getcwd() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the workflow base directory or the CWD of the shell? Could this cause things to dump into the local directory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't because interfaces handle their WD. I think this is fixing an edge case for fmriprep where we are spinning up and killing workers all the time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok it turns out that some SimpleInterface
s are writing to the workflow base directory :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well that's... suboptimal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's actually bothering, because it means that without this patch, those interfaces are being run in some other unexpected path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently it is a problem only with SimpleInterface
@satra I don't see potential side effects to this PR, but it'd be great if you took a peek through your window to the future since other humans we don't have that. |
Hold off merging, I still need to refine a minor nit. |
@oesteban Still need to hold off? |
No, once I identified that the problem is the |
…pype into fix/multi-proc-cwd
When
maxtasksperchild
is set at a very low value (e.g. 1), workers are created very often, sometimes at working directories deleted after the interface cleanup. That would trigger anOSError
when callingos.getcwd()
duringnipype.config
import.This PR sets an initializer for the workers that just changes to the appropriate working directory before the worker is spun up. Fixes nipreps/fmriprep#868
The reason why one would be interested in
maxtasksperchild=1
is memory consumption. Using this option in addition tomultiprocessing
in'forkserver'
mode ensures memory is freed after each interface has been run.