You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
On UNIX, SubprocVecEnv defaults to creating processes using the fork method. The Python docs warn that "safely forking a multithreaded process is problematic." This is since the child process inherits all objects from the parent, including -- for example -- TensorFlow sessions and graphs. In practice, this tends to manifest itself with difficult to diagnose deadlocks.
This is easily solved by switching the method to spawn (which is already used by default on Windows). This imposes a small fixed overhead to creating environments. I'll submit a PR that does this for discussion.
Code example
Running ci/local_tests.sh of this commit will reproduce this error (inside Docker, reproduced on my local machine and on Travis). If required I can try and create a more minimal breaking example, but I expect this will be difficult -- race conditions tend to be fickle.
The key ingredient to cause the problem is having created a session before calling SubprocVecEnv. This happens quite naturally in some contexts, e.g. population-based training, but does not occur in most code.
System Info
Docker; Ubuntu 16.04
Python 3.6.8
Tensorflow 1.12, CPU version
Stable Baselines 2.4.1, pip install
The text was updated successfully, but these errors were encountered:
Describe the bug
On UNIX,
SubprocVecEnv
defaults to creating processes using thefork
method. The Python docs warn that "safely forking a multithreaded process is problematic." This is since the child process inherits all objects from the parent, including -- for example -- TensorFlow sessions and graphs. In practice, this tends to manifest itself with difficult to diagnose deadlocks.This is easily solved by switching the method to
spawn
(which is already used by default on Windows). This imposes a small fixed overhead to creating environments. I'll submit a PR that does this for discussion.Code example
Running
ci/local_tests.sh
of this commit will reproduce this error (inside Docker, reproduced on my local machine and on Travis). If required I can try and create a more minimal breaking example, but I expect this will be difficult -- race conditions tend to be fickle.The key ingredient to cause the problem is having created a session before calling
SubprocVecEnv
. This happens quite naturally in some contexts, e.g. population-based training, but does not occur in most code.System Info
The text was updated successfully, but these errors were encountered: