Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Started multiprocessing.Process instances are unserialisable #91090

Open
maggyero mannequin opened this issue Mar 5, 2022 · 3 comments
Open

Started multiprocessing.Process instances are unserialisable #91090

maggyero mannequin opened this issue Mar 5, 2022 · 3 comments
Labels
3.11 only security fixes stdlib Python modules in the Lib dir topic-multiprocessing type-feature A feature request or enhancement

Comments

@maggyero
Copy link
Mannequin

maggyero mannequin commented Mar 5, 2022

BPO 46934
Nosy @maggyero
PRs
  • gh-91090: Make started multiprocessing.Process instances and started multiprocessing.managers.BaseManager instances serialisable #31701
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2022-03-05.22:19:15.660>
    labels = ['type-feature', 'library', '3.11']
    title = 'Started multiprocessing.Process instances are unserialisable'
    updated_at = <Date 2022-03-05.23:19:58.658>
    user = 'https://github.com/maggyero'

    bugs.python.org fields:

    activity = <Date 2022-03-05.23:19:58.658>
    actor = 'maggyero'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2022-03-05.22:19:15.660>
    creator = 'maggyero'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 46934
    keywords = ['patch']
    message_count = 1.0
    messages = ['414597']
    nosy_count = 1.0
    nosy_names = ['maggyero']
    pr_nums = ['31701']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue46934'
    versions = ['Python 3.11']

    @maggyero
    Copy link
    Mannequin Author

    maggyero mannequin commented Mar 5, 2022

    The Python program:

    import multiprocessing
    import time
    
    
    class Application:
    
        def __init__(self):
            self._event = multiprocessing.Event()
            self._processes = [
                multiprocessing.Process(target=self._worker)
                for _ in range(multiprocessing.cpu_count())]
    
        def _worker(self):
            while not self._event.is_set():
                print(multiprocessing.current_process().name)
                time.sleep(1)
    
        def start(self):
            for process in self._processes:
                print('starting')
                process.start()
    
        def stop(self):
            self._event.set()
            for process in self._processes:
                process.join()
    
    
    if __name__ == '__main__':
        application = Application()
        application.start()
        time.sleep(3)
        application.stop()
    

    Its output:

    starting
    starting
    Traceback (most recent call last):
      File "/Users/maggyero/Desktop/application.py", line 31, in <module>
        application.start()
      File "/Users/maggyero/Desktop/application.py", line 21, in start
        process.start()
      File "/usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/process.py", line 121, in start
        self._popen = self._Popen(self)
      File "/usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/context.py", line 224, in _Popen
        return _default_context.get_context().Process._Popen(process_obj)
      File "/usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/context.py", line 284, in _Popen
        return Popen(process_obj)
      File "/usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/popen_spawn_posix.py", line 32, in __init__
        super().__init__(process_obj)
      File "/usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/popen_fork.py", line 19, in __init__
        self._launch(process_obj)
      File "/usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/popen_spawn_posix.py", line 47, in _launch
        reduction.dump(process_obj, fp)
      File "/usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/reduction.py", line 60, in dump
        ForkingPickler(file, protocol).dump(obj)
    TypeError: cannot pickle 'weakref' object
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 116, in spawn_main
        exitcode = _main(fd, parent_sentinel)
      File "/usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 126, in _main
        self = reduction.pickle.load(from_parent)
      File "/usr/local/Cellar/python@3.9/3.9.10/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/synchronize.py", line 110, in __setstate__
        self._semlock = _multiprocessing.SemLock._rebuild(*state)
    FileNotFoundError: [Errno 2] No such file or directory
    

    In the function Application.__init__, each call multiprocessing.Process(target=self._worker) initializes a multiprocessing.Process instance with the instance method self._worker as its target argument. self._worker is bound to self which has the instance attribute self._processes.

    In the function Application.start, each call process.start() serialises the target argument and therefore self._processes. self._processes is a list of multiprocessing.Process instances, initially not started yet. The first call process.start() starts the first multiprocessing.Process instance in that list without issue, but the second call process.start() fails.

    So a started multiprocessing.Process instance cannot be serialised.

    The root of the problem is that the start method of a multiprocessing.Process instance sets its _popen instance attribute to a multiprocessing.popen_*.Popen instance. The initialization of that instance performs these two steps (among others):

    1. For a multiprocessing.popen_spawn_posix.Popen instance, a multiprocessing.popen_spawn_win32.Popen instance, or a multiprocessing.popen_forkserver.Popen instance but not a multiprocessing.popen_fork.Popen instance (i.e. for the start method 'spawn' or the start method 'forkserver' but not the start method 'fork'), it serialises the multiprocessing.Process instance for writing it to the end of the pipe used by the parent process to communicate with the child process so that the child process can execute the run method of the multiprocessing.Process instance.

    2. It sets its finalizer instance attribute to a multiprocessing.util.Finalize instance which itself sets its _weakref instance attribute to a weakref.ref instance for closing at interpreter exit the ends of the pipes used by the parent process to communicate with the child process. In other words, it makes the multiprocessing.Process instance hold a weak reference.

    Thus if a multiprocessing.Process instance holds a reference to a started multiprocessing.Process instance then it holds a weak reference (point 2), so starting it will fail since it will serialise (point 1) the weak reference and weak references are not serialisable:

    import multiprocessing
    
    if __name__ == '__main__':
        multiprocessing.set_start_method('spawn')  # or 'forkserver' but not 'fork'
        process_a = multiprocessing.Process()
        process_b = multiprocessing.Process()
        process_b.foo = process_a
        process_a.start()  # creates process_a._popen.finalizer._weakref
        process_b.start()  # TypeError: cannot pickle 'weakref' object
    

    A minimal Python program showing the serialisation issue:

    import pickle
    import weakref
    
    pickle.dumps(weakref.ref(int))  # TypeError: cannot pickle 'weakref' object
    

    @maggyero maggyero mannequin added 3.11 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Mar 5, 2022
    @maggyero maggyero mannequin changed the title A started multiprocessing.Process instance cannot be serialised Started multiprocessing.Process instances are unserialisable Mar 5, 2022
    @maggyero maggyero mannequin changed the title A started multiprocessing.Process instance cannot be serialised Started multiprocessing.Process instances are unserialisable Mar 5, 2022
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @saurabh02
    Copy link

    Up-voting this issue

    @arikon
    Copy link

    arikon commented Mar 27, 2023

    Do this bug have a workaround?

    lesterfernandez added a commit to lesterfernandez/simulus that referenced this issue Feb 8, 2025
    The previous commit fixed bugs that prevented OUR classes from being pickled, but there is still one main issue.
    multiprocessing.Process itself is not picklable in Python3.8+. This issue only arises when pickling multiple processes.
    See python/cpython#91090.
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes stdlib Python modules in the Lib dir topic-multiprocessing type-feature A feature request or enhancement
    Projects
    Status: No status
    Development

    No branches or pull requests

    3 participants