Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in multiprocessing example using venv on Windows in 3.11rc2 #98360

Closed
edge-python opened this issue Oct 17, 2022 · 6 comments
Closed
Labels
3.11 only security fixes 3.12 bugs and security fixes OS-windows release-blocker topic-multiprocessing type-bug An unexpected behavior, bug, or error

Comments

@edge-python
Copy link

edge-python commented Oct 17, 2022

I'm programming in PyCharm in Virtual Environment on Win11 on a 12900k

I use 2 modules:
• mp_tst_all.py
• mp_tst_a.py

mp_tst_all.py

import multiprocessing
import platform

import mp_tst_a


def run_my_multi():
    p_xyz = multiprocessing.Process(
        target=mp_tst_a.run_main,
        args=())

    # p_... = ...

    p_xyz.start()
    # p_... .start()

    p_xyz.join()
    # p_... .join()

    return


if __name__ == '__main__':

    print('py:', platform.python_version())

    run_my_multi()

mp_tst_a.py

import multiprocessing
import time
import platform

main_queue = multiprocessing.Queue()


def load_directory(def_queue):

    content_def = def_queue.get()
    content_def['load_directory_is_alive'] = True
    def_queue.put(content_def)

    # ---- do something ... --------
    print('\n' + 'LOAD DIRECTORY:')
    for x in range(3):
        time.sleep(.6)
        print('do something ...')
    print('\n' + 'LOAD ok' + '\n')


    content_def = def_queue.get()
    content_def['load_directory_is_alive'] = False
    content_def['run_load_file'] = True
    def_queue.put(content_def)

    return


def load_files(def_queue):

    content_def = def_queue.get()
    run_load_file_i = content_def['run_load_file']
    def_queue.put(content_def)

    # ---- do something ... --------
    print('LOAD FILE:')
    if run_load_file_i:
        for x in range(3):
            time.sleep(.6)
            print('FILE ' + str(x))
    print('\n' + 'LOAD ok' + '\n')

    return


def run_main():

    global main_queue

    content = {
        'load_directory_is_alive': None,
        'run_load_file': None,
        'last_download_date': None
        }

    main_queue.put(content)

    rest = 0.2  # for test, try different values
    time.sleep(rest)

    jobs = []

    content = main_queue.get()
    content['last_download_date'] = 'xx.xx.xxxx'
    main_queue.put(content)
    time.sleep(rest)


    # ---- process_1.1: load_directory -----------------------------------
    process_1_1 = multiprocessing.Process(
        target=load_directory, name='load_directory',
        args=(main_queue, ))

    jobs.append(process_1_1)
    process_1_1.start()
    time.sleep(rest)


    # ---- do not begin process_2 before process_1.1: load_directory exits
    while process_1_1.is_alive():
        pass
        time.sleep(rest)


    # ---- process_2: load_files -----------------------------------------
    process_2 = multiprocessing.Process(
        target=load_files, name='load_files',
        args=(main_queue, ))

    jobs.append(process_2)
    process_2.start()


    process_1_1.join()
    process_2.join()

    return


if __name__ == "__main__":

    print('py:', platform.python_version())
    run_main()

Bug

first: with PyCharm

mp_tst_a.py works in py 3.11.0_rc2 and in py 3.10(.7)

but

when mp_tst_a.py is called from mp_tst_all.py there is an error in py 3.11.0_rc2,
however, in py 3.10(.7) it works

the error message varies depending on the 'rest' variable (mp_tst_a.py -> def run_main)

Process load_directory:
Traceback (most recent call last):
  File "C:\...\Python311\Lib\multiprocessing\process.py", line 314, in _bootstrap
    self.run()
  File "C:\...Python311\Lib\multiprocessing\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "C:\...\mp_test_single_a.py", line 10, in load_directory
    content_def = def_queue.get()
                  ^^^^^^^^^^^^^^^
  File "C:\...\Python311\Lib\multiprocessing\queues.py", line 102, in get
    with self._rlock:
  File "C:\...\Python311\Lib\multiprocessing\synchronize.py", line 95, in __enter__
    return self._semlock.__enter__()
           ^^^^^^^^^^^^^^^^^^^^^^^^^
**PermissionError: [WinError 5]** Zugriff verweigert
Process load_directory:
Traceback (most recent call last):
  File "C:\...\Python311\Lib\multiprocessing\process.py", line 314, in _bootstrap
    self.run()
  File "C:\...\Python311\Lib\multiprocessing\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "C:\...\mp_test_single_a.py", line 10, in load_directory
    content_def = def_queue.get()
                  ^^^^^^^^^^^^^^^
  File "C:\...\Python311\Lib\multiprocessing\queues.py", line 102, in get
    with self._rlock:
  File "C:\...\Python311\Lib\multiprocessing\synchronize.py", line 98, in __exit__
    return self._semlock.__exit__(*args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
**OSError: [WinError 6]** Das Handle ist ungültig

secondly: in Command Prompt Window (cmd) it runs:

C:\...\Python311\Lib\site-packages>mp_tst_all.py
py: 3.11.0rc2

LOAD DIRECTORY:
do something ...
do something ...
do something ...

LOAD ok

LOAD FILE:
FILE 0
FILE 1
FILE 2

LOAD ok


C:\...\Python311\Lib\site-packages>

is it a bug in 3.11.0_rc2 or do i need to change something in the code ?
Thank you

@edge-python edge-python added the type-bug An unexpected behavior, bug, or error label Oct 17, 2022
@eryksun
Copy link
Contributor

eryksun commented Oct 17, 2022

There is a bug with using multiprocessing in a virtual environment, which I presume applies to development under PyCharm.

Under a virtual environment, multiprocessing bypasses the venv launcher to instead directly run sys._base_executable, and it sets the "__PYVENV_LAUNCHER__" environment variable in order to propagate the virtual environment. It bypasses the launcher because the addition of a process in between the parent and the worker breaks the basic design of muliprocessing, which relies on manual duplication of handles from parent to child and vice versa.

However, multiprocessing leaves the path of the venv launcher in the command line, which hasn't been a problem until 3.11. With the new implementation of path initialization in 3.11, the value of sys._base_executable is now based on the command line if the parsed argv[0] path contains one or more backslashes 1. Of course this breaks spawning the next generation of worker processes.

A high-level fix in Popen.__init__(), when running in a virtual environment, would be to modify the result from spawn.get_command_line() to replace the first item with sys._base_executable. (The venv launcher itself does this, in terms of modifying the command line of the launched process.) Current source:

cmd = spawn.get_command_line(parent_pid=os.getpid(),
pipe_handle=rhandle)
cmd = ' '.join('"%s"' % x for x in cmd)
python_exe = spawn.get_executable()
# bpo-35797: When running in a venv, we bypass the redirect
# executor and launch our base Python.
if WINENV and _path_eq(python_exe, sys.executable):
python_exe = sys._base_executable
env = os.environ.copy()
env["__PYVENV_LAUNCHER__"] = sys.executable
else:
env = None

Footnotes

  1. In previous versions, sys._base_executable is always based on the process image path, from GetModuleFileNameW(NULL, ...). The new behavior allows the command line to override this if the command is a qualified path that contains at least one backslash (not forward slash). Refer to the source in Modules/getpath.py.

@edge-python
Copy link
Author

edge-python commented Oct 18, 2022

Many Thanks,

I can confirm that using the System Interpreter in PyCharm (without Virtual Environment) py 3.11.0_rc2 works correctly.
(but not in a virtual environment)

@gvanrossum gvanrossum changed the title in py 3.11.0_rc2 sample multiprocessing code not work, but same code in 3.10(.7) work Regression in multiprocessing example on Windows in 3.11rc2 Oct 19, 2022
@gvanrossum gvanrossum changed the title Regression in multiprocessing example on Windows in 3.11rc2 Regression in multiprocessing example using venv on Windows in 3.11rc2 Oct 19, 2022
@gvanrossum
Copy link
Member

@zooba Can you have a look at this release blocker?

@zooba
Copy link
Member

zooba commented Oct 19, 2022

A high-level fix in Popen.__init__(), when running in a virtual environment, would be to modify the result from spawn.get_command_line() to replace the first item with sys._base_executable. (The venv launcher itself does this, in terms of modifying the command line of the launched process.)

This sounds like the right fix here. It should be easy to test (in case someone else gets to it before I can):

if WINENV and _path_eq(python_exe, sys.executable): 
    python_exe = sys._base_executable 
    env = os.environ.copy() 
    env["__PYVENV_LAUNCHER__"] = sys.executable 
    cmd[0] = python_exe
else:
    env = None

cmd = ' '.join('"%s"' % x for x in cmd)   # moved down from earlier

@zooba
Copy link
Member

zooba commented Oct 19, 2022

PR posted.

@pablogsal, for your consideration for next week's release. I don't know how common it is to nest multiprocessing pools like this, but the fix is general goodness and seems unlikely to introduce other issues.

zooba added a commit that referenced this issue Oct 20, 2022
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Oct 20, 2022
…orrect argv[0] in virtual environments (pythonGH-98462)

(cherry picked from commit e48f9b2)

Co-authored-by: Steve Dower <steve.dower@python.org>
miss-islington added a commit that referenced this issue Oct 20, 2022
… argv[0] in virtual environments (GH-98462)

(cherry picked from commit e48f9b2)

Co-authored-by: Steve Dower <steve.dower@python.org>
carljm added a commit to carljm/cpython that referenced this issue Oct 20, 2022
* main: (40 commits)
  pythongh-98461: Fix source location in comprehensions bytecode (pythonGH-98464)
  pythongh-98421: Clean Up PyObject_Print (pythonGH-98422)
  pythongh-98360: multiprocessing now spawns children on Windows with correct argv[0] in virtual environments (pythonGH-98462)
  CODEOWNERS: Become a typing code owner (python#98480)
  [doc] Improve logging cookbook example. (pythonGH-98481)
  Add more tkinter.Canvas tests (pythonGH-98475)
  pythongh-95023: Added os.setns and os.unshare functions (python#95046)
  pythonGH-98363: Presize the list for batched() (pythonGH-98419)
  pythongh-98374: Suppress ImportError for invalid query for help() command. (pythongh-98450)
  typing tests: `_overload_dummy` raises `NotImplementedError`, not `RuntimeError` (python#98351)
  pythongh-98354: Add unicode check for 'name' attribute in _imp_create_builtin (pythonGH-98412)
  pythongh-98257: Make _PyEval_SetTrace() reentrant (python#98258)
  pythongh-98414: py.exe launcher does not use defaults for -V:company/ option (pythonGH-98460)
  pythongh-98417: Store int_max_str_digits on the Interpreter State (pythonGH-98418)
  Doc: Remove title text from internal links (python#98409)
  [doc] Refresh the venv introduction documentation, and correct the statement about VIRTUAL_ENV (pythonGH-98350)
  Docs: Bump sphinx-lint and fix unbalanced inline literal markup (python#98441)
  pythongh-92886: Replace assertion statements in `handlers.BaseHandler` to support running with optimizations (`-O`) (pythonGH-93231)
  pythongh-92886: Fix tests that fail when running with optimizations (`-O`) in `_test_multiprocessing.py` (pythonGH-93233)
  pythongh-92886: Fix tests that fail when running with optimizations (`-O`) in `test_py_compile.py` (pythonGH-93235)
  ...
pablogsal pushed a commit that referenced this issue Oct 22, 2022
… argv[0] in virtual environments (GH-98462)

(cherry picked from commit e48f9b2)

Co-authored-by: Steve Dower <steve.dower@python.org>
@ambv
Copy link
Contributor

ambv commented Dec 5, 2022

This looks fixed. Closing.

@ambv ambv closed this as completed Dec 5, 2022
Repository owner moved this from Todo to Done in Release and Deferred blockers 🚫 Dec 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 only security fixes 3.12 bugs and security fixes OS-windows release-blocker topic-multiprocessing type-bug An unexpected behavior, bug, or error
Projects
Development

No branches or pull requests

5 participants