Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some packages like glfw blocks forever while communicating with a subprocess #1096

Closed
wookayin opened this issue Oct 22, 2022 · 6 comments
Closed

Comments

@wookayin
Copy link

wookayin commented Oct 22, 2022

Before creating a new issue, please check the FAQ to see if your question is answered there.

Environment data

  • debugpy version: 1.6.3
  • OS and version: macOS 12.6 arm64
  • Python version (& distribution if applicable, e.g. Anaconda): anaconda 3.10.5
  • Using VS Code or Visual Studio: Using Neovim DAP.
  • glfw.__version__: '2.5.5'

Actual behavior

When importing glfw which internally uses ctypes.CDLL, the process blocks forever. This happens only when running via debugpy with as neovim DAP client; unfortunately, it did not happen when running via VS Code + debugpy.

The line that causes a deadlock or indefinite blocking is subprocess.Popen(...).communicate: glfw/library.py#L115. Looks like a patched version of multiprocess/subprocess is causing some problem.

Steps to reproduce:

I was running the following single-file script via debugpy.

import os
print(f"PID = {os.getpid()}")

print("trying to load glfw")
import glfw  # <---------- blocks while calling _glfw_get_version()
print("glfw loaded")

More information

Stacktrace tells (py-spy dump --pid PID) it is blocked at _glfw_get_version (glfw/library.py:115), or more precisely, during Popen(...).communicate() call -- see glfw/library.py#L115.

Click to expand the full stack trace
Thread 0x10370C580 (idle): "MainThread"
    select (selectors.py:416)
    _communicate (subprocess.py:2003)
    communicate (subprocess.py:1152)
    _glfw_get_version (glfw/library.py:115)
    _load_library (glfw/library.py:59)
    <module> (glfw/library.py:194)
    _call_with_frames_removed (<frozen importlib._bootstrap>:241)
    exec_module (<frozen importlib._bootstrap_external>:883)
    _load_unlocked (<frozen importlib._bootstrap>:688)
    _find_and_load_unlocked (<frozen importlib._bootstrap>:1006)
    _find_and_load (<frozen importlib._bootstrap>:1027)
    <module> (glfw/__init__.py:40)
    _call_with_frames_removed (<frozen importlib._bootstrap>:241)
    exec_module (<frozen importlib._bootstrap_external>:883)
    _load_unlocked (<frozen importlib._bootstrap>:688)
    _find_and_load_unlocked (<frozen importlib._bootstrap>:1006)
    _find_and_load (<frozen importlib._bootstrap>:1027)
    <module> (g.py:4)
    _run_code (_pydevd_bundle/pydevd_runpy.py:124)
    _run_module_code (_pydevd_bundle/pydevd_runpy.py:135)
    run_path (_pydevd_bundle/pydevd_runpy.py:321)
    run_file (debugpy/server/cli.py:284)
    main (debugpy/server/cli.py:430)
    <module> (debugpy/__main__.py:39)
    _run_code (runpy.py:86)
    _run_module_as_main (runpy.py:196)

I could see the subprocess spawned but via pydevd and debugpy as follows:

The subprocess command line (prettified)
.../bin/python -c
    import sys;
    sys.path.insert(0, r'.../lib/python3.10/site-packages/debugpy/_vendored/pydevd');
    import pydevd;
    pydevd.PydevdCustomization.DEFAULT_PROTOCOL='http_json';
    pydevd.settrace(
         host='127.0.0.1', port=64304, suspend=False, trace_only_current_thread=False,
         patch_multiprocessing=True, access_token=None, client_access_token='<TOKEN>',
         __setup_holder__={
            'client': '127.0.0.1',
            'client-access-token': '<TOKEN>',
            'json-dap-http': True,
            'multiprocess': True,
            'port': 64304,
            'ppid': 5994,
            'server': False,
            'skip-notify-stdin': True
        }
    )
\012import sys
\012import ctypes
\012
\012def get_version(library_handle):
\012    """
\012    Queries and returns the library version tuple or None.
\012    """
\012    major_value = ctypes.c_int(0)
\012    major = ctypes.pointer(major_value)
\012    minor_value = ctypes.c_int(0)
\012    minor = ctypes.pointer(minor_value)
\012    rev_value = ctypes.c_int(0)
\012    rev = ctypes.pointer(rev_value)
\012    if hasattr(library_handle, 'glfwGetVersion'):
\012        library_handle.glfwGetVersion(major, minor, rev)
\012        version = (major_value.value,
\012                   minor_value.value,
\012                   rev_value.value)
\012        return version
\012    else:
\012        return None
\012
\012try:
\012    input_func = raw_input
\012except NameError:
\012    input_func = input
\012filename = input_func().strip()
\012
\012try:
\012    library_handle = ctypes.CDLL(filename)
\012except OSError:
\012    pass
\012else:
\012    version = get_version(library_handle)
\012    print(version)
\012

Please note that the above code originates from https://github.com/FlorianRhiem/pyGLFW/blob/master/glfw/library.py#L69.

I was trying to reproduce the bug by launching server and client processes via command line (rather than neovim DAP), but I failed due to my ignorance. I was unable to reproduce this in a VS Code environment. If you have hard time reproducing the bug with a neovim setup, please let me know so I can try more (or please let me know some instructions). Or would it be a bug of the python DAP adapter?

Expected behavior

The above code should run without any problem as when running without debugpy.

Workaround

As a temporary workaround, I'm patching the glfw.library so that _glfw_get_version simply returns (3, 3, 7) (depending on the actual .so file) or runs the program in the same process.

@wookayin wookayin changed the title Some packages like glfw blocks forever while invoking Popen.communicate Some packages like glfw blocks forever while communicating with a subprocess Oct 22, 2022
@fabioz
Copy link
Collaborator

fabioz commented Oct 22, 2022

This could've been caused by the issue fixed in: ea423ae (which is still not released).

Can you try to apply that change locally and see if it fixes it for you? i.e.:

Run in the debugger:

from _pydev_bundle import pydev_monkey
print(pydev_monkey.__file__)

Then open that file and add the code below in the def new_fork(): function as done in ea423ae:

            else:
                set_global_debugger(None)

@wookayin
Copy link
Author

wookayin commented Oct 22, 2022

Hi @fabioz, thank you very much for the help and the detailed instructions. I tried applying ea423ae which fixes #1005 -- adding the set_global_debugger(None) line -- and the HEAD version (1.6.3+28.gac646576), but it doesn't resolve the bug. Probably the root cause is different than #1005.

@fabioz
Copy link
Collaborator

fabioz commented Oct 27, 2022

One question here: do you know if neovim can deal with auto-attach of subprocesses?

i.e.: in multiprocess mode the debugger may request (through a custom message) that the client creates a new debug session in another port to debug a subprocesses, but if the client doesn't do the attach, it's quite possible that it'll just linger there and not do anything.

The workaround if the client can't do that auto-attach is asking that mode to be disabled (i.e.: "subProcess": false in the launch or attach request).

@wookayin
Copy link
Author

wookayin commented Oct 28, 2022

One question here: do you know if neovim can deal with auto-attach of subprocesses?

Probably not yet. It looks like there were some efforts for this, mfussenegger/nvim-dap-python#21.

Maybe the author of nvim-dap @mfussenegger could please give some insight on this?

The workaround if the client can't do that auto-attach is asking that mode to be disabled (i.e.: "subProcess": false in the launch or attach request).

This is it! We may want to have subProcess = false in the default configuration of nvim-dap-python, but adding that option to DAP launch configurations (on a user side) would work around the multiprocess deadlock and make glfw importable.

  -- A workaround for nvim-dap and nvim-dap-python users.
  -- put these somewhere after require('dap-python').setup {}
  local configurations = require('dap').configurations.python
  for _, configuration in pairs(configurations) do
    configuration.justMyCode = false
    configuration.subProcess = false  -- <----------- this line!
  end

@fabioz
Copy link
Collaborator

fabioz commented Oct 28, 2022

According to mfussenegger/nvim-dap-python#21, it seems that nvim is waiting for the start debugging reverse request to add that support (this is being tracked in debugpy in #1074, but it's still not even a part of the official debug adapter protocol specification, so, it may still take a while).

In the meanwhile, nvim should always set "subProcess": false as if any python subprocess is launched and the debugger tries to do the auto-attach it'll fail (but that's up to nvim to do, debugpy can't really do much here).

@fabioz fabioz closed this as completed Oct 28, 2022
@wookayin
Copy link
Author

wookayin commented Oct 28, 2022

Thanks @fabioz, it was great to figure out why this bug is happening and it is not really a debugpy problem. For the time being I'm fine with the subProcess = false options.

mfussenegger added a commit to mfussenegger/nvim-dap-python that referenced this issue Nov 1, 2022
Until the `startDebugging` request is implemented in both nvim-dap and
debugpy it's safer to set it to false as it can cause issues with some
packages. See microsoft/debugpy#1096
mfussenegger added a commit to mfussenegger/nvim-dap-python that referenced this issue Nov 1, 2022
Until the `startDebugging` request is implemented in both nvim-dap and
debugpy it's safer to set it to false as it can cause issues with some
packages. See microsoft/debugpy#1096
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants