-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
On Linux, attaching debugger to a running process causes the process to segfault #882
Comments
My understanding right now is that the "The debugger will attempt to work around..." message may be unrelated. I was concerned that it was related, and did a bit of searching, but based on fabioz/ptvsd@e0b2c5e (which was written to resolve microsoft/ptvsd#1542, which was linked from https://bugs.python.org/issue37416, which was linked from the warning) it appears that if the workaround had failed, there would probably be more errors, like "Issue found when debugger was trying to work around..." or "Error patching main thread id." Of course, maybe the segfault is happening early enough that those other errors aren't being sent. 🤷♂️ Also, this person seems to also have been finding it impossible to successfully attach to a very similar looking test script using Ubuntu/WSL/Windows: #102 (comment) Interestingly, they don't have a segfault - and I don't get any reaction inside |
Found microsoft/vscode-python#3528, in which OP stopped getting their segfault after building a clean conda enviroment. Realized I could do more to isolate the problem. So: I built a clean
I removed my OSS build of VSCode (Arch's I installed only the Python and Pylance extensions: I made sure that VSCode was using the new venv: And I used the default "Python: Attach using Process ID" config that was generated for me on first run: I'm still getting the segfault. Below is the diagnostic data generated by this new environment: Diagnostic data
User Settings
Extension version: 2022.4.0 System Info
A/B Experiments
|
I went back to using the OSS build because it's my daily driver and I'm using it for work. However, I'm still using the clean venv. Here are the logs I get when triggering the segfault with $ PYTHONFAULTHANDLER=1 python -u ~/testing/test_attaching_debugger.py Results
|
Still digging. Not really sure what I'm doing, or what might be useful to look at. Here's a summary of the coredump: $ coredumpctl info Results
And the results of running the $ gdb /bin/python3.9 core.python.1000.d66b55cf39fe49d9b0b079206f8b6d17.834432.1648831453000000 Results
|
I wonder if this is happening because of some incompatibility on the Arch linux libc (I'm not really familiar with that Linux flavor). The files related to the attach are compiled at: Maybe it may be worth trying to compile those on your own machine to know if it makes any difference. To find where that's in your machine, you can do execute:
and then find those files locally. Also, does regular debugging work? |
You may also want to try disabling the cython extensions in this case (you can do this by setting an environment variable such as: |
Answering from easiest to hardest: Regular debugging Disabling cython extensions Compiling for my own machine However, it doesn't have a shebang, isn't executable, and the second half is about 32 bit libs which (fuzzy on the details) I think Arch doesn't support. So (from same folder as the script) I manually ran the two lines of the script that seemed relevant:
And that certainly produced a Then I ran "Developer: Reload Window" in the editor, started the script, and tried to attach again. Still got a segfault, but at least the coredump summary seems to confirm that the library I recompiled was involved - here's the diff (original on left; recompiled on right). |
Can you do Another thing to check may be gdb (as the attach to process uses gdb to do the actual attach). Do you think you can try with a newer version of gdb? -- Installing it and making sure it's in the |
I did try Arch does rolling releases; I believe I'm running the newest version of $ gdb --version
GNU gdb (GDB) 11.2 I have a mac running OSX 10.14 and a friend running Ubuntu, maybe I'll see what happens if I try to replicate in those environments. |
Well, I'm able to attach to my test script using the Mac (no segfault, works great!) I'll ask my friend to try. |
Interesting! He was able to replicate the segfault 100% on Ubuntu 21.10. His `faulthandler` output:
His coredump summary:
|
I noticed that on the Mac,
Is that significant? |
Mac has its own fork (lldb). I'll see if I can reproduce on Ubuntu 21.10. |
If you think it might be useful and can give me steps for how to switch out which binary is used by |
Heh. I installed Different result: Nothing happens for about 15 seconds and then I get a popup saying "Timed out waiting for debug server to connect." That's exactly what happens in my Arch environment when the kernel is configured to forbid one process from snooping on another. (Note: when the Mac was using I don't think Linux's Then later I tried the reverse: on Arch, symlinked As one could probably predict, `lldb` and `gdb` do not support identical arguments and flags :)
|
Tried for a bit to compile Didn't succeed to my satisfaction; when I call Since it was a wild guess to start with, I decided to stop trying this approach for now. |
Wanted to try running VSCode as root after reading comments at the top of
|
Maybe the injection logs of debugpy would be useful. These are from a "normal" run: (Arch, no symlinks, no sudo, etc ) Here are all the lines from Raw logs
Cleaned logs
|
As a note, I've been able to reproduce this with |
Ok, I think I found the culprit... try to do the following: find the file:
Note: you can do
to know the base location where the debugger resides in your installation (run that in a simple debug session). And then in that file comment out the lines with the following contents:
and try to do a new attach. That should make it work again... Those changes were done some time ago as an optimization to make the attach run faster (because it shouldn't need to load all symbols from all dlls at an attach), but this is already the 2nd time we've had issues due to that change (the first time it was solved by forcing to load I'll investigate a bit more what's the proper solution, but at least for now that can be used as a workaround. |
Workaround works! Beautiful. I've been looking forward to being able to debug a running process for - probably a year or so since I saw it in the docs. Thanks for looking into this. |
I guess I'll make that the default (as it was previously)... it's better to be slower but working by default than faster and crashing sometimes depending on the linux installation (maybe I'll create an env variable to customize that if it's too slow for someone). |
Out of curiosity, is it that
|
I wonder if this could also be the cause of sporadic failures of attach-to-PID tests in CI? |
It should just take longer to attach and it shouldn't make a difference afterwards. |
I don't think it's likely, I think it's a works/doesn't work difference -- but we'll see... I'll do the change and we'll know for certain ;) |
I'm a little curious if there's a plan for when the changes mitigating this issue might get merged (don't think 6d88802 is merged yet). I opened the issue against version |
@mosbasik it's merged but a @int19h @karthiknadig are there plans to do a release? I guess we have enough things to do a release worthwhile at this point (Python 3.11 support is still not complete, but it should be enough to be usable and definitely better than the last release and there are a number of fixes which should be interesting -- the major change in the repo is that frame eval mode, when available, is the default again). |
Yes, we're planning a release this week. |
Issue Type: Bug
Behaviour
Expected vs. Actual
I ran a simple python script that alternates between print() and sleep() forever, then I tried to attach the debugger to it. I expected to gain control of the process with the debugger. But actually, the process segfaulted.
Steps to reproduce:
I'm using Python 3.9 in a virtualenv on Arch Linux.
I disable all of my extensions from the Extensions tab using "Disable all installed extensions", and then re-enable only the Python extension to get the debugger.
I run the following Python script:
test.py:
And then I try to attach to it using the following debugger setup:
launch.json:
Here's what the ouput of the script looks like:
Diagnostic data
python.languageServer
setting: PylanceUser Settings
Extension version: 2022.2.1924087327
VS Code version: Code - OSS 1.65.2 (c722ca6c7eed3d7987c0d5c3df5c45f6b15e77d1, 2022-03-10T19:23:12.185Z)
OS version: Linux x64 5.16.16-arch1-1
Restricted Mode: No
System Info
gpu_compositing: enabled
multiple_raster_threads: enabled_on
oop_rasterization: disabled_off
opengl: enabled_on
rasterization: disabled_software
skia_renderer: enabled_on
video_decode: disabled_software
vulkan: disabled_off
webgl: enabled
webgl2: enabled
The text was updated successfully, but these errors were encountered: