debugpy gets "stuck" while using run by line in vscode jupyter notebook #1051

roblourens · 2022-09-09T21:15:23Z

Before creating a new issue, please check the FAQ to see if your question is answered there.

Environment data

debugpy version: 1.6.3
OS and version: macOS
Python version (& distribution if applicable, e.g. Anaconda): 3.9.5
Using VS Code or Visual Studio: VS Code

I am working on the "run by line" feature with a jupyter notebook in vscode. There is an issue which I don't think is new. I am just clicking the "run by line" button very quickly, which starts a debug session and sends a "stepIn" request. After trying this a few times, the debugger eventually stops responding.

Basically at this point, the cell finished execution and so we have sent a "disconnect" request to debugpy, but we never get a response. The log from the extension's side shows that as I keep clicking the button to step, vscode sends "stepIn" requests that don't get a response. Here are the debugpy + vscode-jupyter logs from this point where the debugger is not responding.

debugpy.adapter-93501.log
debugpy.pydevd.93407.log
debugpy.server-93407.log
vscodejupyter.log

But there is a workaround: if the user clicks the stop button, we execute the code pass, and this seems to unstick the debugger. We added this workaround in microsoft/vscode-jupyter#7612, although I'm not sure where it came from. Once that happens, we get responses to all the requests that previously weren't getting responses, and can start another debug session if desired. Here are the logs from after executing this workaround (includes the full log of the above, then more)

vscodejupyter.log
debugpy.adapter-93501.log
debugpy.pydevd.93407.log
debugpy.server-93407.log

Can you help me understand what's going on? We try to keep things consistent on the extension's end when stopping/starting new sessions quickly. It's possible that the requests we sent are in a weird order. One thing I have noticed is that when this happens, I see an evaluate request that we sent for variables right before the final disconnect request. If the disconnect request goes out after we got a response for evaluate, then it seems like the issue doesn't happen. I don't know whether that is part of the problem or not. But in any case, it definitely seems like the kernel/debugger is in a broken state.

The text was updated successfully, but these errors were encountered:

roblourens · 2022-09-20T04:09:12Z

@int19h I'm wondering whether the tag means you know it's a bug on debugpy's end, or whether it needs more investigation, and if so, anything I can do to help?

int19h · 2022-09-26T19:14:51Z

The adapter logs for the failing case seem to indicate that it never even received that "disconnect" - the last message from the client has seq=101, which is the eval immediately preceding the disconnect. However, it also never receives the response to that
eval from the server, and pydevd log doesn't show success of evaluation, either.

One thing of note when that last eval request (seq=101) comes in, it specifies a "frameId", but at the same time there were several "stepIn" request issued right before without a subsequent "stackTrace" - so the "frameId" is not necessarily valid anymore. However, if the frame is gone, I'd expect things to fail gracefully with an error response, not hang indefinitely. So there's definitely a bug on debugpy end here, even if only about reporting the error.

@fabioz, is there anything else suspicious that can be gleaned from the pydevd log here?

roblourens · 2022-10-13T20:01:47Z

Hey @fabioz, pinging on this in case you missed it

fabioz · 2022-10-14T13:35:12Z

I did check the logs a but nothing really did stand out for me in the pydevd side.

As a note, it's entirely possible that an evaluate request sent to pydevd doesn't have a response at all (since it's all threaded, the eval request can be made but if that thread is not stopped in a breakpoint it'll just linger there as there's no explicit control to timeout such a request if the thread starts running in the meanwhile, which appears to be what's happening here -- it's possible to see the evaluate followed by the resume due to the step in, but it never stops anywhere afterwards in the logs and thus it's just running waiting to stop somewhere, which it never does due to not reaching user code).

The fact that no disconnect appears in the jupyter log but not in the debugpy logs probably means something bad happened at that point (and I'd say that's the probable culprit).

Given that other messages are received by debugpy afterwards, my best guess is that after logging the disconnect in jupyter it didn't really send the message to debugpy and the client gets into a weird situation (but I don't know why that'd happen -- the server logs seem to correctly reflect what it receives, the one point where there's a divergence is that the jupyter logs show a disconnect and that's never received in the debugger and from following the code in debugpy I don't see any way for it to receive something in the socket and don't log it unless the connection is broken -- which is not the case as other messages are received properly afterwards)...

The difference from the case where it's responding is that the disconnect request was indeed received (and thus any request afterwards should be answered with a client not authenticated as appears in that log until the attach is done again).

One possible culprit (and now I'm just making things up because I haven't even checked the client) is that the client is waiting for a response to all messages before sending the disconnect and that wouldn't work because it will never receive a response from that evaluate...

roblourens · 2022-10-17T20:38:40Z

One possible culprit (and now I'm just making things up because I haven't even checked the client) is that the client is waiting for a response to all messages before sending the disconnect and that wouldn't work because it will never receive a response from that evaluate...

Assuming by "client" you mean vscode and the Jupyter extension, that shouldn't happen- all the messages are independent. Or do you mean ipykernel/jupyter?

Would it narrow it down at all to run this without the evaluate requests and see whether the issue still comes up in CI?

roblourens · 2022-11-02T22:58:45Z

My takeaway from the comments above is that you don't feel there's anything actionable on the debugpy/pydevd sides with the info you currently have, is that fair? If so, I can work on this to try to narrow it down some more.

This comes up pretty easily in my tests in CI, so if you can think of anything that would be helpful to log, I can run tests against a custom build with extra logging.

I can also try some different things on the extension's side that might give more info... I can remove the evaluate requests (lately we've also been looking at how these requests that originate from the variables view can be slow to process or produce large responses to pass around) if that gets us anywhere, I can try to create a minimal repro based on that. Does that make sense? Anything else I can try?

fabioz · 2022-11-03T14:59:34Z

My takeaway from the comments above is that you don't feel there's anything actionable on the debugpy/pydevd sides with the info you currently have, is that fair?

Well, I'd say that the issue is in the communication between jupyter and debugpy, as the issue seems to be that a message (logged in jupyter) is not received by debugpy (at least it's not in its logs). I did some review in the code in debugpy and I find it unlikely that a received message is not logged (but this doesn't exclude issues in the socket itself -- maybe the socket wasn't setup properly and is missing some flag and there's some buffering issue and the message isn't being received).

If there was some standalone repro that'd be really nice...

roblourens · 2022-11-04T00:57:31Z

Got it, I will dig into this a bit more this month.

roblourens · 2022-11-28T15:58:49Z

Sure enough, running tests without the variables view sending the evaluate requests seems to make the issue go away. I still want to come up with a consistent repro though.

judej · 2023-01-19T21:55:00Z

Waiting for Jupyter investigation

github-actions · 2023-03-14T00:24:36Z

This issue was closed because it has been stalled for 30 days with no activity. If the issue still persists, please reopen with the information requested. Thanks.

roblourens mentioned this issue Sep 10, 2022

Run by line test is not stopping at end of cell microsoft/vscode-jupyter#7611

Open

int19h self-assigned this Sep 12, 2022

int19h added the bug Something isn't working label Sep 12, 2022

int19h removed their assignment Sep 27, 2022

roblourens mentioned this issue Oct 26, 2022

Issues with Run By Line microsoft/vscode-jupyter#11740

Closed

2 tasks

kieferrm mentioned this issue Nov 6, 2022

Jupyter: Iteration Plan for November 2022 microsoft/vscode-jupyter#11922

Closed

23 tasks

kieferrm mentioned this issue Jan 2, 2023

Jupyter: Iteration Plan for January 2023 microsoft/vscode-jupyter#12474

Closed

20 tasks

judej added the waiting for response label Jan 19, 2023

kieferrm mentioned this issue Feb 4, 2023

Jupyter: Iteration Plan for February 2023 microsoft/vscode-jupyter#12734

Closed

17 tasks

kieferrm mentioned this issue Mar 4, 2023

Jupyter: Iteration Plan for March 2023 microsoft/vscode-jupyter#12994

Closed

15 tasks

github-actions bot added the Stale label Mar 14, 2023

github-actions bot closed this as completed Mar 14, 2023

kieferrm mentioned this issue Apr 2, 2023

Jupyter: Iteration Plan for April 2023 microsoft/vscode-jupyter#13250

Closed

16 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

debugpy gets "stuck" while using run by line in vscode jupyter notebook #1051

debugpy gets "stuck" while using run by line in vscode jupyter notebook #1051

roblourens commented Sep 9, 2022 •

edited

Loading

roblourens commented Sep 20, 2022

int19h commented Sep 26, 2022

roblourens commented Oct 13, 2022

fabioz commented Oct 14, 2022

roblourens commented Oct 17, 2022

roblourens commented Nov 2, 2022

fabioz commented Nov 3, 2022

roblourens commented Nov 4, 2022

roblourens commented Nov 28, 2022

judej commented Jan 19, 2023

github-actions bot commented Mar 14, 2023

debugpy gets "stuck" while using run by line in vscode jupyter notebook #1051

debugpy gets "stuck" while using run by line in vscode jupyter notebook #1051

Comments

roblourens commented Sep 9, 2022 • edited Loading

Environment data

roblourens commented Sep 20, 2022

int19h commented Sep 26, 2022

roblourens commented Oct 13, 2022

fabioz commented Oct 14, 2022

roblourens commented Oct 17, 2022

roblourens commented Nov 2, 2022

fabioz commented Nov 3, 2022

roblourens commented Nov 4, 2022

roblourens commented Nov 28, 2022

judej commented Jan 19, 2023

github-actions bot commented Mar 14, 2023

roblourens commented Sep 9, 2022 •

edited

Loading