-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
debugpy gets "stuck" while using run by line in vscode jupyter notebook #1051
Comments
@int19h I'm wondering whether the tag means you know it's a bug on debugpy's end, or whether it needs more investigation, and if so, anything I can do to help? |
The adapter logs for the failing case seem to indicate that it never even received that "disconnect" - the last message from the client has seq=101, which is the eval immediately preceding the disconnect. However, it also never receives the response to that One thing of note when that last eval request (seq=101) comes in, it specifies a "frameId", but at the same time there were several "stepIn" request issued right before without a subsequent "stackTrace" - so the "frameId" is not necessarily valid anymore. However, if the frame is gone, I'd expect things to fail gracefully with an error response, not hang indefinitely. So there's definitely a bug on debugpy end here, even if only about reporting the error. @fabioz, is there anything else suspicious that can be gleaned from the pydevd log here? |
Hey @fabioz, pinging on this in case you missed it |
I did check the logs a but nothing really did stand out for me in the As a note, it's entirely possible that an The fact that no disconnect appears in the Given that other messages are received by The difference from the case where it's responding is that the One possible culprit (and now I'm just making things up because I haven't even checked the client) is that the client is waiting for a response to all messages before sending the disconnect and that wouldn't work because it will never receive a response from that evaluate... |
Assuming by "client" you mean vscode and the Jupyter extension, that shouldn't happen- all the messages are independent. Or do you mean ipykernel/jupyter? Would it narrow it down at all to run this without the evaluate requests and see whether the issue still comes up in CI? |
My takeaway from the comments above is that you don't feel there's anything actionable on the debugpy/pydevd sides with the info you currently have, is that fair? If so, I can work on this to try to narrow it down some more. This comes up pretty easily in my tests in CI, so if you can think of anything that would be helpful to log, I can run tests against a custom build with extra logging. I can also try some different things on the extension's side that might give more info... I can remove the evaluate requests (lately we've also been looking at how these requests that originate from the variables view can be slow to process or produce large responses to pass around) if that gets us anywhere, I can try to create a minimal repro based on that. Does that make sense? Anything else I can try? |
Well, I'd say that the issue is in the communication between jupyter and debugpy, as the issue seems to be that a message (logged in jupyter) is not received by debugpy (at least it's not in its logs). I did some review in the code in debugpy and I find it unlikely that a received message is not logged (but this doesn't exclude issues in the socket itself -- maybe the socket wasn't setup properly and is missing some flag and there's some buffering issue and the message isn't being received). If there was some standalone repro that'd be really nice... |
Got it, I will dig into this a bit more this month. |
Sure enough, running tests without the variables view sending the evaluate requests seems to make the issue go away. I still want to come up with a consistent repro though. |
Waiting for Jupyter investigation |
This issue was closed because it has been stalled for 30 days with no activity. If the issue still persists, please reopen with the information requested. Thanks. |
Before creating a new issue, please check the FAQ to see if your question is answered there.
Environment data
I am working on the "run by line" feature with a jupyter notebook in vscode. There is an issue which I don't think is new. I am just clicking the "run by line" button very quickly, which starts a debug session and sends a "stepIn" request. After trying this a few times, the debugger eventually stops responding.
Basically at this point, the cell finished execution and so we have sent a "disconnect" request to debugpy, but we never get a response. The log from the extension's side shows that as I keep clicking the button to step, vscode sends "stepIn" requests that don't get a response. Here are the debugpy + vscode-jupyter logs from this point where the debugger is not responding.
debugpy.adapter-93501.log
debugpy.pydevd.93407.log
debugpy.server-93407.log
vscodejupyter.log
But there is a workaround: if the user clicks the stop button, we execute the code
pass
, and this seems to unstick the debugger. We added this workaround in microsoft/vscode-jupyter#7612, although I'm not sure where it came from. Once that happens, we get responses to all the requests that previously weren't getting responses, and can start another debug session if desired. Here are the logs from after executing this workaround (includes the full log of the above, then more)vscodejupyter.log
debugpy.adapter-93501.log
debugpy.pydevd.93407.log
debugpy.server-93407.log
Can you help me understand what's going on? We try to keep things consistent on the extension's end when stopping/starting new sessions quickly. It's possible that the requests we sent are in a weird order. One thing I have noticed is that when this happens, I see an
evaluate
request that we sent for variables right before the finaldisconnect
request. If thedisconnect
request goes out after we got a response forevaluate
, then it seems like the issue doesn't happen. I don't know whether that is part of the problem or not. But in any case, it definitely seems like the kernel/debugger is in a broken state.The text was updated successfully, but these errors were encountered: