Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear cell output doesn't always work #5446

Closed
joyceerhl opened this issue Apr 8, 2021 · 16 comments
Closed

Clear cell output doesn't always work #5446

joyceerhl opened this issue Apr 8, 2021 · 16 comments
Assignees
Labels
bug Issue identified by VS Code Team member as probable bug

Comments

@joyceerhl
Copy link
Contributor

joyceerhl commented Apr 8, 2021

  1. Create a cell with this code (this code is from one of our native notebooks tests)
print("Start")
import time
for i in range(100):
    time.sleep(0.1)
    print(i)

print("End")
  1. Execute it
  2. While it's running and appending to output, execute 'Clear All Cells Outputs' a bunch of times
  3. Eventually the cell will stop updating with output while continuing to run
  4. Rerunning the cell will get it into a state where it never completes
    image
@joyceerhl joyceerhl added bug Issue identified by VS Code Team member as probable bug vscode-notebook labels Apr 8, 2021
@DonJayamanne
Copy link
Contributor

Sounds like a bug in vscode, since we now never update output directly this looks like it's on them

@DonJayamanne
Copy link
Contributor

@rebornix @roblourens
Looks like we're running into issues that we had in Jupyter extension when we managed updating output our selves. You might recall we had a lot of issues with updates and we resolved them by synchronozing all of the eidts (putting them into a queue).
Looks like we're running into these issues again & since we're no longer updating anything ourselves, i suspect VS Code is now running into these issues.

@roblourens
Copy link
Member

When this happens, vscode stops getting output edit requests in the main process. If I click stop, I get this

image

so are you sure the extension isn't doing something weird?

Then after that I can't run a cell, but it starts the timer, so the extension must have started execution but won't finish it.

@roblourens
Copy link
Member

Seems like this is harder to repro today, but I can still see it. Looking at logs, when I click Clear and this reproes, the extension prints "Trusting notebook" and stops printing the "Update streamed output" messages, which makes me think the extension is doing something weird. The next logs are from when I clicked stop.

Info 2021-04-13 18:38:03: Cell Index:1, state:3, exec: 2. Update streamed output
Info 2021-04-13 18:38:03: Cell Index:1, state:3, exec: 2. Update streamed output
Info 2021-04-13 18:38:04: Cell Index:1, state:3, exec: 2. Update streamed output
Info 2021-04-13 18:38:04: Cell Index:1, state:3, exec: 2. Update streamed output
Info 2021-04-13 18:38:04: Cell Index:1, state:3, exec: 2. Update streamed output
Info 2021-04-13 18:38:04: Cell Index:1, state:3, exec: 2. Update streamed output
Info 2021-04-13 18:38:05: Cell Index:1, state:3, exec: 2. Update streamed output
Info 2021-04-13 18:38:05: Cell Index:1, state:3, exec: 2. Update streamed output
Info 2021-04-13 18:38:05: Cell Index:1, state:3, exec: 2. Update streamed output
Info 2021-04-13 18:38:06: Trusting notebook...
Info 2021-04-13 18:38:20: Cell Index:0, state:1, exec: undefined. Cell cancellation requested
Info 2021-04-13 18:38:20: Cell Index:1, state:3, exec: 2. Cell cancellation requested
Info 2021-04-13 18:38:20: Interrupt requested file:///workspaces/jupyter-container/test2.ipynb
Info 2021-04-13 18:38:20: Interrupt kernel execution
Info 2021-04-13 18:38:20: Cancel pending cells
Info 2021-04-13 18:38:20: Cell Index:1, state:3, exec: 2. Cell is already running, waiting for it to finish or kernel to start
Info 2021-04-13 18:38:20: Interrupting kernel: python386jvsc74a57bd098b0a9b7b4eaaa670588a142fd0a9b87eaafe866f1db4228be72b4211d12040f
Info 2021-04-13 18:38:20: KernelProcess output: Content-Length: 136
Content-Type: application/vscode-jsonrpc; charset=utf8

{"jsonrpc": "2.0", "method": "log", "params": {"level": "INFO", "msg": "Execute rpc method interrupt_kernel in DS Daemon", "pid": 1075}}Content-Length: 150
Content-Type: application/vscode-jsonrpc; charset=utf8

{"jsonrpc": "2.0", "method": "log", "params": {"level": "INFO", "msg": "Execute rpc method interrupt_kernel from /opt/conda/bin/python", "pid": 1075}}Content-Length: 133
Content-Type: application/vscode-jsonrpc; charset=utf8

{"jsonrpc": "2.0", "method": "log", "params": {"level": "INFO", "msg": "Interrupt kernel in DS Kernel Launcher Daemon", "pid": 1075}}Content-Length: 43
Content-Type: application/vscode-jsonrpc; charset=utf8

{"jsonrpc": "2.0", "id": 2, "result": null}
Info 2021-04-13 18:38:20: Python Daemon (pid: 1075): Execute rpc method interrupt_kernel in DS Daemon
Info 2021-04-13 18:38:20: Python Daemon (pid: 1075): Execute rpc method interrupt_kernel from /opt/conda/bin/python
Info 2021-04-13 18:38:20: Python Daemon (pid: 1075): Interrupt kernel in DS Kernel Launcher Daemon

@DonJayamanne
Copy link
Contributor

Wait, thats odd, we why's the cell getting interrupted when you haven't clicked interrup. Based on what you're trying & What @joyceerhl is trying, we're only clearing output.

Hmmn.
@joyceerhl are you still able to repro this?

@joyceerhl
Copy link
Contributor Author

@DonJayamanne The logs Rob provided are a result of clicking the cell stop button, so I think it makes sense that the cell is getting interrupted. And yes I can still repro this 😕

@DonJayamanne DonJayamanne added this to the May 2021 Release milestone Apr 19, 2021
@DonJayamanne
Copy link
Contributor

DonJayamanne commented Apr 19, 2021

And yes I can still repro this 😕

I too can repro this issue.
Will need to add more logs to find out where things are hanging, I'm still certain its VS Code (or something related to the tasks we're using in new execution API), but we'll need to add more logging to find this out.

Changing to high priority & adding to May milestone as this would be a regression for us (we've had plenty of complaints in the past for similar issues)

@DonJayamanne
Copy link
Contributor

DonJayamanne commented Apr 19, 2021

Found it, its both ends.
Looks like VS Code is crashing & we're not handling the error (I never expected VSC to fail in performing the updates to the output).

@roblourens
Here's the error we get when we update the outputs

Info 2021-04-19 09:58:29: Cell Index:0, state:3, exec: 1. Failed update
Error 2021-04-19 09:58:29: Failed Error: model index out of range -1
    at s._assertIndex (file:///Applications/Visual Studio Code - Insiders.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:999:32944)
    at file:///Applications/Visual Studio Code - Insiders.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:999:24333
    at Array.map (<anonymous>)
    at s.applyEdits (file:///Applications/Visual Studio Code - Insiders.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:999:24127)
    at xr.$applyEdits (file:///Applications/Visual Studio Code - Insiders.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:2290:13287)
    at r._doInvokeHandler (file:///Applications/Visual Studio Code - Insiders.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:1642:11977)
    at r._invokeHandler (file:///Applications/Visual Studio Code - Insiders.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:1642:11661)
    at r._receiveRequest (file:///Applications/Visual Studio Code - Insiders.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:1642:10331)
    at r._receiveOneMessage (file:///Applications/Visual Studio Code - Insiders.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:1642:9117)
    at file:///Applications/Visual Studio Code - Insiders.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:1642:7218
    at l.fire (file:///Applications/Visual Studio Code - Insiders.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:60:1836)
    at s.fire (file:///Applications/Visual Studio Code - Insiders.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:76:16186)
    at i._receiveMessage (file:///Applications/Visual Studio Code - Insiders.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:76:21444)
    at file:///Applications/Visual Studio Code - Insiders.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:76:18330
    at l.fire (file:///Applications/Visual Studio Code - Insiders.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:60:1836)
    at u.acceptChunk (file:///Applications/Visual Studio Code - Insiders.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:76:13551)
    at file:///Applications/Visual Studio Code - Insiders.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:76:12899
    at Socket.R (file:///Applications/Visual Studio Code - Insiders.app/Contents/Resources/app/out/vs/workbench/workbench.desktop.main.js:2840:38365)
    at Socket.emit (events.js:315:20)
    at addChunk (_stream_readable.js:295:12)
    at readableAddChunk (_stream_readable.js:271:9)
    at Socket.Readable.push (_stream_readable.js:212:10)
    at Pipe.onStreamRead (internal/stream_base_commons.js:186:23)
Info 2021-04-19 09:58:30: Cell Index:0, state:3, exec: 1. Update streamed output

@roblourens
Copy link
Member

roblourens commented Apr 19, 2021

I guess you are replacing output on an output item that was deleted in the renderer process by the clear command. I get it now. I am not sure what the right move is.

We could implement "clear outputs" in the extension host. Or in this scenario, we could create a new output to wrap the output item.

I can't think of any other case where the frontend is modifying outputs, so this is a special scenario. @rebornix?

Btw, any reason you aren't using appendOutputItems instead of replace? Looks like you are re-sending all the content on each update.

@DonJayamanne
Copy link
Contributor

using appendOutputItems instead of replace? Looks like you are re-sending all the conten

This was after I discussed this with Jo.
Basically we need to trim the output if there are any ANSI codes (else the ipynb just grows)
Yes, VSCode supporst ANSI but only in the renderers, we need to ensure the output in the ipynb is also trimmed, hence we need to do this.

As an optimization we can do this only if we detect ansi, but tha'ts a seprate issue.

@roblourens
Copy link
Member

I forgot about that. @rebornix suggested you could just be using appendOutput for this but not if you are changing the output to resolve those.

But in this case calling append or replaceOutputItems for a removed output should just be a no-op. So I think the only issue is that we throw and should not throw.

@DonJayamanne
Copy link
Contributor

using appendOutput for this but not if you are changing the output to resolve those.

Yes agreed, but that's merely an optimization. E.g. if we're streaming terminal output that contains ANSI, we could run into this same issue (where we keep clearing output).

@DonJayamanne
Copy link
Contributor

But in this case calling append or replaceOutputItems for a removed output should just be a no-op. So I think the only issue is that we throw and should not throw.

Wouldn't the data then get lost? I.e. we'd keep calling replaceOutputItem everytime.

I think the only way is for us (extension) to know that the output is not longer valid.
Perhaps have a disposed property on the output item or an onDidDispose event on that output item.
Once we know its no longer valid, then we can not use it.

@roblourens
Copy link
Member

It's ok for the data to be lost because the output was deleted. But I think that when there are no outputs, you call appendOutput to create a new output, and after that add output items to that. So once the output changes are synced back to the EH a few ms later, things will be good again.

After you click "clear", there is a short window where an output may be produced, but lost, and this window depends on EH latency but I think this is the right behavior.

Either that or we go back to a model where an edit can fail to be applied and the extension has to retry, which is not what we want.

@roblourens
Copy link
Member

Fixed in microsoft/vscode#121687

@joyceerhl
Copy link
Contributor Author

Validated. Tried the original repro 5 times and everything works as expected. Thank you @roblourens!

@joyceerhl joyceerhl removed their assignment Apr 30, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 7, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Issue identified by VS Code Team member as probable bug
Projects
None yet
Development

No branches or pull requests

3 participants