Inaccurate kernel state #989

Zsailer · 2022-09-22T21:40:38Z

Today, Jupyter Server provides no reliable way to determine the state of a kernel—i.e. its lifecycle states (starting, connecting, connected, terminating, dead, restarting, ...) and execution states (busy, idle, etc.). Frontend applications much do their best to resolve this state from the influx of messages they receive on the ZMQ channels. Typically, this means listening to IOPub status messages.

The problem is that the IOPub stream is not always reliable because both the shell and control channels trigger IOPub status messages. For example, (re)connecting a new websocket client to the ZMQChannelsHandler for an already busy kernel "nudges" the kernel by sending kernel_info requests to both the shell + control. The kernel responds with two IOPub status messages, one for the shell that says "busy" and one for control that says "idle". The relevant status for the UI is the shell channel. but the control channel's response comes is last "status" that the client sees. The result is a UI that says the kernel is "idle" when the kernel is actually busy.

Debug requests (which hit the control channel) are another example that can cause inaccurate kernel states for busy kernels.

Unfortunately, there is no easy way to differentiate which parent channel triggered an IOPub message, so we can't cherry-pick the shell channel statuses.

blink1073 · 2022-09-23T12:47:25Z

I think that what JupyterLab (and other server clients) can do is only set the status based on messages that they have sent. For example, the client sends a kernel info request, and then waits for a reply and and idle message whose parent header points to that request. Similar for any other shell messages it sends. They would have to track the msg_id of their outstanding shell request.

blink1073 · 2022-09-23T12:48:30Z

This would be a short term fix that can be applied to lab 3 and 4 until we have a better system in place on the server.

Zsailer added the bug label Sep 22, 2022

This was referenced Sep 22, 2022

[Proposal] Jupyter Server should handle resolving kernel lifecycle and execution states. #990

Open

Reconnecting the kernel after a refresh does not capture the kernel output correctly jupyterlab/jupyterlab#12455

Closed

krassowski mentioned this issue Mar 26, 2024

Kernel state incorrect after a page refresh jupyterlab/jupyterlab#16059

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inaccurate kernel state #989

Inaccurate kernel state #989

Zsailer commented Sep 22, 2022

blink1073 commented Sep 23, 2022

blink1073 commented Sep 23, 2022

Inaccurate kernel state #989

Inaccurate kernel state #989

Comments

Zsailer commented Sep 22, 2022

blink1073 commented Sep 23, 2022

blink1073 commented Sep 23, 2022