-
Notifications
You must be signed in to change notification settings - Fork 29
Attempt to re-establish websocket connection to KG #42
Conversation
|
This may be related to #39 Sorry for no idea about how to make the test for this.
Test Result |
d9dc86a to
1639002
Compare
|
This pull request has been mentioned on Jupyter Community Forum. There might be relevant details there: |
kevin-bates
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good Evan - thank you!
@IMAM9AIS - are you able to take this for a spin in your env?
nb2kg/handlers.py
Outdated
| self.log.error("Exception reading message from websocket: {}".format(e)) # , exc_info=True) | ||
| if message is None: | ||
| if not self.disconnected: | ||
| self.log.warning("Lost connection to KG: {}".format(self.kernel_id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's change KG to Gateway to match others.
nb2kg/handlers.py
Outdated
| break | ||
|
|
||
| if not self.disconnected: # NOTE(esevan): if websocket is not disconnected by client, try to reconnect. | ||
| self.log.info("Try to re-establish the connection to KG: {}".format(self.kernel_id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try -> Attempting, KG -> Gateway.
nb2kg/handlers.py
Outdated
| self.ws_future.cancel() | ||
| self.ws_future_cancelled = True | ||
| self.log.debug("_disconnect: ws_future_cancelled: {}".format(self.ws_future_cancelled)) | ||
| self.log.debug("_disconnect: ws_future_cancelled: {}".format(self.disconnected)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we update the content here to be more like:
_disconnect: future cancelled, disconnected: {}
|
@esevan @kevin-bates I am sorry i could not get back on this quickly.
This check still returns a false signal, due to which a write is finally triggered but since the stream is not available the final write operation does not work. This is during the first shell message that you try to execute after the connection has been lost.
Line 247 in ddf6b7c
But since the exception is gracefully handled nothing happens and the client is still unaware of the loss. |
|
@esevan I will try the changes you have done, but i am not sure that on_close() event here :- Line 249 in ddf6b7c
will be called at all. Also, from the changes you have done (from what i can understand), let's say even if the on_close() event is being generated, it still won't reconnect the sockets but just print the log to do it, which might not be user-friendly. |
|
Rather than doing this, I have another suggestion :- What if while creating the connection between NB2KG and JEG, we also start a ping mechanism in the IOLoop, that pings every 1 min say. Now in scenario where we somehow lose a connection, the first ping will obviously do nothing but will help websocket realise that the stream is closed. The second ping will raise an exception which will propagate back here :- Line 247 in ddf6b7c
But rather than handling this exception, if you raise it this actually triggers a websocket close event back to jupyter lab. If we look into jupyter lab's websocket close event here :- Rather the immediately closing the websocket, it tries to do reconnect attempts with increasing timeouts which will immediately fix the connection if it can be fixed with the network state. |
When nb2kg lost the connection to KG, nb2kg didn't connect to KG again although the websocket connection from the client was still alive. This change recovers the connection to KG to prevent above anomaly. Signed-off-by: Eunsoo Park <esevan.park@gmail.com>
|
@kevin-bates Thanks for your review, kevin! @IMAM9AIS Websocket connection closing event is also listened by
One more thing for what it's worth, |
@esevan Thanks a lot for clearing this. I will try out changes and will let you know. |
kevin-bates
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - thank you!

When nb2kg lost the connection to KG, nb2kg didn't connect to KG again
although the websocket connection from the client was still alive.
This change recovers the connection to KG to prevent above anomaly.
Signed-off-by: Eunsoo Park esevan.park@gmail.com