-
Notifications
You must be signed in to change notification settings - Fork 288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix memory leak of kernel Popen object #360
Conversation
After analyzing various leaked items when running either Notebook or Jupyter Kernel Gateway, one item that recurred across each kernel startup and shutdown sequence was the Popen object stored in the kernel manager in `self.kernel`. The issue is that in normal circumstances, when a kernel's termination is successful via the ZMQ messaging, the process is never waited for (which, in this case, is probably unnecessary but advised) nor is the member variable set to None. In the failing case, where the message-based shutdown does not terminate the kernel process, the `_kill_kernel()` method is used, which performs the `wait()` and _nullifies_ the kernel member. This change ensures that sequence occurs in normal situations as well.
PR jupyter-server#279 introduced some fixes to address file descriptor leaks - one of which was to shutdown the communication port. Since the kernel launcher listening on the other side may have already terminated, the shutdown method could throw an exception. This change catches, logs, then ignores such exceptions. While looking into other leaks (in this case memory), it was discovered that the process proxy instance was being leaked across kernel cycles. This change addresses that particular leak. Note: Other PRs have also been submitted to address leaks in `jupyter_client` and `notebook`. These are: [PR 360 - Fix memory leak of kernel Popen object](jupyter/jupyter_client#360) [PR 361 - Fix memory leak of IOLoopKernelManager object](jupyter/jupyter_client#361) [PR 3424 - Fix memory leak of iopub object in activity monitoring](jupyter/notebook#3424)
While looking into other leaks (in this case memory), it was discovered that the process proxy instance was being leaked across kernel cycles. This change addresses that particular leak. Note: Other PRs have also been submitted to address leaks in `jupyter_client` and `notebook`. These are: [PR 360 - Fix memory leak of kernel Popen object](jupyter/jupyter_client#360) [PR 361 - Fix memory leak of IOLoopKernelManager object](jupyter/jupyter_client#361) [PR 3424 - Fix memory leak of iopub object in activity monitoring](jupyter/notebook#3424)
@@ -272,6 +272,10 @@ def finish_shutdown(self, waittime=None, pollinterval=0.1): | |||
if self.is_alive(): | |||
time.sleep(pollinterval) | |||
else: | |||
# If there's still a proc, wait and clear | |||
if self.has_kernel: | |||
self.kernel.wait() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to wait if is_alive() is False?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @minrk . Since there's still a process object around, I consider it good practice and we're not seeing runtime issues in the sequence.
After analyzing various leaked items when running either Notebook or
Jupyter Kernel Gateway, one item that recurred across each kernel
startup and shutdown sequence was the Popen object stored in the kernel
manager in
self.kernel
.The issue is that in normal circumstances, when a kernel's termination
is successful via the ZMQ messaging, the process is never waited for
(which, in this case, is probably unnecessary but advised) nor is the
member variable set to None. In the failing case, where the message-based
shutdown does not terminate the kernel process, the
_kill_kernel()
method is used, which performs the
wait()
and nullifies the kernelmember. This change ensures that sequence occurs in normal situations
as well.