-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lab get lag when many kernels are spawned #530
Comments
Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗 |
Jupyter Server uses How many kernels have been spawned when you begin to see degradation? Just trying to get some more understanding. Since you also opened the Notebook issue jupyter/notebook#6077 (which really applies here), have you tried to switch to using async contents management as well? |
jupyter/notebook#6077 I'll try it tomorrow again.
probably 20 kernels, each of these kernels does nothing. Maybe it's because I configured a subclass to implement |
I found a problem that might be related to the issue. Reproduce step
the dump result:
dump result:
a lot of ThreadPoolExecutors were created, and then I shutdown all of the kernels, these ThreadPoolExecutors are not destroyed. I don't know where these thread pools were created. |
@kevin-bates oh sorry, I confused the two issues. Let me move the problem over here |
same problem. Can you reproduce it? I'm confused why these thread pools are not recycled after the kernels are shut down |
I'm not sure this is related. I see the same THREE thread-pool entries both before and after starting up to three kernels...
That is, I don't see a relationship between thread pool executors and kernels. Each kernel will be represented by a separate process of course, but the server's thread pool executor count remains constant. You truncated your output. Are you seeing a one-to-one correlation between thread pool executors and kernels? Can you try using the default |
just try starting a few more kernels(20 or more),and then look at the stack |
|
Interesting. I don't see that at all. I started 20 kernels and the thread-executor counts remained constant before startup, during, and after shutdown. Where I do see a possible leak is closing the Lab tab and opening Lab in a new tab. Each tab closure does nothing to the thread-executor count, while each new tab (with Lab) adds one to the count. I'm running Python 3.7 on MacOS, you're running Python 3.6 on ??? Also, please provide responses to:
We need to eliminate as many differences as possible. |
FWIW, I ran lab on my RHEL cluster and see the same "static" behavior relative to kernels. In addition, I don't see any kind of lab tab leak I described previously when close/reopening the lab tab. This server also enables kernel culling, and I don't see any relation to that and the executor count either (contrary to my previous thoughts). Your |
MacOS big sur 11.2
I just ran lab by default config. log
I also tried the following configuration and found the same problem
|
This is a Python issue: https://bugs.python.org/issue24882 - you need to upgrade your Python. But I'm not sure this is the reason of the lag your are seeing. I would say that 20+ kernels is eating a lot of memory; especially if the associated notebooks are large. It has been corrected in 3.8 (python/cpython#6375). But I don't know if it was backported. |
Thank you @fcollonval - this is beginning to make more sense. I should also point out that the RHEL scenario I ran in #530 (comment) was on Python 3.8 and saw no growth in the thread pool. @icankeep - are you in a position in which you could try this out on Python 3.8? I think the ThreadPoolExecutor "leak" is a bit of a red herring. I'm not saying in any way the lag introduced when multiple kernels are running (but idle) doesn't exist, but I'm not seeing evidence of it in this issue or my experience. Where I've seen substantial lags in the past is when attempting to start long-starting kernels simultaneously (typically on an Enterprise Gateway server) since, prior to AsyncKernelManagers, the |
Our service is based on jupyterhub, each user service runs continuously in a docker container for a long time. Users' service sometimes occurred 502. when I entered the user container to view the call stack of the jupyter process, I found a lot of(300+) threads. I guess it may be related to this issue, so I comment here.
I don’t know if this thread pool bug is the cause of this issue. I'll upgrade python version to 3.8 in our docker image and give feedback here. Forgive me for not being good at English, there may be some problems with the expression |
Thanks for the update and giving Python 3.8 a try - that is greatly appreciated.
No issues there! Terminology is difficult when dealing with software in general.
I see. Yeah, if the users are living in the same container, I suppose the thread pool (given this python issue) could grow. But even if threads are all idle, I wouldn't expect there to be significant performance degradation (at least until the pool is exhausted). |
@kevin-bates Upgrading the python version can solve the problem of creating threads. But it's not related to this issue. |
related issue: jtpio/jupyterlab-system-monitor#87 |
hello everyone, my lab get lag when many kernels are spawned.
the whole lab is very slow to respond, it seems to be blocking the main thread of ioloop.
I'm not sure whether
AsyncMappingKernelManager
can solve this problem, can anyone help me?The text was updated successfully, but these errors were encountered: