Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lab get lag when many kernels are spawned #530

Closed
icankeep opened this issue May 31, 2021 · 21 comments
Closed

lab get lag when many kernels are spawned #530

icankeep opened this issue May 31, 2021 · 21 comments
Labels

Comments

@icankeep
Copy link
Contributor

jupyterlab==3.0.14
jupyter-server==1.6.4
notebook==6.2.0

hello everyone, my lab get lag when many kernels are spawned.
the whole lab is very slow to respond, it seems to be blocking the main thread of ioloop.

I'm not sure whether AsyncMappingKernelManager can solve this problem, can anyone help me?

@icankeep icankeep added the bug label May 31, 2021
@welcome
Copy link

welcome bot commented May 31, 2021

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

@kevin-bates
Copy link
Member

Jupyter Server uses AsyncMappingKernelManager by default since its 1.3.0 release, so that should already be in use.

How many kernels have been spawned when you begin to see degradation?
To what degree are each of the kernels busy executing cells during this time?

Just trying to get some more understanding.

Since you also opened the Notebook issue jupyter/notebook#6077 (which really applies here), have you tried to switch to using async contents management as well?

@icankeep
Copy link
Contributor Author

icankeep commented May 31, 2021

jupyter/notebook#6077
I tried to set AsyncLargeFileManager two days ago, but it didn't seem to solve the problem.

I'll try it tomorrow again.

How many kernels have been spawned when you begin to see degradation?
To what degree are each of the kernels busy executing cells during this time?

probably 20 kernels, each of these kernels does nothing. Maybe it's because I configured a subclass to implement MappingKernelManager, I'll try use default config to reproduce it.

@icankeep
Copy link
Contributor Author

icankeep commented Jun 2, 2021

I found a problem that might be related to the issue.

Reproduce step

  1. pip install py-spy (dump python thread)
  2. just start a simple jupyter program jupyter lab
  3. execute py-spy dump -p {jupyter-pid}

the dump result:

Process 98632: /Users/passer/conda/envs/python3.6.7/bin/python3.6 /Users/passer/conda/envs/python3.6.7/bin/jupyter-lab
Python v3.6.7 (/Users/passer/conda/envs/python3.6.7/bin/python3.6)

Thread 0x20524FE00 (idle): "MainThread"
    select (selectors.py:577)
    _run_once (asyncio/base_events.py:1404)
    run_forever (asyncio/base_events.py:427)
    start (tornado/platform/asyncio.py:199)
    start_ioloop (jupyter_server/serverapp.py:2148)
    start (jupyter_server/serverapp.py:2160)
    launch_instance (jupyter_server/extension/application.py:528)
    <module> (jupyter-lab:8)
Thread 0x30B41A000 (idle): "ThreadPoolExecutor-0_0"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30B91D000 (idle): "ThreadPoolExecutor-2_0"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30BE20000 (idle): "ThreadPoolExecutor-2_1"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30C323000 (idle): "ThreadPoolExecutor-2_2"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
  1. spawning some kernels, and then dump process stack again.

dump result:

Process 98632: /Users/passer/conda/envs/python3.6.7/bin/python3.6 /Users/passer/conda/envs/python3.6.7/bin/jupyter-lab
Python v3.6.7 (/Users/passer/conda/envs/python3.6.7/bin/python3.6)

Thread 0x20524FE00 (idle): "MainThread"
    select (selectors.py:577)
    _run_once (asyncio/base_events.py:1404)
    run_forever (asyncio/base_events.py:427)
    start (tornado/platform/asyncio.py:199)
    start_ioloop (jupyter_server/serverapp.py:2148)
    start (jupyter_server/serverapp.py:2160)
    launch_instance (jupyter_server/extension/application.py:528)
    <module> (jupyter-lab:8)
Thread 0x30B41A000 (idle): "ThreadPoolExecutor-0_0"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30B91D000 (idle): "ThreadPoolExecutor-2_0"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30BE20000 (idle): "ThreadPoolExecutor-2_1"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30C323000 (idle): "ThreadPoolExecutor-2_2"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30C92C000 (idle): "ThreadPoolExecutor-2_3"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30CE2F000 (idle): "ThreadPoolExecutor-2_4"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30D332000 (idle): "ThreadPoolExecutor-2_5"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)

... omitting a lot

a lot of ThreadPoolExecutors were created, and then I shutdown all of the kernels, these ThreadPoolExecutors are not destroyed.

I don't know where these thread pools were created.

@icankeep
Copy link
Contributor Author

icankeep commented Jun 2, 2021

@kevin-bates oh sorry, I confused the two issues. Let me move the problem over here

@icankeep
Copy link
Contributor Author

icankeep commented Jun 2, 2021

If you did not "warm-up" the server by starting 1 kernel, I would suggest doing that. I suspect at least one of these is to 》monitor for culling - which won't start until the first kernel starts. So perhaps you could dump the threads following the start of the first kernel as well.

same problem. Can you reproduce it? I'm confused why these thread pools are not recycled after the kernels are shut down

@kevin-bates
Copy link
Member

I'm not sure this is related. I see the same THREE thread-pool entries both before and after starting up to three kernels...

$ sudo py-spy dump -p 56829
Process 56829: /opt/anaconda3/envs/jupyter-server-dev/bin/python /opt/anaconda3/envs/jupyter-server-dev/bin/jupyter-lab --debug --notebook-dir=~/notebooks/
Python v3.7.3 (/opt/anaconda3/envs/jupyter-server-dev/bin/python3.7)

Thread 0x1045A0E00 (idle): "MainThread"
    select (selectors.py:558)
    _run_once (asyncio/base_events.py:1739)
    run_forever (asyncio/base_events.py:539)
    start (tornado/platform/asyncio.py:199)
    start_ioloop (jupyter_server/serverapp.py:2165)
    start (jupyter_server/serverapp.py:2177)
    launch_instance (jupyter_server/extension/application.py:528)
    <module> (jupyter-lab:8)
Thread 0x700009E2D000 (idle): "ThreadPoolExecutor-0_0"
    _worker (concurrent/futures/thread.py:78)
    run (threading.py:865)
    _bootstrap_inner (threading.py:917)
    _bootstrap (threading.py:885)
Thread 0x70000AD36000 (idle): "ThreadPoolExecutor-1_0"
    _worker (concurrent/futures/thread.py:78)
    run (threading.py:865)
    _bootstrap_inner (threading.py:917)
    _bootstrap (threading.py:885)

That is, I don't see a relationship between thread pool executors and kernels. Each kernel will be represented by a separate process of course, but the server's thread pool executor count remains constant.

You truncated your output. Are you seeing a one-to-one correlation between thread pool executors and kernels?

Can you try using the default MappingKernelManager to see if that changes behavior? If not, can you share your MappingKernelManager here (or privately via email associated with my GH profile)?

@icankeep
Copy link
Contributor Author

icankeep commented Jun 2, 2021

just try starting a few more kernels(20 or more),and then look at the stack

@icankeep
Copy link
Contributor Author

icankeep commented Jun 2, 2021

Process 62020: /Users/passer/conda/envs/python3.6.7/bin/python3.6 /Users/passer/conda/envs/python3.6.7/bin/jupyter-lab
Python v3.6.7 (/Users/passer/conda/envs/python3.6.7/bin/python3.6)

Thread 0x2029DFE00 (idle): "MainThread"
    select (selectors.py:577)
    _run_once (asyncio/base_events.py:1404)
    run_forever (asyncio/base_events.py:427)
    start (tornado/platform/asyncio.py:199)
    start_ioloop (jupyter_server/serverapp.py:2148)
    start (jupyter_server/serverapp.py:2160)
    launch_instance (jupyter_server/extension/application.py:528)
    <module> (jupyter-lab:8)
Thread 0x30641A000 (idle): "ThreadPoolExecutor-0_0"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30691D000 (idle): "ThreadPoolExecutor-2_0"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x306E20000 (idle): "ThreadPoolExecutor-2_1"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x307323000 (idle): "ThreadPoolExecutor-2_2"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x307826000 (idle): "ThreadPoolExecutor-2_3"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x307D29000 (idle): "ThreadPoolExecutor-2_4"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x308332000 (idle): "ThreadPoolExecutor-2_5"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x308835000 (idle): "ThreadPoolExecutor-2_6"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x308D38000 (idle): "ThreadPoolExecutor-2_7"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30923B000 (idle): "ThreadPoolExecutor-2_8"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30973E000 (idle): "ThreadPoolExecutor-2_9"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x309C41000 (idle): "ThreadPoolExecutor-2_10"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30A144000 (idle): "ThreadPoolExecutor-2_11"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30A647000 (idle): "ThreadPoolExecutor-2_12"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30AB4A000 (idle): "ThreadPoolExecutor-2_13"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30B04D000 (idle): "ThreadPoolExecutor-2_14"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30B550000 (idle): "ThreadPoolExecutor-2_15"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30BA53000 (idle): "ThreadPoolExecutor-2_16"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30BF56000 (idle): "ThreadPoolExecutor-2_17"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30C459000 (idle): "ThreadPoolExecutor-2_18"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30C95C000 (idle): "ThreadPoolExecutor-2_19"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30CE5F000 (idle): "ThreadPoolExecutor-2_20"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30D362000 (idle): "ThreadPoolExecutor-2_21"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30D865000 (idle): "ThreadPoolExecutor-2_22"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30DD68000 (idle): "ThreadPoolExecutor-2_23"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
Thread 0x30E26B000 (idle): "ThreadPoolExecutor-2_24"
    wait (threading.py:295)
    get (queue.py:164)
    _worker (concurrent/futures/thread.py:67)
    run (threading.py:864)
    _bootstrap_inner (threading.py:916)
    _bootstrap (threading.py:884)
(python3.6.7) 

@kevin-bates
Copy link
Member

Interesting. I don't see that at all. I started 20 kernels and the thread-executor counts remained constant before startup, during, and after shutdown. Where I do see a possible leak is closing the Lab tab and opening Lab in a new tab. Each tab closure does nothing to the thread-executor count, while each new tab (with Lab) adds one to the count.

I'm running Python 3.7 on MacOS, you're running Python 3.6 on ???

Also, please provide responses to:

Can you try using the default MappingKernelManager to see if that changes behavior? If not, can you share your MappingKernelManager here (or privately via email associated with my GH profile)?

We need to eliminate as many differences as possible.

@kevin-bates
Copy link
Member

FWIW, I ran lab on my RHEL cluster and see the same "static" behavior relative to kernels. In addition, I don't see any kind of lab tab leak I described previously when close/reopening the lab tab.

This server also enables kernel culling, and I don't see any relation to that and the executor count either (contrary to my previous thoughts).

Your MappingKernelManager isn't using a thread per KernelManager it manages by any chance, is it? 😄

@icankeep
Copy link
Contributor Author

icankeep commented Jun 2, 2021

I'm running Python 3.7 on MacOS, you're running Python 3.6 on ???

MacOS big sur 11.2

Can you try using the default MappingKernelManager to see if that changes behavior? If not, can you share your MappingKernelManager here (or privately via email associated with my GH profile)?

I just ran lab by default config.

log

$ jupyter lab --debug                                                           
[D 2021-06-03 07:49:27.979 ServerApp] Searching ['/Users/passer/jupyter', '/Users/passer/.jupyter', '/Users/passer/conda/envs/python3.6.7/etc/jupyter', '/usr/local/etc/jupyter', '/etc/jupyter'] for config files
[D 2021-06-03 07:49:27.980 ServerApp] Looking for jupyter_config in /etc/jupyter
[D 2021-06-03 07:49:27.980 ServerApp] Looking for jupyter_config in /usr/local/etc/jupyter
[D 2021-06-03 07:49:27.981 ServerApp] Looking for jupyter_config in /Users/passer/conda/envs/python3.6.7/etc/jupyter
[D 2021-06-03 07:49:27.981 ServerApp] Looking for jupyter_config in /Users/passer/.jupyter
[D 2021-06-03 07:49:27.981 ServerApp] Looking for jupyter_config in /Users/passer/jupyter
[D 2021-06-03 07:49:27.981 ServerApp] Looking for jupyter_server_config in /etc/jupyter
[D 2021-06-03 07:49:27.981 ServerApp] Looking for jupyter_server_config in /usr/local/etc/jupyter
[D 2021-06-03 07:49:27.981 ServerApp] Looking for jupyter_server_config in /Users/passer/conda/envs/python3.6.7/etc/jupyter
[D 2021-06-03 07:49:27.981 ServerApp] Looking for jupyter_server_config in /Users/passer/.jupyter
[D 2021-06-03 07:49:27.981 ServerApp] Looking for jupyter_server_config in /Users/passer/jupyter
[D 2021-06-03 07:49:27.983 ServerApp] Paths used for configuration of jupyter_server_config: 
    	/etc/jupyter/jupyter_server_config.json
[D 2021-06-03 07:49:27.983 ServerApp] Paths used for configuration of jupyter_server_config: 
    	/usr/local/etc/jupyter/jupyter_server_config.json
[D 2021-06-03 07:49:27.983 ServerApp] Paths used for configuration of jupyter_server_config: 
    	/Users/passer/conda/envs/python3.6.7/etc/jupyter/jupyter_server_config.d/jupyter_server_mathjax.json
    	/Users/passer/conda/envs/python3.6.7/etc/jupyter/jupyter_server_config.d/jupyterlab.json
    	/Users/passer/conda/envs/python3.6.7/etc/jupyter/jupyter_server_config.d/jupyterlab_git.json
    	/Users/passer/conda/envs/python3.6.7/etc/jupyter/jupyter_server_config.d/jupyterlab_snapshot.json
    	/Users/passer/conda/envs/python3.6.7/etc/jupyter/jupyter_server_config.d/nbclassic.json
    	/Users/passer/conda/envs/python3.6.7/etc/jupyter/jupyter_server_config.d/nbdime.json
    	/Users/passer/conda/envs/python3.6.7/etc/jupyter/jupyter_server_config.json
[D 2021-06-03 07:49:27.984 ServerApp] Paths used for configuration of jupyter_server_config: 
    	/Users/passer/.jupyter/jupyter_server_config.json
/Users/passer/conda/envs/python3.6.7/lib/python3.6/site-packages/jupyter_server/transutils.py:13: FutureWarning: The alias `_()` will be deprecated. Use `_i18n()` instead.
  warnings.warn(warn_msg, FutureWarning)
[D 2021-06-03 07:49:28.016 MathJaxExtension] Config changed:
[D 2021-06-03 07:49:28.016 MathJaxExtension] {'ServerApp': {'log_level': 'DEBUG', 'jpserver_extensions': <traitlets.config.loader.LazyConfigValue object at 0x7fa319b2bdd8>}, 'ExtensionApp': {'log_level': 'DEBUG'}}
[D 2021-06-03 07:49:28.017 ServerApp] Config changed:
[D 2021-06-03 07:49:28.017 ServerApp] {'ServerApp': {'log_level': 'DEBUG', 'jpserver_extensions': <traitlets.config.loader.LazyConfigValue object at 0x7fa319b2bd68>}, 'ExtensionApp': {'log_level': 'DEBUG'}}
[I 2021-06-03 07:49:28.018 ServerApp] jupyter_server_mathjax | extension was successfully linked.
[D 2021-06-03 07:49:28.023 LabApp] Config changed:
[D 2021-06-03 07:49:28.023 LabApp] {'NotebookApp': {}, 'ServerApp': {'log_level': 'DEBUG', 'jpserver_extensions': <traitlets.config.loader.LazyConfigValue object at 0x7fa319b3c470>}, 'ExtensionApp': {'log_level': 'DEBUG'}}
[I 2021-06-03 07:49:28.024 ServerApp] jupyterlab | extension was successfully linked.
[I 2021-06-03 07:49:28.024 ServerApp] jupyterlab-snapshot | extension was successfully linked.
[I 2021-06-03 07:49:28.024 ServerApp] jupyterlab_git | extension was successfully linked.
[D 2021-06-03 07:49:28.028 NotebookApp] Config changed:
[D 2021-06-03 07:49:28.028 NotebookApp] {'NotebookApp': {}, 'ServerApp': {'log_level': 'DEBUG', 'jpserver_extensions': <traitlets.config.loader.LazyConfigValue object at 0x7fa319b2b8d0>}, 'ExtensionApp': {'log_level': 'DEBUG'}}
[D 2021-06-03 07:49:28.256 ServerApp] Paths used for configuration of jupyter_notebook_config: 
    	/Users/passer/.jupyter/jupyter_notebook_config.json
[D 2021-06-03 07:49:28.256 ServerApp] Paths used for configuration of jupyter_notebook_config: 
    	/etc/jupyter/jupyter_notebook_config.json
[D 2021-06-03 07:49:28.256 ServerApp] Paths used for configuration of jupyter_notebook_config: 
    	/usr/local/etc/jupyter/jupyter_notebook_config.json
[D 2021-06-03 07:49:28.256 ServerApp] Paths used for configuration of jupyter_notebook_config: 
    	/Users/passer/conda/envs/python3.6.7/etc/jupyter/jupyter_notebook_config.d/jupyterlab.json
    	/Users/passer/conda/envs/python3.6.7/etc/jupyter/jupyter_notebook_config.d/jupyterlab_git.json
    	/Users/passer/conda/envs/python3.6.7/etc/jupyter/jupyter_notebook_config.d/jupyterlab_snapshot.json
    	/Users/passer/conda/envs/python3.6.7/etc/jupyter/jupyter_notebook_config.d/nbdime.json
    	/Users/passer/conda/envs/python3.6.7/etc/jupyter/jupyter_notebook_config.json
[D 2021-06-03 07:49:28.259 ServerApp] Paths used for configuration of jupyter_notebook_config: 
    	/Users/passer/.jupyter/jupyter_notebook_config.json
[I 2021-06-03 07:49:28.259 ServerApp] nbclassic | extension was successfully linked.
[I 2021-06-03 07:49:28.259 ServerApp] nbdime | extension was successfully linked.
[D 2021-06-03 07:49:28.260 ServerApp] Config changed:
[D 2021-06-03 07:49:28.260 ServerApp] {'ExtensionApp': {'log_level': 'DEBUG'}, 'NotebookApp': {}, 'ServerApp': {'log_level': 'DEBUG', 'jpserver_extensions': <traitlets.config.loader.LazyConfigValue object at 0x7fa319b38a58>}}
[D 2021-06-03 07:49:28.260 ServerApp] Raising open file limit: soft 256->4096; hard 9223372036854775807->9223372036854775807
[I 2021-06-03 07:49:28.296 ServerApp] jupyter_server_mathjax | extension was successfully loaded.
[I 2021-06-03 07:49:28.297 LabApp] JupyterLab extension loaded from /Users/passer/conda/envs/python3.6.7/lib/python3.6/site-packages/jupyterlab
[I 2021-06-03 07:49:28.297 LabApp] JupyterLab application directory is /Users/passer/conda/envs/python3.6.7/share/jupyter/lab
[I 2021-06-03 07:49:28.299 ServerApp] jupyterlab | extension was successfully loaded.
[I 2021-06-03 07:49:28.299 ServerApp] Registered snapshot extension at URL path /snapshot
[I 2021-06-03 07:49:28.299 ServerApp] jupyterlab-snapshot | extension was successfully loaded.
[I 2021-06-03 07:49:28.301 ServerApp] jupyterlab_git | extension was successfully loaded.
[I 2021-06-03 07:49:28.304 ServerApp] nbclassic | extension was successfully loaded.
[D 2021-06-03 07:49:28.395 ServerApp] Using default logger
[D 2021-06-03 07:49:28.395 ServerApp] Using default logger
[D 2021-06-03 07:49:28.395 ServerApp] Using default logger
[D 2021-06-03 07:49:28.395 ServerApp] Using default logger
[D 2021-06-03 07:49:28.395 ServerApp] Using default logger
[I 2021-06-03 07:49:28.397 ServerApp] nbdime | extension was successfully loaded.
[I 2021-06-03 07:49:28.397 ServerApp] 启动notebooks 在本地路径: /Users/passer/jupyter
[I 2021-06-03 07:49:28.397 ServerApp] Jupyter Server 1.6.4 is running at:
[I 2021-06-03 07:49:28.397 ServerApp] http://localhost:8888/lab?token=9f0df823fb48cc8ed5aa5df5e41d625d16c7e2ed1846fac0
[I 2021-06-03 07:49:28.397 ServerApp]     http://127.0.0.1:8888/lab?token=9f0df823fb48cc8ed5aa5df5e41d625d16c7e2ed1846fac0
[I 2021-06-03 07:49:28.397 ServerApp] 使用control-c停止此服务器并关闭所有内核(两次跳过确认).
[C 2021-06-03 07:49:28.400 ServerApp] 
    
    To access the server, open this file in a browser:
        file:///Users/passer/Library/Jupyter/runtime/jpserver-64759-open.html
    Or copy and paste one of these URLs:
        http://localhost:8888/lab?token=9f0df823fb48cc8ed5aa5df5e41d625d16c7e2ed1846fac0
        http://127.0.0.1:8888/lab?token=9f0df823fb48cc8ed5aa5df5e41d625d16c7e2ed1846fac0

I also tried the following configuration and found the same problem

from jupyter_server.services.kernels.kernelmanager import MappingKernelManager

c.ServerApp.kernel_manager_class = MappingKernelManager

@icankeep
Copy link
Contributor Author

icankeep commented Jun 3, 2021

I found the steps to reproduce. just copy notebook, and paste

image

image

@icankeep
Copy link
Contributor Author

icankeep commented Jun 3, 2021

reproduce gif:

Untitled

@icankeep
Copy link
Contributor Author

icankeep commented Jun 3, 2021

Untitled2-min

@fcollonval
Copy link
Member

fcollonval commented Jun 3, 2021

This is a Python issue: https://bugs.python.org/issue24882 - you need to upgrade your Python. But I'm not sure this is the reason of the lag your are seeing. I would say that 20+ kernels is eating a lot of memory; especially if the associated notebooks are large.

It has been corrected in 3.8 (python/cpython#6375). But I don't know if it was backported.

@kevin-bates
Copy link
Member

Thank you @fcollonval - this is beginning to make more sense. I should also point out that the RHEL scenario I ran in #530 (comment) was on Python 3.8 and saw no growth in the thread pool.

@icankeep - are you in a position in which you could try this out on Python 3.8?

I think the ThreadPoolExecutor "leak" is a bit of a red herring. I'm not saying in any way the lag introduced when multiple kernels are running (but idle) doesn't exist, but I'm not seeing evidence of it in this issue or my experience. Where I've seen substantial lags in the past is when attempting to start long-starting kernels simultaneously (typically on an Enterprise Gateway server) since, prior to AsyncKernelManagers, the start_kernel() method was blocking and preventing other start requests to lag (and build up). But you're saying the lag isn't during kernel startup nor are kernels active (and you're just using local quick-starting kernels), so I'm at a loss here.

@icankeep
Copy link
Contributor Author

icankeep commented Jun 3, 2021

Where I've seen substantial lags in the past is when attempting to start long-starting kernels simultaneously (typically on an Enterprise Gateway server) since

Our service is based on jupyterhub, each user service runs continuously in a docker container for a long time. Users' service sometimes occurred 502. when I entered the user container to view the call stack of the jupyter process, I found a lot of(300+) threads. I guess it may be related to this issue, so I comment here.

you're just using local quick-starting kernels
I use the local service just to reduce the difference in the environment, so that you can reproduce faster

I don’t know if this thread pool bug is the cause of this issue. I'll upgrade python version to 3.8 in our docker image and give feedback here. Forgive me for not being good at English, there may be some problems with the expression

@kevin-bates
Copy link
Member

Thanks for the update and giving Python 3.8 a try - that is greatly appreciated.

Forgive me for not being good at English, there may be some problems with the expression

No issues there! Terminology is difficult when dealing with software in general.

Our service is based on jupyterhub, each user service runs continuously in a docker container for a long time. Users' service sometimes occurred 502. when I entered the user container to view the call stack of the jupyter process, I found a lot of(300+) threads. I guess it may be related to this issue, so I comment here.

I see. Yeah, if the users are living in the same container, I suppose the thread pool (given this python issue) could grow. But even if threads are all idle, I wouldn't expect there to be significant performance degradation (at least until the pool is exhausted).

@icankeep
Copy link
Contributor Author

icankeep commented Jun 7, 2021

@kevin-bates Upgrading the python version can solve the problem of creating threads. But it's not related to this issue.
Maybe just because we use a subclass of MappingKernelManager, I will try more if I am not busy and reopen this issue when needed

@icankeep icankeep closed this as completed Jun 7, 2021
@icankeep
Copy link
Contributor Author

related issue: jtpio/jupyterlab-system-monitor#87

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants