`jupyter_ai` import time is too slow #1115

krassowski · 2024-11-20T10:36:56Z

Description

jupyter-ai is the slowest extensions of the ones that I am shipping in various deployments. Even worse, it is blocking the jupyter-server from startup. It would be lovely if we could reduce the startup time.

Reproduce

Install jupyter-ai
Run jupyter server extension list to get detailed timings, on my laptop it is:

Package jupyter_ai took 2.9546s to import

as compared to all other extensions being ready instantaneously:

 Package jupyterlab took 0.0398s to import
 Package nbclassic took 0.0014s to import
 Package jupyter_server_fileid took 0.0000s to import
 Package jupyter_server_ydoc took 0.0655s to import
 Package jupyter_lsp took 0.0290s to import

Run jupyter lab or jupyter server
See that jupyter-ai blocks jupyter-server startup for 2-3 seconds
See that it prints lots of messages about dependencies being not installed even when we block other providers (so it needlessly attempts some imports)

Expected behavior

Async startup sequence is used:
- either using a bespoke solution as in jupyter-lsp (see Use app's HTTP settings, handle feature settings race condition, CI/build updates, lazy server discovery jupyter-lsp/jupyterlab-lsp#882)
- using new jupyter-server async hook once merged Add async start hook to ExtensionApp API jupyter-server/jupyter_server#1417
No imports are performed for blocked providers

Context

Operating System and version: Ubuntu
JupyterLab version: 4.3.1

The text was updated successfully, but these errors were encountered:

dlqqq · 2024-11-25T23:31:51Z

@krassowski Thank you for opening an issue about this! I've also noticed that the startup time has slowly worsened over the last year, and I wish I had the time to fix that.

Could you / another contributor explore adding debug/info logs that trace the time it takes for the extension to initialize? For example, it'd be great to see what % of the time is spent loading the entry points & entry point groups. We need that data to determine where the slowdown is coming from, and to determine the best fix available.

My ability to contribute will be limited, as I need to focus full-time on Jupyter AI v3 very soon. Hope you understand, and I sincerely appreciate your patience as you deal with Jupyter AI issues. 🙏

Darshan808 · 2025-02-15T15:49:36Z

I investigated this issue and found that the startup delay in Jupyter AI is largely due to heavy package imports. For instance, importing several components from langchain.prompts takes around 600ms:

start_time = time.time() 
from langchain.prompts import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    MessagesPlaceholder,
    PromptTemplate,
    SystemMessagePromptTemplate,
)
end_time = time.time()

This delay, multiplied by similar imports from other heavy packages, significantly slows down the startup.

A recent PR added an async start hook to the Jupyter server extension API (jupyter-server/jupyter_server#1417), which I attempted to leverage. For example:

Retriever = None

async def _start_jupyter_server_extension(self, serverapp):
    global Retriever
    # Performing the heavy import asynchronously
    from jupyter_ai.chat_handlers.learn import Retriever
    self.Retriever = Retriever

However, given the number of heavy imports, applying this approach individually may not be feasible. I'm currently exploring better options. Any suggestions or alternative approaches would be greatly appreciated.

CC: @krassowski @dlqqq

krassowski · 2025-02-16T17:27:47Z

I think the "heavy" imports should mostly happen in entry points by definition, including any specific langchain vendor extensions for LLMs, context providers, and all slash command handlers.

Currently entrypoints are already used for context providers:

jupyter-ai/packages/jupyter-ai/jupyter_ai/extension.py

Lines 546 to 548 in 7b4586e

    
           def _init_context_providers(self): 
        
               eps = entry_points() 
        
               context_providers_eps = eps.select(group="jupyter_ai.context_providers")

and models:

jupyter-ai/packages/jupyter-ai/jupyter_ai/extension.py

Lines 369 to 374 in 7b4586e

    
           # Fetch LM & EM providers 
        
           self.settings["lm_providers"] = get_lm_providers( 
        
               log=self.log, restrictions=restrictions 
        
           ) 
        
           self.settings["em_providers"] = get_em_providers( 
        
               log=self.log, restrictions=restrictions

and custom slash commands:

jupyter-ai/packages/jupyter-ai/jupyter_ai/extension.py

Lines 495 to 504 in 7b4586e

    
           slash_command_pattern = r"^[a-zA-Z0-9_]+$" 
        
           for chat_handler_ep in chat_handler_eps: 
        
               try: 
        
                   chat_handler = chat_handler_ep.load() 
        
               except Exception as err: 
        
                   self.log.error( 
        
                       f"Unable to load chat handler class from entry point `{chat_handler_ep.name}`: " 
        
                       + f"Unexpected {err=}, {type(err)=}" 
        
                   ) 
        
                   continue

If we also moved the default slash, which are currently hard-coded:

jupyter-ai/packages/jupyter-ai/jupyter_ai/extension.py

Lines 481 to 493 in 7b4586e

    
           default_chat_handler = DefaultChatHandler(**chat_handler_kwargs) 
        
           generate_chat_handler = GenerateChatHandler( 
        
               **chat_handler_kwargs, 
        
               log_dir=self.error_logs_dir, 
        
           ) 
        
           learn_chat_handler = LearnChatHandler(**chat_handler_kwargs) 
        
           retriever = Retriever(learn_chat_handler=learn_chat_handler) 
        
           ask_chat_handler = AskChatHandler(**chat_handler_kwargs, retriever=retriever) 
        
           chat_handlers["default"] = default_chat_handler 
        
           chat_handlers["/ask"] = ask_chat_handler 
        
           chat_handlers["/generate"] = generate_chat_handler 
        
           chat_handlers["/learn"] = learn_chat_handler

to be load from entry points, we would not only have a more fine-grained control on the import timing, we would also solve #507 (which was a hurdle in customizing jupyter-ai)

I suspect moving entry points to happen async should remove >80% of problematic blocking imports. Then we could add a test to verify on CI that importing and initializing extension (but not awaiting the async method) takes <1s and move the remaining imports to happen only when needed.

dlqqq · 2025-02-17T19:56:50Z

@Darshan808 Thank you doing research & profiling the performance impact. This is super helpful!

Regarding your suggested solution: I'm not sure that it would improve performance. Python imports are inherently synchronous, i.e. they block the Python process & event loop. As far as I can tell, doing an import inside of an async function is the same as doing it synchronously.

It may be better to investigate if we can improve performance by changing our import paths. We import StrOutputParser via the declaration:

from langchain_core.output_parsers import StrOutputParser

However, take a look at the langchain_core.output_parsers module definition (click to expand):

langchain_core/output_parsers/__init__.py

"""**OutputParser** classes parse the output of an LLM call.

**Class hierarchy:**

.. code-block::

    BaseLLMOutputParser --> BaseOutputParser --> <name>OutputParser  # ListOutputParser, PydanticOutputParser

**Main helpers:**

.. code-block::

    Serializable, Generation, PromptValue
"""  # noqa: E501

from langchain_core.output_parsers.base import (
    BaseGenerationOutputParser,
    BaseLLMOutputParser,
    BaseOutputParser,
)
from langchain_core.output_parsers.json import JsonOutputParser, SimpleJsonOutputParser
from langchain_core.output_parsers.list import (
    CommaSeparatedListOutputParser,
    ListOutputParser,
    MarkdownListOutputParser,
    NumberedListOutputParser,
)
from langchain_core.output_parsers.openai_tools import (
    JsonOutputKeyToolsParser,
    JsonOutputToolsParser,
    PydanticToolsParser,
)
from langchain_core.output_parsers.pydantic import PydanticOutputParser
from langchain_core.output_parsers.string import StrOutputParser
from langchain_core.output_parsers.transform import (
    BaseCumulativeTransformOutputParser,
    BaseTransformOutputParser,
)
from langchain_core.output_parsers.xml import XMLOutputParser

__all__ = [
    "BaseLLMOutputParser",
    "BaseGenerationOutputParser",
    "BaseOutputParser",
    "ListOutputParser",
    "CommaSeparatedListOutputParser",
    "NumberedListOutputParser",
    "MarkdownListOutputParser",
    "StrOutputParser",
    "BaseTransformOutputParser",
    "BaseCumulativeTransformOutputParser",
    "SimpleJsonOutputParser",
    "XMLOutputParser",
    "JsonOutputParser",
    "PydanticOutputParser",
    "JsonOutputToolsParser",
    "JsonOutputKeyToolsParser",
    "PydanticToolsParser",
]

Whenever we import from langchain_core.output_parsers, it seems that we are importing every output parser just to get the StrOutputParser we want. I believe it's possible that this is the main reason that imports are taking too long.

I would love to see a proof-of-concept where we import smaller modules with longer names to improve performance. For example, it may be better to make changes like the one below throughout the codebase, and see if it improves performance:

- from langchain_core.output_parsers import StrOutputParser
+ from langchain_core.output_parsers.string import StrOutputParser

We can enforce import rules using importlinter, which I had implemented previously in #546.

dlqqq · 2025-02-17T20:00:07Z

@krassowski Thanks for the suggestions! Yes I agree, loading entry points asynchronously would definitely bring performance benefits.

Right now, I think we're loading all entry points in giant code blocks in the main extension.py file. It seems like there's an opportunity to migrate this responsibility to our ConfigManager class, given that it already provides the API for getting the list of model providers. It seems good to also allow developers to wait for the ConfigManager to load all model providers, e.g. via await config_manager.ready.

krassowski · 2025-02-17T20:59:10Z

I'm not sure that it would improve performance. Python imports are inherently synchronous, i.e. they block the Python process & event loop. As far as I can tell, doing an import inside of an async function is the same as doing it synchronously

I do not understand what you mean.

krassowski · 2025-02-17T21:00:41Z

In case if there is some misunderstanding, the issue does not aim to magically make jupyter-ai be ready faster; it marely means to delay loading of its heavy dependencies to happen later.

krassowski · 2025-02-17T21:05:19Z

I would love to see a proof-of-concept where we import smaller modules with longer names to improve performance. For example, it may be better to make changes like the one below throughout the codebase, and see if it improves performance:
- from langchain_core.output_parsers import StrOutputParser
+ from langchain_core.output_parsers.string import StrOutputParser

Nope, no significant difference:

dlqqq · 2025-02-17T21:42:26Z

@krassowski The example on the left is much faster because it's not doing anything besides creating a coroutine object. The body of import_later() isn't run until await coroutine or loop.create_task(coroutine) is called.

Suppose we do move our imports to an async function (i.e. a coroutine). As soon as Jupyter Server awaits that coroutine, the Python process & server event loop is blocked until the imports are complete. This is because the coroutine would not be doing anything actually asynchronous; each import statement is a synchronous call that occupies the Python process. This ends up basically providing the same performance as the existing behavior.

This will happen even if loop.create_task() is used instead of await. As soon as await or loop.create_task() are called with the coroutine object, the server will freeze until the import is done, because the import coroutine never gets interrupted, because it's not doing anything actually asynchronous.

In case if there is some misunderstanding, the issue does not aim to magically make jupyter-ai be ready faster; it marely means to delay loading of its heavy dependencies to happen later.

I agree that loading our imports & entry points would improve the UX! I'm not trying to shut down that idea. I'm just stating that I have good reasons for why the suggested approach might not work. Let me see if I can produce an example for this.

dlqqq · 2025-02-17T21:49:07Z

Actually I just realized that jupyter-server/jupyter_server#1417 needs to be released first, before I can provide an example demonstrating my point.

dlqqq · 2025-02-17T21:56:52Z

We should be mindful of the difference between concurrency (doing multiple interrupted tasks in the same thread of execution) and parallelism (doing multiple un-interrupted tasks in separate threads of execution). Async/await just provides concurrency, so it's not obvious how it can mitigate the performance impact of import statements, which are synchronous (i.e. import statements are not interrupted).

Async/await will be useful if we find a way to asynchronously import Python modules / load entry points. However, I don't know if that is even possible in Python. I'll do some more research later.

krassowski · 2025-02-17T22:47:09Z

Yes, I see the problem, which is made worse by the async entry point for server extension being awaited in the very next cycle of the IO loop. This behaves differently from what is implemented in jupyter-lsp.

This is super promising: https://discuss.python.org/t/async-imports-to-reduce-startup-times/69732/25

transformers has their own _LazyModule implementation that they use to reduce import times until attributes are accessed. [...] They have been using this implementation for about 4 years. When they first began using this solution, it took importing transformers from 2.3s to 239ms.

krassowski · 2025-02-17T23:11:27Z

Here is the thing: we do not actually care about a single import that much, right? As long as we can call await asyncio.sleep(epsilon) in between blocking tasks, we are unblocking the server to handle the requests from the browser. I verified that this works with the new server extension method (with sufficiently large epsilon, otherwise there is not enough time for handling the requests). Entrypoints are executed in loops and seem a natural place to introduce this await on epsilon. The epsilon could be configurable. What do you think? nah, wrong again. it will not work unless we put it behind a future.

dlqqq · 2025-02-19T15:42:33Z

I've renamed the issue to focus on the speed of importing the jupyter_ai package. This isn't the only performance concern, but others can be tracked in separate issues. I've added this to the Jupyter AI project board, since this would be a great issue to tackle before JAI v3.0.0 official.

krassowski added the bug Bugs reported by users label Nov 20, 2024

krassowski mentioned this issue Nov 25, 2024

Allow to block providers by name, not only by ID #1120

Open

Darshan808 self-assigned this Feb 17, 2025

dlqqq changed the title ~~Slow and blocking startup of jupyter-ai server extension~~ jupyter_ai import time is too slow Feb 19, 2025

dlqqq added this to Jupyter AI Feb 19, 2025

dlqqq moved this to Active in Jupyter AI Feb 19, 2025

dlqqq assigned krassowski Feb 19, 2025

Darshan808 mentioned this issue Feb 22, 2025

Refactor Chat Handlers to Simplify Initialization #1257

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`jupyter_ai` import time is too slow #1115

`jupyter_ai` import time is too slow #1115

krassowski commented Nov 20, 2024

dlqqq commented Nov 25, 2024 •

edited

Loading

Darshan808 commented Feb 15, 2025

krassowski commented Feb 16, 2025

dlqqq commented Feb 17, 2025

dlqqq commented Feb 17, 2025

krassowski commented Feb 17, 2025

krassowski commented Feb 17, 2025

krassowski commented Feb 17, 2025

dlqqq commented Feb 17, 2025 •

edited

Loading

dlqqq commented Feb 17, 2025 •

edited

Loading

dlqqq commented Feb 17, 2025 •

edited

Loading

krassowski commented Feb 17, 2025

krassowski commented Feb 17, 2025 •

edited

Loading

dlqqq commented Feb 19, 2025

jupyter_ai import time is too slow #1115

jupyter_ai import time is too slow #1115

Comments

krassowski commented Nov 20, 2024

Description

Reproduce

Expected behavior

Context

dlqqq commented Nov 25, 2024 • edited Loading

Darshan808 commented Feb 15, 2025

krassowski commented Feb 16, 2025

dlqqq commented Feb 17, 2025

dlqqq commented Feb 17, 2025

krassowski commented Feb 17, 2025

krassowski commented Feb 17, 2025

krassowski commented Feb 17, 2025

dlqqq commented Feb 17, 2025 • edited Loading

dlqqq commented Feb 17, 2025 • edited Loading

dlqqq commented Feb 17, 2025 • edited Loading

krassowski commented Feb 17, 2025

krassowski commented Feb 17, 2025 • edited Loading

dlqqq commented Feb 19, 2025

`jupyter_ai` import time is too slow #1115

`jupyter_ai` import time is too slow #1115

dlqqq commented Nov 25, 2024 •

edited

Loading

dlqqq commented Feb 17, 2025 •

edited

Loading

dlqqq commented Feb 17, 2025 •

edited

Loading

dlqqq commented Feb 17, 2025 •

edited

Loading

krassowski commented Feb 17, 2025 •

edited

Loading