Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jupyter_ai import time is too slow #1115

Open
krassowski opened this issue Nov 20, 2024 · 14 comments
Open

jupyter_ai import time is too slow #1115

krassowski opened this issue Nov 20, 2024 · 14 comments
Assignees
Labels
bug Something isn't working

Comments

@krassowski
Copy link
Member

Description

jupyter-ai is the slowest extensions of the ones that I am shipping in various deployments. Even worse, it is blocking the jupyter-server from startup. It would be lovely if we could reduce the startup time.

Reproduce

  1. Install jupyter-ai
  2. Run jupyter server extension list to get detailed timings, on my laptop it is:
Package jupyter_ai took 2.9546s to import

as compared to all other extensions being ready instantaneously:

 Package jupyterlab took 0.0398s to import
 Package nbclassic took 0.0014s to import
 Package jupyter_server_fileid took 0.0000s to import
 Package jupyter_server_ydoc took 0.0655s to import
 Package jupyter_lsp took 0.0290s to import
  1. Run jupyter lab or jupyter server
  2. See that jupyter-ai blocks jupyter-server startup for 2-3 seconds
  3. See that it prints lots of messages about dependencies being not installed even when we block other providers (so it needlessly attempts some imports)

Expected behavior

  1. Async startup sequence is used:
  2. No imports are performed for blocked providers

Context

  • Operating System and version: Ubuntu
  • JupyterLab version: 4.3.1
@dlqqq
Copy link
Member

dlqqq commented Nov 25, 2024

@krassowski Thank you for opening an issue about this! I've also noticed that the startup time has slowly worsened over the last year, and I wish I had the time to fix that.

Could you / another contributor explore adding debug/info logs that trace the time it takes for the extension to initialize? For example, it'd be great to see what % of the time is spent loading the entry points & entry point groups. We need that data to determine where the slowdown is coming from, and to determine the best fix available.

My ability to contribute will be limited, as I need to focus full-time on Jupyter AI v3 very soon. Hope you understand, and I sincerely appreciate your patience as you deal with Jupyter AI issues. 🙏

@Darshan808
Copy link
Member

I investigated this issue and found that the startup delay in Jupyter AI is largely due to heavy package imports. For instance, importing several components from langchain.prompts takes around 600ms:

start_time = time.time() 
from langchain.prompts import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    MessagesPlaceholder,
    PromptTemplate,
    SystemMessagePromptTemplate,
)
end_time = time.time()

This delay, multiplied by similar imports from other heavy packages, significantly slows down the startup.

A recent PR added an async start hook to the Jupyter server extension API (jupyter-server/jupyter_server#1417), which I attempted to leverage. For example:

Retriever = None

async def _start_jupyter_server_extension(self, serverapp):
    global Retriever
    # Performing the heavy import asynchronously
    from jupyter_ai.chat_handlers.learn import Retriever
    self.Retriever = Retriever

However, given the number of heavy imports, applying this approach individually may not be feasible. I'm currently exploring better options. Any suggestions or alternative approaches would be greatly appreciated.

CC: @krassowski @dlqqq

@krassowski
Copy link
Member Author

I think the "heavy" imports should mostly happen in entry points by definition, including any specific langchain vendor extensions for LLMs, context providers, and all slash command handlers.

Currently entrypoints are already used for context providers:

def _init_context_providers(self):
eps = entry_points()
context_providers_eps = eps.select(group="jupyter_ai.context_providers")

and models:

# Fetch LM & EM providers
self.settings["lm_providers"] = get_lm_providers(
log=self.log, restrictions=restrictions
)
self.settings["em_providers"] = get_em_providers(
log=self.log, restrictions=restrictions

and custom slash commands:

slash_command_pattern = r"^[a-zA-Z0-9_]+$"
for chat_handler_ep in chat_handler_eps:
try:
chat_handler = chat_handler_ep.load()
except Exception as err:
self.log.error(
f"Unable to load chat handler class from entry point `{chat_handler_ep.name}`: "
+ f"Unexpected {err=}, {type(err)=}"
)
continue

If we also moved the default slash, which are currently hard-coded:

default_chat_handler = DefaultChatHandler(**chat_handler_kwargs)
generate_chat_handler = GenerateChatHandler(
**chat_handler_kwargs,
log_dir=self.error_logs_dir,
)
learn_chat_handler = LearnChatHandler(**chat_handler_kwargs)
retriever = Retriever(learn_chat_handler=learn_chat_handler)
ask_chat_handler = AskChatHandler(**chat_handler_kwargs, retriever=retriever)
chat_handlers["default"] = default_chat_handler
chat_handlers["/ask"] = ask_chat_handler
chat_handlers["/generate"] = generate_chat_handler
chat_handlers["/learn"] = learn_chat_handler

to be load from entry points, we would not only have a more fine-grained control on the import timing, we would also solve #507 (which was a hurdle in customizing jupyter-ai)

I suspect moving entry points to happen async should remove >80% of problematic blocking imports. Then we could add a test to verify on CI that importing and initializing extension (but not awaiting the async method) takes <1s and move the remaining imports to happen only when needed.

@Darshan808 Darshan808 self-assigned this Feb 17, 2025
@dlqqq
Copy link
Member

dlqqq commented Feb 17, 2025

@Darshan808 Thank you doing research & profiling the performance impact. This is super helpful!

Regarding your suggested solution: I'm not sure that it would improve performance. Python imports are inherently synchronous, i.e. they block the Python process & event loop. As far as I can tell, doing an import inside of an async function is the same as doing it synchronously.

It may be better to investigate if we can improve performance by changing our import paths. We import StrOutputParser via the declaration:

from langchain_core.output_parsers import StrOutputParser

However, take a look at the langchain_core.output_parsers module definition (click to expand):

langchain_core/output_parsers/__init__.py
"""**OutputParser** classes parse the output of an LLM call.

**Class hierarchy:**

.. code-block::

    BaseLLMOutputParser --> BaseOutputParser --> <name>OutputParser  # ListOutputParser, PydanticOutputParser

**Main helpers:**

.. code-block::

    Serializable, Generation, PromptValue
"""  # noqa: E501

from langchain_core.output_parsers.base import (
    BaseGenerationOutputParser,
    BaseLLMOutputParser,
    BaseOutputParser,
)
from langchain_core.output_parsers.json import JsonOutputParser, SimpleJsonOutputParser
from langchain_core.output_parsers.list import (
    CommaSeparatedListOutputParser,
    ListOutputParser,
    MarkdownListOutputParser,
    NumberedListOutputParser,
)
from langchain_core.output_parsers.openai_tools import (
    JsonOutputKeyToolsParser,
    JsonOutputToolsParser,
    PydanticToolsParser,
)
from langchain_core.output_parsers.pydantic import PydanticOutputParser
from langchain_core.output_parsers.string import StrOutputParser
from langchain_core.output_parsers.transform import (
    BaseCumulativeTransformOutputParser,
    BaseTransformOutputParser,
)
from langchain_core.output_parsers.xml import XMLOutputParser

__all__ = [
    "BaseLLMOutputParser",
    "BaseGenerationOutputParser",
    "BaseOutputParser",
    "ListOutputParser",
    "CommaSeparatedListOutputParser",
    "NumberedListOutputParser",
    "MarkdownListOutputParser",
    "StrOutputParser",
    "BaseTransformOutputParser",
    "BaseCumulativeTransformOutputParser",
    "SimpleJsonOutputParser",
    "XMLOutputParser",
    "JsonOutputParser",
    "PydanticOutputParser",
    "JsonOutputToolsParser",
    "JsonOutputKeyToolsParser",
    "PydanticToolsParser",
]

Whenever we import from langchain_core.output_parsers, it seems that we are importing every output parser just to get the StrOutputParser we want. I believe it's possible that this is the main reason that imports are taking too long.

I would love to see a proof-of-concept where we import smaller modules with longer names to improve performance. For example, it may be better to make changes like the one below throughout the codebase, and see if it improves performance:

- from langchain_core.output_parsers import StrOutputParser
+ from langchain_core.output_parsers.string import StrOutputParser

We can enforce import rules using importlinter, which I had implemented previously in #546.

@dlqqq
Copy link
Member

dlqqq commented Feb 17, 2025

@krassowski Thanks for the suggestions! Yes I agree, loading entry points asynchronously would definitely bring performance benefits.

Right now, I think we're loading all entry points in giant code blocks in the main extension.py file. It seems like there's an opportunity to migrate this responsibility to our ConfigManager class, given that it already provides the API for getting the list of model providers. It seems good to also allow developers to wait for the ConfigManager to load all model providers, e.g. via await config_manager.ready.

@krassowski
Copy link
Member Author

I'm not sure that it would improve performance. Python imports are inherently synchronous, i.e. they block the Python process & event loop. As far as I can tell, doing an import inside of an async function is the same as doing it synchronously

I do not understand what you mean.

Image

@krassowski
Copy link
Member Author

In case if there is some misunderstanding, the issue does not aim to magically make jupyter-ai be ready faster; it marely means to delay loading of its heavy dependencies to happen later.

@krassowski
Copy link
Member Author

I would love to see a proof-of-concept where we import smaller modules with longer names to improve performance. For example, it may be better to make changes like the one below throughout the codebase, and see if it improves performance:

- from langchain_core.output_parsers import StrOutputParser
+ from langchain_core.output_parsers.string import StrOutputParser

Nope, no significant difference:

Image

Image

@dlqqq
Copy link
Member

dlqqq commented Feb 17, 2025

@krassowski The example on the left is much faster because it's not doing anything besides creating a coroutine object. The body of import_later() isn't run until await coroutine or loop.create_task(coroutine) is called.

Suppose we do move our imports to an async function (i.e. a coroutine). As soon as Jupyter Server awaits that coroutine, the Python process & server event loop is blocked until the imports are complete. This is because the coroutine would not be doing anything actually asynchronous; each import statement is a synchronous call that occupies the Python process. This ends up basically providing the same performance as the existing behavior.

This will happen even if loop.create_task() is used instead of await. As soon as await or loop.create_task() are called with the coroutine object, the server will freeze until the import is done, because the import coroutine never gets interrupted, because it's not doing anything actually asynchronous.

In case if there is some misunderstanding, the issue does not aim to magically make jupyter-ai be ready faster; it marely means to delay loading of its heavy dependencies to happen later.

I agree that loading our imports & entry points would improve the UX! I'm not trying to shut down that idea. I'm just stating that I have good reasons for why the suggested approach might not work. Let me see if I can produce an example for this.

@dlqqq
Copy link
Member

dlqqq commented Feb 17, 2025

Actually I just realized that jupyter-server/jupyter_server#1417 needs to be released first, before I can provide an example demonstrating my point.

@dlqqq
Copy link
Member

dlqqq commented Feb 17, 2025

We should be mindful of the difference between concurrency (doing multiple interrupted tasks in the same thread of execution) and parallelism (doing multiple un-interrupted tasks in separate threads of execution). Async/await just provides concurrency, so it's not obvious how it can mitigate the performance impact of import statements, which are synchronous (i.e. import statements are not interrupted).

Async/await will be useful if we find a way to asynchronously import Python modules / load entry points. However, I don't know if that is even possible in Python. I'll do some more research later.

@krassowski
Copy link
Member Author

Yes, I see the problem, which is made worse by the async entry point for server extension being awaited in the very next cycle of the IO loop. This behaves differently from what is implemented in jupyter-lsp.

This is super promising: https://discuss.python.org/t/async-imports-to-reduce-startup-times/69732/25

transformers has their own _LazyModule implementation that they use to reduce import times until attributes are accessed. [...] They have been using this implementation for about 4 years. When they first began using this solution, it took importing transformers from 2.3s to 239ms.

@krassowski
Copy link
Member Author

krassowski commented Feb 17, 2025

Here is the thing: we do not actually care about a single import that much, right? As long as we can call await asyncio.sleep(epsilon) in between blocking tasks, we are unblocking the server to handle the requests from the browser. I verified that this works with the new server extension method (with sufficiently large epsilon, otherwise there is not enough time for handling the requests). Entrypoints are executed in loops and seem a natural place to introduce this await on epsilon. The epsilon could be configurable. What do you think? nah, wrong again. it will not work unless we put it behind a future.

@dlqqq dlqqq changed the title Slow and blocking startup of jupyter-ai server extension jupyter_ai import time is too slow Feb 19, 2025
@dlqqq dlqqq added this to Jupyter AI Feb 19, 2025
@dlqqq dlqqq moved this to Active in Jupyter AI Feb 19, 2025
@dlqqq
Copy link
Member

dlqqq commented Feb 19, 2025

I've renamed the issue to focus on the speed of importing the jupyter_ai package. This isn't the only performance concern, but others can be tracked in separate issues. I've added this to the Jupyter AI project board, since this would be a great issue to tackle before JAI v3.0.0 official.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Active
Development

No branches or pull requests

3 participants