Fix EmbeddingService blocking event loop by offloading encode to thread #262

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open

dolliecoder wants to merge 1 commit into AOSSIE-Org:main from dolliecoder:fix/non-blocking-embedding-service

+11 −2

backend/app/services/embedding_service/service.py

-Original file line number
+Diff line change
@@ -1,3 +1,4 @@
+    import asyncio
     import logging
     import config
     from typing import List, Dict, Any, Optional
@@ Expand Down Expand Up / @@ -66,6 +67,10 @@ def llm(self) -> ChatGoogleGenerativeAI: @@
                     raise
             return self._llm
+        def _encode_sync(self, *args, **kwargs):
+            return self.model.encode(*args, **kwargs)
         async def get_embedding(self, text: str) -> List[float]:
             """Generate embedding for a single text input"""
             try:
@@ Expand All / @@ -74,12 +79,14 @@ async def get_embedding(self, text: str) -> List[float]: @@
                     text = [text]
                 # Generate embeddings
-                embeddings = self.model.encode(
+                embeddings = await asyncio.to_thread(
+                    self._encode_sync,
                     text,
                     convert_to_tensor=True,
                     show_progress_bar=False
                 )
                 # Convert to standard Python list and return
                 embedding_list = embeddings[0].cpu().tolist()
                 logger.debug(f"Generated embedding with dimension: {len(embedding_list)}")
@@ Expand All @@
             """Generate embeddings for multiple text inputs in batches"""
             try:
                 # Generate embeddings
-                embeddings = self.model.encode(
+                embeddings = await asyncio.to_thread(
+                    self._encode_sync,
                     texts,
                     convert_to_tensor=True,
                     batch_size=MAX_BATCH_SIZE,
                     show_progress_bar=len(texts) > 10
                 )
                 # Convert to standard Python list
                 embedding_list = embeddings.cpu().tolist()
                 logger.info(f"Generated {len(embedding_list)} embeddings")
@@ Expand Down @@

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix EmbeddingService blocking event loop by offloading encode to thread #262

Diff view

Diff view

There are no files selected for viewing

Fix EmbeddingService blocking event loop by offloading encode to thread #262

Are you sure you want to change the base?

Fix EmbeddingService blocking event loop by offloading encode to thread #262

Uh oh!

Uh oh!

Diff view

Diff view

There are no files selected for viewing