Enable embedding caching on all vectorizers #320

tylerhutcherson · 2025-04-16T22:34:51Z

Adds support to the BaseVectorizer class to have an optional EmbeddingsCache attached.

Refactored the subclass vectorizers to implement private embed methods and then let the base class handle the cache wrapper logic.
Fixed some circular imports.
Fixed async client handling in the cache subclasses (caught during testing).
Handle some typing checks and pydantic stuff related to private attrs and custom attrs.

TODO in a separate PR:

Add embeddings caching to our testing suite (CI/CD speed up??)
Add embeddings caching to our SemanticRouter

abrookins

This looks great! Nice culmination of a lot of work. Had one non-blocking suggestion to consider, totally optional. 👍

abrookins · 2025-04-16T23:54:45Z

redisvl/utils/vectorize/base.py

+
+        try:
+            # Efficient batch cache lookup
+            cache_results = await self.cache.amget(texts=texts, model_name=self.model)


The whole amget / efficiency aspect of this work is nice. 🔥

abrookins · 2025-04-16T23:55:44Z

redisvl/utils/vectorize/base.py


    model: str
    dtype: str = "float32"
-    dims: Optional[int] = None
+    dims: Annotated[Optional[int], Field(strict=True, gt=0)] = None
+    cache: Optional[EmbeddingsCache] = Field(default=None)


Passing a cache object in to get caching feels elegant

abrookins · 2025-04-16T23:58:57Z

redisvl/utils/vectorize/text/huggingface.py

+    async def _aembed(self, text: str, **kwargs) -> List[float]:
+        """Asynchronously generate a vector embedding for a single text.
+
+        Note: This implementation falls back to the synchronous version as


This could be worth logging when these methods are called.

Adds support to the `BaseVectorizer` class to have an optional `EmbeddingsCache` attached. - Refactored the subclass vectorizers to implement private embed methods and then let the base class handle the cache wrapper logic. - Fixed some circular imports. - Fixed async client handling in the cache subclasses (caught during testing). - Handle some typing checks and pydantic stuff related to private attrs and custom attrs. TODO in a separate PR: - Add embeddings caching to our testing suite (CI/CD speed up??) - Add embeddings caching to our SemanticRouter

tylerhutcherson added 3 commits April 16, 2025 18:16

unleash the caching on the subclass vectorizers

a11ceda

uncomment vectorizer subclasses

c5e253b

add to user guide

0181139

tylerhutcherson added the enhancement New feature or request label Apr 16, 2025

tylerhutcherson requested a review from abrookins April 16, 2025 22:34

tylerhutcherson marked this pull request as ready for review April 16, 2025 22:34

tylerhutcherson changed the title ~~Feat/raae 594 cached vectorizers~~ Add caching to vectorizers Apr 16, 2025

abrookins approved these changes Apr 16, 2025

View reviewed changes

log warning when missing async methods

ac52c6d

tylerhutcherson changed the title ~~Add caching to vectorizers~~ Enable embedding caching on all vectorizers Apr 17, 2025

tylerhutcherson merged commit 95ffe75 into 0.6.0 Apr 17, 2025
31 checks passed

tylerhutcherson deleted the feat/RAAE-594-cached-vectorizers branch April 17, 2025 14:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable embedding caching on all vectorizers #320

Enable embedding caching on all vectorizers #320

Uh oh!

tylerhutcherson commented Apr 16, 2025

Uh oh!

abrookins left a comment

Uh oh!

abrookins Apr 16, 2025

Uh oh!

abrookins Apr 16, 2025

Uh oh!

abrookins Apr 16, 2025

Uh oh!

tylerhutcherson Apr 17, 2025

Uh oh!

Uh oh!

Uh oh!

Enable embedding caching on all vectorizers #320

Enable embedding caching on all vectorizers #320

Uh oh!

Conversation

tylerhutcherson commented Apr 16, 2025

Uh oh!

abrookins left a comment

Choose a reason for hiding this comment

Uh oh!

abrookins Apr 16, 2025

Choose a reason for hiding this comment

Uh oh!

abrookins Apr 16, 2025

Choose a reason for hiding this comment

Uh oh!

abrookins Apr 16, 2025

Choose a reason for hiding this comment

Uh oh!

tylerhutcherson Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!