Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

community[minor]: add mongodb byte store #23876

Merged
merged 14 commits into from
Jul 19, 2024

Conversation

pprados
Copy link
Contributor

@pprados pprados commented Jul 4, 2024

The MongoDBStore can manage only documents.
It's not possible to use MongoDB for an CacheBackedEmbeddings.

With this new implementation, it's possible to use:

CacheBackedEmbeddings.from_bytes_store(
    underlying_embeddings=embeddings,
    document_embedding_cache=MongoDBByteStore(
      connection_string=db_uri,
      db_name=db_name,
      collection_name=collection_name,
  ),
)

and use MongoDB to cache the embeddings !

Copy link

vercel bot commented Jul 4, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Jul 17, 2024 1:41pm

@pprados pprados marked this pull request as ready for review July 4, 2024 14:47
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. community Related to langchain-community 🔌: mongo Primarily related to Mongo integrations 🤖:improvement Medium size change to existing code to handle new use-cases labels Jul 4, 2024
@eyurtsev eyurtsev changed the title common[small]: add mongodb byte store community[minor]: add mongodb byte store Jul 5, 2024
@eyurtsev
Copy link
Collaborator

eyurtsev commented Jul 5, 2024

@pprados could you add an integration test using langchain_standard_tests

Look at the following unit tests as examples:

Essentially you only need to import the test suite and inherit from it and then provide a single fixture

class TestInMemoryStore(BaseStoreSyncTests):
    @pytest.fixture
    def three_values(self) -> Tuple[bytes, bytes, bytes]:  # <-- Provide 3 
        return b"foo", b"bar", b"buzz"

    @pytest.fixture
    def kv_store(self) -> Store:
       yield implementation using mongodb that has no keys stored in it

@eyurtsev eyurtsev self-assigned this Jul 5, 2024
@pprados pprados marked this pull request as draft July 12, 2024 13:27
@pprados pprados marked this pull request as ready for review July 12, 2024 14:03
@pprados
Copy link
Contributor Author

pprados commented Jul 18, 2024

@eyurtsev
The new version use the generic test.

@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Jul 19, 2024
@eyurtsev eyurtsev merged commit f585668 into langchain-ai:master Jul 19, 2024
43 checks passed
olgamurraft pushed a commit to olgamurraft/langchain that referenced this pull request Aug 16, 2024
The `MongoDBStore` can manage only documents.
It's not possible to use MongoDB for an `CacheBackedEmbeddings`.

With this new implementation, it's possible to use:
```python
CacheBackedEmbeddings.from_bytes_store(
    underlying_embeddings=embeddings,
    document_embedding_cache=MongoDBByteStore(
      connection_string=db_uri,
      db_name=db_name,
      collection_name=collection_name,
  ),
)
```
and use MongoDB to cache the embeddings !
@pprados pprados deleted the pprados/add-mongodb-byte-store branch October 7, 2024 07:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Related to langchain-community 🤖:improvement Medium size change to existing code to handle new use-cases lgtm PR looks good. Use to confirm that a PR is ready for merging. 🔌: mongo Primarily related to Mongo integrations size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants