-
Notifications
You must be signed in to change notification settings - Fork 3.7k
+MongoDB Vector Support #2367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
+MongoDB Vector Support #2367
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
@Dev-Khant could you please take a look at this one? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the relevant docs and tests for it. Thanks.
@Dev-Khant will add this soon! Sorry for delay |
Adding initial tests, updating use of get_embeddings Remove accidental formatting to unrelated files Fixing tests Fixing docs, adding comment Remove get_embeddings
Hi @Dev-Khant , this PR is ready for re-review when you get a second! |
When would this be closed? |
Hey @ranfysvalle02 Can you please resolve the merge conflicts so we can run the CI tests. |
@Dev-Khant -- merge conflicts resolved!!! |
@Dev-Khant - Just following up here, let me know if any issues |
@Dev-Khant --- We are about to have more merge conflicts -- Can you please advise if there is anything wrong with the MongoDB aspect of it or help get this merged please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, sorry the delay here. Left few comments once it's resolved we can merge this.
Also @ranfysvalle02 Can you please why the monogodb tests are failing? |
@Dev-Khant give me a bit to tackle all this -- will report back soon! |
@Dev-Khant can you elaborate a bit more here? I'm trying to figure out what is going on here.
The provided test script errors out? Am I missing something here? |
do a docker then try to run the below script -- it should work... unless I'm doing something wrong here? import logging
import numpy as np
from mongodb import MongoVector # Ensure this matches the filename where MongoVector is saved
# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
def test_mongo_vector():
# Configuration
db_name = 'test_db'
collection_name = 'test_vectors'
embedding_dims = 128 # Example embedding dimension
# Initialize the MongoVector instance
mongo_vector = MongoVector(
mongo_uri='mongodb://localhost:27017/?directConnection=true&serverSelectionTimeoutMS=2000', # Adjust as needed
db_name=db_name,
collection_name=collection_name,
embedding_model_dims=embedding_dims
)
logger.info('Initialized MongoVector instance.')
# Generate sample vectors and payloads
num_vectors = 10
vectors = [np.random.rand(embedding_dims).tolist() for _ in range(num_vectors)]
payloads = [{'payload_info': f'document_{i}'} for i in range(num_vectors)]
ids = [f'vector_{i}' for i in range(num_vectors)]
# Insert vectors into MongoDB
mongo_vector.insert(vectors=vectors, payloads=payloads, ids=ids)
logger.info('Inserted vectors into MongoDB.')
# Perform a vector search
query_vector = np.random.rand(embedding_dims).tolist()
search_results = mongo_vector.search(query='', query_vector=query_vector, limit=5)
logger.info(f'Search results:')
for result in search_results:
logger.info(f'ID: {result.id}, Score: {result.score}, Payload: {result.payload}')
# Update a vector
update_vector_id = ids[0]
new_vector = np.random.rand(embedding_dims).tolist()
new_payload = {'payload_info': 'updated_document'}
mongo_vector.update(vector_id=update_vector_id, vector=new_vector, payload=new_payload)
logger.info(f'Updated vector {update_vector_id}.')
# Retrieve an updated vector
retrieved_vector = mongo_vector.get(vector_id=update_vector_id)
logger.info(f'Retrieved vector: {retrieved_vector}')
# Delete a vector
delete_vector_id = ids[1]
mongo_vector.delete(vector_id=delete_vector_id)
logger.info(f'Deleted vector {delete_vector_id}.')
# List vectors in the collection
vector_list = mongo_vector.list(limit=5)
logger.info(f'List of vectors:')
for v in vector_list:
logger.info(f'ID: {v.id}, Payload: {v.payload}')
# Get collection information
collection_info = mongo_vector.col_info()
logger.info(f'Collection info: {collection_info}')
# Clean up by deleting the collection
mongo_vector.delete_col()
logger.info('Deleted the collection.')
# Close the database connection
del mongo_vector
logger.info('MongoDB connection closed.')
if __name__ == '__main__':
test_mongo_vector() |
Hey @ranfysvalle02 We have a separate issue for CI tests, will fix it. But your changes looks good to me. Thanks for the contribution! |
Description
Add MongoDB as a VectorDB.
Fixes # (issue)
#1166
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
There is working code that highlights how to use, and the output.
Please delete options that are not relevant.
Checklist:
Maintainer Checklist