Add Pinecone support for archival storage #635

sahusiddharth · 2023-12-17T06:09:41Z

Having support for pinecone can be really helpful is a cloud native vector database, and has one of the best performance

sahusiddharth · 2023-12-17T11:37:33Z

I have started working on this, there are couple of road blocks that I'm running into:-

To make a connection to pinecone client we need to specify the api key and the environment
I think we have to take them configs, need suggestions on how to take it further
Can you please explain me the rationale behind the abstract function get_all_paginated

I would love to take your inputs as well @cpacker, @vivi, @sarahwooders

cpacker · 2023-12-17T18:50:52Z

@sahusiddharth I think you can follow for the most part the Chroma integration (https://github.com/cpacker/MemGPT/blob/main/memgpt/connectors/chroma.py) and just try to implement a parallel class for Pinecone.

AFAIK get_all_paginated is used primarily (only?) for the /attach command, @sarahwooders can confirm.

For API keys, yes we should store them in the base ~/.memgpt/config. The plan is to eventually split config into a config and credentials file, but this hasn't happened yet.

sahusiddharth · 2023-12-17T21:38:30Z

I have got the basic structure down by taking chromadb as reference

I asked for get_all_paginated because when i went through the pinecone documentation, one can not query documents form pineconedb by taking fixed size steps.

About api keys how should I proceed?

taking the api key using the kwargs way
reading from a .txt file given a path
suggestions?

sarahwooders · 2023-12-18T07:38:32Z

@sahusiddharth it would be great to have a pinecone integration! However, we are actually currently in the middle of refactoring some of the storage backends - could you please work off the storage-refactor branch instead of main? The postgres and chroma integrations are mostly complete, so you can model your changes off of them, but I still need to finish migrating a few more things before I can merge the refactored code in.

For API keys, I recommend placing them into the ~/.memgpt/config file, which you can do my adding a field to MemGPTConfig. We will probably eventually move towards having a separate credentials file, but for the time being are using the config file for everything.

We currently use get_all_paginated to copy data from data sources to agent archival memory. However, I think we may deprecate this function in the future since I'd like to avoid copying data into agent archival memory for connecting to external data sources. I would recommend just faking pagination for now, by calling get_all() and then paginating results -- and we can just add a warning about using pinecone with large datasets.

github-actions · 2024-12-06T02:30:53Z

This issue has been automatically closed due to 60 days of inactivity.

Co-authored-by: Mindy Long <mindy@letta.com>

cpacker added good first issue Good for newcomers feature request labels Dec 17, 2023

github-actions bot added the auto-closed label Dec 6, 2024

github-actions bot closed this as completed Dec 6, 2024

mattzh72 pushed a commit that referenced this issue Jan 16, 2025

fix: tool attachment bug for agent creation (#635)

55394e2

Co-authored-by: Mindy Long <mindy@letta.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Pinecone support for archival storage #635

Add Pinecone support for archival storage #635

sahusiddharth commented Dec 17, 2023

sahusiddharth commented Dec 17, 2023

cpacker commented Dec 17, 2023 •

edited

Loading

sahusiddharth commented Dec 17, 2023

sarahwooders commented Dec 18, 2023

github-actions bot commented Dec 6, 2024

Add Pinecone support for archival storage #635

Add Pinecone support for archival storage #635

Comments

sahusiddharth commented Dec 17, 2023

sahusiddharth commented Dec 17, 2023

cpacker commented Dec 17, 2023 • edited Loading

sahusiddharth commented Dec 17, 2023

sarahwooders commented Dec 18, 2023

github-actions bot commented Dec 6, 2024

cpacker commented Dec 17, 2023 •

edited

Loading