softpass test when keys are missing (#369)

* softpass test when keys are missing * update to use local model * both openai and local * typo * fix * Specify model inference and embedding endpoint separately (#286) * Fix config tests (#343) Co-authored-by: Vivian Fang <hi@vivi.sh> * Avoid throwing error for older `~/.memgpt/config` files due to missing section `archival_storage` (#344) * avoid error if has old config type * Dependency management (#337) * Divides dependencies into `pip install pymemgpt[legacy,local,postgres,dev]`. * Update docs * Relax verify_first_message_correctness to accept any function call (#340) * Relax verify_first_message_correctness to accept any function call * Also allow missing internal monologue if request_heartbeat * Cleanup * get instead of raw dict access * Update `poetry.lock` (#346) * mark depricated API section * add readme * add readme * add readme * add readme * add readme * add readme * add readme * add readme * add readme * CLI bug fixes for azure * check azure before running * Update README.md * Update README.md * bug fix with persona loading * remove print * make errors for cli flags more clear * format * fix imports * fix imports * add prints * update lock * Add autogen example that lets you chat with docs (#342) * Relax verify_first_message_correctness to accept any function call * Also allow missing internal monologue if request_heartbeat * Cleanup * get instead of raw dict access * Support attach in memgpt autogen agent * Add docs example * Add documentation, cleanup * add gpt-4-turbo (#349) * add gpt-4-turbo * add in another place * change to 3.5 16k * Revert relaxing verify_first_message_correctness, still add archival_memory_search as an exception (#350) * Revert "Relax verify_first_message_correctness to accept any function call (#340)" This reverts commit 407d6bf. * add archival_memory_search as an exception for verify * Bump version to 0.1.18 (#351) * Remove `requirements.txt` and `requirements_local.txt` (#358) * update requirements to match poetry * update with extras * remove requirements * disable pretty exceptions (#367) * Updated documentation for users (#365) --------- Co-authored-by: Vivian Fang <hi@vivi.sh> * Create pull_request_template.md (#368) * Create pull_request_template.md * Add pymemgpt-nightly workflow (#373) * Add pymemgpt-nightly workflow * change token name * Update lmstudio.md (#382) * Update lmstudio.md * Update lmstudio.md * Update lmstudio.md to show the Prompt Formatting Option (#384) * Update lmstudio.md to show the Prompt Formatting Option * Update lmstudio.md Update the screenshot * Swap asset location from #384 (#385) * Update poetry with `pg8000` and include `pgvector` in docs (#390) * Allow overriding config location with `MEMGPT_CONFIG_PATH` (#383) * Always default to local embeddings if not OpenAI or Azure (#387) * Add support for larger archival memory stores (#359) * Replace `memgpt run` flags error with warning + remove custom embedding endpoint option + add agent create time (#364) * Update webui.md (#397) turn emoji warning into markdown warning * Update webui.md (#398) * dont hard code embeddings * formatting * black * add full deps * remove changes * update poetry --------- Co-authored-by: Sarah Wooders <sarahwooders@gmail.com> Co-authored-by: Vivian Fang <hi@vivi.sh> Co-authored-by: MSZ-MGS <65172063+MSZ-MGS@users.noreply.github.com>
letta-ai · Nov 9, 2023 · c9c1074 · c9c1074
1 parent 887efa3
commit c9c1074
Show file tree

Hide file tree

Showing 14 changed files with 535 additions and 320 deletions.
diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml
@@ -42,7 +42,7 @@ jobs:
         PGVECTOR_TEST_DB_URL: ${{ secrets.PGVECTOR_TEST_DB_URL }}
         OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
       run: |
-        poetry install -E postgres -E dev
+        poetry install -E dev -E postgres -E local -E legacy
 
     - name: Set Poetry config
       env:

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -3,10 +3,14 @@ repos:
     rev: v2.3.0
     hooks:
     -   id: check-yaml
+        exclude: ^docs/
     -   id: end-of-file-fixer
+        exclude: ^docs/
     -   id: trailing-whitespace
+        exclude: ^docs/
 -   repo: https://github.com/psf/black
     rev: 22.10.0
     hooks:
     -   id: black
+        exclude: ^docs/
         args: ['--line-length', '140']
diff --git a/docs/README.md b/docs/README.md
@@ -1,15 +1,13 @@
-# Building the docs 
+# Building the docs
 
-Run the following from the MemGPT directory. 
+Run the following from the MemGPT directory.
 
-1. Install requirements: 
+1. Install requirements:
 ```
 pip install -r docs/requirements.txt
 ```
 
-2. Serve docs: 
+2. Serve docs:
 ```
 mkdocs serve
 ```
-
-
diff --git a/docs/autogen.md b/docs/autogen.md
@@ -87,7 +87,7 @@ config_list_memgpt = [
     },
 ]
 ```
-`config_list` is used by non-MemGPT agents, which expect an OpenAI-compatible API. 
+`config_list` is used by non-MemGPT agents, which expect an OpenAI-compatible API.
 
 `config_list_memgpt` is used by MemGPT agents. Currently, MemGPT interfaces with the LLM backend by exporting `OPENAI_API_BASE` and `BACKEND_TYPE` as described above. Note that MemGPT does not use the OpenAI-compatible API (it uses the direct API).
 
@@ -147,4 +147,4 @@ Virtual context management is a technique used in large language models like Mem
 
 --------------------------------------------------------------------------------
 ...
-```
+```
diff --git a/docs/data_sources.md b/docs/data_sources.md
@@ -2,7 +2,7 @@
 MemGPT supports pre-loading data into archival memory. In order to made data accessible to your agent, you must load data in with `memgpt load`, then attach the data source to your agent. You can configure where archival memory is stored by configuring the [storage backend](storage.md).
 
 ### Viewing available data sources
-You can view available data sources with: 
+You can view available data sources with:
 ```
 memgpt list sources
 ```
@@ -15,12 +15,12 @@ memgpt list sources
 |  memgpt-docs   |  local   |  agent_1 |
 +----------------+----------+----------+
 ```
-The `Agents` column indicates which agents have access to the data, while `Location` indicates what storage backend the data has been loaded into. 
+The `Agents` column indicates which agents have access to the data, while `Location` indicates what storage backend the data has been loaded into.
 
 ### Attaching data to agents
 Attaching a data source to your agent loads the data into your agent's archival memory to access. You can attach data to your agent in two ways:
 
-*[Option 1]* From the CLI, run: 
+*[Option 1]* From the CLI, run:
 ```
 memgpt attach --agent <AGENT-NAME> --data-source <DATA-SOURCE-NAME>
 ```
@@ -41,15 +41,15 @@ memgpt attach --agent <AGENT-NAME> --data-source <DATA-SOURCE-NAME>
 
 
 ### Loading a file or directory
-You can load a file, list of files, or directry into MemGPT with the following command: 
+You can load a file, list of files, or directry into MemGPT with the following command:
 ```sh
 memgpt load directory --name <NAME> \
     [--input-dir <DIRECTORY>] [--input-files <FILE1> <FILE2>...] [--recursive]
 ```
 
 
-### Loading a database dump 
-You can load database into MemGPT, either from a database dump or a database connection, with the following command: 
+### Loading a database dump
+You can load database into MemGPT, either from a database dump or a database connection, with the following command:
 ```sh
 memgpt load database --name <NAME>  \
     --query <QUERY> \ # Query to run on database to get data
@@ -62,25 +62,24 @@ memgpt load database --name <NAME>  \
     --dbname <DB_NAME> # Database name
 ```
 
-### Loading a vector database 
-If you already have a vector database containing passages and embeddings, you can load them into MemGPT by specifying the table name, database URI, and the columns containing the passage text and embeddings.  
+### Loading a vector database
+If you already have a vector database containing passages and embeddings, you can load them into MemGPT by specifying the table name, database URI, and the columns containing the passage text and embeddings.
 ```sh
 memgpt load vector-database --name <NAME> \
     --uri <URI> \ # Database URI
-    --table_name <TABLE-NAME> \ # Name of table containing data 
+    --table_name <TABLE-NAME> \ # Name of table containing data
     --text_column <TEXT-COL> \ # Name of column containing text
     --embedding_column <EMBEDDING-COL> # Name of column containing embedding
 ```
-Since embeddings are already provided, MemGPT will not re-compute the embeddings. 
+Since embeddings are already provided, MemGPT will not re-compute the embeddings.
 
-### Loading a LlamaIndex dump 
-If you have a Llama Index `VectorIndex` which was saved to disk, you can load it into MemGPT by specifying the directory the index was saved to: 
+### Loading a LlamaIndex dump
+If you have a Llama Index `VectorIndex` which was saved to disk, you can load it into MemGPT by specifying the directory the index was saved to:
 ```sh
 memgpt load index --name <NAME> --dir <INDEX-DIR>
 ```
-Since Llama Index will have already computing embeddings, MemGPT will not re-compute embeddings. 
+Since Llama Index will have already computing embeddings, MemGPT will not re-compute embeddings.
 
 
 ### Loading other types of data
-We highly encourage contributions for new data sources, which can be added as a new [CLI data load command](https://github.com/cpacker/MemGPT/blob/main/memgpt/cli/cli_load.py). We recommend checking for [Llama Index connectors](https://gpt-index.readthedocs.io/en/v0.6.3/how_to/data_connectors.html) that may support ingesting the data you're interested in loading. 
-
+We highly encourage contributions for new data sources, which can be added as a new [CLI data load command](https://github.com/cpacker/MemGPT/blob/main/memgpt/cli/cli_load.py). We recommend checking for [Llama Index connectors](https://gpt-index.readthedocs.io/en/v0.6.3/how_to/data_connectors.html) that may support ingesting the data you're interested in loading.
diff --git a/docs/example_chat.md b/docs/example_chat.md
@@ -81,4 +81,4 @@ memgpt run --persona chaz --human bob
 💭 Career crisis detected. Commence motivational dialogue and initiate discussions to understand user's aspirations and insecurities. Validate feelings and offer hope. Also, determine interest in exploring alternatives outside the tech field.
 🤖 It's perfectly okay to feel uncertain, Bob. Life is a journey and it's never a straight path. If you feel tech isn't your calling, we can explore your passions and look for alternatives. But remember, there's a reason you've come this far in tech. Let's uncover your true potential together, shall we?
 > Enter your message:
-```
+```
diff --git a/docs/example_data.md b/docs/example_data.md
@@ -68,4 +68,4 @@ Now that the data has been loaded into the chatbot's memory, we can start to ask
 
 ### Loading other data types
 
-In this example, we loaded a single PDF into a chatbots external memory. However MemGPT supports various types of data, such as full directories of files and even databases - [see the full data sources list](../data_sources).
+In this example, we loaded a single PDF into a chatbots external memory. However MemGPT supports various types of data, such as full directories of files and even databases - [see the full data sources list](../data_sources).
diff --git a/docs/index.md b/docs/index.md
@@ -16,4 +16,4 @@ You can read more about the research behind MemGPT at [https://memgpt.ai](https:
 
 ## Join the community!
 
-MemGPT is an open source project under active development. If you'd like to help make MemGPT even better, you can come chat with the community on [our Discord server](https://discord.gg/9GEQrxmVyE) or on our [GitHub](https://github.com/cpacker/MemGPT).
+MemGPT is an open source project under active development. If you'd like to help make MemGPT even better, you can come chat with the community on [our Discord server](https://discord.gg/9GEQrxmVyE) or on our [GitHub](https://github.com/cpacker/MemGPT).
diff --git a/docs/webui_runpod.md b/docs/webui_runpod.md
@@ -1 +1 @@
-TODO
+TODO
diff --git a/memgpt/autogen/README.md b/memgpt/autogen/README.md
@@ -20,7 +20,7 @@ config_list_memgpt = [
     },
 ]
 ```
-`config_list` is used by non-MemGPT agents, which expect an OpenAI-compatible API. 
+`config_list` is used by non-MemGPT agents, which expect an OpenAI-compatible API.
 
 `config_list_memgpt` is used by MemGPT agents. Currently, MemGPT interfaces with the LLM backend by exporting `OPENAI_API_BASE` and `BACKEND_TYPE` as described in [Local LLM support](../local_llm). Note that MemGPT does not use the OpenAI-compatible API (it uses the direct API).
 

diff --git a/memgpt/connectors/db.py b/memgpt/connectors/db.py
@@ -23,23 +23,24 @@
 Base = declarative_base()
 
 
-class PassageModel(Base):
-    """Defines data model for storing Passages (consisting of text, embedding)"""
+def get_db_model(table_name: str):
+    config = MemGPTConfig.load()
 
-    __abstract__ = True  # this line is necessary
+    class PassageModel(Base):
+        """Defines data model for storing Passages (consisting of text, embedding)"""
 
-    # Assuming passage_id is the primary key
-    id = Column(BIGINT, primary_key=True, nullable=False, autoincrement=True)
-    doc_id = Column(String)
-    text = Column(String, nullable=False)
-    embedding = mapped_column(Vector(1536))  # TODO: don't hard-code
-    # metadata_ = Column(JSON(astext_type=Text()))
+        __abstract__ = True  # this line is necessary
 
-    def __repr__(self):
-        return f"<Passage(passage_id='{self.id}', text='{self.text}', embedding='{self.embedding})>"
+        # Assuming passage_id is the primary key
+        id = Column(BIGINT, primary_key=True, nullable=False, autoincrement=True)
+        doc_id = Column(String)
+        text = Column(String, nullable=False)
+        embedding = mapped_column(Vector(config.embedding_dim))
+        # metadata_ = Column(JSON(astext_type=Text()))
 
+        def __repr__(self):
+            return f"<Passage(passage_id='{self.id}', text='{self.text}', embedding='{self.embedding})>"
 
-def get_db_model(table_name: str):
     """Create database model for table_name"""
     class_name = f"{table_name.capitalize()}Model"
     Model = type(class_name, (PassageModel,), {"__tablename__": table_name, "__table_args__": {"extend_existing": True}})
-Original file line number
+Diff line change
@@ Expand Up / @@ -87,7 +87,7 @@ config_list_memgpt = [ @@
         },
     ]
     ```
-    `config_list` is used by non-MemGPT agents, which expect an OpenAI-compatible API.
+    `config_list` is used by non-MemGPT agents, which expect an OpenAI-compatible API.
     `config_list_memgpt` is used by MemGPT agents. Currently, MemGPT interfaces with the LLM backend by exporting `OPENAI_API_BASE` and `BACKEND_TYPE` as described above. Note that MemGPT does not use the OpenAI-compatible API (it uses the direct API).
@@ Expand Down Expand Up @@
     --------------------------------------------------------------------------------
     ...
-    ```
+    ```
Original file line number	Diff line number	Diff line change
Expand Up		@@ -68,4 +68,4 @@ Now that the data has been loaded into the chatbot's memory, we can start to ask

		### Loading other data types

		In this example, we loaded a single PDF into a chatbots external memory. However MemGPT supports various types of data, such as full directories of files and even databases - [see the full data sources list](../data_sources).
		In this example, we loaded a single PDF into a chatbots external memory. However MemGPT supports various types of data, such as full directories of files and even databases - [see the full data sources list](../data_sources).
Original file line number	Diff line number	Diff line change
Expand Up		@@ -16,4 +16,4 @@ You can read more about the research behind MemGPT at [https://memgpt.ai](https:

		## Join the community!

		MemGPT is an open source project under active development. If you'd like to help make MemGPT even better, you can come chat with the community on [our Discord server](https://discord.gg/9GEQrxmVyE) or on our [GitHub](https://github.com/cpacker/MemGPT).
		MemGPT is an open source project under active development. If you'd like to help make MemGPT even better, you can come chat with the community on [our Discord server](https://discord.gg/9GEQrxmVyE) or on our [GitHub](https://github.com/cpacker/MemGPT).