Skip to content

Commit

Permalink
softpass test when keys are missing (#369)
Browse files Browse the repository at this point in the history
* softpass test when keys are missing

* update to use local model

* both openai and local

* typo

* fix

* Specify model inference and embedding endpoint separately  (#286)

* Fix config tests (#343)

Co-authored-by: Vivian Fang <hi@vivi.sh>

* Avoid throwing error for older `~/.memgpt/config` files due to missing section `archival_storage` (#344)

* avoid error if has old config type

* Dependency management  (#337)

* Divides dependencies into `pip install pymemgpt[legacy,local,postgres,dev]`. 
* Update docs

* Relax verify_first_message_correctness to accept any function call (#340)

* Relax verify_first_message_correctness to accept any function call

* Also allow missing internal monologue if request_heartbeat

* Cleanup

* get instead of raw dict access

* Update `poetry.lock` (#346)

* mark depricated API section

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* CLI bug fixes for azure

* check azure before running

* Update README.md

* Update README.md

* bug fix with persona loading

* remove print

* make errors for cli flags more clear

* format

* fix imports

* fix imports

* add prints

* update lock

* Add autogen example that lets you chat with docs (#342)

* Relax verify_first_message_correctness to accept any function call

* Also allow missing internal monologue if request_heartbeat

* Cleanup

* get instead of raw dict access

* Support attach in memgpt autogen agent

* Add docs example

* Add documentation, cleanup

* add gpt-4-turbo (#349)

* add gpt-4-turbo

* add in another place

* change to 3.5 16k

* Revert relaxing verify_first_message_correctness, still add archival_memory_search as an exception (#350)

* Revert "Relax verify_first_message_correctness to accept any function call (#340)"

This reverts commit 407d6bf.

* add archival_memory_search as an exception for verify

* Bump version to 0.1.18 (#351)

* Remove `requirements.txt` and `requirements_local.txt` (#358)

* update requirements to match poetry

* update with extras

* remove requirements

* disable pretty exceptions (#367)

* Updated documentation for users (#365)


---------

Co-authored-by: Vivian Fang <hi@vivi.sh>

* Create pull_request_template.md (#368)

* Create pull_request_template.md

* Add pymemgpt-nightly workflow (#373)

* Add pymemgpt-nightly workflow

* change token name

* Update lmstudio.md (#382)

* Update lmstudio.md

* Update lmstudio.md

* Update lmstudio.md to show the Prompt Formatting Option (#384)

* Update lmstudio.md to show the Prompt Formatting Option

* Update lmstudio.md Update the screenshot

* Swap asset location from #384 (#385)

* Update poetry with `pg8000` and include `pgvector` in docs  (#390)

* Allow overriding config location with `MEMGPT_CONFIG_PATH` (#383)

* Always default to local embeddings if not OpenAI or Azure  (#387)

* Add support for larger archival memory stores (#359)

* Replace `memgpt run` flags error with warning + remove custom embedding endpoint option + add agent create time (#364)

* Update webui.md (#397)

turn emoji warning into markdown warning

* Update webui.md (#398)

* dont hard code embeddings

* formatting

* black

* add full deps

* remove changes

* update poetry

---------

Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>
Co-authored-by: Vivian Fang <hi@vivi.sh>
Co-authored-by: MSZ-MGS <65172063+MSZ-MGS@users.noreply.github.com>
  • Loading branch information
4 people authored Nov 9, 2023
1 parent 887efa3 commit c9c1074
Show file tree
Hide file tree
Showing 14 changed files with 535 additions and 320 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ jobs:
PGVECTOR_TEST_DB_URL: ${{ secrets.PGVECTOR_TEST_DB_URL }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: |
poetry install -E postgres -E dev
poetry install -E dev -E postgres -E local -E legacy
- name: Set Poetry config
env:
Expand Down
4 changes: 4 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,14 @@ repos:
rev: v2.3.0
hooks:
- id: check-yaml
exclude: ^docs/
- id: end-of-file-fixer
exclude: ^docs/
- id: trailing-whitespace
exclude: ^docs/
- repo: https://github.com/psf/black
rev: 22.10.0
hooks:
- id: black
exclude: ^docs/
args: ['--line-length', '140']
10 changes: 4 additions & 6 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,13 @@
# Building the docs
# Building the docs

Run the following from the MemGPT directory.
Run the following from the MemGPT directory.

1. Install requirements:
1. Install requirements:
```
pip install -r docs/requirements.txt
```

2. Serve docs:
2. Serve docs:
```
mkdocs serve
```


4 changes: 2 additions & 2 deletions docs/autogen.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ config_list_memgpt = [
},
]
```
`config_list` is used by non-MemGPT agents, which expect an OpenAI-compatible API.
`config_list` is used by non-MemGPT agents, which expect an OpenAI-compatible API.

`config_list_memgpt` is used by MemGPT agents. Currently, MemGPT interfaces with the LLM backend by exporting `OPENAI_API_BASE` and `BACKEND_TYPE` as described above. Note that MemGPT does not use the OpenAI-compatible API (it uses the direct API).

Expand Down Expand Up @@ -147,4 +147,4 @@ Virtual context management is a technique used in large language models like Mem
--------------------------------------------------------------------------------
...
```
```
29 changes: 14 additions & 15 deletions docs/data_sources.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
MemGPT supports pre-loading data into archival memory. In order to made data accessible to your agent, you must load data in with `memgpt load`, then attach the data source to your agent. You can configure where archival memory is stored by configuring the [storage backend](storage.md).

### Viewing available data sources
You can view available data sources with:
You can view available data sources with:
```
memgpt list sources
```
Expand All @@ -15,12 +15,12 @@ memgpt list sources
| memgpt-docs | local | agent_1 |
+----------------+----------+----------+
```
The `Agents` column indicates which agents have access to the data, while `Location` indicates what storage backend the data has been loaded into.
The `Agents` column indicates which agents have access to the data, while `Location` indicates what storage backend the data has been loaded into.

### Attaching data to agents
Attaching a data source to your agent loads the data into your agent's archival memory to access. You can attach data to your agent in two ways:

*[Option 1]* From the CLI, run:
*[Option 1]* From the CLI, run:
```
memgpt attach --agent <AGENT-NAME> --data-source <DATA-SOURCE-NAME>
```
Expand All @@ -41,15 +41,15 @@ memgpt attach --agent <AGENT-NAME> --data-source <DATA-SOURCE-NAME>


### Loading a file or directory
You can load a file, list of files, or directry into MemGPT with the following command:
You can load a file, list of files, or directry into MemGPT with the following command:
```sh
memgpt load directory --name <NAME> \
[--input-dir <DIRECTORY>] [--input-files <FILE1> <FILE2>...] [--recursive]
```


### Loading a database dump
You can load database into MemGPT, either from a database dump or a database connection, with the following command:
### Loading a database dump
You can load database into MemGPT, either from a database dump or a database connection, with the following command:
```sh
memgpt load database --name <NAME> \
--query <QUERY> \ # Query to run on database to get data
Expand All @@ -62,25 +62,24 @@ memgpt load database --name <NAME> \
--dbname <DB_NAME> # Database name
```

### Loading a vector database
If you already have a vector database containing passages and embeddings, you can load them into MemGPT by specifying the table name, database URI, and the columns containing the passage text and embeddings.
### Loading a vector database
If you already have a vector database containing passages and embeddings, you can load them into MemGPT by specifying the table name, database URI, and the columns containing the passage text and embeddings.
```sh
memgpt load vector-database --name <NAME> \
--uri <URI> \ # Database URI
--table_name <TABLE-NAME> \ # Name of table containing data
--table_name <TABLE-NAME> \ # Name of table containing data
--text_column <TEXT-COL> \ # Name of column containing text
--embedding_column <EMBEDDING-COL> # Name of column containing embedding
```
Since embeddings are already provided, MemGPT will not re-compute the embeddings.
Since embeddings are already provided, MemGPT will not re-compute the embeddings.

### Loading a LlamaIndex dump
If you have a Llama Index `VectorIndex` which was saved to disk, you can load it into MemGPT by specifying the directory the index was saved to:
### Loading a LlamaIndex dump
If you have a Llama Index `VectorIndex` which was saved to disk, you can load it into MemGPT by specifying the directory the index was saved to:
```sh
memgpt load index --name <NAME> --dir <INDEX-DIR>
```
Since Llama Index will have already computing embeddings, MemGPT will not re-compute embeddings.
Since Llama Index will have already computing embeddings, MemGPT will not re-compute embeddings.


### Loading other types of data
We highly encourage contributions for new data sources, which can be added as a new [CLI data load command](https://github.com/cpacker/MemGPT/blob/main/memgpt/cli/cli_load.py). We recommend checking for [Llama Index connectors](https://gpt-index.readthedocs.io/en/v0.6.3/how_to/data_connectors.html) that may support ingesting the data you're interested in loading.

We highly encourage contributions for new data sources, which can be added as a new [CLI data load command](https://github.com/cpacker/MemGPT/blob/main/memgpt/cli/cli_load.py). We recommend checking for [Llama Index connectors](https://gpt-index.readthedocs.io/en/v0.6.3/how_to/data_connectors.html) that may support ingesting the data you're interested in loading.
2 changes: 1 addition & 1 deletion docs/example_chat.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,4 +81,4 @@ memgpt run --persona chaz --human bob
💭 Career crisis detected. Commence motivational dialogue and initiate discussions to understand user's aspirations and insecurities. Validate feelings and offer hope. Also, determine interest in exploring alternatives outside the tech field.
🤖 It's perfectly okay to feel uncertain, Bob. Life is a journey and it's never a straight path. If you feel tech isn't your calling, we can explore your passions and look for alternatives. But remember, there's a reason you've come this far in tech. Let's uncover your true potential together, shall we?
> Enter your message:
```
```
2 changes: 1 addition & 1 deletion docs/example_data.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,4 +68,4 @@ Now that the data has been loaded into the chatbot's memory, we can start to ask

### Loading other data types

In this example, we loaded a single PDF into a chatbots external memory. However MemGPT supports various types of data, such as full directories of files and even databases - [see the full data sources list](../data_sources).
In this example, we loaded a single PDF into a chatbots external memory. However MemGPT supports various types of data, such as full directories of files and even databases - [see the full data sources list](../data_sources).
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,4 @@ You can read more about the research behind MemGPT at [https://memgpt.ai](https:

## Join the community!

MemGPT is an open source project under active development. If you'd like to help make MemGPT even better, you can come chat with the community on [our Discord server](https://discord.gg/9GEQrxmVyE) or on our [GitHub](https://github.com/cpacker/MemGPT).
MemGPT is an open source project under active development. If you'd like to help make MemGPT even better, you can come chat with the community on [our Discord server](https://discord.gg/9GEQrxmVyE) or on our [GitHub](https://github.com/cpacker/MemGPT).
2 changes: 1 addition & 1 deletion docs/webui_runpod.md
Original file line number Diff line number Diff line change
@@ -1 +1 @@
TODO
TODO
2 changes: 1 addition & 1 deletion memgpt/autogen/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ config_list_memgpt = [
},
]
```
`config_list` is used by non-MemGPT agents, which expect an OpenAI-compatible API.
`config_list` is used by non-MemGPT agents, which expect an OpenAI-compatible API.

`config_list_memgpt` is used by MemGPT agents. Currently, MemGPT interfaces with the LLM backend by exporting `OPENAI_API_BASE` and `BACKEND_TYPE` as described in [Local LLM support](../local_llm). Note that MemGPT does not use the OpenAI-compatible API (it uses the direct API).

Expand Down
25 changes: 13 additions & 12 deletions memgpt/connectors/db.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,23 +23,24 @@
Base = declarative_base()


class PassageModel(Base):
"""Defines data model for storing Passages (consisting of text, embedding)"""
def get_db_model(table_name: str):
config = MemGPTConfig.load()

__abstract__ = True # this line is necessary
class PassageModel(Base):
"""Defines data model for storing Passages (consisting of text, embedding)"""

# Assuming passage_id is the primary key
id = Column(BIGINT, primary_key=True, nullable=False, autoincrement=True)
doc_id = Column(String)
text = Column(String, nullable=False)
embedding = mapped_column(Vector(1536)) # TODO: don't hard-code
# metadata_ = Column(JSON(astext_type=Text()))
__abstract__ = True # this line is necessary

def __repr__(self):
return f"<Passage(passage_id='{self.id}', text='{self.text}', embedding='{self.embedding})>"
# Assuming passage_id is the primary key
id = Column(BIGINT, primary_key=True, nullable=False, autoincrement=True)
doc_id = Column(String)
text = Column(String, nullable=False)
embedding = mapped_column(Vector(config.embedding_dim))
# metadata_ = Column(JSON(astext_type=Text()))

def __repr__(self):
return f"<Passage(passage_id='{self.id}', text='{self.text}', embedding='{self.embedding})>"

def get_db_model(table_name: str):
"""Create database model for table_name"""
class_name = f"{table_name.capitalize()}Model"
Model = type(class_name, (PassageModel,), {"__tablename__": table_name, "__table_args__": {"extend_existing": True}})
Expand Down
Loading

0 comments on commit c9c1074

Please sign in to comment.