Merge pull request #79 from cpacker/azure-support-pls

azure support
letta-ai · Oct 21, 2023 · 26de00c · 26de00c
2 parents e50220e + aa6136e
commit 26de00c
Show file tree

Hide file tree

Showing 3 changed files with 61 additions and 16 deletions.
diff --git a/README.md b/README.md
@@ -5,7 +5,7 @@
 <div align="center">
 
  <strong>Try out our MemGPT chatbot on <a href="https://discord.gg/9GEQrxmVyE">Discord</a>!</strong>
- 
+
 [![Discord](https://img.shields.io/discord/1161736243340640419?label=Discord&logo=discord&logoColor=5865F2&style=flat-square&color=5865F2)](https://discord.gg/9GEQrxmVyE)
 [![arXiv 2310.08560](https://img.shields.io/badge/arXiv-2310.08560-B31B1B?logo=arxiv&style=flat-square)](https://arxiv.org/abs/2310.08560)
 
@@ -45,9 +45,9 @@
   </details>
 </details>
 
-## Quick setup 
+## Quick setup
 
-Join <a href="https://discord.gg/9GEQrxmVyE">Discord</a></strong> and message the MemGPT bot (in the `#memgpt` channel). Then run the following commands (messaged to "MemGPT Bot"): 
+Join <a href="https://discord.gg/9GEQrxmVyE">Discord</a></strong> and message the MemGPT bot (in the `#memgpt` channel). Then run the following commands (messaged to "MemGPT Bot"):
 * `/profile` (to create your profile)
 * `/key` (to enter your OpenAI key)
 * `/create` (to create a MemGPT chatbot)
@@ -58,14 +58,14 @@ MemGPT → Privacy Settings → Direct Messages set to ON
  <img src="https://memgpt.ai/assets/img/discord/dm_settings.png" alt="set DMs settings on MemGPT server to be open in MemGPT so that MemGPT Bot can message you" width="400">
 </div>
 
-You can see the full list of available commands when you enter `/` into the message box. 
+You can see the full list of available commands when you enter `/` into the message box.
 <div align="center">
  <img src="https://memgpt.ai/assets/img/discord/slash_commands.png" alt="MemGPT Bot slash commands" width="400">
 </div>
 
-## What is MemGPT? 
+## What is MemGPT?
 
-Memory-GPT (or MemGPT in short) is a system that intelligently manages different memory tiers in LLMs in order to effectively provide extended context within the LLM's limited context window. For example, MemGPT knows when to push critical information to a vector database and when to retrieve it later in the chat, enabling perpetual conversations. Learn more about MemGPT in our [paper](https://arxiv.org/abs/2310.08560). 
+Memory-GPT (or MemGPT in short) is a system that intelligently manages different memory tiers in LLMs in order to effectively provide extended context within the LLM's limited context window. For example, MemGPT knows when to push critical information to a vector database and when to retrieve it later in the chat, enabling perpetual conversations. Learn more about MemGPT in our [paper](https://arxiv.org/abs/2310.08560).
 
 ## Running MemGPT locally
 
@@ -100,6 +100,19 @@ To run MemGPT for as a conversation agent in CLI mode, simply run `main.py`:
 python3 main.py
 ```
 
+If you're using Azure OpenAI, set these variables instead:
+
+```sh
+# see https://github.com/openai/openai-python#microsoft-azure-endpoints
+export AZURE_OPENAI_KEY = ...
+export AZURE_OPENAI_ENDPOINT = ...
+export AZURE_OPENAI_VERSION = ...
+export AZURE_OPENAI_DEPLOYMENT = ...
+
+# then use the --use_azure_openai flag
+python main.py --use_azure_openai
+```
+
 To create a new starter user or starter persona (that MemGPT gets initialized with), create a new `.txt` file in [/memgpt/humans/examples](/memgpt/humans/examples) or [/memgpt/personas/examples](/memgpt/personas/examples), then use the `--persona` or `--human` flag when running `main.py`. For example:
 
 ```sh
@@ -168,7 +181,7 @@ While using MemGPT via the CLI you can run various commands:
 <details open>
 <summary><h3>Use MemGPT to talk to your Database!</h3></summary>
 
-MemGPT's archival memory let's you load your database and talk to it! To motivate this use-case, we have included a toy example. 
+MemGPT's archival memory let's you load your database and talk to it! To motivate this use-case, we have included a toy example.
 
 Consider the `test.db` already included in the repository.
 
@@ -221,7 +234,7 @@ This will generate embeddings, stick them into a FAISS index, and write the inde
     --archival_storage_faiss_path=<DIRECTORY_WITH_EMBEDDINGS> (if your files haven't changed).
 ```
 
-If you want to reuse these embeddings, run 
+If you want to reuse these embeddings, run
 ```bash
 python3 main.py --archival_storage_faiss_path="<DIRECTORY_WITH_EMBEDDINGS>" --persona=memgpt_doc --human=basic
 ```
@@ -233,17 +246,17 @@ python3 main.py --archival_storage_faiss_path="<DIRECTORY_WITH_EMBEDDINGS>" --pe
 
 MemGPT also enables you to chat with docs -- try running this example to talk to the LlamaIndex API docs!
 
-1. 
+1.
     a. Download LlamaIndex API docs and FAISS index from [Hugging Face](https://huggingface.co/datasets/MemGPT/llamaindex-api-docs).
    ```bash
    # Make sure you have git-lfs installed (https://git-lfs.com)
    git lfs install
    git clone https://huggingface.co/datasets/MemGPT/llamaindex-api-docs
    mv llamaindex-api-docs
    ```
-   
+
     **-- OR --**
-   
+
    b. Build the index:
     1. Build `llama_index` API docs with `make text`. Instructions [here](https://github.com/run-llama/llama_index/blob/main/docs/DOCS_README.md). Copy over the generated `_build/text` folder to `memgpt/personas/docqa`.
     2. Generate embeddings and FAISS index.

diff --git a/main.py b/main.py
@@ -31,6 +31,8 @@
 flags.DEFINE_string("archival_storage_files", default="", required=False, help="Specify files to pre-load into archival memory (glob pattern)")
 flags.DEFINE_string("archival_storage_files_compute_embeddings", default="", required=False, help="Specify files to pre-load into archival memory (glob pattern), and compute embeddings over them")
 flags.DEFINE_string("archival_storage_sqldb", default="", required=False, help="Specify SQL database to pre-load into archival memory")
+# Support for Azure OpenAI (see: https://github.com/openai/openai-python#microsoft-azure-endpoints)
+flags.DEFINE_boolean("use_azure_openai", default=False, required=False, help="Use Azure OpenAI (requires additional environment variables)")
 
 
 def clear_line():
@@ -48,6 +50,28 @@ async def main():
         logging.getLogger().setLevel(logging.DEBUG)
     print("Running... [exit by typing '/exit']")
 
+    # Azure OpenAI support
+    if FLAGS.use_azure_openai:
+        azure_openai_key = os.getenv('AZURE_OPENAI_KEY')
+        azure_openai_endpoint = os.getenv('AZURE_OPENAI_ENDPOINT')
+        azure_openai_version = os.getenv('AZURE_OPENAI_VERSION')
+        azure_openai_deployment = os.getenv('AZURE_OPENAI_DEPLOYMENT')
+        if None in [azure_openai_key, azure_openai_endpoint, azure_openai_version, azure_openai_deployment]:
+            print(f"Error: missing Azure OpenAI environment variables. Please see README section on Azure.")
+            return
+
+        import openai
+        openai.api_type = "azure"
+        openai.api_key = azure_openai_key
+        openai.api_base = azure_openai_endpoint
+        openai.api_version = azure_openai_version
+        # deployment gets passed into chatcompletion
+    else:
+        azure_openai_deployment = os.getenv('AZURE_OPENAI_DEPLOYMENT')
+        if azure_openai_deployment is not None:
+            print(f"Error: AZURE_OPENAI_DEPLOYMENT should not be set if --use_azure_openai is False")
+            return
+
     if FLAGS.model != constants.DEFAULT_MEMGPT_MODEL:
       interface.important_message(f"Warning - you are running MemGPT with {FLAGS.model}, which is not officially supported (yet). Expect bugs!")
 
@@ -94,7 +118,7 @@ async def main():
                 await memgpt_agent.persistence_manager.archival_memory.insert(row)
             print(f"Database loaded into archival memory.")
 
-    # auto-exit for 
+    # auto-exit for
     if "GITHUB_ACTIONS" in os.environ:
         return
 
@@ -187,10 +211,10 @@ async def main():
                         except Exception as e:
                             print(f"Loading {filename} failed with: {e}")
                     else:
-                        # Load the latest file 
+                        # Load the latest file
                         print(f"/load warning: no checkpoint specified, loading most recent checkpoint instead")
                         json_files = glob.glob("saved_state/*.json")  # This will list all .json files in the current directory.
-        
+
                         # Check if there are any json files.
                         if not json_files:
                             print(f"/load error: no .json checkpoint files found")
@@ -295,4 +319,4 @@ def run(argv):
         loop = asyncio.get_event_loop()
         loop.run_until_complete(main())
 
-    app.run(run)
+    app.run(run)
diff --git a/memgpt/openai_tools.py b/memgpt/openai_tools.py
@@ -1,5 +1,6 @@
 import asyncio
 import random
+import os
 import time
 
 import openai
@@ -101,18 +102,25 @@ async def wrapper(*args, **kwargs):
 
 @aretry_with_exponential_backoff
 async def acompletions_with_backoff(**kwargs):
+    azure_openai_deployment = os.getenv('AZURE_OPENAI_DEPLOYMENT')
+    if azure_openai_deployment is not None:
+        kwargs['deployment_id'] = azure_openai_deployment
     return await openai.ChatCompletion.acreate(**kwargs)
 
 
 @aretry_with_exponential_backoff
 async def acreate_embedding_with_backoff(**kwargs):
     """Wrapper around Embedding.acreate w/ backoff"""
+    azure_openai_deployment = os.getenv('AZURE_OPENAI_DEPLOYMENT')
+    if azure_openai_deployment is not None:
+        kwargs['deployment_id'] = azure_openai_deployment
     return await openai.Embedding.acreate(**kwargs)
 
+
 async def async_get_embedding_with_backoff(text, model="text-embedding-ada-002"):
     """To get text embeddings, import/call this function
     It specifies defaults + handles rate-limiting + is async"""
     text = text.replace("\n", " ")
     response = await acreate_embedding_with_backoff(input = [text], model=model)
     embedding = response['data'][0]['embedding']
-    return embedding
+    return embedding