feat(models)!: openAI 1.0 (#1716)

* git output via openai migrate * WIP * git output via openai migrate WIP * get pytests running * get pytests running * completions changes * fix llm_classify tests * fix llm generate tests * minimal test for openai * fix llama_index tests * add back model * fix openai * fix openai tests * Add explanations section for relevance * run GPT4-Turbo * update notebooks * migrate find clusters notebook * update more notebooks * fix relevance * fix tutorials/llm_generative_gpt_4.ipynb * add more tutorials * Start refactoring httpx tests to be order-agnostic * Finish refactoring classification tests * Refactor `generate` tests to be order-agnostic * Migrate to respx for langchain tracer tests * Update openai tracing tests to use respx * feat(evals): azure openai support * address pr comments * return early for azure * fix tiktoken failures * remove model name aliasing * make gpt-4 point to latest * ruff format * docs: update documentation * feat: update OpenAIInstrumentor to support openai>=1.0.0 and deprecate support for openai<1.0.0 (#1723) * correct unit test to mock completions endpoint (#1730) * Update langchain_pinecone_search_and_retrieval_tutorial.ipynb Co-authored-by: Xander Song <axiomofjoy@gmail.com> * Update llama_index_search_and_retrieval_tutorial.ipynb Co-authored-by: Xander Song <axiomofjoy@gmail.com> * Update llm_generative_gpt_4.ipynb Co-authored-by: Xander Song <axiomofjoy@gmail.com> * Update milvus_llamaindex_search_and_retrieval_tutorial.ipynb Co-authored-by: Xander Song <axiomofjoy@gmail.com> * Update ragas_retrieval_evals_tutorial.ipynb Co-authored-by: Xander Song <axiomofjoy@gmail.com> * Update langchain_agent_tracing_tutorial.ipynb Co-authored-by: Xander Song <axiomofjoy@gmail.com> * Update langchain_tracing_tutorial.ipynb Co-authored-by: Xander Song <axiomofjoy@gmail.com> * Update llama_index_openai_agent_tracing_tutorial.ipynb Co-authored-by: Xander Song <axiomofjoy@gmail.com> * Update llama_index_tracing_tutorial.ipynb Co-authored-by: Xander Song <axiomofjoy@gmail.com> * Update langchain_retrieval_qa_with_sources_chain_tracing_tutorial.ipynb Co-authored-by: Xander Song <axiomofjoy@gmail.com> * Update find_cluster_export_and_explore_with_gpt.ipynb Co-authored-by: Xander Song <axiomofjoy@gmail.com> * Update evaluate_toxicity_classifications.ipynb Co-authored-by: Xander Song <axiomofjoy@gmail.com> * Update evaluate_summarization_classifications.ipynb Co-authored-by: Xander Song <axiomofjoy@gmail.com> * Update evaluate_QA_classifications.ipynb Co-authored-by: Xander Song <axiomofjoy@gmail.com> * Update evaluate_hallucination_classifications.ipynb Co-authored-by: Xander Song <axiomofjoy@gmail.com> * Update evaluate_code_readability_classifications.ipynb Co-authored-by: Xander Song <axiomofjoy@gmail.com> * document breaking changes, bump langchain * add more details to the breaking changes * langchain bump * fix qa evals notebook * fix style * update major version * remove logging token count warnings --------- Co-authored-by: Dustin Ngo <dustin@arize.com> Co-authored-by: Xander Song <axiomofjoy@gmail.com>
Arize-ai · Nov 13, 2023 · 36dcdd1 · 36dcdd1
1 parent eb1bbaf
commit 36dcdd1
Show file tree

Hide file tree

Showing 34 changed files with 1,998 additions and 1,873 deletions.
diff --git a/README.md b/README.md
@@ -47,6 +47,7 @@ Phoenix provides MLOps and LLMOps insights at lightning speed with zero-config o
     -   [Exportable Clusters](#exportable-clusters)
 -   [Retrieval-Augmented Generation Analysis](#retrieval-augmented-generation-analysis)
 -   [Structured Data Analysis](#structured-data-analysis)
+-   [Breaking Changes](#breaking-changes)
 -   [Community](#community)
 -   [Thanks](#thanks)
 -   [Copyright, Patent, and License](#copyright-patent-and-license)
@@ -364,6 +365,10 @@ train_ds = px.Dataset(dataframe=train_df, schema=schema, name="training")
 session = px.launch_app(primary=prod_ds, reference=train_ds)
 ```
 
+## Breaking Changes
+
+-   **v1.0.0** - Phoenix now exclusively supports the `openai>=1.0.0` sdk. If you are using an older version of the OpenAI SDK, you can continue to use `arize-phoenix==0.1.1`. However, we recommend upgrading to the latest version of the OpenAI SDK as it contains many improvements. If you are using Phoenix with LlamaIndex and and LangChain, you will have to upgrade to the versions of these packages that support the OpenAI `1.0.0` SDK as well (`llama-index>=0.8.64`, `langchain>=0.0.334`)
+
 ## Community
 
 Join our community to connect with thousands of machine learning practitioners and ML observability enthusiasts.

diff --git a/cspell.json b/cspell.json
@@ -11,14 +11,19 @@
         "Evals",
         "gitbook",
         "HDBSCAN",
+        "httpx",
         "Instrumentor",
         "langchain",
         "llamaindex",
         "NDJSON",
         "numpy",
+        "openai",
+        "pydantic",
         "quickstart",
         "RERANKER",
+        "respx",
         "rgba",
+        "tiktoken",
         "tracedataset",
         "UMAP"
     ],

diff --git a/docs/api/evaluation-models.md b/docs/api/evaluation-models.md
@@ -16,49 +16,90 @@ Need to install the extra dependencies `openai>=0.26.4` and `tiktoken`
 
 ```python
 class OpenAIModel:
-    openai_api_key: Optional[str] = None
-    openai_api_base: Optional[str] = None
-    openai_api_type: Optional[str] = None
-    openai_api_version: Optional[str] = None
-    openai_organization: Optional[str] = None
-    engine: str = ""
+    api_key: Optional[str] = field(repr=False, default=None)
+    """Your OpenAI key. If not provided, will be read from the environment variable"""
+    organization: Optional[str] = field(repr=False, default=None)
+    """
+    The organization to use for the OpenAI API. If not provided, will default
+    to what's configured in OpenAI
+    """
+    base_url: Optional[str] = field(repr=False, default=None)
+    """
+    An optional base URL to use for the OpenAI API. If not provided, will default
+    to what's configured in OpenAI
+    """
     model_name: str = "gpt-4"
+    """Model name to use. In of azure, this is the deployment name such as gpt-35-instant"""
     temperature: float = 0.0
+    """What sampling temperature to use."""
     max_tokens: int = 256
+    """The maximum number of tokens to generate in the completion.
+    -1 returns as many tokens as possible given the prompt and
+    the models maximal context size."""
     top_p: float = 1
+    """Total probability mass of tokens to consider at each step."""
     frequency_penalty: float = 0
+    """Penalizes repeated tokens according to frequency."""
     presence_penalty: float = 0
+    """Penalizes repeated tokens."""
     n: int = 1
-    model_kwargs: Dict[str, Any] = {}
-    batch_size: int = 20
+    """How many completions to generate for each prompt."""
+    model_kwargs: Dict[str, Any] = field(default_factory=dict)
+    """Holds any model parameters valid for `create` call not explicitly specified."""
     request_timeout: Optional[Union[float, Tuple[float, float]]] = None
-    max_retries: int = 6
+    """Timeout for requests to OpenAI completion API. Default is 600 seconds."""
+    max_retries: int = 20
+    """Maximum number of retries to make when generating."""
     retry_min_seconds: int = 10
+    """Minimum number of seconds to wait when retrying."""
     retry_max_seconds: int = 60
+    """Maximum number of seconds to wait when retrying."""
 ```
 
-To authenticate with OpenAI you will need, at a minimum, an API key. Our classes will look for it in your environment, or you can pass it via argument as shown above. In addition, you can choose the specific name of the model you want to use and its configuration parameters. The default values specified above are common default values from OpenAI. Quickly instantiate your model as follows:
+To authenticate with OpenAI you will need, at a minimum, an API key. The model class will look for it in your environment, or you can pass it via argument as shown above. In addition, you can choose the specific name of the model you want to use and its configuration parameters. The default values specified above are common default values from OpenAI. Quickly instantiate your model as follows:
 
 ```python
 model = OpenAI()
-model("Hello there, this is a tesst if you are working?")
+model("Hello there, this is a test if you are working?")
 # Output: "Hello! I'm working perfectly. How can I assist you today?"
 ```
 
 #### Azure OpenAI
 
 The code snippet below shows how to initialize `OpenAIModel` for Azure. Refer to the Azure [docs](https://microsoftlearning.github.io/mslearn-openai/Instructions/Labs/02-natural-language-azure-openai.html) on how to obtain these value from your Azure deployment.
 
+Here is an example of how to initialize `OpenAIModel` for Azure:
+
 ```python
 model = OpenAIModel(
-    openai_api_key=YOUR_AZURE_OPENAI_API_KEY,
-    openai_api_base="https://YOUR_RESOURCE_NAME.openai.azure.com",
-    openai_api_type="azure",
-    openai_api_version="2023-05-15",  # See Azure docs for more
-    engine="YOUR_MODEL_DEPLOYMENT_NAME",
+    model = OpenAIModel(
+    model_name="gpt-4-32k",
+    azure_endpoint="https://YOUR_SUBDOMAIN.openai.azure.com/",
+    api_version="2023-03-15-preview"
 )
 ```
 
+Azure OpenAI supports specific options:
+
+```python
+api_version: str = field(default=None)
+"""
+The verion of the API that is provisioned
+https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#rest-api-versioning
+"""
+azure_endpoint: Optional[str] = field(default=None)
+"""
+The endpoint to use for azure openai. Available in the azure portal.
+https://learn.microsoft.com/en-us/azure/cognitive-services/openai/how-to/create-resource?pivots=web-portal#create-a-resource
+"""
+azure_deployment: Optional[str] = field(default=None)
+azure_ad_token: Optional[str] = field(default=None)
+azure_ad_token_provider: Optional[Callable[[], str]] = field(default=None)
+
+```
+
+For full details on Azure OpenAI, check out the [OpenAI Documentation](https://github.com/openai/openai-python#microsoft-azure-openai)
+
 Find more about the functionality available in our EvalModels in the [#usage](evaluation-models.md#usage "mention") section.
 
 ### phoenix.experimental.evals.VertexAI
@@ -95,7 +136,7 @@ model("Hello there, this is a tesst if you are working?")
 ### phoenix.experimental.evals.BedrockModel
 
 ```python
-class BedrockModel:    
+class BedrockModel:
     model_id: str = "anthropic.claude-v2"
     """The model name to use."""
     temperature: float = 0.0
@@ -219,15 +260,15 @@ responses = await model.agenerate(
 )
 print(responses)
 # Output: [
-#     "As an artificial intelligence, I don't have feelings, but I'm here and ready 
+#     "As an artificial intelligence, I don't have feelings, but I'm here and ready
 #         to assist you. How can I help you today?",
-#     "The Mediterranean region is known for its hot, dry summers and mild, wet 
+#     "The Mediterranean region is known for its hot, dry summers and mild, wet
 #         winters. This climate is characterized by warm temperatures throughout the
-#         year, with the highest temperatures usually occurring in July and August. 
-#         Rainfall is scarce during the summer months but more frequent during the 
-#         winter months. The region also experiences a lot of sunshine, with some 
+#         year, with the highest temperatures usually occurring in July and August.
+#         Rainfall is scarce during the summer months but more frequent during the
+#         winter months. The region also experiences a lot of sunshine, with some
 #         areas receiving about 300 sunny days per year.",
-#     "You're welcome! Don't hesitate to reach out if you need anything else. 
+#     "You're welcome! Don't hesitate to reach out if you need anything else.
 #         Goodbye!"
 #    ]
 ```
@@ -252,7 +293,7 @@ print(text)
 
 ### `model.max_context_size`
 
-Furthermore, LLM models have a limited number of tokens that they can pay attention to. We call this limit the _context size_ or _context window_. You can access the context size of your model via the  property `max_context_size`. In the following example, we used the model `gpt-4-0613` and the context size is
+Furthermore, LLM models have a limited number of tokens that they can pay attention to. We call this limit the _context size_ or _context window_. You can access the context size of your model via the property `max_context_size`. In the following example, we used the model `gpt-4-0613` and the context size is
 
 ```python
 print(model.max_context_size)

diff --git a/pyproject.toml b/pyproject.toml
@@ -55,8 +55,8 @@ dev = [
   "strawberry-graphql[debug-server]==0.208.2",
   "pre-commit",
   "arize[AutoEmbeddings, LLM_Evaluation]",
-  "llama-index>=0.8.29",
-  "langchain>=0.0.324",
+  "llama-index>=0.8.64",
+  "langchain>=0.0.334",
 ]
 experimental = [
   "tenacity",
@@ -91,9 +91,9 @@ dependencies = [
   "pytest-cov",
   "pytest-lazy-fixture",
   "arize",
-  "langchain>=0.0.324",
-  "llama-index>=0.8.29",
-  "openai<1.0.0",
+  "langchain>=0.0.334",
+  "llama-index>=0.8.63.post2",
+  "openai>=1.0.0",
   "tenacity",
   "nltk==3.8.1",
   "sentence-transformers==2.2.2",
@@ -103,18 +103,20 @@ dependencies = [
   "responses",
   "tiktoken",
   "typing-extensions<4.6.0",  # for Colab
+  "httpx", # For OpenAI testing
+  "respx", # For OpenAI testing
 ]
 
 [tool.hatch.envs.type]
 dependencies = [
   "mypy==1.5.1",
-  "llama-index>=0.8.29",
+  "llama-index>=0.8.64",
   "pandas-stubs<=2.0.2.230605",  # version 2.0.3.230814 is causing a dependency conflict.
   "types-psutil",
   "types-tqdm",
   "types-requests",
   "types-protobuf",
-  "openai<1.0.0",
+  "openai>=1.0.0",
 ]
 
 [tool.hatch.envs.style]

diff --git a/scripts/rag/llama_index_w_evals_and_qa.py b/scripts/rag/llama_index_w_evals_and_qa.py
@@ -11,7 +11,6 @@
 
 import cohere
 import numpy as np
-import openai
 import pandas as pd
 import phoenix.experimental.evals.templates.default_templates as templates
 import requests
@@ -380,8 +379,7 @@ def process_row(row, formatted_evals_column, k):
 
 
 def check_keys() -> None:
-    openai.api_key = os.getenv("OPENAI_API_KEY")
-    if openai.api_key is None:
+    if os.getenv("OPENAI_API_KEY") is None:
         raise RuntimeError(
             "OpenAI API key missing. Please set it up in your environment as OPENAI_API_KEY"
         )

diff --git a/src/phoenix/__init__.py b/src/phoenix/__init__.py
@@ -5,7 +5,7 @@
 from .trace.fixtures import load_example_traces
 from .trace.trace_dataset import TraceDataset
 
-__version__ = "0.1.1"
+__version__ = "1.0.0"
 
 # module level doc-string
 __doc__ = """