Merge branch 'main' into docs

Arize-ai · Feb 7, 2024 · d7d7f97 · d7d7f97
2 parents 3310ba6 + ee4ced3
commit d7d7f97
Show file tree

Hide file tree

Showing 71 changed files with 9,147 additions and 8,525 deletions.
diff --git a/.prettierignore b/.prettierignore
@@ -0,0 +1 @@
+docs
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,78 @@
 # Changelog
 
+## [2.9.4](https://github.com/Arize-ai/phoenix/compare/v2.9.3...v2.9.4) (2024-02-06)
+
+
+### Bug Fixes
+
+* disregard active session if endpoint is provided to px.Client ([#2206](https://github.com/Arize-ai/phoenix/issues/2206)) ([6ec0d23](https://github.com/Arize-ai/phoenix/commit/6ec0d2344ffb7f40534730160f10d99f266788da))
+
+## [2.9.3](https://github.com/Arize-ai/phoenix/compare/v2.9.2...v2.9.3) (2024-02-05)
+
+
+### Bug Fixes
+
+* absolute path for eval exporter ([#2202](https://github.com/Arize-ai/phoenix/issues/2202)) ([2ac39e9](https://github.com/Arize-ai/phoenix/commit/2ac39e93de3f437c5cf3f092bd6de437d75337ce))
+
+## [2.9.2](https://github.com/Arize-ai/phoenix/compare/v2.9.1...v2.9.2) (2024-02-05)
+
+
+### Bug Fixes
+
+* localhost address for px.Client ([#2200](https://github.com/Arize-ai/phoenix/issues/2200)) ([e56b66a](https://github.com/Arize-ai/phoenix/commit/e56b66adea734693a82f49b415e093a07a9f0ff1))
+
+## [2.9.1](https://github.com/Arize-ai/phoenix/compare/v2.9.0...v2.9.1) (2024-02-05)
+
+
+### Bug Fixes
+
+* absolute path for urljoin in px.Client ([#2199](https://github.com/Arize-ai/phoenix/issues/2199)) ([ba30a30](https://github.com/Arize-ai/phoenix/commit/ba30a30d1312af042b81b631b5d0b6cc0e14d411))
+
+
+### Documentation
+
+* update readme with a deployment guide ([#2194](https://github.com/Arize-ai/phoenix/issues/2194)) ([bf67775](https://github.com/Arize-ai/phoenix/commit/bf6777569c764392d72d4ccf3c71738079957901))
+
+## [2.9.0](https://github.com/Arize-ai/phoenix/compare/v2.8.0...v2.9.0) (2024-02-05)
+
+
+### Features
+
+* phoenix client `get_evaluations()` and `get_trace_dataset()` ([#2154](https://github.com/Arize-ai/phoenix/issues/2154)) ([29800e4](https://github.com/Arize-ai/phoenix/commit/29800e4ed4a901ad19874ba049638e13d8c67b87))
+* phoenix client `get_spans_dataframe()` and `query_spans()` ([#2151](https://github.com/Arize-ai/phoenix/issues/2151)) ([e44b948](https://github.com/Arize-ai/phoenix/commit/e44b948301b28b22d5f578de686dc29c1cf84ad0))
+
+## [2.8.0](https://github.com/Arize-ai/phoenix/compare/v2.7.0...v2.8.0) (2024-02-02)
+
+
+### Features
+
+* Remove model-level tenacity retries ([#2176](https://github.com/Arize-ai/phoenix/issues/2176)) ([66d452c](https://github.com/Arize-ai/phoenix/commit/66d452c45a676ee5dbac43b25df43df32bdb71bc))
+
+
+### Bug Fixes
+
+* broken link and openinference links ([#2144](https://github.com/Arize-ai/phoenix/issues/2144)) ([01fb046](https://github.com/Arize-ai/phoenix/commit/01fb0464d023e1494c22f80b10ed840eef47fce8))
+* databricks check crashes in python console ([#2152](https://github.com/Arize-ai/phoenix/issues/2152)) ([5aeeeff](https://github.com/Arize-ai/phoenix/commit/5aeeeff9fa8c2d697374686552b35127238dce44))
+* default collector endpoint breaks on windows ([#2161](https://github.com/Arize-ai/phoenix/issues/2161)) ([f1a2007](https://github.com/Arize-ai/phoenix/commit/f1a200713c44ffcf2506ff54429715ef7171ecd1))
+* Do not retry when context window has been exceeded ([#2126](https://github.com/Arize-ai/phoenix/issues/2126)) ([ff6df1f](https://github.com/Arize-ai/phoenix/commit/ff6df1fc01f0986357a9e20e0441a3c15697a5fa))
+* remove hyphens from span_id in legacy evaluation fixtures ([#2153](https://github.com/Arize-ai/phoenix/issues/2153)) ([fae859d](https://github.com/Arize-ai/phoenix/commit/fae859d8831669f92a368e979caa81a778948432))
+
+
+### Documentation
+
+* add docker badge ([e584ed8](https://github.com/Arize-ai/phoenix/commit/e584ed87960eba61c0e5165e3c0d08cf0d11e672))
+* Add terminal running steps (GITBOOK-441) ([91c6b24](https://github.com/Arize-ai/phoenix/commit/91c6b24b411bd2d447c7c2c4453bb57320bff325))
+* No subject (GITBOOK-442) ([5c4eb6c](https://github.com/Arize-ai/phoenix/commit/5c4eb6c93a284e06907582b3b80dc70cbfd3d0e6))
+* No subject (GITBOOK-443) ([11f46cb](https://github.com/Arize-ai/phoenix/commit/11f46cbbb442dbbbc7d84779915ecc537461b80c))
+* No subject (GITBOOK-444) ([fcf2bc9](https://github.com/Arize-ai/phoenix/commit/fcf2bc927c24cfb7cba3eda8e7589f59af2dfcf1))
+* update badge ([ddcecea](https://github.com/Arize-ai/phoenix/commit/ddcecea23bc9998f361f3cb41427688f84314295))
+* update prompt to reflect rails (GITBOOK-445) ([dea6dd6](https://github.com/Arize-ai/phoenix/commit/dea6dd6ce2f179cf200eaef5f77ba958140355a2))
+
+
+### Miscellaneous Chores
+
+* change release to 2.8.0 ([#2181](https://github.com/Arize-ai/phoenix/issues/2181)) ([0b7b524](https://github.com/Arize-ai/phoenix/commit/0b7b524d8cbd05bf1f8652a648145ed94d72af90))
+
 ## [2.7.0](https://github.com/Arize-ai/phoenix/compare/v2.6.0...v2.7.0) (2024-01-24)
 
 

diff --git a/README.md b/README.md
@@ -22,6 +22,9 @@
     <a target="_blank" href="https://pypi.org/project/arize-phoenix/">
         <img src="https://img.shields.io/pypi/pyversions/arize-phoenix">
     </a>
+    <a target="_blank" href="https://hub.docker.com/repository/docker/arizephoenix/phoenix/general">
+        <img src="https://img.shields.io/docker/v/arizephoenix/phoenix?sort=semver&logo=docker&label=image&color=blue">
+    </a>
 </p>
 
 ![a rotating UMAP point cloud of a computer vision model](https://github.com/Arize-ai/phoenix-assets/blob/main/gifs/image_classification_10mb.gif?raw=true)
@@ -36,21 +39,22 @@ Phoenix provides MLOps and LLMOps insights at lightning speed with zero-config o
 
 **Table of Contents**
 
--   [Installation](#installation)
--   [LLM Traces](#llm-traces)
-    -   [Tracing with LlamaIndex](#tracing-with-llamaindex)
-    -   [Tracing with LangChain](#tracing-with-langchain)
--   [LLM Evals](#llm-evals)
--   [Embedding Analysis](#embedding-analysis)
-    -   [UMAP-based Exploratory Data Analysis](#umap-based-exploratory-data-analysis)
-    -   [Cluster-driven Drift and Performance Analysis](#cluster-driven-drift-and-performance-analysis)
-    -   [Exportable Clusters](#exportable-clusters)
--   [Retrieval-Augmented Generation Analysis](#retrieval-augmented-generation-analysis)
--   [Structured Data Analysis](#structured-data-analysis)
--   [Breaking Changes](#breaking-changes)
--   [Community](#community)
--   [Thanks](#thanks)
--   [Copyright, Patent, and License](#copyright-patent-and-license)
+- [Installation](#installation)
+- [LLM Traces](#llm-traces)
+  - [Tracing with LlamaIndex](#tracing-with-llamaindex)
+  - [Tracing with LangChain](#tracing-with-langchain)
+- [LLM Evals](#llm-evals)
+- [Embedding Analysis](#embedding-analysis)
+  - [UMAP-based Exploratory Data Analysis](#umap-based-exploratory-data-analysis)
+  - [Cluster-driven Drift and Performance Analysis](#cluster-driven-drift-and-performance-analysis)
+  - [Exportable Clusters](#exportable-clusters)
+- [Retrieval-Augmented Generation Analysis](#retrieval-augmented-generation-analysis)
+- [Structured Data Analysis](#structured-data-analysis)
+- [Deploying Phoenix](#deploying-phoenix)
+- [Breaking Changes](#breaking-changes)
+- [Community](#community)
+- [Thanks](#thanks)
+- [Copyright, Patent, and License](#copyright-patent-and-license)
 
 ## Installation
 
@@ -365,6 +369,27 @@ train_ds = px.Dataset(dataframe=train_df, schema=schema, name="training")
 session = px.launch_app(primary=prod_ds, reference=train_ds)
 ```
 
+## Deploying Phoenix
+
+ <a target="_blank" href="https://hub.docker.com/repository/docker/arizephoenix/phoenix/general">
+        <img src="https://img.shields.io/docker/v/arizephoenix/phoenix?sort=semver&logo=docker&label=image&color=blue">
+    </a>
+
+<img src="https://storage.googleapis.com/arize-assets/phoenix/assets/images/deployment.png" title="How phoenix can collect traces from an LLM application"/>
+
+Phoenix's notebook-first approach to observability makes it a great tool to utilize during experimentation and pre-production. However at some point you are going to want to ship your application to production and continue to monitor your application as it runs. Phoenix is made up of two components that can be deployed independently:
+
+-   **Trace Instrumentation**: These are a set of plugins that can be added to your application's startup process. These plugins (known as instrumentations) automatically collect spans for your application and export them for collection and visualization. For phoenix, all the instrumentors are managed via a single repository called [OpenInference](https://github.com/Arize-ai/openinference)
+-   **Trace Collector**: The Phoenix server acts as a trace collector and application that helps you troubleshoot your application in real time. You can pull the latest images of Phoenix from the [Docker Hub](https://hub.docker.com/repository/docker/arizephoenix/phoenix/general)
+
+In order to run Phoenix tracing in production, you will have to follow these following steps:
+
+-   **Setup a Server**: your LLM application to run on a server ([examples](https://github.com/Arize-ai/openinference/tree/main/python/examples))
+-   **Instrument**: Add [OpenInference](https://github.com/Arize-ai/openinference) Instrumentation to your server
+-   **Observe**: Run the Phoenix server as a side-car or a standalone instance and point your tracing instrumentation to the phoenix server
+
+For more information on deploying Phoenix, see the [Phoenix Deployment Guide](https://docs.arize.com/phoenix/deployment/deploying-phoenix).
+
 ## Breaking Changes
 
 -   **v1.0.0** - Phoenix now exclusively supports the `openai>=1.0.0` sdk. If you are using an older version of the OpenAI SDK, you can continue to use `arize-phoenix==0.1.1`. However, we recommend upgrading to the latest version of the OpenAI SDK as it contains many improvements. If you are using Phoenix with LlamaIndex and and LangChain, you will have to upgrade to the versions of these packages that support the OpenAI `1.0.0` SDK as well (`llama-index>=0.8.64`, `langchain>=0.0.334`)

diff --git a/app/package.json b/app/package.json
@@ -78,7 +78,7 @@
     "build:relay": "relay-compiler",
     "watch": "./esbuild.config.mjs dev",
     "test": "jest --config ./jest.config.js",
-    "dev": "npm run dev:server:image & npm run build:static && npm run watch",
+    "dev": "npm run dev:server:traces:llama_index_rag & npm run build:static && npm run watch",
     "dev:server:mnist": "python3 -m phoenix.server.main --umap_params 0,30,550 fixture fashion_mnist",
     "dev:server:mnist:single": "python3 -m phoenix.server.main fixture fashion_mnist --primary-only true",
     "dev:server:sentiment": "python3 -m phoenix.server.main fixture sentiment_classification_language_drift",

diff --git a/app/src/pages/trace/TracePage.tsx b/app/src/pages/trace/TracePage.tsx
@@ -216,7 +216,7 @@ export function TracePage() {
     <DialogContainer
       type="slideOver"
       isDismissable
-      onDismiss={() => navigate(-1)}
+      onDismiss={() => navigate("/tracing")}
     >
       <Dialog size="XL" title="Trace Details">
         <main

diff --git a/docs/api/evals.md b/docs/api/evals.md
@@ -60,7 +60,7 @@ from phoenix.experimental.evals import (
 )
 
 api_key = None  # set your api key here or with the OPENAI_API_KEY environment variable
-eval_model = OpenAIModel(model_name="gpt-4-1106-preview", api_key=api_key)
+eval_model = OpenAIModel(model_name="gpt-4-turbo-preview", api_key=api_key)
 
 hallucination_evaluator = HallucinationEvaluator(eval_model)
 qa_correctness_evaluator = QAEvaluator(eval_model)
@@ -264,7 +264,7 @@ capitals_df = llm_generate(
     dataframe=countries_df,
     template=template,
     model=OpenAIModel(
-        model_name="gpt-4-1106-preview",
+        model_name="gpt-4-turbo-preview",
         model_kwargs={
             "response_format": {"type": "json_object"}
         }

diff --git a/docs/llm-evals/quickstart-retrieval-evals/README.md b/docs/llm-evals/quickstart-retrieval-evals/README.md
@@ -37,7 +37,7 @@ from phoenix.experimental.evals import (
 # Creating Hallucination Eval which checks if the application hallucinated
 hallucination_eval = llm_classify(
   dataframe=queries_df,
-  model=OpenAIModel("gpt-4-1106-preview", temperature=0.0),
+  model=OpenAIModel("gpt-4-turbo-preview", temperature=0.0),
   template=HALLUCINATION_PROMPT_TEMPLATE,
   rails=list(HALLUCINATION_PROMPT_RAILS_MAP.values()),
   provide_explanation=True,  # Makes the LLM explain its reasoning
@@ -50,7 +50,7 @@ hallucination_eval["score"] = (
 # Creating Q&A Eval which checks if the application answered the question correctly
 qa_correctness_eval = llm_classify(
   dataframe=queries_df,
-  model=OpenAIModel("gpt-4-1106-preview", temperature=0.0),
+  model=OpenAIModel("gpt-4-turbo-preview", temperature=0.0),
   template=QA_PROMPT_TEMPLATE,
   rails=list(QA_PROMPT_RAILS_MAP.values()),
   provide_explanation=True,  # Makes the LLM explain its reasoning
@@ -90,7 +90,7 @@ from phoenix.experimental.evals import (
 
 retrieved_documents_eval = llm_classify(
     dataframe=retrieved_documents_df,
-    model=OpenAIModel("gpt-4-1106-preview", temperature=0.0),
+    model=OpenAIModel("gpt-4-turbo-preview", temperature=0.0),
     template=RAG_RELEVANCY_PROMPT_TEMPLATE,
     rails=list(RAG_RELEVANCY_PROMPT_RAILS_MAP.values()),
     provide_explanation=True,

diff --git a/docs/quickstart/evals.md b/docs/quickstart/evals.md
@@ -71,7 +71,7 @@ Install the OpenAI SDK with `pip install openai` and instantiate your model.
 from phoenix.experimental.evals import OpenAIModel
 
 api_key = None  # set your api key here or with the OPENAI_API_KEY environment variable
-eval_model = OpenAIModel(model_name="gpt-4-1106-preview", api_key=api_key)
+eval_model = OpenAIModel(model_name="gpt-4-turbo-preview", api_key=api_key)
 ```
 
 You'll next define your evaluators. Evaluators are built on top of language models and prompt the LLM to assess the quality of responses, the relevance of retrieved documents, etc., and provide a quality signal even in the absence of human-labeled data. Pick an evaluator type and instantiate it with the language model you want to use to perform evaluations using our battle-tested evaluation templates.

diff --git a/docs/use-cases/rag-evaluation.md b/docs/use-cases/rag-evaluation.md
@@ -372,7 +372,7 @@ from phoenix.experimental.evals import (
     run_evals,
 )
 
-relevance_evaluator = RelevanceEvaluator(OpenAIModel(model_name="gpt-4-1106-preview"))
+relevance_evaluator = RelevanceEvaluator(OpenAIModel(model_name="gpt-4-turbo-preview"))
 
 retrieved_documents_relevance_df = run_evals(
     evaluators=[relevance_evaluator],
@@ -530,8 +530,8 @@ from phoenix.experimental.evals import (
     run_evals,
 )
 
-qa_evaluator = QAEvaluator(OpenAIModel(model_name="gpt-4-1106-preview"))
-hallucination_evaluator = HallucinationEvaluator(OpenAIModel(model_name="gpt-4-1106-preview"))
+qa_evaluator = QAEvaluator(OpenAIModel(model_name="gpt-4-turbo-preview"))
+hallucination_evaluator = HallucinationEvaluator(OpenAIModel(model_name="gpt-4-turbo-preview"))
 
 qa_correctness_eval_df, hallucination_eval_df = run_evals(
     evaluators=[qa_evaluator, hallucination_evaluator],

diff --git a/examples/using_llamaindex_with_huggingface_models.ipynb b/examples/using_llamaindex_with_huggingface_models.ipynb
@@ -278,7 +278,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "trace_df = px.active_session().get_spans_dataframe('span_kind == \"RETRIEVER\"')\n",
+    "trace_df = px.Client().get_spans_dataframe('span_kind == \"RETRIEVER\"')\n",
     "trace_df"
    ]
   },

diff --git a/pyproject.toml b/pyproject.toml
@@ -124,6 +124,7 @@ dependencies = [
 [tool.hatch.envs.type]
 dependencies = [
   "mypy==1.5.1",
+  "pydantic==v1.10.14",  # for mypy
   "llama-index>=0.9.14",
   "pandas-stubs<=2.0.2.230605",  # version 2.0.3.230814 is causing a dependency conflict.
   "types-psutil",

diff --git a/src/phoenix/__init__.py b/src/phoenix/__init__.py
@@ -1,6 +1,7 @@
 from .datasets.dataset import Dataset
 from .datasets.fixtures import ExampleDatasets, load_example
 from .datasets.schema import EmbeddingColumnNames, RetrievalEmbeddingColumnNames, Schema
+from .session.client import Client
 from .session.evaluation import log_evaluations
 from .session.session import NotebookEnvironment, Session, active_session, close_app, launch_app
 from .trace.fixtures import load_example_traces
@@ -39,4 +40,5 @@
     "TraceDataset",
     "NotebookEnvironment",
     "log_evaluations",
+    "Client",
 ]
diff --git a/src/phoenix/core/traces.py b/src/phoenix/core/traces.py
@@ -1,7 +1,6 @@
 import weakref
 from collections import defaultdict
 from datetime import datetime, timezone
-from enum import Enum
 from queue import SimpleQueue
 from threading import RLock, Thread
 from types import MethodType
@@ -32,6 +31,7 @@
     ATTRIBUTE_PREFIX,
     COMPUTED_PREFIX,
     CONTEXT_PREFIX,
+    ComputedAttributes,
     Span,
     SpanAttributes,
     SpanID,
@@ -55,18 +55,6 @@
 LLM_TOKEN_COUNT_COMPLETION = ATTRIBUTE_PREFIX + semantic_conventions.LLM_TOKEN_COUNT_COMPLETION
 
 
-class ComputedAttributes(Enum):
-    # Enum value must be string prefixed by COMPUTED_PREFIX
-    LATENCY_MS = (
-        COMPUTED_PREFIX + "latency_ms"
-    )  # The latency (or duration) of the span in milliseconds
-    CUMULATIVE_LLM_TOKEN_COUNT_TOTAL = COMPUTED_PREFIX + "cumulative_token_count.total"
-    CUMULATIVE_LLM_TOKEN_COUNT_PROMPT = COMPUTED_PREFIX + "cumulative_token_count.prompt"
-    CUMULATIVE_LLM_TOKEN_COUNT_COMPLETION = COMPUTED_PREFIX + "cumulative_token_count.completion"
-    ERROR_COUNT = COMPUTED_PREFIX + "error_count"
-    CUMULATIVE_ERROR_COUNT = COMPUTED_PREFIX + "cumulative_error_count"
-
-
 class ReadableSpan(ObjectProxy):  # type: ignore
     """
     A wrapped a protobuf Span, with access methods and ability to decode to

diff --git a/src/phoenix/experimental/evals/models/anthropic.py b/src/phoenix/experimental/evals/models/anthropic.py
@@ -45,12 +45,6 @@ def __post_init__(self) -> None:
         self._init_client()
         self._init_tiktoken()
         self._init_rate_limiter()
-        self.retry = self._retry(
-            error_types=[],  # default to catching all errors
-            min_seconds=self.retry_min_seconds,
-            max_seconds=self.retry_max_seconds,
-            max_retries=self.max_retries,
-        )
 
     def _init_environment(self) -> None:
         try:
@@ -128,18 +122,17 @@ def _generate(self, prompt: str, **kwargs: Dict[str, Any]) -> str:
         kwargs.pop("instruction", None)
         invocation_parameters = self.invocation_parameters()
         invocation_parameters.update(kwargs)
-        response = self._generate_with_retry(
+        response = self._rate_limited_completion(
             model=self.model,
             prompt=self._format_prompt_for_claude(prompt),
             **invocation_parameters,
         )
 
         return str(response)
 
-    def _generate_with_retry(self, **kwargs: Any) -> Any:
-        @self.retry
+    def _rate_limited_completion(self, **kwargs: Any) -> Any:
         @self._rate_limiter.limit
-        def _completion_with_retry(**kwargs: Any) -> Any:
+        def _completion(**kwargs: Any) -> Any:
             try:
                 response = self.client.completions.create(**kwargs)
                 return response.completion
@@ -149,24 +142,23 @@ def _completion_with_retry(**kwargs: Any) -> Any:
                     raise PhoenixContextLimitExceeded(exception_message) from e
                 raise e
 
-        return _completion_with_retry(**kwargs)
+        return _completion(**kwargs)
 
     async def _async_generate(self, prompt: str, **kwargs: Dict[str, Any]) -> str:
         # instruction is an invalid input to Anthropic models, it is passed in by
         # BaseEvalModel.__call__ and needs to be removed
         kwargs.pop("instruction", None)
         invocation_parameters = self.invocation_parameters()
         invocation_parameters.update(kwargs)
-        response = await self._async_generate_with_retry(
+        response = await self._async_rate_limited_completion(
             model=self.model, prompt=self._format_prompt_for_claude(prompt), **invocation_parameters
         )
 
         return str(response)
 
-    async def _async_generate_with_retry(self, **kwargs: Any) -> Any:
-        @self.retry
+    async def _async_rate_limited_completion(self, **kwargs: Any) -> Any:
         @self._rate_limiter.alimit
-        async def _async_completion_with_retry(**kwargs: Any) -> Any:
+        async def _async_completion(**kwargs: Any) -> Any:
             try:
                 response = await self.async_client.completions.create(**kwargs)
                 return response.completion
@@ -176,7 +168,7 @@ async def _async_completion_with_retry(**kwargs: Any) -> Any:
                     raise PhoenixContextLimitExceeded(exception_message) from e
                 raise e
 
-        return await _async_completion_with_retry(**kwargs)
+        return await _async_completion(**kwargs)
 
     def _format_prompt_for_claude(self, prompt: str) -> str:
         # Claude requires prompt in the format of Human: ... Assistant: