Arize-ai · RogerHYang · Oct 28, 2024 · Oct 28, 2024 · Oct 28, 2024
diff --git a/tutorials/tracing/span_filtering_tutorial.ipynb b/tutorials/tracing/span_filtering_tutorial.ipynb
@@ -0,0 +1,283 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<center>\n",
+    "    <p style=\"text-align:center\">\n",
+    "        <img alt=\"phoenix logo\" src=\"https://storage.googleapis.com/arize-phoenix-assets/assets/phoenix-logo-light.svg\" width=\"200\"/>\n",
+    "        <br>\n",
+    "        <a href=\"https://docs.arize.com/phoenix/\">Docs</a>\n",
+    "        |\n",
+    "        <a href=\"https://github.com/Arize-ai/phoenix\">GitHub</a>\n",
+    "        |\n",
+    "        <a href=\"https://join.slack.com/t/arize-ai/shared_invite/zt-1px8dcmlf-fmThhDFD_V_48oU7ALan4Q\">Community</a>\n",
+    "    </p>\n",
+    "</center>\n",
+    "<h1 align=\"center\">Filter OpenTelemetry Spans</h1>\n",
+    "\n",
+    "This tutorial shows how to filter OpenTelemetry spans based on a condition.\n",
+    "\n",
+    "If you're using multiple OTEL-compatible libraries, you may want to filter spans based on a condition. This can prevent you from sending spans to multiple backends unnecessarily.\n",
+    "\n",
+    "There are multiple approaches to do this, in this tutorial we'll show how to do this using a custom SpanProcessor."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Install Dependencies"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip install -q uv\n",
+    "!uv pip install --system -q opentelemetry-sdk opentelemetry-exporter-otlp-proto-http openinference-instrumentation-openai openai python-dotenv"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "from dotenv import load_dotenv\n",
+    "from opentelemetry import trace as trace_api\n",
+    "from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter\n",
+    "from opentelemetry.sdk.trace import TracerProvider\n",
+    "from opentelemetry.sdk.trace.export import ConsoleSpanExporter, SpanExporter, SpanProcessor\n",
+    "\n",
+    "load_dotenv()\n",
+    "\n",
+    "from getpass import getpass\n",
+    "\n",
+    "import openai\n",
+    "from openinference.instrumentation.openai import OpenAIInstrumentor"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Define a custom SpanProcessor\n",
+    "\n",
+    "We'll extend the SpanProcessor class to add a condition that determines whether a span should be exported."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "class ConditionalSpanProcessor(SpanProcessor):\n",
+    "    def __init__(self, exporter: SpanExporter, condition: callable):\n",
+    "        self.exporter = exporter\n",
+    "        self.condition = condition\n",
+    "\n",
+    "    def on_start(self, span, parent_context):\n",
+    "        pass\n",
+    "\n",
+    "    def on_end(self, span):\n",
+    "        # Only export spans that meet the condition\n",
+    "        if self.condition(span):\n",
+    "            self.exporter.export([span])\n",
+    "\n",
+    "    def shutdown(self):\n",
+    "        self.exporter.shutdown()\n",
+    "\n",
+    "    def force_flush(self, timeout_millis=None):\n",
+    "        self.exporter.force_flush(timeout_millis)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Along with this class, we'll define two conditions: one for sending spans to the console, and one for sending spans to Phoenix."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Define conditions for sending spans to specific exporters\n",
+    "def console_condition(span):\n",
+    "    return \"console\" in span.name  # Example: send to Console if \"console\" is in the span name\n",
+    "\n",
+    "\n",
+    "def phoenix_condition(span):\n",
+    "    # return \"phoenix\" in span.name  # Example: send to Phoenix if \"phoenix\" is in the span name\n",
+    "    return not console_condition(\n",
+    "        span\n",
+    "    )  # Example: send to Phoenix if \"console\" is not in the span name"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Use our custom SpanProcessor to set up instrumentation\n",
+    "In this example, we'll use Phoenix as one of our destinations, and the console as the other. You could instead add any other exporters you'd like in this approach.\n",
+    "\n",
+    "If you need to set up an API key for Phoenix, you can do so [here](https://app.phoenix.arize.com/)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "if not (phoenix_api_key := os.getenv(\"PHOENIX_API_KEY\")):\n",
+    "    phoenix_api_key = getpass(\"Enter your Phoenix API Key: \")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Define the console exporter"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tracer_provider = TracerProvider()\n",
+    "\n",
+    "# Create the Console exporter\n",
+    "console_exporter = ConsoleSpanExporter()\n",
+    "\n",
+    "# Add the Console exporter to the tracer provider\n",
+    "tracer_provider.add_span_processor(ConditionalSpanProcessor(console_exporter, console_condition))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Define the Phoenix exporter"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Add Phoenix API Key to the headers for tracing and API access\n",
+    "os.environ[\"PHOENIX_CLIENT_HEADERS\"] = f\"api_key={phoenix_api_key}\"\n",
+    "\n",
+    "# Create the Phoenix exporter\n",
+    "otlp_exporter = OTLPSpanExporter(endpoint=\"https://app.phoenix.arize.com/v1/traces\")\n",
+    "\n",
+    "# Add the Phoenix exporter to the tracer provider\n",
+    "tracer_provider.add_span_processor(ConditionalSpanProcessor(otlp_exporter, phoenix_condition))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Add the exporters to the tracer provider"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Set the tracer provider\n",
+    "trace_api.set_tracer_provider(tracer_provider)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Run our app and view traces"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "if not (openai_api_key := os.getenv(\"OPENAI_API_KEY\")):\n",
+    "    os.environ[\"OPENAI_API_KEY\"] = getpass(\"Enter your OpenAI API Key: \")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This first span will be exported to the console. Here we're using manual instrumentation to create the span."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Create a tracer\n",
+    "tracer = trace_api.get_tracer(__name__)\n",
+    "\n",
+    "# Example of creating and exporting spans\n",
+    "with tracer.start_as_current_span(\"console-span\"):\n",
+    "    print(\"This span will be exported to Console only.\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This next span will be exported only to Phoenix. In this case, we're using our auto-instrumentor for OpenAI to generate the span."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Auto-instrumentors can still be used with this setup\n",
+    "OpenAIInstrumentor().instrument(tracer_provider=tracer_provider, skip_dep_check=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "client = openai.OpenAI()\n",
+    "client.chat.completions.create(\n",
+    "    model=\"gpt-4o-mini\", messages=[{\"role\": \"user\", \"content\": \"Hello, world!\"}]\n",
+    ")"
+   ]
+  }
+ ],
+ "metadata": {
+  "language_info": {
+   "name": "python"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}