From 070fcc5b1bb58e2a51671127eabc298e06fe0489 Mon Sep 17 00:00:00 2001 From: Alexander Song Date: Thu, 10 Oct 2024 11:48:05 -0700 Subject: [PATCH 1/3] update dspy notebook --- tutorials/tracing/dspy_tracing_tutorial.ipynb | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tutorials/tracing/dspy_tracing_tutorial.ipynb b/tutorials/tracing/dspy_tracing_tutorial.ipynb index 34b1e5cbde..f5a4259db1 100644 --- a/tutorials/tracing/dspy_tracing_tutorial.ipynb +++ b/tutorials/tracing/dspy_tracing_tutorial.ipynb @@ -47,8 +47,8 @@ "metadata": {}, "outputs": [], "source": [ - "!pip install \"regex~=2023.10.3\" dspy-ai # DSPy requires an old version of regex that conflicts with the installed version on Colab\n", - "!pip install arize-phoenix openinference-instrumentation-dspy opentelemetry-exporter-otlp" + "!pip install \"regex~=2023.10.3\" \"dspy-ai>=2.5.0\" # DSPy requires an old version of regex that conflicts with the installed version on Colab\n", + "!pip install arize-phoenix \"openinference-instrumentation-dspy>1.1.12\" opentelemetry-exporter-otlp" ] }, { @@ -116,7 +116,7 @@ "metadata": {}, "outputs": [], "source": [ - "turbo = dspy.OpenAI(model=\"gpt-3.5-turbo\")\n", + "turbo = dspy.LM(\"openai/gpt-4\")\n", "colbertv2_wiki17_abstracts = dspy.ColBERTv2(\n", " url=\"http://20.102.90.50:2017/wiki17_abstracts\" # endpoint for a hosted ColBERTv2 service\n", ")\n", From 59c6c3b69020b22000aad1470f6b35eb0909d33c Mon Sep 17 00:00:00 2001 From: Alexander Song Date: Thu, 10 Oct 2024 11:51:35 -0700 Subject: [PATCH 2/3] don't cache on LM --- tutorials/tracing/dspy_tracing_tutorial.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tutorials/tracing/dspy_tracing_tutorial.ipynb b/tutorials/tracing/dspy_tracing_tutorial.ipynb index f5a4259db1..26005c61e2 100644 --- a/tutorials/tracing/dspy_tracing_tutorial.ipynb +++ b/tutorials/tracing/dspy_tracing_tutorial.ipynb @@ -116,7 +116,7 @@ "metadata": {}, "outputs": [], "source": [ - "turbo = dspy.LM(\"openai/gpt-4\")\n", + "turbo = dspy.LM(\"openai/gpt-4\", cache=False)\n", "colbertv2_wiki17_abstracts = dspy.ColBERTv2(\n", " url=\"http://20.102.90.50:2017/wiki17_abstracts\" # endpoint for a hosted ColBERTv2 service\n", ")\n", From 582286853989aed33f3d0fdf932c015d348dd0fd Mon Sep 17 00:00:00 2001 From: Alexander Song Date: Thu, 10 Oct 2024 14:47:25 -0700 Subject: [PATCH 3/3] check notebook runs in colab --- tutorials/tracing/dspy_tracing_tutorial.ipynb | 112 ++++++++++++------ 1 file changed, 75 insertions(+), 37 deletions(-) diff --git a/tutorials/tracing/dspy_tracing_tutorial.ipynb b/tutorials/tracing/dspy_tracing_tutorial.ipynb index 26005c61e2..8982f71f1c 100644 --- a/tutorials/tracing/dspy_tracing_tutorial.ipynb +++ b/tutorials/tracing/dspy_tracing_tutorial.ipynb @@ -2,7 +2,9 @@ "cells": [ { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "ugOiyLQRScii" + }, "source": [ "
\n", "

\n", @@ -21,7 +23,7 @@ "\n", "- Composable and declarative APIs that allow developers to describe the architecture of their LLM application in the form of a \"module\" (inspired by PyTorch's `nn.Module`),\n", "- Compilers known as \"teleprompters\" that optimize a user-defined module for a particular task. The term \"teleprompter\" is meant to evoke \"prompting at a distance,\" and could involve selecting few-shot examples, generating prompts, or fine-tuning language models.\n", - " \n", + "\n", "Phoenix makes your DSPy applications *observable* by visualizing the underlying structure of each call to your compiled DSPy module and surfacing problematic spans of execution based on latency, token count, or other evaluation metrics.\n", "\n", "In this tutorial, you will:\n", @@ -34,7 +36,9 @@ }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "9PzyTdXkScij" + }, "source": [ "## 1. Install Dependencies and Import Libraries\n", "\n", @@ -47,20 +51,14 @@ "metadata": {}, "outputs": [], "source": [ - "!pip install \"regex~=2023.10.3\" \"dspy-ai>=2.5.0\" # DSPy requires an old version of regex that conflicts with the installed version on Colab\n", - "!pip install arize-phoenix \"openinference-instrumentation-dspy>1.1.12\" opentelemetry-exporter-otlp" + "!pip install arize-phoenix \"dspy-ai>=2.5.0\" \"openinference-instrumentation-dspy>=0.1.13\" openinference-instrumentation-litellm opentelemetry-exporter-otlp" ] }, { "cell_type": "markdown", - "metadata": {}, - "source": [ - "⚠️ DSPy conflicts with the default version of the `regex` module that comes pre-installed on Google Colab. If you are running this notebook in Google Colab, you will likely need to restart the kernel after running the installation step above and before proceeding to the rest of the notebook, otherwise, your instrumentation will fail." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "--ju_0z3Scik" + }, "source": [ "Import libraries." ] @@ -82,7 +80,9 @@ }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "skhq25K-Scil" + }, "source": [ "## 2. Configure Your OpenAI API Key\n", "\n", @@ -103,11 +103,13 @@ }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "9koncSAzScil" + }, "source": [ "## 3. Configure Module Components\n", "\n", - "A module consists of components such as a language model (in this case, OpenAI's GPT 3.5 turbo), akin to the layers of a PyTorch module and a retriever (in this case, ColBERTv2)." + "A module consists of components such as a language model (in this case, OpenAI's GPT-4), akin to the layers of a PyTorch module and a retriever (in this case, ColBERTv2)." ] }, { @@ -116,17 +118,19 @@ "metadata": {}, "outputs": [], "source": [ - "turbo = dspy.LM(\"openai/gpt-4\", cache=False)\n", + "lm = dspy.LM(\"openai/gpt-4\", cache=False)\n", "colbertv2_wiki17_abstracts = dspy.ColBERTv2(\n", " url=\"http://20.102.90.50:2017/wiki17_abstracts\" # endpoint for a hosted ColBERTv2 service\n", ")\n", "\n", - "dspy.settings.configure(lm=turbo, rm=colbertv2_wiki17_abstracts)" + "dspy.settings.configure(lm=lm, rm=colbertv2_wiki17_abstracts)" ] }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "iN-4OqMyScil" + }, "source": [ "## 4. Load Data\n", "\n", @@ -154,7 +158,9 @@ }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "VgIFspM0Scil" + }, "source": [ "Each example in our training set has a question and a human-annotated answer." ] @@ -171,7 +177,9 @@ }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "T3ylXcAdScil" + }, "source": [ "Examples in the dev set have a third field containing titles of relevant Wikipedia articles." ] @@ -188,7 +196,9 @@ }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "3m301eQXScil" + }, "source": [ "## 5. Define Your RAG Module\n", "\n", @@ -215,7 +225,9 @@ }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "Do8HeVZ3Scim" + }, "source": [ "Define your module by subclassing `dspy.Module` and overriding the `forward` method." ] @@ -240,21 +252,27 @@ }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "mGWLBPeIScim" + }, "source": [ "This module uses retrieval-augmented generation (using the previously configured ColBERTv2 retriever) in tandem with chain of thought in order to generate the final answer to the user." ] }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "fh71fUzPScim" + }, "source": [ "## 6. Compile Your RAG Module" ] }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "6BlkKG1RScim" + }, "source": [ "In this case, we'll use the default `BootstrapFewShot` teleprompter that selects good demonstrations from the the training dataset for inclusion in the final prompt." ] @@ -283,14 +301,18 @@ }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "1-vGu_aDScim" + }, "source": [ "## 7. Instrument DSPy and Launch Phoenix" ] }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "teUVpEjtScim" + }, "source": [ "Now that we've compiled our RAG program, let's see what's going on under the hood.\n", "\n", @@ -308,9 +330,13 @@ }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "-fIfJ9AOScim" + }, "source": [ - "Then instrument your application with [OpenInference](https://github.com/Arize-ai/openinference/tree/main/spec), an open standard build atop [OpenTelemetry](https://opentelemetry.io/) that captures and stores LLM application executions. OpenInference provides telemetry data to help you understand the invocation of your LLMs and the surrounding application context, including retrieval from vector stores, the usage of external tools or APIs, etc." + "Then instrument your application with [OpenInference](https://github.com/Arize-ai/openinference/tree/main/spec), an open standard build atop [OpenTelemetry](https://opentelemetry.io/) that captures and stores LLM application executions. OpenInference provides telemetry data to help you understand the invocation of your LLMs and the surrounding application context, including retrieval from vector stores, the usage of external tools or APIs, etc.\n", + "\n", + "DSPy uses LiteLLM under the hood to invoke LLMs. We add the `LiteLLMInstrumentor` here so we can get token counts for LLM spans." ] }, { @@ -320,23 +346,29 @@ "outputs": [], "source": [ "from openinference.instrumentation.dspy import DSPyInstrumentor\n", + "from openinference.instrumentation.litellm import LiteLLMInstrumentor\n", "\n", "from phoenix.otel import register\n", "\n", "register(endpoint=\"http://127.0.0.1:6006/v1/traces\")\n", - "DSPyInstrumentor().instrument(skip_dep_check=True)" + "DSPyInstrumentor().instrument(skip_dep_check=True)\n", + "LiteLLMInstrumentor().instrument(skip_dep_check=True)" ] }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "quSMbL5DScim" + }, "source": [ "## 8. Run Your Application" ] }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "XmXs-qUWScim" + }, "source": [ "Let's run our DSPy application on the dev set." ] @@ -366,7 +398,9 @@ }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "RBye2l4EScin" + }, "source": [ "Check the Phoenix UI to inspect the architecture of your DSPy module." ] @@ -382,7 +416,9 @@ }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "deaC5uIsScin" + }, "source": [ "A few things to note:\n", "\n", @@ -395,7 +431,9 @@ }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "id": "lIqk4MMXScin" + }, "source": [ "Congrats! You've used DSPy to bootstrap a multishot prompt with hard negative passages and chain of thought, and you've used Phoenix to observe the inner workings of DSPy and understand the internals of the forward pass." ] @@ -407,5 +445,5 @@ } }, "nbformat": 4, - "nbformat_minor": 2 + "nbformat_minor": 0 }