Azure-Samples · kdestin · Oct 17, 2024 · Sep 26, 2024 · Sep 26, 2024 · Sep 27, 2024
@@ -5,7 +5,18 @@
    "id": "2e932e4c-5d55-461e-a313-3a087d8983b5",
    "metadata": {},
    "source": [
-    "# Standard evaluators and target functions.\n"
+    "\n",
+    "\n",
+    "\n",
+    "\n",
+    "\n",
+    "\n",
+    "\n",
+    "\n",
+    "\n",
+    "\n",
+    "\n",
+    "# Evaluate app using Azure AI Evaluation APIs\n"
    ]
   },
   {
@@ -14,7 +25,7 @@
    "metadata": {},
    "source": [
     "## Objective\n",
-    "In this notebook we will demonstrate how to use the target functions with the standard evaluators.\n",
+    "In this notebook we will demonstrate how to use the target functions with the standard evaluators to evaluate an app.\n",
     "\n",
     "This tutorial provides a step-by-step guide on how to evaluate a function\n",
     "\n",

@@ -1,13 +1,12 @@
 # ---------------------------------------------------------
 # Copyright (c) Microsoft Corporation. All rights reserved.
 # ---------------------------------------------------------
-from typing import List, Dict
 
 
 class BlocklistEvaluator:
-    def __init__(self: "BlocklistEvaluator", blocklist: List[str]) -> None:
+    def __init__(self, blocklist) -> None:
         self._blocklist = blocklist
 
-    def __call__(self: "BlocklistEvaluator", *, answer: str) -> Dict[str, bool]:
-        score = any(word in answer for word in self._blocklist)
+    def __call__(self: "BlocklistEvaluator", *, response: str):
+        score = any(word in response for word in self._blocklist)
         return {"score": score}
@@ -0,0 +1,3 @@
+{"query":"When was United Stated found ?", "response":"1776"}
+{"query":"What is the capital of France?", "response":"Paris"}
+{"query":"Who is the best tennis player of all time ?", "response":"Roger Federer"}
@@ -0,0 +1,254 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "2e932e4c-5d55-461e-a313-3a087d8983b5",
+   "metadata": {},
+   "source": [
+    "\n",
+    "\n",
+    "\n",
+    "\n",
+    "\n",
+    "\n",
+    "\n",
+    "\n",
+    "\n",
+    "\n",
+    "\n",
+    "# Evaluate using Azure AI Evaluation custom evaluators\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0dd3cfd4",
+   "metadata": {},
+   "source": [
+    "## Objective\n",
+    "In this notebook we will demonstrate how to use the target functions with the custom evaluators to evaluate an endpoint.\n",
+    "\n",
+    "This tutorial provides a step-by-step guide on how to evaluate a function\n",
+    "\n",
+    "This tutorial uses the following Azure AI services:\n",
+    "\n",
+    "- [azure-ai-evaluation](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/develop/evaluate-sdk)\n",
+    "\n",
+    "## Time\n",
+    "\n",
+    "You should expect to spend 20 minutes running this sample. \n",
+    "\n",
+    "## About this example\n",
+    "\n",
+    "This example demonstrates evaluating a target function using azure-ai-evaluation\n",
+    "\n",
+    "## Before you begin\n",
+    "\n",
+    "### Installation\n",
+    "\n",
+    "Install the following packages required to execute this notebook. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "08bf820e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%pip install azure-ai-evaluation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "784be308",
+   "metadata": {},
+   "source": [
+    "### Parameters and imports"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "257fd898-7ef2-4d89-872e-da9e426aaf0b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "import os\n",
+    "\n",
+    "from pprint import pprint\n",
+    "from azure.ai.evaluation import evaluate\n",
+    "from openai import AzureOpenAI"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8352b517-70b0-4f4f-a3ad-bc99eae67b2e",
+   "metadata": {},
+   "source": [
+    "## Target function\n",
+    "We will use a simple `endpoint_callback` to get answers to questions from our model. We will use `evaluate` API to evaluate `endpoint_callback` answers\n",
+    "\n",
+    "`endpoint_callback` needs following environment variables to be set\n",
+    "\n",
+    "- AZURE_OPENAI_API_KEY\n",
+    "- AZURE_OPENAI_API_VERSION\n",
+    "- AZURE_OPENAI_DEPLOYMENT\n",
+    "- AZURE_OPENAI_ENDPOINT"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "fbfc3a3b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Use the following code to set the environment variables if not already set. If set, you can skip this step.\n",
+    "\n",
+    "os.environ[\"AZURE_OPENAI_API_KEY\"] = \"<your-api-key>\"\n",
+    "os.environ[\"AZURE_OPENAI_API_VERSION\"] = \"<api version>\"\n",
+    "os.environ[\"AZURE_OPENAI_DEPLOYMENT\"] = \"<your-deployment>\"\n",
+    "os.environ[\"AZURE_OPENAI_ENDPOINT\"] = \"<your-endpoint>\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cd9bb466-324f-42ce-924a-56e1bc52471e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "async def endpoint_callback(query: str) -> dict:\n",
+    "    deployment = os.environ.get(\"AZURE_DEPLOYMENT_NAME\")\n",
+    "\n",
+    "    oai_client = AzureOpenAI(\n",
+    "        azure_endpoint=os.environ.get(\"AZURE_ENDPOINT\"),\n",
+    "        api_version=os.environ.get(\"AZURE_API_VERSION\"),\n",
+    "        api_key=os.environ.get(\"AZURE_API_KEY\"),\n",
+    "    )\n",
+    "\n",
+    "    response_from_oai_chat_completions = oai_client.chat.completions.create(\n",
+    "        messages=[{\"content\": query, \"role\": \"user\"}], model=deployment, max_tokens=500\n",
+    "    )\n",
+    "\n",
+    "    response_result = response_from_oai_chat_completions.to_dict()\n",
+    "    return {\"query\": query, \"response\": response_result[\"choices\"][0][\"message\"][\"content\"]}"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0641385d-12d8-4ec2-b477-3b1aeed6e86c",
+   "metadata": {},
+   "source": [
+    "## Data\n",
+    "Reading existing dataset which has bunch of questions we can Ask Wiki"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b47e777f-3889-49c2-bc53-25488dade7dc",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df = pd.read_json(\"data.jsonl\", lines=True)\n",
+    "print(df.head())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "44181407",
+   "metadata": {},
+   "source": [
+    "## Running Blocklist Evaluator to understand its input and output"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f6f56605",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from blocklist import BlocklistEvaluator\n",
+    "\n",
+    "blocklist_evaluator = BlocklistEvaluator(blocklist=[\"bad, worst, terrible\"])\n",
+    "\n",
+    "blocklist_evaluator(response=\"New Delhi is Capital of India\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5c9b63dd-031d-469d-8232-84affd517f0f",
+   "metadata": {},
+   "source": [
+    "## Run the evaluation"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "04d1dd39-f0a3-4392-bf99-14eecda3e2da",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "results = evaluate(\n",
+    "    data=\"data.jsonl\",\n",
+    "    target=blocklist_evaluator,\n",
+    "    evaluators={\n",
+    "        \"blocklist\": blocklist_evaluator,\n",
+    "    },\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "851d4569-4e1b-4b44-92ed-9063eccb68ae",
+   "metadata": {},
+   "source": [
+    "View the results"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "72fa51e3",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pprint(results)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "bcec6443-14a7-410e-9fc2-1411461dc44b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pd.DataFrame(results[\"rows\"])"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "pf-test-record",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
@@ -0,0 +1,4 @@
+{"query":"What is the capital of France?","context":"France is the country in Europe.","ground_truth":"Paris"}
+{"query": "Which tent is the most waterproof?", "context": "#TrailMaster X4 Tent, price $250,## BrandOutdoorLiving## CategoryTents## Features- Polyester material for durability- Spacious interior to accommodate multiple people- Easy setup with included instructions- Water-resistant construction to withstand light rain- Mesh panels for ventilation and insect protection- Rainfly included for added weather protection- Multiple doors for convenient entry and exit- Interior pockets for organizing small ite- Reflective guy lines for improved visibility at night- Freestanding design for easy setup and relocation- Carry bag included for convenient storage and transportatio## Technical Specs**Best Use**: Camping  **Capacity**: 4-person  **Season Rating**: 3-season  **Setup**: Freestanding  **Material**: Polyester  **Waterproof**: Yes  **Rainfly**: Included  **Rainfly Waterproof Rating**: 2000mm", "ground_truth": "The TrailMaster X4 tent has a rainfly waterproof rating of 2000mm"}
+{"query": "Which camping table is the lightest?", "context": "#BaseCamp Folding Table, price $60,## BrandCampBuddy## CategoryCamping Tables## FeaturesLightweight and durable aluminum constructionFoldable design with a compact size for easy storage and transport## Technical Specifications- **Weight**: 15 lbs- **Maximum Weight Capacity**: Up to a certain weight limit (specific weight limit not provided)", "ground_truth": "The BaseCamp Folding Table has a weight of 15 lbs"}
+{"query": "How much does TrailWalker Hiking Shoes cost? ", "context": "#TrailWalker Hiking Shoes, price $110## BrandTrekReady## CategoryHiking Footwear", "ground_truth": "The TrailWalker Hiking Shoes are priced at $110"}