You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_task_adherence/task_adherence.prompty
+45-1
Original file line number
Diff line number
Diff line change
@@ -38,26 +38,70 @@ system:
38
38
**Definition:**
39
39
Response completely ignores instructions or deviates significantly
40
40
41
+
**Example:**
42
+
**Query:** What is a recommended weekend itinerary in Paris?
43
+
**Response:** Paris is a lovely city with a rich history.
44
+
45
+
Explanation: This response completely misses the task by not providing any itinerary details. It offers a generic statement about Paris rather than a structured travel plan.
46
+
47
+
41
48
**Level 2: Barely adherent**
42
49
43
50
**Definition:**
44
51
Response partially aligns with instructions but has critical gaps.
45
52
53
+
**Example:**
54
+
**Query:** What is a recommended weekend itinerary in Paris?
55
+
**Response:** Spend your weekend visiting famous places in Paris.
56
+
57
+
Explanation: While the response hints at visiting well-known sites, it is extremely vague and lacks specific details, such as which sites to visit or any order of activities, leaving major gaps in the instructions.
58
+
59
+
46
60
**Level 3: Moderately adherent**
47
61
48
62
**Definition:**
49
63
Response meets the core requirements but lacks precision or clarity.
50
64
65
+
**Example:**
66
+
**Query:** What is a recommended weekend itinerary in Paris?
67
+
**Response:** Visit the Eiffel Tower and the Louvre on Saturday, and stroll through Montmartre on Sunday.
68
+
69
+
Explanation: This answer meets the basic requirement by naming a few key attractions and assigning them to specific days. However, it lacks additional context, such as timings, additional activities, or details to make the itinerary practical and clear.
70
+
71
+
51
72
**Level 4: Mostly adherent**
52
73
53
74
**Definition:**
54
75
Response is clear, accurate, and aligns with instructions with minor issues.
55
76
56
-
**Level 5: Fully adnerent**
77
+
**Example:**
78
+
**Query:** What is a recommended weekend itinerary in Paris?
79
+
**Response:** For a weekend in Paris, start Saturday with a morning visit to the Eiffel Tower, then head to the Louvre in the early afternoon. In the evening, enjoy a leisurely walk along the Seine. On Sunday, begin with a visit to Notre-Dame Cathedral, followed by exploring the art and cafés in Montmartre. This plan offers a mix of cultural visits and relaxing experiences.
80
+
81
+
Explanation: This response is clear, structured, and provides a concrete itinerary with specific attractions and a suggested order of activities. It is accurate and useful, though it might benefit from a few more details like exact timings or restaurant suggestions to be perfect.
82
+
83
+
84
+
**Level 5: Fully adherent**
57
85
58
86
**Definition:**
59
87
Response is flawless, accurate, and follows instructions to the letter.
60
88
89
+
**Example:**
90
+
**Query:** What is a recommended weekend itinerary in Paris?
91
+
**Response:** Here is a detailed weekend itinerary in Paris:
92
+
Saturday:
93
+
Morning: Begin your day with a visit to the Eiffel Tower to admire the views from the top.
94
+
Early Afternoon: Head to the Louvre for a guided tour of its most famous exhibits.
95
+
Late Afternoon: Take a relaxing walk along the Seine, stopping at local boutiques.
96
+
Evening: Enjoy dinner at a classic Parisian bistro near the river.
97
+
Sunday:
98
+
Morning: Visit the Notre-Dame Cathedral to explore its architecture and history.
99
+
Midday: Wander the charming streets of Montmartre, stopping by art galleries and cafés.
100
+
Afternoon: Finish your trip with a scenic boat tour on the Seine.
101
+
This itinerary balances cultural immersion, leisure, and local dining experiences, ensuring a well-rounded visit.
102
+
103
+
Explanation: This response is comprehensive and meticulously follows the instructions. It provides detailed steps, timings, and a variety of activities that fully address the query, leaving no critical gaps.
Copy file name to clipboardexpand all lines: sdk/evaluation/azure-ai-evaluation/samples/bug_bash/response_completeness.ipynb
+49-3
Original file line number
Diff line number
Diff line change
@@ -123,17 +123,63 @@
123
123
"result"
124
124
]
125
125
},
126
+
{
127
+
"cell_type": "markdown",
128
+
"metadata": {},
129
+
"source": [
130
+
"# Batch run for response completeness"
131
+
]
132
+
},
126
133
{
127
134
"cell_type": "code",
128
135
"execution_count": null,
129
136
"metadata": {},
130
137
"outputs": [],
131
-
"source": []
138
+
"source": [
139
+
"import json\n",
140
+
"\n",
141
+
"data = [\n",
142
+
" {\n",
143
+
"\"response\": \"The temperature of Seattle now is 70 degrees. Based on the temperature, having an outdoor office party is recommended.\",\n",
144
+
"\"ground_truth\": \"The temperature of Seattle now is 50 degrees. It will be recommended to bring a jacket in the evening.\",\n",
145
+
" },\n",
146
+
" {\n",
147
+
"\"response\": \"The email draft \\\"Project Plan\\\" is attached. Please review and provide feedback.\",\n",
148
+
"\"ground_truth\": \"The email draft \\\"Project Plan\\\" is attached. Please review and provide feedback by EOD.\",\n",
149
+
" },\n",
150
+
" {\n",
151
+
"\"response\": \"Based on the retrieved documents, the shareholder meeting discussed the operational efficiency of the company and financing options.\",\n",
152
+
"\"ground_truth\": \"The shareholder meeting discussed the compensation package of the company CEO.\",\n",
0 commit comments