-
Notifications
You must be signed in to change notification settings - Fork 3k
Converter from AI Service threads/runs to evaluator-compatible schema #40047
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
03af2ba
e24d77f
73e3939
c53b72f
149a4cc
8d87168
b3f5ef2
465c1c7
5219616
eed1375
3758eae
c33e4f7
1741980
55294ae
a30abdf
7ec8c35
50e819f
7637357
8bb6cd3
6deb358
cc8df22
85def50
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# --------------------------------------------------------- | ||
# Copyright (c) Microsoft Corporation. All rights reserved. | ||
# --------------------------------------------------------- |
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,259 @@ | ||
import datetime | ||
import json | ||
|
||
from pydantic import BaseModel | ||
|
||
from azure.ai.projects.models import RunStepFunctionToolCall | ||
|
||
from typing import List, Optional, Union | ||
|
||
# Message roles constants. | ||
_SYSTEM = "system" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Newer APIs use "developer" here (they're essentially interchangeable, just depends on API version). So might want to make sure that's supported too. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Interesting, didn't know that. What API version should we be bound to or set it to developer by default? I'll look into versioning. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think for output, we can just use system, at least for now. But when reading in messages from threads/etc., you might need to cover the case where it could say "developer". Or maybe we just default to use whatever the thread itself is using, actually. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually, for this converter's purpose I don't think it matters. I take it back. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's write |
||
_USER = "user" | ||
_AGENT = "assistant" | ||
_TOOL = "tool" | ||
|
||
# Constant definitions for what tool details include. | ||
_TOOL_CALL = "tool_call" | ||
_TOOL_RESULT = "tool_result" | ||
_FUNCTION = "function" | ||
|
||
# This is returned by AI services in the API to filter against tool invocations. | ||
_TOOL_CALLS = "tool_calls" | ||
|
||
|
||
class Message(BaseModel): | ||
"""Represents a message in a conversation with agents, assistants, and tools. We need to export these structures | ||
to JSON for evaluators and we have custom fields such as createdAt, run_id, and tool_call_id, so we cannot use | ||
the standard pydantic models provided by OpenAI. | ||
|
||
:param createdAt: The timestamp when the message was created. | ||
:type createdAt: datetime.datetime | ||
:param run_id: The ID of the run associated with the message. Optional. | ||
:type run_id: Optional[str] | ||
:param role: The role of the message sender (e.g., system, user, tool, assistant). | ||
:type role: str | ||
:param content: The content of the message, which can be a string or a list of dictionaries. | ||
:type content: Union[str, List[dict]] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not for this PR, it can be a quick follown so we don't need to be blocked, but I've suggested we include There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can user provide a name ? I do not see it being generated right now by service. |
||
""" | ||
|
||
createdAt: Optional[Union[datetime.datetime, int]] = None # SystemMessage wouldn't have this | ||
run_id: Optional[str] = None | ||
tool_call_id: Optional[str] = None # see ToolMessage | ||
role: str | ||
content: Union[str, List[dict]] | ||
|
||
|
||
class SystemMessage(Message): | ||
"""Represents a system message in a conversation with agents, assistants, and tools. | ||
|
||
:param role: The role of the message sender, which is always 'system'. | ||
:type role: str | ||
""" | ||
|
||
role: str = _SYSTEM | ||
|
||
|
||
class UserMessage(Message): | ||
"""Represents a user message in a conversation with agents, assistants, and tools. | ||
|
||
:param role: The role of the message sender, which is always 'user'. | ||
:type role: str | ||
""" | ||
|
||
role: str = _USER | ||
|
||
|
||
class ToolMessage(Message): | ||
"""Represents a tool message in a conversation with agents, assistants, and tools. | ||
|
||
:param run_id: The ID of the run associated with the message. | ||
:type run_id: str | ||
:param role: The role of the message sender, which is always 'tool'. | ||
:type role: str | ||
:param tool_call_id: The ID of the tool call associated with the message. Optional. | ||
:type tool_call_id: Optional[str] | ||
""" | ||
|
||
run_id: str | ||
role: str = _TOOL | ||
tool_call_id: Optional[str] = None | ||
|
||
|
||
class AssistantMessage(Message): | ||
"""Represents an assistant message. | ||
|
||
:param run_id: The ID of the run associated with the message. | ||
:type run_id: str | ||
:param role: The role of the message sender, which is always 'assistant'. | ||
:type role: str | ||
""" | ||
|
||
run_id: str | ||
role: str = _AGENT | ||
|
||
|
||
class ToolDefinition(BaseModel): | ||
"""Represents a tool definition that will be used in the agent. | ||
|
||
:param name: The name of the tool. | ||
:type name: str | ||
:param description: A description of the tool. | ||
:type description: str | ||
:param parameters: The parameters required by the tool. | ||
:type parameters: dict | ||
""" | ||
|
||
name: str | ||
description: Optional[str] = None | ||
parameters: dict | ||
|
||
|
||
class ToolCall: | ||
"""Represents a tool call, used as an intermediate step in the conversion process. | ||
|
||
:param created: The timestamp when the tool call was created. | ||
:type created: datetime.datetime | ||
:param completed: The timestamp when the tool call was completed. | ||
:type completed: datetime.datetime | ||
:param details: The details of the tool call. | ||
:type details: RunStepFunctionToolCall | ||
""" | ||
|
||
def __init__(self, created: datetime.datetime, completed: datetime.datetime, details: RunStepFunctionToolCall): | ||
self.created = created | ||
self.completed = completed | ||
self.details = details | ||
|
||
|
||
class EvaluatorData(BaseModel): | ||
"""Represents the result of a conversion. | ||
|
||
:param query: A list of messages representing the system message, chat history, and user query. | ||
:type query: List[Message] | ||
:param response: A list of messages representing the assistant's response, including tool calls and results. | ||
:type response: List[Message] | ||
:param tool_definitions: A list of tool definitions used in the agent. | ||
:type tool_definitions: List[ToolDefinition] | ||
""" | ||
|
||
query: List[Message] | ||
response: List[Message] | ||
tool_definitions: List[ToolDefinition] | ||
|
||
def to_json(self): | ||
"""Converts the result to a JSON string. | ||
|
||
:return: The JSON representation of the result. | ||
:rtype: str | ||
""" | ||
return self.model_dump_json(exclude={}, exclude_none=True) | ||
|
||
|
||
def break_tool_call_into_messages(tool_call: ToolCall, run_id: str) -> List[Message]: | ||
""" | ||
Breaks a tool call into a list of messages, including the tool call and its result. | ||
|
||
:param tool_call: The tool call to be broken into messages. | ||
:type tool_call: ToolCall | ||
:param run_id: The ID of the run associated with the messages. | ||
:type run_id: str | ||
:return: A list of messages representing the tool call and its result. | ||
:rtype: List[Message] | ||
""" | ||
# We will use this as our accumulator. | ||
messages: List[Message] = [] | ||
|
||
# As of March 17th, 2025, we only support custom functions due to built-in code interpreters and bing grounding | ||
# tooling not reporting their function calls in the same way. Code interpreters don't include the tool call at | ||
# all in most of the cases, and bing would only show the API URL, without arguments or results. | ||
# Bing grounding would have "bing_grounding" in details with "requesturl" that will just be the API path with query. | ||
# TODO: Work with AI Services to add converter support for BingGrounding and CodeInterpreter. | ||
if not hasattr(tool_call.details, _FUNCTION): | ||
return messages | ||
|
||
# This is the internals of the content object that will be included with the tool call. | ||
tool_call_id = tool_call.details.id | ||
content_tool_call = { | ||
"type": _TOOL_CALL, | ||
_TOOL_CALL: { | ||
"id": tool_call_id, | ||
"type": _FUNCTION, | ||
_FUNCTION: { | ||
"name": tool_call.details.function.name, | ||
"arguments": safe_loads(tool_call.details.function.arguments), | ||
}, | ||
}, | ||
} | ||
|
||
# We format it into an assistant message, where the content is a singleton list of the content object. | ||
# It should be a tool message, since this is the call, but the given schema treats this message as | ||
# assistant's action of calling the tool. | ||
messages.append(AssistantMessage(run_id=run_id, content=[to_dict(content_tool_call)], createdAt=tool_call.created)) | ||
|
||
# Now, onto the tool result, which only includes the result of the function call. | ||
content_tool_call_result = {"type": _TOOL_RESULT, _TOOL_RESULT: safe_loads(tool_call.details.function.output)} | ||
|
||
# Since this is a tool's action of returning, we put it as a tool message. | ||
messages.append( | ||
ToolMessage( | ||
run_id=run_id, | ||
tool_call_id=tool_call_id, | ||
content=[to_dict(content_tool_call_result)], | ||
createdAt=tool_call.completed, | ||
) | ||
) | ||
return messages | ||
|
||
|
||
def to_dict(obj) -> dict: | ||
""" | ||
Converts an object to a dictionary. | ||
|
||
:param obj: The object to be converted. | ||
:type obj: Any | ||
:return: The dictionary representation of the object. | ||
:rtype: dict | ||
""" | ||
return json.loads(json.dumps(obj)) | ||
|
||
|
||
def safe_loads(data: str) -> Union[dict, str]: | ||
""" | ||
Safely loads a JSON string into a Python dictionary or returns the original string if loading fails. | ||
:param data: The JSON string to be loaded. | ||
:type data: str | ||
:return: The loaded dictionary or the original string. | ||
:rtype: Union[dict, str] | ||
""" | ||
try: | ||
return json.loads(data) | ||
except json.JSONDecodeError: | ||
return data | ||
|
||
|
||
def convert_message(msg: dict) -> Message: | ||
""" | ||
Converts a dictionary to the appropriate Message subclass. | ||
|
||
:param msg: The message dictionary. | ||
:type msg: dict | ||
:return: The Message object. | ||
:rtype: Message | ||
""" | ||
role = msg["role"] | ||
if role == "system": | ||
return SystemMessage(content=str(msg["content"])) | ||
elif role == "user": | ||
return UserMessage(content=msg["content"], createdAt=msg["createdAt"]) | ||
elif role == "assistant": | ||
return AssistantMessage(run_id=str(msg["run_id"]), content=msg["content"], createdAt=msg["createdAt"]) | ||
elif role == "tool": | ||
return ToolMessage( | ||
run_id=str(msg["run_id"]), | ||
tool_call_id=str(msg["tool_call_id"]), | ||
content=msg["content"], | ||
createdAt=msg["createdAt"], | ||
) | ||
else: | ||
raise ValueError(f"Unknown role: {role}") |
Uh oh!
There was an error while loading. Please reload this page.