-
Notifications
You must be signed in to change notification settings - Fork 2.5k
feat: Add AzureOpenAIResponsesChatGenerator
#10019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
67 commits
Select commit
Hold shift + click to select a range
a0ccca9
Add working ChatGenerator
Amnah199 8363ae6
rename
Amnah199 eba8123
Merge branch 'main' of https://github.com/deepset-ai/haystack into op…
Amnah199 18ce0e0
Improve and add live tests
Amnah199 ba14b18
Updates
Amnah199 eeec152
Update the tests
Amnah199 64052f7
Fix errors
Amnah199 ef4d0a7
Add release notes
Amnah199 149a47e
Merge branch 'main' into openai-responses
Amnah199 f3b41f2
Merge branch 'main' into openai-responses
Amnah199 50d5feb
Add support for openai tools
Amnah199 d1bea24
Merge branch 'openai-responses' of https://github.com/deepset-ai/hays…
Amnah199 f2ba387
Remove openai tools test that times out
Amnah199 830d086
fix tool calls
Amnah199 2c866f9
Update release notes
Amnah199 228a21b
PR comments
Amnah199 1b0ac65
remove edits to chat message
Amnah199 96e9343
Add a test
Amnah199 515474a
PR comments
Amnah199 9d8aa42
Send back reasoning to model
Amnah199 b1d6e80
Merge branch 'main' of https://github.com/deepset-ai/haystack into op…
Amnah199 9e414a9
Merge branch 'main' of https://github.com/deepset-ai/haystack into op…
Amnah199 00e6013
Fix reasoning support
Amnah199 419ec36
Add reasoning support
Amnah199 2a7f342
Fix tests
Amnah199 76db039
Refactor
Amnah199 0bab968
Simplify methods
Amnah199 8c8e031
Fix mypy
Amnah199 9107989
Stream responses, tool calls etc
Amnah199 fe07300
Update docstrings
Amnah199 c8083ba
Fix errors while using in Agent
Amnah199 40406aa
Fix call_id and fc_id
Amnah199 88760c0
Merge branch 'main' into openai-responses
Amnah199 d8949fc
Update tests
Amnah199 150b94a
Merge branch 'openai-responses' of https://github.com/deepset-ai/hays…
Amnah199 ed299f3
Updates
Amnah199 c973c8d
Add extra in ToolCall and ToolCallDelta
Amnah199 a0bb425
Update streaming chunk
Amnah199 412c26d
Fix tests and linting
Amnah199 41164ec
Merge branch 'main' into openai-responses
Amnah199 eb04021
Update api key resolve
Amnah199 d5c1717
Merge branch 'openai-responses' of https://github.com/deepset-ai/hays…
Amnah199 0ba664e
PR comments
Amnah199 90f8da5
PR comments
Amnah199 85064ee
Updates
sjrl c78e972
some type fixes and also make sure to use flatten_tools_or_toolsets
sjrl 6cae697
fix docs
sjrl ac2cc2f
Fix streaming chunks so assistant header is properly captured
sjrl 59f2064
Add finish_reason and update test
sjrl b8d4f02
Skip streaming + pydantic model test b/c of known issue in openai pyt…
sjrl 334db3d
Fix pylint
sjrl 3327831
Initial commit adding AzureOpenAIResponsesChatGenerator support
sjrl 50ffe4a
fix unit test
sjrl 3d51008
Starting to refactor to use new recommended way to connect to Azure O…
sjrl 275dcae
Updates
sjrl f53ae87
Fix tests
sjrl eb4f3b0
More tests
sjrl 17c0595
fix integration tests
sjrl a1e9a95
Add to docs
sjrl 2954d3c
Don't need warm_up method anymore
sjrl 94508be
fix unit test
sjrl da712b7
Fix pylint
sjrl dfba282
fix docstrings
sjrl 839aaec
fix mypy typing
sjrl a18284b
Merge branch 'main' of github.com:deepset-ai/haystack into azure-open…
sjrl 3463a8d
fix reno
sjrl 559a8fe
Add another unit test
sjrl File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,234 @@ | ||
| # SPDX-FileCopyrightText: 2022-present deepset GmbH <info@deepset.ai> | ||
| # | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| import os | ||
| from typing import Any, Awaitable, Callable, Optional, Union | ||
|
|
||
| from openai.lib._pydantic import to_strict_json_schema | ||
| from pydantic import BaseModel | ||
|
|
||
| from haystack import component, default_from_dict, default_to_dict | ||
| from haystack.components.generators.chat import OpenAIResponsesChatGenerator | ||
| from haystack.dataclasses.streaming_chunk import StreamingCallbackT | ||
| from haystack.tools import ToolsType, deserialize_tools_or_toolset_inplace, serialize_tools_or_toolset | ||
| from haystack.utils import Secret, deserialize_callable, deserialize_secrets_inplace, serialize_callable | ||
|
|
||
|
|
||
| @component | ||
| class AzureOpenAIResponsesChatGenerator(OpenAIResponsesChatGenerator): | ||
| """ | ||
| Completes chats using OpenAI's Responses API on Azure. | ||
|
|
||
| It works with the gpt-5 and o-series models and supports streaming responses | ||
| from OpenAI API. It uses [ChatMessage](https://docs.haystack.deepset.ai/docs/chatmessage) | ||
| format in input and output. | ||
|
|
||
| You can customize how the text is generated by passing parameters to the | ||
| OpenAI API. Use the `**generation_kwargs` argument when you initialize | ||
| the component or when you run it. Any parameter that works with | ||
| `openai.Responses.create` will work here too. | ||
|
|
||
| For details on OpenAI API parameters, see | ||
| [OpenAI documentation](https://platform.openai.com/docs/api-reference/responses). | ||
|
|
||
| ### Usage example | ||
|
|
||
| ```python | ||
| from haystack.components.generators.chat import AzureOpenAIResponsesChatGenerator | ||
| from haystack.dataclasses import ChatMessage | ||
|
|
||
| messages = [ChatMessage.from_user("What's Natural Language Processing?")] | ||
|
|
||
| client = AzureOpenAIResponsesChatGenerator( | ||
| azure_endpoint="https://example-resource.azure.openai.com/", | ||
| generation_kwargs={"reasoning": {"effort": "low", "summary": "auto"}} | ||
| ) | ||
| response = client.run(messages) | ||
| print(response) | ||
| ``` | ||
| """ | ||
|
|
||
| # ruff: noqa: PLR0913 | ||
| def __init__( | ||
| self, | ||
| *, | ||
| api_key: Union[Secret, Callable[[], str], Callable[[], Awaitable[str]]] = Secret.from_env_var( | ||
| "AZURE_OPENAI_API_KEY", strict=False | ||
| ), | ||
| azure_endpoint: Optional[str] = None, | ||
| azure_deployment: str = "gpt-5-mini", | ||
| streaming_callback: Optional[StreamingCallbackT] = None, | ||
| organization: Optional[str] = None, | ||
| generation_kwargs: Optional[dict[str, Any]] = None, | ||
| timeout: Optional[float] = None, | ||
| max_retries: Optional[int] = None, | ||
| tools: Optional[ToolsType] = None, | ||
| tools_strict: bool = False, | ||
| http_client_kwargs: Optional[dict[str, Any]] = None, | ||
| ): | ||
| """ | ||
| Initialize the AzureOpenAIResponsesChatGenerator component. | ||
|
|
||
| :param api_key: The API key to use for authentication. Can be: | ||
| - A `Secret` object containing the API key. | ||
| - A `Secret` object containing the [Azure Active Directory token](https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id). | ||
| - A function that returns an Azure Active Directory token. | ||
| :param azure_endpoint: The endpoint of the deployed model, for example `"https://example-resource.azure.openai.com/"`. | ||
| :param azure_deployment: The deployment of the model, usually the model name. | ||
| :param organization: Your organization ID, defaults to `None`. For help, see | ||
| [Setting up your organization](https://platform.openai.com/docs/guides/production-best-practices/setting-up-your-organization). | ||
| :param streaming_callback: A callback function called when a new token is received from the stream. | ||
| It accepts [StreamingChunk](https://docs.haystack.deepset.ai/docs/data-classes#streamingchunk) | ||
| as an argument. | ||
| :param timeout: Timeout for OpenAI client calls. If not set, it defaults to either the | ||
| `OPENAI_TIMEOUT` environment variable, or 30 seconds. | ||
| :param max_retries: Maximum number of retries to contact OpenAI after an internal error. | ||
| If not set, it defaults to either the `OPENAI_MAX_RETRIES` environment variable, or set to 5. | ||
| :param generation_kwargs: Other parameters to use for the model. These parameters are sent | ||
| directly to the OpenAI endpoint. | ||
| See OpenAI [documentation](https://platform.openai.com/docs/api-reference/responses) for | ||
| more details. | ||
| Some of the supported parameters: | ||
| - `temperature`: What sampling temperature to use. Higher values like 0.8 will make the output more random, | ||
| while lower values like 0.2 will make it more focused and deterministic. | ||
| - `top_p`: An alternative to sampling with temperature, called nucleus sampling, where the model | ||
| considers the results of the tokens with top_p probability mass. For example, 0.1 means only the tokens | ||
| comprising the top 10% probability mass are considered. | ||
| - `previous_response_id`: The ID of the previous response. | ||
| Use this to create multi-turn conversations. | ||
| - `text_format`: A JSON schema or a Pydantic model that enforces the structure of the model's response. | ||
| If provided, the output will always be validated against this | ||
| format (unless the model returns a tool call). | ||
| For details, see the [OpenAI Structured Outputs documentation](https://platform.openai.com/docs/guides/structured-outputs). | ||
| Notes: | ||
| - This parameter accepts Pydantic models and JSON schemas for latest models starting from GPT-4o. | ||
| Older models only support basic version of structured outputs through `{"type": "json_object"}`. | ||
| For detailed information on JSON mode, see the [OpenAI Structured Outputs documentation](https://platform.openai.com/docs/guides/structured-outputs#json-mode). | ||
| - For structured outputs with streaming, | ||
| the `text_format` must be a JSON schema and not a Pydantic model. | ||
| - `reasoning`: A dictionary of parameters for reasoning. For example: | ||
| - `summary`: The summary of the reasoning. | ||
| - `effort`: The level of effort to put into the reasoning. Can be `low`, `medium` or `high`. | ||
| - `generate_summary`: Whether to generate a summary of the reasoning. | ||
| Note: OpenAI does not return the reasoning tokens, but we can view summary if its enabled. | ||
| For details, see the [OpenAI Reasoning documentation](https://platform.openai.com/docs/guides/reasoning). | ||
| :param tools: | ||
| A list of Tool and/or Toolset objects, or a single Toolset for which the model can prepare calls. | ||
| :param tools_strict: | ||
| Whether to enable strict schema adherence for tool calls. If set to `True`, the model will follow exactly | ||
| the schema provided in the `parameters` field of the tool definition, but this may increase latency. | ||
| :param http_client_kwargs: | ||
| A dictionary of keyword arguments to configure a custom `httpx.Client`or `httpx.AsyncClient`. | ||
| For more information, see the [HTTPX documentation](https://www.python-httpx.org/api/#client). | ||
| """ | ||
| azure_endpoint = azure_endpoint or os.getenv("AZURE_OPENAI_ENDPOINT") | ||
| if azure_endpoint is None: | ||
| raise ValueError( | ||
| "You must provide `azure_endpoint` or set the `AZURE_OPENAI_ENDPOINT` environment variable." | ||
| ) | ||
| self._azure_endpoint = azure_endpoint | ||
| self._azure_deployment = azure_deployment | ||
| super(AzureOpenAIResponsesChatGenerator, self).__init__( | ||
| api_key=api_key, # type: ignore[arg-type] | ||
| model=self._azure_deployment, | ||
| streaming_callback=streaming_callback, | ||
| api_base_url=f"{self._azure_endpoint.rstrip('/')}/openai/v1", | ||
| organization=organization, | ||
| generation_kwargs=generation_kwargs, | ||
| timeout=timeout, | ||
| max_retries=max_retries, | ||
| tools=tools, | ||
| tools_strict=tools_strict, | ||
| http_client_kwargs=http_client_kwargs, | ||
| ) | ||
|
|
||
| def to_dict(self) -> dict[str, Any]: | ||
| """ | ||
| Serialize this component to a dictionary. | ||
|
|
||
| :returns: | ||
| The serialized component as a dictionary. | ||
| """ | ||
| callback_name = serialize_callable(self.streaming_callback) if self.streaming_callback else None | ||
|
|
||
| # API key can be a secret or a callable | ||
| serialized_api_key = ( | ||
| serialize_callable(self.api_key) | ||
| if callable(self.api_key) | ||
| else self.api_key.to_dict() | ||
| if isinstance(self.api_key, Secret) | ||
| else None | ||
| ) | ||
|
|
||
| # If the response format is a Pydantic model, it's converted to openai's json schema format | ||
| # If it's already a json schema, it's left as is | ||
| generation_kwargs = self.generation_kwargs.copy() | ||
| response_format = generation_kwargs.get("response_format") | ||
| if response_format and issubclass(response_format, BaseModel): | ||
| json_schema = { | ||
| "type": "json_schema", | ||
| "json_schema": { | ||
| "name": response_format.__name__, | ||
| "strict": True, | ||
| "schema": to_strict_json_schema(response_format), | ||
| }, | ||
| } | ||
| generation_kwargs["response_format"] = json_schema | ||
|
|
||
| # OpenAI/MCP tools are passed as list of dictionaries | ||
| serialized_tools: Union[dict[str, Any], list[dict[str, Any]], None] | ||
| if self.tools and isinstance(self.tools, list) and isinstance(self.tools[0], dict): | ||
| # mypy can't infer that self.tools is list[dict] here | ||
| serialized_tools = self.tools # type: ignore[assignment] | ||
| else: | ||
| serialized_tools = serialize_tools_or_toolset(self.tools) # type: ignore[arg-type] | ||
|
|
||
| return default_to_dict( | ||
| self, | ||
| azure_endpoint=self._azure_endpoint, | ||
| api_key=serialized_api_key, | ||
| azure_deployment=self._azure_deployment, | ||
| streaming_callback=callback_name, | ||
| organization=self.organization, | ||
| generation_kwargs=generation_kwargs, | ||
| timeout=self.timeout, | ||
| max_retries=self.max_retries, | ||
| tools=serialized_tools, | ||
| tools_strict=self.tools_strict, | ||
| http_client_kwargs=self.http_client_kwargs, | ||
| ) | ||
|
|
||
| @classmethod | ||
| def from_dict(cls, data: dict[str, Any]) -> "AzureOpenAIResponsesChatGenerator": | ||
| """ | ||
| Deserialize this component from a dictionary. | ||
|
|
||
| :param data: The dictionary representation of this component. | ||
| :returns: | ||
| The deserialized component instance. | ||
| """ | ||
| serialized_api_key = data["init_parameters"].get("api_key") | ||
| # If it's a dict most likely a Secret | ||
| if isinstance(serialized_api_key, dict): | ||
| deserialize_secrets_inplace(data["init_parameters"], keys=["api_key"]) | ||
| # If it's a str, most likely a callable | ||
| elif isinstance(serialized_api_key, str): | ||
| data["init_parameters"]["api_key"] = deserialize_callable(serialized_api_key) | ||
|
|
||
| # we only deserialize the tools if they are haystack tools | ||
| # because openai tools are not serialized in the same way | ||
| tools = data["init_parameters"].get("tools") | ||
| if tools and ( | ||
| isinstance(tools, dict) | ||
| and tools.get("type") == "haystack.tools.toolset.Toolset" | ||
| or isinstance(tools, list) | ||
| and tools[0].get("type") == "haystack.tools.tool.Tool" | ||
| ): | ||
| deserialize_tools_or_toolset_inplace(data["init_parameters"], key="tools") | ||
|
|
||
| init_params = data.get("init_parameters", {}) | ||
| serialized_callback_handler = init_params.get("streaming_callback") | ||
| if serialized_callback_handler: | ||
| data["init_parameters"]["streaming_callback"] = deserialize_callable(serialized_callback_handler) | ||
| return default_from_dict(cls, data) | ||
24 changes: 24 additions & 0 deletions
24
releasenotes/notes/add-azure-responses-api-1b2c990a060b09f5.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,24 @@ | ||
| --- | ||
| features: | ||
| - | | ||
| Added the `AzureOpenAIResponsesChatGenerator`, a new component that integrates Azure OpenAI's Responses API into Haystack. | ||
| This unlocks several advanced capabilities from the Responses API: | ||
| - Allowing retrieval of concise summaries of the model's reasoning process. | ||
| - Allowing the use of native OpenAI or MCP tool formats, along with Haystack Tool objects and Toolset instances. | ||
|
|
||
| Example with reasoning and web search tool: | ||
| ```python | ||
| from haystack.components.generators.chat import AzureOpenAIResponsesChatGenerator | ||
| from haystack.dataclasses import ChatMessage | ||
|
|
||
| chat_generator = AzureOpenAIResponsesChatGenerator( | ||
| azure_endpoint="https://example-resource.azure.openai.com/", | ||
| azure_deployment="gpt-5-mini", | ||
| generation_kwargs={"reasoning": {"effort": "low", "summary": "auto"}}, | ||
| ) | ||
|
|
||
| response = chat_generator.run( | ||
| messages=[ChatMessage.from_user("What's Natural Language Processing?")] | ||
| ) | ||
| print(response["replies"][0].text) | ||
| ``` |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.