Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
a0ccca9
Add working ChatGenerator
Amnah199 Sep 22, 2025
8363ae6
rename
Amnah199 Sep 22, 2025
eba8123
Merge branch 'main' of https://github.com/deepset-ai/haystack into op…
Amnah199 Sep 24, 2025
18ce0e0
Improve and add live tests
Amnah199 Sep 24, 2025
ba14b18
Updates
Amnah199 Oct 8, 2025
eeec152
Update the tests
Amnah199 Oct 9, 2025
64052f7
Fix errors
Amnah199 Oct 9, 2025
ef4d0a7
Add release notes
Amnah199 Oct 10, 2025
149a47e
Merge branch 'main' into openai-responses
Amnah199 Oct 10, 2025
f3b41f2
Merge branch 'main' into openai-responses
Amnah199 Oct 10, 2025
50d5feb
Add support for openai tools
Amnah199 Oct 10, 2025
d1bea24
Merge branch 'openai-responses' of https://github.com/deepset-ai/hays…
Amnah199 Oct 10, 2025
f2ba387
Remove openai tools test that times out
Amnah199 Oct 12, 2025
830d086
fix tool calls
Amnah199 Oct 13, 2025
2c866f9
Update release notes
Amnah199 Oct 13, 2025
228a21b
PR comments
Amnah199 Oct 14, 2025
1b0ac65
remove edits to chat message
Amnah199 Oct 14, 2025
96e9343
Add a test
Amnah199 Oct 14, 2025
515474a
PR comments
Amnah199 Oct 15, 2025
9d8aa42
Send back reasoning to model
Amnah199 Oct 15, 2025
b1d6e80
Merge branch 'main' of https://github.com/deepset-ai/haystack into op…
Amnah199 Oct 21, 2025
9e414a9
Merge branch 'main' of https://github.com/deepset-ai/haystack into op…
Amnah199 Oct 22, 2025
00e6013
Fix reasoning support
Amnah199 Oct 22, 2025
419ec36
Add reasoning support
Amnah199 Oct 22, 2025
2a7f342
Fix tests
Amnah199 Oct 23, 2025
76db039
Refactor
Amnah199 Oct 23, 2025
0bab968
Simplify methods
Amnah199 Oct 23, 2025
8c8e031
Fix mypy
Amnah199 Oct 24, 2025
9107989
Stream responses, tool calls etc
Amnah199 Oct 24, 2025
fe07300
Update docstrings
Amnah199 Oct 24, 2025
c8083ba
Fix errors while using in Agent
Amnah199 Oct 30, 2025
40406aa
Fix call_id and fc_id
Amnah199 Oct 30, 2025
88760c0
Merge branch 'main' into openai-responses
Amnah199 Oct 31, 2025
d8949fc
Update tests
Amnah199 Oct 31, 2025
150b94a
Merge branch 'openai-responses' of https://github.com/deepset-ai/hays…
Amnah199 Oct 31, 2025
ed299f3
Updates
Amnah199 Nov 3, 2025
c973c8d
Add extra in ToolCall and ToolCallDelta
Amnah199 Nov 4, 2025
a0bb425
Update streaming chunk
Amnah199 Nov 4, 2025
412c26d
Fix tests and linting
Amnah199 Nov 4, 2025
41164ec
Merge branch 'main' into openai-responses
Amnah199 Nov 5, 2025
eb04021
Update api key resolve
Amnah199 Nov 5, 2025
d5c1717
Merge branch 'openai-responses' of https://github.com/deepset-ai/hays…
Amnah199 Nov 5, 2025
0ba664e
PR comments
Amnah199 Nov 5, 2025
90f8da5
PR comments
Amnah199 Nov 5, 2025
85064ee
Updates
sjrl Nov 6, 2025
c78e972
some type fixes and also make sure to use flatten_tools_or_toolsets
sjrl Nov 6, 2025
6cae697
fix docs
sjrl Nov 6, 2025
ac2cc2f
Fix streaming chunks so assistant header is properly captured
sjrl Nov 6, 2025
59f2064
Add finish_reason and update test
sjrl Nov 6, 2025
b8d4f02
Skip streaming + pydantic model test b/c of known issue in openai pyt…
sjrl Nov 6, 2025
334db3d
Fix pylint
sjrl Nov 6, 2025
3327831
Initial commit adding AzureOpenAIResponsesChatGenerator support
sjrl Nov 4, 2025
50ffe4a
fix unit test
sjrl Nov 4, 2025
3d51008
Starting to refactor to use new recommended way to connect to Azure O…
sjrl Nov 4, 2025
275dcae
Updates
sjrl Nov 5, 2025
f53ae87
Fix tests
sjrl Nov 5, 2025
eb4f3b0
More tests
sjrl Nov 5, 2025
17c0595
fix integration tests
sjrl Nov 5, 2025
a1e9a95
Add to docs
sjrl Nov 5, 2025
2954d3c
Don't need warm_up method anymore
sjrl Nov 6, 2025
94508be
fix unit test
sjrl Nov 6, 2025
da712b7
Fix pylint
sjrl Nov 6, 2025
dfba282
fix docstrings
sjrl Nov 6, 2025
839aaec
fix mypy typing
sjrl Nov 6, 2025
a18284b
Merge branch 'main' of github.com:deepset-ai/haystack into azure-open…
sjrl Nov 6, 2025
3463a8d
fix reno
sjrl Nov 6, 2025
559a8fe
Add another unit test
sjrl Nov 6, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/pydoc/config/generators_api.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ loaders:
"openai",
"openai_dalle",
"chat/azure",
"chat/azure_responses",
"chat/hugging_face_local",
"chat/hugging_face_api",
"chat/openai",
Expand Down
1 change: 1 addition & 0 deletions docs/pydoc/config_docusaurus/generators_api.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ loaders:
"openai",
"openai_dalle",
"chat/azure",
"chat/azure_responses",
"chat/hugging_face_local",
"chat/hugging_face_api",
"chat/openai",
Expand Down
2 changes: 2 additions & 0 deletions haystack/components/generators/chat/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,15 @@
"openai": ["OpenAIChatGenerator"],
"openai_responses": ["OpenAIResponsesChatGenerator"],
"azure": ["AzureOpenAIChatGenerator"],
"azure_responses": ["AzureOpenAIResponsesChatGenerator"],
"hugging_face_local": ["HuggingFaceLocalChatGenerator"],
"hugging_face_api": ["HuggingFaceAPIChatGenerator"],
"fallback": ["FallbackChatGenerator"],
}

if TYPE_CHECKING:
from .azure import AzureOpenAIChatGenerator as AzureOpenAIChatGenerator
from .azure_responses import AzureOpenAIResponsesChatGenerator as AzureOpenAIResponsesChatGenerator
from .fallback import FallbackChatGenerator as FallbackChatGenerator
from .hugging_face_api import HuggingFaceAPIChatGenerator as HuggingFaceAPIChatGenerator
from .hugging_face_local import HuggingFaceLocalChatGenerator as HuggingFaceLocalChatGenerator
Expand Down
234 changes: 234 additions & 0 deletions haystack/components/generators/chat/azure_responses.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,234 @@
# SPDX-FileCopyrightText: 2022-present deepset GmbH <info@deepset.ai>
#
# SPDX-License-Identifier: Apache-2.0

import os
from typing import Any, Awaitable, Callable, Optional, Union

from openai.lib._pydantic import to_strict_json_schema
from pydantic import BaseModel

from haystack import component, default_from_dict, default_to_dict
from haystack.components.generators.chat import OpenAIResponsesChatGenerator
from haystack.dataclasses.streaming_chunk import StreamingCallbackT
from haystack.tools import ToolsType, deserialize_tools_or_toolset_inplace, serialize_tools_or_toolset
from haystack.utils import Secret, deserialize_callable, deserialize_secrets_inplace, serialize_callable


@component
class AzureOpenAIResponsesChatGenerator(OpenAIResponsesChatGenerator):
"""
Completes chats using OpenAI's Responses API on Azure.

It works with the gpt-5 and o-series models and supports streaming responses
from OpenAI API. It uses [ChatMessage](https://docs.haystack.deepset.ai/docs/chatmessage)
format in input and output.

You can customize how the text is generated by passing parameters to the
OpenAI API. Use the `**generation_kwargs` argument when you initialize
the component or when you run it. Any parameter that works with
`openai.Responses.create` will work here too.

For details on OpenAI API parameters, see
[OpenAI documentation](https://platform.openai.com/docs/api-reference/responses).

### Usage example

```python
from haystack.components.generators.chat import AzureOpenAIResponsesChatGenerator
from haystack.dataclasses import ChatMessage

messages = [ChatMessage.from_user("What's Natural Language Processing?")]

client = AzureOpenAIResponsesChatGenerator(
azure_endpoint="https://example-resource.azure.openai.com/",
generation_kwargs={"reasoning": {"effort": "low", "summary": "auto"}}
)
response = client.run(messages)
print(response)
```
"""

# ruff: noqa: PLR0913
def __init__(
self,
*,
api_key: Union[Secret, Callable[[], str], Callable[[], Awaitable[str]]] = Secret.from_env_var(
"AZURE_OPENAI_API_KEY", strict=False
),
azure_endpoint: Optional[str] = None,
azure_deployment: str = "gpt-5-mini",
streaming_callback: Optional[StreamingCallbackT] = None,
organization: Optional[str] = None,
generation_kwargs: Optional[dict[str, Any]] = None,
timeout: Optional[float] = None,
max_retries: Optional[int] = None,
tools: Optional[ToolsType] = None,
tools_strict: bool = False,
http_client_kwargs: Optional[dict[str, Any]] = None,
):
"""
Initialize the AzureOpenAIResponsesChatGenerator component.

:param api_key: The API key to use for authentication. Can be:
- A `Secret` object containing the API key.
- A `Secret` object containing the [Azure Active Directory token](https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id).
- A function that returns an Azure Active Directory token.
:param azure_endpoint: The endpoint of the deployed model, for example `"https://example-resource.azure.openai.com/"`.
:param azure_deployment: The deployment of the model, usually the model name.
:param organization: Your organization ID, defaults to `None`. For help, see
[Setting up your organization](https://platform.openai.com/docs/guides/production-best-practices/setting-up-your-organization).
:param streaming_callback: A callback function called when a new token is received from the stream.
It accepts [StreamingChunk](https://docs.haystack.deepset.ai/docs/data-classes#streamingchunk)
as an argument.
:param timeout: Timeout for OpenAI client calls. If not set, it defaults to either the
`OPENAI_TIMEOUT` environment variable, or 30 seconds.
:param max_retries: Maximum number of retries to contact OpenAI after an internal error.
If not set, it defaults to either the `OPENAI_MAX_RETRIES` environment variable, or set to 5.
:param generation_kwargs: Other parameters to use for the model. These parameters are sent
directly to the OpenAI endpoint.
See OpenAI [documentation](https://platform.openai.com/docs/api-reference/responses) for
more details.
Some of the supported parameters:
- `temperature`: What sampling temperature to use. Higher values like 0.8 will make the output more random,
while lower values like 0.2 will make it more focused and deterministic.
- `top_p`: An alternative to sampling with temperature, called nucleus sampling, where the model
considers the results of the tokens with top_p probability mass. For example, 0.1 means only the tokens
comprising the top 10% probability mass are considered.
- `previous_response_id`: The ID of the previous response.
Use this to create multi-turn conversations.
- `text_format`: A JSON schema or a Pydantic model that enforces the structure of the model's response.
If provided, the output will always be validated against this
format (unless the model returns a tool call).
For details, see the [OpenAI Structured Outputs documentation](https://platform.openai.com/docs/guides/structured-outputs).
Notes:
- This parameter accepts Pydantic models and JSON schemas for latest models starting from GPT-4o.
Older models only support basic version of structured outputs through `{"type": "json_object"}`.
For detailed information on JSON mode, see the [OpenAI Structured Outputs documentation](https://platform.openai.com/docs/guides/structured-outputs#json-mode).
- For structured outputs with streaming,
the `text_format` must be a JSON schema and not a Pydantic model.
- `reasoning`: A dictionary of parameters for reasoning. For example:
- `summary`: The summary of the reasoning.
- `effort`: The level of effort to put into the reasoning. Can be `low`, `medium` or `high`.
- `generate_summary`: Whether to generate a summary of the reasoning.
Note: OpenAI does not return the reasoning tokens, but we can view summary if its enabled.
For details, see the [OpenAI Reasoning documentation](https://platform.openai.com/docs/guides/reasoning).
:param tools:
A list of Tool and/or Toolset objects, or a single Toolset for which the model can prepare calls.
:param tools_strict:
Whether to enable strict schema adherence for tool calls. If set to `True`, the model will follow exactly
the schema provided in the `parameters` field of the tool definition, but this may increase latency.
:param http_client_kwargs:
A dictionary of keyword arguments to configure a custom `httpx.Client`or `httpx.AsyncClient`.
For more information, see the [HTTPX documentation](https://www.python-httpx.org/api/#client).
"""
azure_endpoint = azure_endpoint or os.getenv("AZURE_OPENAI_ENDPOINT")
if azure_endpoint is None:
raise ValueError(
"You must provide `azure_endpoint` or set the `AZURE_OPENAI_ENDPOINT` environment variable."
)
self._azure_endpoint = azure_endpoint
self._azure_deployment = azure_deployment
super(AzureOpenAIResponsesChatGenerator, self).__init__(
api_key=api_key, # type: ignore[arg-type]
model=self._azure_deployment,
streaming_callback=streaming_callback,
api_base_url=f"{self._azure_endpoint.rstrip('/')}/openai/v1",
organization=organization,
generation_kwargs=generation_kwargs,
timeout=timeout,
max_retries=max_retries,
tools=tools,
tools_strict=tools_strict,
http_client_kwargs=http_client_kwargs,
)

def to_dict(self) -> dict[str, Any]:
"""
Serialize this component to a dictionary.

:returns:
The serialized component as a dictionary.
"""
callback_name = serialize_callable(self.streaming_callback) if self.streaming_callback else None

# API key can be a secret or a callable
serialized_api_key = (
serialize_callable(self.api_key)
if callable(self.api_key)
else self.api_key.to_dict()
if isinstance(self.api_key, Secret)
else None
)

# If the response format is a Pydantic model, it's converted to openai's json schema format
# If it's already a json schema, it's left as is
generation_kwargs = self.generation_kwargs.copy()
response_format = generation_kwargs.get("response_format")
if response_format and issubclass(response_format, BaseModel):
json_schema = {
"type": "json_schema",
"json_schema": {
"name": response_format.__name__,
"strict": True,
"schema": to_strict_json_schema(response_format),
},
}
generation_kwargs["response_format"] = json_schema

# OpenAI/MCP tools are passed as list of dictionaries
serialized_tools: Union[dict[str, Any], list[dict[str, Any]], None]
if self.tools and isinstance(self.tools, list) and isinstance(self.tools[0], dict):
# mypy can't infer that self.tools is list[dict] here
serialized_tools = self.tools # type: ignore[assignment]
else:
serialized_tools = serialize_tools_or_toolset(self.tools) # type: ignore[arg-type]

return default_to_dict(
self,
azure_endpoint=self._azure_endpoint,
api_key=serialized_api_key,
azure_deployment=self._azure_deployment,
streaming_callback=callback_name,
organization=self.organization,
generation_kwargs=generation_kwargs,
timeout=self.timeout,
max_retries=self.max_retries,
tools=serialized_tools,
tools_strict=self.tools_strict,
http_client_kwargs=self.http_client_kwargs,
)

@classmethod
def from_dict(cls, data: dict[str, Any]) -> "AzureOpenAIResponsesChatGenerator":
"""
Deserialize this component from a dictionary.

:param data: The dictionary representation of this component.
:returns:
The deserialized component instance.
"""
serialized_api_key = data["init_parameters"].get("api_key")
# If it's a dict most likely a Secret
if isinstance(serialized_api_key, dict):
deserialize_secrets_inplace(data["init_parameters"], keys=["api_key"])
# If it's a str, most likely a callable
elif isinstance(serialized_api_key, str):
data["init_parameters"]["api_key"] = deserialize_callable(serialized_api_key)

# we only deserialize the tools if they are haystack tools
# because openai tools are not serialized in the same way
tools = data["init_parameters"].get("tools")
if tools and (
isinstance(tools, dict)
and tools.get("type") == "haystack.tools.toolset.Toolset"
or isinstance(tools, list)
and tools[0].get("type") == "haystack.tools.tool.Tool"
):
deserialize_tools_or_toolset_inplace(data["init_parameters"], key="tools")

init_params = data.get("init_parameters", {})
serialized_callback_handler = init_params.get("streaming_callback")
if serialized_callback_handler:
data["init_parameters"]["streaming_callback"] = deserialize_callable(serialized_callback_handler)
return default_from_dict(cls, data)
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
features:
- |
Added the `AzureOpenAIResponsesChatGenerator`, a new component that integrates Azure OpenAI's Responses API into Haystack.
This unlocks several advanced capabilities from the Responses API:
- Allowing retrieval of concise summaries of the model's reasoning process.
- Allowing the use of native OpenAI or MCP tool formats, along with Haystack Tool objects and Toolset instances.

Example with reasoning and web search tool:
```python
from haystack.components.generators.chat import AzureOpenAIResponsesChatGenerator
from haystack.dataclasses import ChatMessage

chat_generator = AzureOpenAIResponsesChatGenerator(
azure_endpoint="https://example-resource.azure.openai.com/",
azure_deployment="gpt-5-mini",
generation_kwargs={"reasoning": {"effort": "low", "summary": "auto"}},
)

response = chat_generator.run(
messages=[ChatMessage.from_user("What's Natural Language Processing?")]
)
print(response["replies"][0].text)
```
Loading
Loading