Add native responses by enyst · Pull Request #622 · OpenHands/software-agent-sdk

enyst · 2025-10-02T22:13:29Z

Native responses API, take two (or four 😅)

responses() stateless, with its reasoning, sent back and shown off
some preparation for stateful, for another PR

This PR proposes an approach where:

the Agent is aware of the two completion() and responses()
but I've tried to keep its changes minimal, refactoring out low-level details of parsing LLM responses, and LLM tool calls
llm.py implements both (but we will probably change this)
- the code is not fully clean atm, but generally I've tried to sort of keep it small and "symmetrical", so it's easy to refactor out
message.py, small new interface:
- from_llm_chat_message
- from_llm_responses_output
- to_chat_dict
- to_responses_dict

Agent Server images for this PR

• GHCR package: https://github.com/All-Hands-AI/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Base Image	Docs / Tags
golang	`golang:1.21-bookworm`	Link
java	`eclipse-temurin:17-jdk`	Link
python	`nikolaik/python-nodejs:python3.12-nodejs22`	Link

Pull (multi-arch manifest)

docker pull ghcr.io/all-hands-ai/agent-server:400c58d-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-400c58d-python \
  ghcr.io/all-hands-ai/agent-server:400c58d-python

All tags pushed for this build

ghcr.io/all-hands-ai/agent-server:400c58d-golang
ghcr.io/all-hands-ai/agent-server:v1.0.0_golang_tag_1.21-bookworm_golang
ghcr.io/all-hands-ai/agent-server:400c58d-java
ghcr.io/all-hands-ai/agent-server:v1.0.0_eclipse-temurin_tag_17-jdk_java
ghcr.io/all-hands-ai/agent-server:400c58d-python
ghcr.io/all-hands-ai/agent-server:v1.0.0_nikolaik_s_python-nodejs_tag_python3.12-nodejs22_python

The 400c58d tag is a multi-arch manifest (amd64/arm64); your client pulls the right arch automatically.

enyst · 2025-10-02T22:30:34Z

@OpenHands Extract NonExecutableActionEvent from this PR, make a new branch from main, and apply it on it. Then make a PR.

Understand well what role it has, and understand that on main branch we don't have Responses path. Make sure that NonExecutableActionEvent fulfills its role correctly on Completions path. Add unit tests to that new PR.

openhands-ai · 2025-10-02T22:30:44Z

I'm on it! enyst can track my progress at all-hands.dev

enyst · 2025-10-03T07:18:22Z

This works again! GPT-5-mini thinks for a while when deleting a file
(this also shows the other issue)

enyst · 2025-10-03T09:11:16Z

This works! View and send back reasoning on stateless variant, and it cleared up some of the code. At a price (llm.py is still a bit unhappy, and telemetry.py will need another pass), just IMHO it's mostly easier to read and follow the execution path for one API or the other.

I kinda like that the Agent only chooses, all the rest of its code is now without ifs and without litellm/completions code.

message.py saw a heavier lift, @xingyaoww I'd love to know what you think, I admit I like to look at a random screen of code and know exactly what belongs where. e.g. Tiny lost bits like _add_tool_call_keys were not obvious (once we have responses, are they executed on responses?), so this PR took them out. I still want to reintroduce the types for parameters, but I feel it's probably best separate.

The agent is working on porting tests.

I think llm.py needs a heavier refactoring..., I'm thinking to make a facade for normalize_kwargs and format_messages_for(), and take them out; not sure it'd reduce redundancies, but more for readability and more reliance on types. 🤔

This PR doesn't do a conversion; no conversion the other way around either

this is Option 2 with some bits of 3.1

xingyaoww

Did a quick skim thourgh! I actually think the structure LGTM and make sense! (and agree that we might need to do a good refactor afterwards)

Happy to take another look when we merge that NonExecutableActionEvent and when tests are in and passed!

🙏

Co-authored-by: openhands <openhands@all-hands.dev>

openhands/sdk/llm/message.py

enyst · 2025-10-04T06:36:27Z

OpenHands-GPT-5 here. Concise summary of test coverage for the Responses API work:

Agent.step routing/gating
- Override gate: uses_responses_api() True -> responses(); False -> completion(); assistant MessageEvent emitted
- Model gate: gpt-5-mini-2025-08-07 -> responses(); gpt-4o-mini -> completion()
Responses API parsing & kwargs
- from_llm_responses_output: assistant text join, function call normalization (fc_ prefix), reasoning summary mapping
- to_responses_dict: fc_ prefix and arg stringification
- _normalize_responses_kwargs: temperature=1.0 for responses; include adds reasoning.encrypted_content when enabled; store defaults False; max_output_tokens passthrough
- End-to-end responses() (patched) verifies Telemetry records a usage entry
Message & tool coverage
- from_llm_chat_message edge cases; _coerce_content/content_to_str; system to_responses_value
- ToolBase.to_responses_tool: security_risk gating tests
Status
- Pre-commit clean (ruff, pyright); full pytest green: 1507 passed, 35 warnings
Follow-ups (optional)
- Telemetry tests for cached_tokens and reasoning_tokens mapping; audit remaining openai vs litellm imports

neubig · 2025-10-04T12:33:30Z

Thanks for this! I'll try to take a look sometime in these two days!

…ol calls\n\n- Remove NonExecutableActionEvent usage and exports\n- Represent non-exec tool calls as ActionEvent(action=None)\n- Keep ActionEvent.visualize minimal for action=None; preserve __str__\n- Update View.filter_unmatched_tool_calls to pair by tool_call_id\n- Agent emits ActionEvent(action=None) then AgentErrorEvent for missing tool/invalid args\n- Update tests and visualizer accordingly\n\nCo-authored-by: openhands <openhands@all-hands.dev>

enyst · 2025-10-04T17:14:21Z

Did a quick skim thourgh! I actually think the structure LGTM and make sense! (and agree that we might need to do a good refactor afterwards)

Happy to take another look when we merge that NonExecutableActionEvent and when tests are in and passed!

🙏

Merged NEA and tests!

xingyaoww

I made some changes along the way but this PR LGTM! There's still some changes for simplification / clean up but we can leave for future PRs

One last issue i'm trying to figure out now: https://community.openai.com/t/schema-additionalproperties-must-be-false-when-strict-is-true/929996

# Keep the tasks short for demo purposes
^^^^^^^^^^^^^^^^^^

File "/Users/xingyaow/Projects/All-Hands-AI/openhands-v1-dev/agent-sdk.worktree/worktree2/openhands/sdk/conversation/impl/local_conversation.py", line 244, in run
self.agent.step(self._state, on_event=self._on_event)
File "/Users/xingyaow/Projects/All-Hands-AI/openhands-v1-dev/agent-sdk.worktree/worktree2/openhands/sdk/agent/agent.py", line 209, in step
raise e
File "/Users/xingyaow/Projects/All-Hands-AI/openhands-v1-dev/agent-sdk.worktree/worktree2/openhands/sdk/agent/agent.py", line 178, in step
llm_response = self.llm.responses(
^^^^^^^^^^^^^^^^^^^
File "/Users/xingyaow/Projects/All-Hands-AI/openhands-v1-dev/agent-sdk.worktree/worktree2/openhands/sdk/llm/llm.py", line 593, in responses
resp: ResponsesAPIResponse = _one_attempt()
^^^^^^^^^^^^^^
File "/Users/xingyaow/Projects/All-Hands-AI/openhands-v1-dev/agent-sdk.worktree/worktree2/.venv/lib/python3.12/site-packages/tenacity/init.py", line 338, in wrapped_f
return copy(f, *args, **kw)
^^^^^^^^^^^^^^^^^^^^
File "/Users/xingyaow/Projects/All-Hands-AI/openhands-v1-dev/agent-sdk.worktree/worktree2/.venv/lib/python3.12/site-packages/tenacity/init.py", line 477, in call
do = self.iter(retry_state=retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/xingyaow/Projects/All-Hands-AI/openhands-v1-dev/agent-sdk.worktree/worktree2/.venv/lib/python3.12/site-packages/tenacity/init.py", line 378, in iter
result = action(retry_state)
^^^^^^^^^^^^^^^^^^^
File "/Users/xingyaow/Projects/All-Hands-AI/openhands-v1-dev/agent-sdk.worktree/worktree2/.venv/lib/python3.12/site-packages/tenacity/init.py", line 400, in
self._add_action_func(lambda rs: rs.outcome.result())
^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/python@3.12/3.12.11/Frameworks/Python.framework/Versions/3.12/lib/python3.12/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/python@3.12/3.12.11/Frameworks/Python.framework/Versions/3.12/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/Users/xingyaow/Projects/All-Hands-AI/openhands-v1-dev/agent-sdk.worktree/worktree2/.venv/lib/python3.12/site-packages/tenacity/init.py", line 480, in call
result = fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/Users/xingyaow/Projects/All-Hands-AI/openhands-v1-dev/agent-sdk.worktree/worktree2/openhands/sdk/llm/llm.py", line 569, in _one_attempt
ret = litellm_responses(
^^^^^^^^^^^^^^^^^^
File "/Users/xingyaow/Projects/All-Hands-AI/openhands-v1-dev/agent-sdk.worktree/worktree2/.venv/lib/python3.12/site-packages/litellm/utils.py", line 1343, in wrapper
raise e
File "/Users/xingyaow/Projects/All-Hands-AI/openhands-v1-dev/agent-sdk.worktree/worktree2/.venv/lib/python3.12/site-packages/litellm/utils.py", line 1218, in wrapper
result = original_function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/xingyaow/Projects/All-Hands-AI/openhands-v1-dev/agent-sdk.worktree/worktree2/.venv/lib/python3.12/site-packages/litellm/responses/main.py", line 655, in responses
raise litellm.exception_type(
^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/xingyaow/Projects/All-Hands-AI/openhands-v1-dev/agent-sdk.worktree/worktree2/.venv/lib/python3.12/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 2301, in exception_type
raise e
File "/Users/xingyaow/Projects/All-Hands-AI/openhands-v1-dev/agent-sdk.worktree/worktree2/.venv/lib/python3.12/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 391, in exception_type
raise BadRequestError(
litellm.exceptions.BadRequestError: litellm.BadRequestError: OpenAIException - {"error":{"message":"litellm.BadRequestError: OpenAIException - {\n "error": {\n "message": "Invalid schema for function 'execute_bash': In context=(), 'additionalProperties' is required to be supplied and to be false.",\n "type": "invalid_request_error",\n "param": "tools[0].parameters",\n "code": "invalid_function_parameters"\n }\n}. Received Model Group=gpt-5-mini\nAvailable Model Group Fallbacks=None","type":null,"param":null,"code":"400"}}
/Users/xingyaow/Projects/All-Hands-AI/openhands-v1-dev/agent-sdk.worktree/worktree2/.venv/lib/python3.12/site-packages/litellm/llms/custom_httpx/async_client_cleanup.py:66: DeprecationWarning: There is no current event loop
loop = asyncio.get_event_loop()

openhands/sdk/tool/tool.py

xingyaoww · 2025-10-04T20:40:01Z

openhands/sdk/llm/message.py

+
        return message_dict

+    # Backward-compatible alias


Let's clean clean this up in future PRs? We can just remove this?

xingyaoww · 2025-10-04T20:43:42Z

openhands/sdk/llm/message.py

+        )
+
+    @classmethod
+    def from_litellm_message(cls, message: LiteLLMMessage) -> "Message":


Let's remove this too in follow up PR

@OpenHands create a github issue that we should remove all definition and usage of to_llm_dict and from_litellm_message in favor of the new API.

I'm on it! xingyaoww can track my progress at all-hands.dev

I've successfully created a GitHub issue to track the removal of deprecated API methods in favor of the new responses API.

Summary

GitHub Issue Created: #639 - Remove deprecated to_llm_dict() and from_litellm_message() in favor of new responses API

The issue documents:

The new API introduced in PR Add native responses #622:

from_llm_chat_message() - Parse Chat Completions messages

from_llm_responses_output() - Parse Responses API output

to_chat_dict() - Serialize to Chat Completions format

to_responses_dict() - Serialize to Responses API format

Deprecated methods to remove:

from_litellm_message() - Backward-compatible alias

to_llm_dict() - Old serialization method

Scope of work: Identified 5 files across production code, tests, and fixtures that need updating

Migration tasks: Checklist of specific actions needed to complete the migration

The issue has been labeled with refactoring and technical-debt tags for easy filtering and prioritization.

_{View full conversation}

enyst · 2025-10-04T21:02:33Z

@xingyaoww If it helps, actually we had it unspecified most of the time
Trying now 👀

openhands/sdk/tool/tool.py

enyst · 2025-10-04T21:08:39Z

@xingyaoww Works for me with strict=False!

I think that’s fine?

xingyaoww · 2025-10-04T21:18:11Z

@enyst yeah, that's fine, i was trying to get it to work: 4661859

But it is VERY tricky to get MCP schema to work :( In the mean time i discover another bug which is "kind" of the Action Input schema somehow show up in "properties" 😓 (unrelated to this PR)

xingyaoww

Confirmed it works locally for me! Let's get this in!

xingyaoww · 2025-10-04T21:24:35Z

@OpenHands create another github issue that Tool.to_mcp_schema() will outputs a MCP schema and its "properties" contains "kind" which suppose NOT be contains in tool schema we send to LLM.

We should consider keeping kind as a Pydantic field, and just remove if in the to_mcp_schema generation of openhands/sdk/tool/schema.py. Also, we can have Schema directly inherit from DiscriminatedUnionMixin instead of BaseModel, so that it makes sense to just exclude the kind field in the same class.

openhands-ai · 2025-10-04T21:24:44Z

I'm on it! xingyaoww can track my progress at all-hands.dev

openhands-ai · 2025-10-04T21:28:08Z

Summary

I've successfully created a GitHub issue for the problem you described:

Issue #640: Tool.to_mcp_schema() includes 'kind' field in properties which should not be sent to LLMs

What the issue documents:

Problem: The to_mcp_schema() method currently outputs schemas that include a kind field in the properties, which is an internal Pydantic discriminator field that should NOT be included in tool schemas sent to LLMs.
Root Cause:
- Action inherits from both Schema and DiscriminatedUnionMixin
- DiscriminatedUnionMixin defines kind as a Pydantic field
- to_mcp_schema() uses model_json_schema() which includes all fields
- The _process_schema_node() function doesn't filter out kind
Proposed Solutions:
- Approach 1: Modify Schema.to_mcp_schema() in openhands/sdk/tool/schema.py to explicitly exclude the kind field from the output
- Approach 2: Refactor Schema to directly inherit from DiscriminatedUnionMixin instead of BaseModel, making it clearer that all schemas have the discriminator and should exclude it
Verification: I confirmed the issue exists by testing it - the MCP schema currently shows dict_keys(['kind', 'command']) when it should only show dict_keys(['command']).

The issue has been labeled with bug and tool tags for proper categorization.

_{View full conversation}

simonrosenberg · 2025-10-07T22:04:51Z

@enyst it seems that gpt5-mini is crashing because of this fix because GPT-5 does NOT natively support Responses API.

enyst · 2025-10-08T01:00:31Z

I suspect Litellm seems to have some wonky criteria where it doesn't recognize "litellm_proxy/openai/gpt-5" as supporting Responses, but it does recognize "openai/gpt-5". So it reroutes the first.

We can look more into it tomorrow! I'm pretty sure GPT-5 does support natively Responses API. The call does go to OpenAI on /responses endpoint, and OpenAI documents support, e.g.
https://platform.openai.com/docs/guides/migrate-to-responses#responses-benefits

enyst force-pushed the response branch from 8bd5d0a to ba3ac57 Compare October 2, 2025 22:22

enyst added 2 commits October 3, 2025 02:27

native responses, clean up

ad36e9a

refactor message, serialization and reading

6c8bfa3

enyst force-pushed the response branch from 5d0d45b to 6c8bfa3 Compare October 3, 2025 00:28

tweak

809ecaf

enyst marked this pull request as draft October 3, 2025 00:34

enyst and others added 11 commits October 3, 2025 02:41

fix telemetry

1ec94df

fix typing

ffbb3bb

Add FACTS.txt with one project fact

5120a97

Remove FACTS.txt

640bfd0

use typed reasoning

8efe577

chore: ignore .worktrees and remove from index (no embedded repos)

f223ddc

chore: remove mistakenly committed .worktrees gitlinks

fe3509d

fix agent oops

3d1fe96

fix reasoning types

430a4e4

fix display

3be1a21

examples: conversation-only Responses reasoning demo (format)

5a906ae

fix input_text (assistant can only be output); clean up a little

d26d510

OpenHands deleted a comment from openhands-ai bot Oct 3, 2025

enyst added 2 commits October 3, 2025 10:10

clean up a little

4d0a158

use standard model features

30bf50d

enyst marked this pull request as ready for review October 3, 2025 08:43

xingyaoww reviewed Oct 3, 2025

View reviewed changes

enyst and others added 2 commits October 4, 2025 01:42

fix tests, types

2ec226e

tests: add unit tests for Responses API (#634)

ab97401

Co-authored-by: openhands <openhands@all-hands.dev>

OpenHands deleted a comment from openhands-ai bot Oct 4, 2025

enyst commented Oct 4, 2025

View reviewed changes

openhands/sdk/llm/message.py Show resolved Hide resolved

Update openhands/sdk/llm/message.py

966150f

OpenHands deleted a comment from openhands-ai bot Oct 4, 2025

enyst and others added 2 commits October 4, 2025 16:40

Merge branch 'main' into response

3f073c9

xingyaoww added 4 commits October 4, 2025 16:28

dudup code

c9c5b38

adjust formatting

e0f2a63

add comment back

fa4aec6

minor simplification

d3d948a

xingyaoww reviewed Oct 4, 2025

View reviewed changes

enyst commented Oct 4, 2025

View reviewed changes

openhands/sdk/tool/tool.py Outdated Show resolved Hide resolved

Update openhands/sdk/tool/tool.py

14ccfc8

adapt test

64eba97

xingyaoww approved these changes Oct 4, 2025

View reviewed changes

xingyaoww mentioned this pull request Oct 4, 2025

Remove deprecated to_llm_dict() and from_litellm_message() in favor of new responses API #639

Closed

5 tasks

xingyaoww merged commit 1a715ef into main Oct 4, 2025
14 checks passed

xingyaoww deleted the response branch October 4, 2025 21:35

xingyaoww mentioned this pull request Oct 4, 2025

Remove deprecated to_llm_dict() and from_litellm_message() methods #642

Merged

neubig mentioned this pull request Oct 16, 2025

Support Responses API #138

Closed

Comments

Conversation

enyst commented Oct 2, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

enyst commented Oct 2, 2025

Uh oh!

openhands-ai bot commented Oct 2, 2025

Uh oh!

enyst commented Oct 3, 2025

Uh oh!

enyst commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xingyaoww left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

enyst commented Oct 4, 2025

Uh oh!

neubig commented Oct 4, 2025

Uh oh!

enyst commented Oct 4, 2025

Uh oh!

xingyaoww left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

xingyaoww Oct 4, 2025

Choose a reason for hiding this comment

Uh oh!

xingyaoww Oct 4, 2025

Choose a reason for hiding this comment

Uh oh!

xingyaoww Oct 4, 2025

Choose a reason for hiding this comment

Uh oh!

openhands-ai bot Oct 4, 2025

Choose a reason for hiding this comment

Uh oh!

openhands-ai bot Oct 4, 2025

Choose a reason for hiding this comment

Summary

Uh oh!

enyst commented Oct 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

enyst commented Oct 4, 2025

Uh oh!

xingyaoww commented Oct 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xingyaoww left a comment

Choose a reason for hiding this comment

Uh oh!

xingyaoww commented Oct 4, 2025

Uh oh!

openhands-ai bot commented Oct 4, 2025

Uh oh!

openhands-ai bot commented Oct 4, 2025

Summary

What the issue documents:

Uh oh!

Uh oh!

simonrosenberg commented Oct 7, 2025

Uh oh!

enyst commented Oct 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

enyst commented Oct 2, 2025 •

edited by github-actions bot

Loading

enyst commented Oct 3, 2025 •

edited

Loading

xingyaoww left a comment •

edited

Loading

enyst commented Oct 4, 2025 •

edited

Loading

xingyaoww commented Oct 4, 2025 •

edited

Loading