[Frontend] OpenAI Responses API supports Tool/Function calling - non-harmony #26874

chaunceyjiang · 2025-10-15T03:36:20Z

Purpose

rebase #20874

OpenAI Responses API supports Tool/Function calling

Follow up #20504

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

mergify · 2025-10-15T03:36:55Z

Documentation preview: https://vllm--26874.org.readthedocs.build/en/26874/

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

vllm/entrypoints/openai/serving_responses.py

yeqcharlotte · 2025-10-16T09:17:41Z

i think we are quite aligned that we don't want responses api to only work for gpt-oss #26703 ;)

could you share some e2e vllm serve command how you use this without other models?

chaunceyjiang · 2025-10-16T09:36:36Z

could you share some e2e vllm serve command how you use this without other models?

Hi, @yeqcharlotte I've provided an example in examples/online_serving/openai_responses_client_with_tools.py; vllm serve does not require any additional arguments.

vllm/entrypoints/openai/serving_engine.py

tests/v1/entrypoints/openai/serving_responses/test_function_call.py

vllm/entrypoints/openai/protocol.py

vllm/entrypoints/openai/serving_responses.py

alecsolder

I think this is great, it really gets us started with supporting other models on responses API

Comparing this to gpt-oss, for gpt-oss we handle the conversion of the output tokens to "messages" in entrypoints/context.py. The Harmony library happens to be a tool_parser reasoning_parser and tokenizer all at the same time, but it is nice that it all happens in one location. This matches needing to convert from responses types to completions types so we can use the parsers for other models.

I think the implementation in this PR as-is works, but once we want to support server side tool calling with models besides gpt-oss, we will likely need to move the parsing logic to entrypoints/context.py as well so it can happen in the tool calling "loop" from _generate_with_builtin_tools() in serving_engine.

Ideally we'd be able to start pulling conversion logic out of serving_responses into their own files like with harmony_utils.py, and continue to standardize more around something like creating the right context object for serving_engine.

vllm/entrypoints/openai/protocol.py

vllm/entrypoints/openai/serving_responses.py

vllm/entrypoints/openai/protocol.py

chaunceyjiang · 2025-10-22T04:08:39Z

I think the implementation in this PR as-is works, but once we want to support server side tool calling with models besides gpt-oss, we will likely need to move the parsing logic to entrypoints/context.py as well so it can happen in the tool calling "loop" from _generate_with_builtin_tools() in serving_engine.

Ideally we'd be able to start pulling conversion logic out of serving_responses into their own files like with harmony_utils.py, and continue to standardize more around something like creating the right context object for serving_engine.

Hi, @alecsolder

Your suggestions are excellent, and I completely agree. However, the main goal of this PR is to enable non-gpt-oss models to use tool calling with the Responses API.

Essentially, this PR constructs chat-completion-style messages with tool calling, and then extracts tools from the output using the tool_parser.

You can think of it as similar to how the Responses API currently handles Harmony.

The new _construct_chat_message_with_tool_call function is analogous to the existing _construct_input_messages_with_harmony function — both are responsible for constructing input messages.

Similarly, the new _parse_tool_calls_from_content function is analogous to the existing parse_output_message function — both are responsible for parsing output messages.

Regarding what you mentioned about server-side tool calling and MCP, I believe that should be handled by SimpleContext. Therefore, for non-gpt-oss models to support MCP, we’ll likely need a new Context. I plan to submit a separate PR to introduce that new Context.

However, I still think the _construct_chat_message_with_tool_call and _parse_tool_calls_from_content functions are necessary for constructing input messages and parsing output messages. These functions do not conflict with the Context, since currently the Context is not responsible for message construction or parsing.

alecsolder

Hey @chaunceyjiang, Completely agree with everything in your comment, I am fully aligned on moving things to the context classes, and all of it definitely doesn't need to be in this PR. I was mostly just trying to say these things to make sure we are aligned, which we are!

Other models can definitely support MCP, and the implementation from here is basically just us deciding when to handle the parsed FunctionCall from within vLLM, vs when to return it to the client. So we are right on track!

I'll try to put together the list of things that still need to be done to support the same arbitrary MCP integration as here #26704 for other models :)

vllm/entrypoints/openai/protocol.py

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

chaunceyjiang · 2025-11-05T09:04:49Z

@yeqcharlotte Ready for review.

yeqcharlotte

thanks for the change. let's keep iterate on this to also support minimax-m2 cc: @qandrew

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

…harmony (vllm-project#26874) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

mergify bot added documentation Improvements or additions to documentation frontend v1 labels Oct 15, 2025

chaunceyjiang force-pushed the func_call_2 branch from 3451ea9 to 1ab9c9d Compare October 15, 2025 08:18

mergify bot added the tool-calling label Oct 15, 2025

github-project-automation bot added this to Tool Calling Oct 15, 2025

chaunceyjiang changed the title ~~[Frontend] OpenAI Responses API supports Tool/Function calling~~ [Frontend] OpenAI Responses API supports Tool/Function calling - non-harmony Oct 15, 2025

mergify bot added the gpt-oss Related to GPT-OSS models label Oct 15, 2025

github-project-automation bot added this to gpt-oss Issues & Enhancements Oct 15, 2025

github-project-automation bot moved this to To Triage in gpt-oss Issues & Enhancements Oct 15, 2025

chaunceyjiang marked this pull request as ready for review October 15, 2025 10:17

chaunceyjiang requested a review from aarnphm as a code owner October 15, 2025 10:17

chaunceyjiang requested a review from yeqcharlotte October 15, 2025 10:18

chatgpt-codex-connector bot reviewed Oct 15, 2025

View reviewed changes

vllm/entrypoints/openai/serving_responses.py Outdated Show resolved Hide resolved

qandrew reviewed Oct 21, 2025

View reviewed changes

vllm/entrypoints/openai/serving_engine.py Outdated Show resolved Hide resolved

qandrew reviewed Oct 21, 2025

View reviewed changes

tests/v1/entrypoints/openai/serving_responses/test_function_call.py Show resolved Hide resolved

qandrew reviewed Oct 21, 2025

View reviewed changes

vllm/entrypoints/openai/protocol.py Outdated Show resolved Hide resolved

qandrew reviewed Oct 21, 2025

View reviewed changes

vllm/entrypoints/openai/serving_responses.py Outdated Show resolved Hide resolved

chaunceyjiang requested review from DarkLight1337, NickLucche, robertgshaw2-redhat and simon-mo as code owners October 21, 2025 06:57

chaunceyjiang force-pushed the func_call_2 branch from 4cfd173 to 50e9124 Compare October 21, 2025 07:04

chaunceyjiang requested a review from qandrew October 21, 2025 08:11

alecsolder reviewed Oct 21, 2025

View reviewed changes

vllm/entrypoints/openai/protocol.py Outdated Show resolved Hide resolved

vllm/entrypoints/openai/serving_responses.py Outdated Show resolved Hide resolved

vllm/entrypoints/openai/protocol.py Outdated Show resolved Hide resolved

alecsolder approved these changes Oct 22, 2025

View reviewed changes

vllm/entrypoints/openai/protocol.py Outdated Show resolved Hide resolved

chaunceyjiang added 16 commits November 5, 2025 08:15

[Frontend] OpenAI Responses API supports Tool/Function calling

2f13d99

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Frontend] OpenAI Responses API supports Tool/Function calling

93e9db5

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Frontend] OpenAI Responses API supports Tool/Function calling

eebab9c

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Frontend] OpenAI Responses API supports Tool/Function calling

842f7e1

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Frontend] OpenAI Responses API supports Tool/Function calling

839aaad

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Frontend] OpenAI Responses API supports Tool/Function calling

eb8fa5b

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Frontend] OpenAI Responses API supports Tool/Function calling

3d91f53

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Frontend] OpenAI Responses API supports Tool/Function calling

157ab86

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Frontend] OpenAI Responses API supports Tool/Function calling

2ddd919

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Frontend] OpenAI Responses API supports Tool/Function calling

ce784fe

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Frontend] OpenAI Responses API supports Tool/Function calling

067cd66

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Frontend] OpenAI Responses API supports Tool/Function calling

f44848a

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Frontend] OpenAI Responses API supports Tool/Function calling

5b8356c

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Frontend] OpenAI Responses API supports Tool/Function calling

3ab6b2b

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Frontend] OpenAI Responses API supports Tool/Function calling

f2b7dde

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

[Frontend] OpenAI Responses API supports Tool/Function calling

06e802e

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

chaunceyjiang force-pushed the func_call_2 branch from 79de140 to 06e802e Compare November 5, 2025 08:20

[Frontend] OpenAI Responses API supports Tool/Function calling

13a2749

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

chaunceyjiang mentioned this pull request Nov 6, 2025

[Bug]: Tool use is not supported in Responses API without Harmony None #28173

Open

1 task

yeqcharlotte approved these changes Nov 6, 2025

View reviewed changes

github-project-automation bot moved this from To Triage to Ready in gpt-oss Issues & Enhancements Nov 6, 2025

yeqcharlotte enabled auto-merge (squash) November 6, 2025 07:20

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 6, 2025

[Frontend] OpenAI Responses API supports Tool/Function calling

56f400b

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

yeqcharlotte merged commit 59a50af into vllm-project:main Nov 6, 2025
48 checks passed

github-project-automation bot moved this from Ready to Done in gpt-oss Issues & Enhancements Nov 6, 2025

chaunceyjiang deleted the func_call_2 branch November 6, 2025 10:45

bartlettroscoe mentioned this pull request Nov 8, 2025

Codex with gpt-oss:120b model appears to forget the task openai/codex#2293

Open

ZhengHongming888 pushed a commit to ZhengHongming888/vllm that referenced this pull request Nov 8, 2025

[Frontend] OpenAI Responses API supports Tool/Function calling - non-…

1f0f540

…harmony (vllm-project#26874) Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

Uh oh!

[Frontend] OpenAI Responses API supports Tool/Function calling - non-harmony #26874

[Frontend] OpenAI Responses API supports Tool/Function calling - non-harmony #26874

Conversation

chaunceyjiang commented Oct 15, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

mergify bot commented Oct 15, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

yeqcharlotte commented Oct 16, 2025

Uh oh!

chaunceyjiang commented Oct 16, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alecsolder left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chaunceyjiang commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alecsolder left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chaunceyjiang commented Nov 5, 2025

Uh oh!

yeqcharlotte left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

chaunceyjiang commented Oct 15, 2025 •

edited by github-actions bot

Loading

chaunceyjiang commented Oct 22, 2025 •

edited

Loading