fixed reasoning streaming with tool_choice="required" #24108

ExtReMLapin · 2025-09-02T15:26:15Z

Purpose

fixed reasoning not being sent to client when tool_choice="required"

Test Plan

Added one test to ensure it's returned in streamed data pytest ./tests/entrypoints/openai/test_completion_with_function_calling.py

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

ExtReMLapin · 2025-09-03T13:14:03Z

Tested on multiple qwen models + tool

Qwen 3 with reasoning
Qwen 3 with reasoning disabled
Qwen 2.5

Signed-off-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr>

mergify · 2025-09-08T13:57:22Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @ExtReMLapin.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: ExtReMLapin <3909752+ExtReMLapin@users.noreply.github.com>

Signed-off-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr>

ExtReMLapin · 2025-09-12T12:25:22Z

@DarkLight1337 @heheda12345
@simon-mo

I'm not sure exactly who to ping to get it reviewed

DarkLight1337 · 2025-09-12T12:35:47Z

cc @aarnphm @chaunceyjiang

chaunceyjiang

reasoning not being sent to client when tool_choice="required"

Could you provide a reproduction step?

The combination of stream + enable_thinking + required has been continuously tested in e2e.

https://github.com/vllm-project/vllm/blob/main/tests/entrypoints/openai/test_completion_with_function_calling.py#L165-L177

ExtReMLapin · 2025-09-12T14:44:33Z

@chaunceyjiang there is no assert/check/test in the stream mode for reasoning

https://github.com/vllm-project/vllm/blob/main/tests/entrypoints/openai/test_completion_with_function_calling.py#L218

master/HEAD :

(See it directly goes into tool call , something like 10 seconds between first message and tool call start)

query.js

This branch :

ExtReMLapin · 2025-09-12T14:46:16Z

Also this PR cover both forced reasoning models (like Qwen3 2507 that doesn't output opening reasoning tag) and original ones that outputs both opening and closing reasoning tags

chaunceyjiang

Good catch.

Can you add a check for this here: https://github.com/vllm-project/vllm/blob/main/tests/entrypoints/openai/test_completion_with_function_calling.py#L165-L177?

vllm/entrypoints/openai/serving_chat.py

Signed-off-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr>

ExtReMLapin · 2025-09-15T15:53:25Z

Got it for the changes.

In the tests i'm having a weird issue where

        output = []
        reasoning = []
        async for chunk in output_stream:
            if chunk.choices:
                if enable_thinking and chunk.choices[0].delta.reasoning_content:
                    reasoning.append(chunk.choices[0].delta.reasoning_content)
                if chunk.choices[0].delta.tool_calls:
                    output.extend(chunk.choices[0].delta.tool_calls)

        assert len(output) > 0
        if enable_thinking:
            assert len(reasoning) > 0

Doesn't work because the openai class doesn't have this param, and I don't understand why it doesn't error with the non stream part.

So instead I moved to if enable_thinking and getattr(chunk.choices[0].delta, "reasoning_content", None): not sure if it's good.

ExtReMLapin · 2025-09-15T16:15:06Z

And something's broken with non reasoning models so i'll fix it when I get back from vacations

ExtReMLapin · 2025-09-17T17:47:24Z

/gemini review

gemini-code-assist

Code Review

This pull request fixes an issue where reasoning content was not streamed correctly when tool_choice="required". The fix involves using the correct streaming-aware function for extracting reasoning content. The associated test is also updated to verify this behavior.

My review focuses on the maintainability of the fix. While the fix is correct, it introduces code duplication for handling reasoning streaming across different tool_choice scenarios. I've suggested refactoring this duplicated logic into a helper function to improve code clarity and reduce the risk of future inconsistencies.

ExtReMLapin · 2025-09-18T05:47:17Z

Considering the precommit warning linked to the values of reasoning_end_arr in

        if tool_choice_auto or self.reasoning_parser:
            # These are only required in "auto" tool choice case
            all_previous_token_ids = [[]] * num_choices
            # For reasoning parser and tool call all enabled
            added_content_delta_arr = [False] * num_choices
            reasoning_end_arr = [False] * num_choices
        else:
            all_previous_token_ids = None
            reasoning_end_arr = None

Would you be fine with reasoning_end_arr = [False] * num_choices being initialized either way outside of the if ? @chaunceyjiang

chaunceyjiang · 2025-09-18T05:57:02Z

Would you be fine with reasoning_end_arr = [False] * num_choices being initialized either way outside of the if ? @chaunceyjiang

I haven’t reviewed your PR carefully yet, but my understanding is that reasoning_end_arr should only be used when if self.reasoning_parser is True.

…periment, and removed added parentheses Signed-off-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr>

mergify · 2025-10-08T14:35:22Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @ExtReMLapin.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: ExtReMLapin <3909752+ExtReMLapin@users.noreply.github.com>

Signed-off-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr>

ExtReMLapin · 2025-10-21T05:38:55Z

Could it be possible to have a merge on that ? It's honestly not that complicated to see it's fixed and as of right now with reasoning model it's not possible to know what's going on if it's a long generation loop, all you'll see in the server console is it generated token (Avg generation throughput: 51.6 tokens/s,) but you can't know if it's actualy legitimate reasonining tokens or stuck in an infinite loop.

And obviously with non streamer queries, you don't know anything, you don't even know if you went past the reasoning part.

cc @chaunceyjiang

chaunceyjiang

Thanks~

chaunceyjiang · 2025-10-21T08:26:46Z

Hi @ExtReMLapin, there are currently some issues with the CI on the main branch. Let's wait for them to be fixed before proceeding with this PR.

ExtReMLapin · 2025-10-22T09:53:58Z

Hooray !

…4108) Signed-off-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr> Signed-off-by: ExtReMLapin <3909752+ExtReMLapin@users.noreply.github.com> Co-authored-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr> Co-authored-by: Chauncey <chaunceyjiang@gmail.com>

…4108) Signed-off-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr> Signed-off-by: ExtReMLapin <3909752+ExtReMLapin@users.noreply.github.com> Co-authored-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr> Co-authored-by: Chauncey <chaunceyjiang@gmail.com> Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>

…4108) Signed-off-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr> Signed-off-by: ExtReMLapin <3909752+ExtReMLapin@users.noreply.github.com> Co-authored-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr> Co-authored-by: Chauncey <chaunceyjiang@gmail.com>

…4108) Signed-off-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr> Signed-off-by: ExtReMLapin <3909752+ExtReMLapin@users.noreply.github.com> Co-authored-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr> Co-authored-by: Chauncey <chaunceyjiang@gmail.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>

mergify bot added the frontend label Sep 2, 2025

ExtReMLapin force-pushed the streaming_tool_required_true branch from a0c7ec3 to e95416c Compare September 3, 2025 13:12

ExtReMLapin marked this pull request as ready for review September 3, 2025 13:13

ExtReMLapin requested a review from aarnphm as a code owner September 3, 2025 13:13

fixed reasoning streaming with tool_choice="required"

17853a1

Signed-off-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr>

ExtReMLapin force-pushed the streaming_tool_required_true branch from 58cfd8a to 17853a1 Compare September 4, 2025 13:47

mergify bot added the needs-rebase label Sep 8, 2025

Merge branch 'main' into streaming_tool_required_true

c1919a7

Signed-off-by: ExtReMLapin <3909752+ExtReMLapin@users.noreply.github.com>

ExtReMLapin requested a review from chaunceyjiang as a code owner September 12, 2025 05:55

mergify bot removed the needs-rebase label Sep 12, 2025

CNE Pierre FICHEPOIL added 2 commits September 12, 2025 12:08

oops

1a5780e

Signed-off-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr>

faken precommit forgotten

4d8d81c

Signed-off-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr>

ExtReMLapin force-pushed the streaming_tool_required_true branch from 02a8dde to 4d8d81c Compare September 12, 2025 12:09

chaunceyjiang reviewed Sep 12, 2025

View reviewed changes

chaunceyjiang reviewed Sep 15, 2025

View reviewed changes

CNE Pierre FICHEPOIL added 2 commits September 15, 2025 15:20

reverted useless changes, weird

e89b484

Signed-off-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr>

fixed reasoning streaming with tool_choice=required

b62941d

Signed-off-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr>

ExtReMLapin requested review from DarkLight1337, NickLucche, robertgshaw2-redhat and simon-mo as code owners September 15, 2025 15:52

ExtReMLapin marked this pull request as draft September 15, 2025 16:14

ExtReMLapin requested a review from chaunceyjiang September 17, 2025 12:32

gemini-code-assist bot reviewed Sep 17, 2025

View reviewed changes

CNE Pierre FICHEPOIL and others added 3 commits September 18, 2025 07:11

removed useless reasoning_end_arr I forgot to remove from old code ex…

92fda09

…periment, and removed added parentheses Signed-off-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr>

Merge branch 'vllm-project:main' into streaming_tool_required_true

90fc75a

Merge branch 'vllm-project:main' into streaming_tool_required_true

b1d183f

mergify bot added the needs-rebase label Oct 8, 2025

Merge branch 'main' into streaming_tool_required_true

f948c83

Signed-off-by: ExtReMLapin <3909752+ExtReMLapin@users.noreply.github.com>

mergify bot removed the needs-rebase label Oct 8, 2025

CNE Pierre FICHEPOIL and others added 2 commits October 9, 2025 06:44

reformated code with precommit (post merge fix)

c6b0bfc

Signed-off-by: CNE Pierre FICHEPOIL <pierre-1.fichepoil@gendarmerie.interieur.gouv.fr>

Merge branch 'vllm-project:main' into streaming_tool_required_true

1ceb383

chaunceyjiang approved these changes Oct 21, 2025

View reviewed changes

Merge branch 'main' into streaming_tool_required_true

a06737c

chaunceyjiang added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 21, 2025

chaunceyjiang enabled auto-merge (squash) October 21, 2025 06:11

Merge branch 'main' into streaming_tool_required_true

a07efc3

chaunceyjiang added 2 commits October 22, 2025 11:12

Merge branch 'main' into streaming_tool_required_true

d242100

Merge branch 'main' into streaming_tool_required_true

35efff0

chaunceyjiang merged commit a4c29e6 into vllm-project:main Oct 22, 2025
48 checks passed

Uh oh!

fixed reasoning streaming with tool_choice="required" #24108

fixed reasoning streaming with tool_choice="required" #24108

Conversation

ExtReMLapin commented Sep 2, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

ExtReMLapin commented Sep 3, 2025

Uh oh!

mergify bot commented Sep 8, 2025

Uh oh!

ExtReMLapin commented Sep 12, 2025

Uh oh!

DarkLight1337 commented Sep 12, 2025

Uh oh!

chaunceyjiang left a comment

Choose a reason for hiding this comment

Uh oh!

ExtReMLapin commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ExtReMLapin commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chaunceyjiang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ExtReMLapin commented Sep 15, 2025

Uh oh!

ExtReMLapin commented Sep 15, 2025

Uh oh!

ExtReMLapin commented Sep 17, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

ExtReMLapin commented Sep 18, 2025

Uh oh!

chaunceyjiang commented Sep 18, 2025

Uh oh!

mergify bot commented Oct 8, 2025

Uh oh!

ExtReMLapin commented Oct 21, 2025

Uh oh!

chaunceyjiang left a comment

Choose a reason for hiding this comment

Uh oh!

chaunceyjiang commented Oct 21, 2025

Uh oh!

Uh oh!

ExtReMLapin commented Oct 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ExtReMLapin commented Sep 2, 2025 •

edited by github-actions bot

Loading

ExtReMLapin commented Sep 12, 2025 •

edited

Loading

ExtReMLapin commented Sep 12, 2025 •

edited

Loading