[Bugfix] Multiple fixes to tool streaming with hermes and mistral #10979

cedonley · 2024-12-07T17:41:16Z

REDO of PR #10782

FIX #10781
FIX #10589
FIX #10821
FIX #10900

Key fixed issues:

UTF-8 escaping mismatches result in argument corruption
Final delta often not returned when chunks are small
Short initial arguments can corrupt entire arguments return
Mistral tool call id generation is inconsistent between stream/non-stream and causes failures

To be reviewed in future PRs:

Simplifying overall tool streaming -- reduce reliance on json rewriting and diff's
Identify similar issues in other tool parsers
When using speculative decoding, final delta can sometimes still be dropped

@K-Mistele @gcalmettes

Signed-off-by: cedonley <clayton@donley.io>

github-actions · 2024-12-07T17:41:27Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

cedonley · 2024-12-07T19:50:05Z

Regarding failure in entrypoints-test:

Pythonic tool test failure is not a regression
Appears to be an intermittent issue with that parser similar to the ones I've fixed with hermes and mistral
- Final delta of arguments seems to be missing
The issue appears in fastcheck runs from other recent non-tool PRs, so it is not caused by this PR
My local run of pytest on this branch passes for the same tests

I looked at that parser to see if there was an easy fix, but the logic is somewhat different from the others and I'd rather not try to overload this PR.

gcalmettes

Thanks for your work on this.

gcalmettes · 2024-12-08T14:40:14Z

I agree that the overall tool streaming extraction logic could/should be simplified, and should be addressed in another PR.

K-Mistele · 2024-12-09T16:23:51Z

I agree that the overall tool streaming extraction logic could/should be simplified, and should be addressed in another PR.

Yes, I have been of half a mind to rewrite from scratch using finite state machines which seems like the most appropriate & clean approach; just haven't had time to dig into it yet

cedonley · 2024-12-09T16:58:45Z

@K-Mistele Yeah. I was thinking the same thing.

It doesn't help that the OpenAI tool interfaces have changed multiple times and the underlying models don't even stay consistent within families. Going to be hard to completely future-proof this, so may want to let things settle a bit before the rewrite.

cedonley · 2024-12-11T17:56:31Z

@DarkLight1337 @mgoin This is good to merge per reviews from @K-Mistele and @gcalmettes

Thanks!

DarkLight1337

Thanks for fixing!

DarkLight1337 · 2024-12-11T22:37:51Z

Can you merge from main to fix the CI failures?

…lm-project#10979) Signed-off-by: cedonley <clayton@donley.io>

erkintelnyx · 2025-06-03T13:55:49Z

Hi @cedonley

Can I ask which of the issues does this part solve?

This introduced a bug for streaming parser on Qwen/Qwen3-235B-A22B where:
current_text : <tool_call>{"name": "shuffle_deck", "arguments": {"deck_count": 1
delta_text : 1

prev_arguments : {}
cur_arguments : {"deck_count": 1}

and because cur_arguments json string [:-2] removes 1 it gets into that if statement and returns None.

I'm trying to understand what does [:-2] supposed to eliminate, maybe meant to remove where the partial json parser adds "} ? If so, why not just check for "} ?

Would you remember the context here maybe @DarkLight1337 ?

cedonley · 2025-06-03T14:05:04Z

Good question. At the time the surrounding code was always feeding "}, so I probably didn't check for it. Seems like something changed since then, or perhaps I didn't test with a numeric (non-string) value in the final value.

erkintelnyx · 2025-06-03T15:24:29Z

I see, so it makes sense to check if it ends with "} first as far as I can see.

[Bugfix] Multiple fixes to tool streaming with hermes and mistral

55c5aa0

Signed-off-by: cedonley <clayton@donley.io>

mergify bot added the frontend label Dec 7, 2024

gcalmettes approved these changes Dec 8, 2024

View reviewed changes

K-Mistele mentioned this pull request Dec 9, 2024

[Bug]: mistral tool choice error #10821

Closed

1 task

DarkLight1337 approved these changes Dec 11, 2024

View reviewed changes

DarkLight1337 enabled auto-merge (squash) December 11, 2024 22:37

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 11, 2024

Merge branch 'vllm-project:main' into fix_tool_streaming

d685386

DarkLight1337 merged commit 7439a8b into vllm-project:main Dec 12, 2024
51 checks passed

cedonley deleted the fix_tool_streaming branch December 12, 2024 01:12

sleepwalker2017 pushed a commit to sleepwalker2017/vllm that referenced this pull request Dec 13, 2024

[Bugfix] Multiple fixes to tool streaming with hermes and mistral (vl…

2667aad

…lm-project#10979) Signed-off-by: cedonley <clayton@donley.io>

marcelodiaz558 mentioned this pull request Dec 18, 2024

[Bug]: Invalid tool arguments generated in v0.6.5 #11279

Closed

1 task

logan-markewich mentioned this pull request Apr 3, 2025

[Bug]: FunctionAgent responce is a tool call and not the answer run-llama/llama_index#18363

Closed

ahartel mentioned this pull request Sep 19, 2025

[Test]: Hermes tool parser stream output error in Qwen3 case #25203

Merged

Uh oh!

[Bugfix] Multiple fixes to tool streaming with hermes and mistral #10979

[Bugfix] Multiple fixes to tool streaming with hermes and mistral #10979

Uh oh!

Conversation

cedonley commented Dec 7, 2024 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Dec 7, 2024

Uh oh!

cedonley commented Dec 7, 2024

Uh oh!

gcalmettes left a comment

Choose a reason for hiding this comment

Uh oh!

gcalmettes commented Dec 8, 2024

Uh oh!

K-Mistele commented Dec 9, 2024

Uh oh!

cedonley commented Dec 9, 2024

Uh oh!

cedonley commented Dec 11, 2024

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 commented Dec 11, 2024

Uh oh!

Uh oh!

erkintelnyx commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cedonley commented Jun 3, 2025

Uh oh!

erkintelnyx commented Jun 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

cedonley commented Dec 7, 2024 •

edited by github-actions bot

Loading

erkintelnyx commented Jun 3, 2025 •

edited

Loading