-
-
Notifications
You must be signed in to change notification settings - Fork 11.5k
[Model] Support SeedOss Reason Parser #24263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Model] Support SeedOss Reason Parser #24263
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run You ask your reviewers to trigger select CI tests on top of Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add If you have any questions, please reach out to us on Slack at https://slack.vllm.ai. 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a BaseThinkingReasoningParser to abstract common logic for parsing reasoning content, which is a great refactoring. It also adds support for the SeedOss model.
The refactoring simplifies the DeepSeekR1ReasoningParser and Qwen3ReasoningParser by having them inherit from the new base class. However, I've found a critical issue in the implementation of BaseThinkingReasoningParser's streaming logic that would break existing functionality for deepseek_r1 and the new seed_oss parser. Please see my detailed comment.
After fixing the base class, the streaming behavior for qwen3 might change and become inconsistent with its non-streaming behavior. You may want to consider overriding extract_reasoning_content_streaming in Qwen3ReasoningParser to maintain its specific logic (treating everything as content if no start token is present).
Signed-off-by: Yan Lu <luyan@nvidia.com>
b04c83a to
32e4fd4
Compare
3bb52fc to
77a4ec1
Compare
|
git clone git@github.com:LuYanFCP/vllm.git VLLM_WORKER_MULTIPROC_METHOD=spawn \ (APIServer pid=20370) INFO: Started server process [20370] |
|
@WojtekMatula This Issue have been resolved. You can try it in latest commit.
|
245d6de to
3065687
Compare
…BaseThinkingReasoningParser base implementation. Signed-off-by: Yan Lu <luyan@nvidia.com>
a1e6c1e to
a733746
Compare
Signed-off-by: Yan Lu <luyan@nvidia.com>
bfaed5d to
110e1eb
Compare
|
curl -X POST http://localhost:1234/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "seed-oss", "max_tokens": 32000, "messages": [ { "role": "system", "content": "You are helpful assistant. Use tools to assist user. Answer concisely (<4 lines)." }, { "role": "user", "content": "execute ls -la" } ], "tools": [ { "type": "function", "function": { "name": "bash", "description": "Run bash commands. Quote paths with spaces. Prefer rg over grep. Describe command in 5-10 words.", "parameters": { "type": "object", "properties": { "command": {"type": "string", "description": "Command to execute"}, "timeout": {"type": "number", "description": "Timeout in ms"}, "description": {"type": "string", "description": "5-10 word description"} }, "required": ["command", "description"] } } } ], "tool_choice": "auto", "stream": true }' Looks like tools are not parsed only when reasoning is enabled and streaming is enabled. With streaming disabled everything is fine. |
|
Thinks for your reply, i will resolve this issue in current pr.
|
|
I think there is one more issue with the tool parsing. I told you that tool parsing works, even with streaming enabled, as long as reasoning is disabled but mow I think it is not 100% true. In my curl example: Tool invocation is always failing in stream mode because model output: First I thought that this was a problem with the model but when I was testing the model with curls, without "stream": true flag, the tool call json was fine. So to sum up: |
Signed-off-by: Yan Lu <luyan@nvidia.com>
chaunceyjiang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, LGTM.
/cc @aarnphm @gaocegege PTAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you paste your local test results, especially with Stream=True?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK,I will submit some case result in local
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unittest file already test this case
|
When
|
gaocegege
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It includes a refactor and a new feature (SeedOSS support), which adds complexity and modifies the Mistral, DeepSeek, and Qwen3 code paths. Still, we have reasoning tests for Mistral, DeepSeek, and Qwen3, so I think it should work.
Signed-off-by: Yan Lu <luyan@nvidia.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>
|
Hi @LuYanFCP
|
Signed-off-by: Yan Lu <luyan@nvidia.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>
Signed-off-by: Yan Lu <luyan@nvidia.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> Signed-off-by: gaojc <1055866782@qq.com>
|
When stream=True, the Seed-OSS tool call returns an incorrect structure It should be in tool_calls, but it ended up in content |
Signed-off-by: Yan Lu <luyan@nvidia.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Signed-off-by: Yan Lu <luyan@nvidia.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>
Signed-off-by: Yan Lu <luyan@nvidia.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>
Signed-off-by: Yan Lu <luyan@nvidia.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Signed-off-by: Yan Lu <luyan@nvidia.com> Co-authored-by: Michael Goin <mgoin64@gmail.com>





Purpose
Test Plan
Test Result
All Pass
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.