-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
[BugFix][gpt-oss] Fix Chat Completion with Multiple Output Message #23318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@781574155 can you try to cherry-pick this PR and see whether this problem still exist? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request fixes a bug in chat completion for gpt-oss models where an error was raised when multiple reasoning messages were generated. The change correctly handles this by concatenating all reasoning messages. My review includes a suggestion to add a newline separator when joining these messages to improve the readability of the final reasoning content.
vllm/entrypoints/harmony_utils.py
Outdated
| reasoning_content = "".join( | ||
| [msg.content[0].text for msg in reasoning_msg]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The multiple reasoning messages are being joined without any separator. If these messages represent distinct thoughts or steps, this will result in a concatenated string that is hard to read and may be grammatically incorrect. For example, ["Thought one.", "Thought two."] would become "Thought one.Thought two.". It would be better to join them with a newline character to preserve the separation and improve readability.
| reasoning_content = "".join( | |
| [msg.content[0].text for msg in reasoning_msg]) | |
| reasoning_content = "\n".join( | |
| [msg.content[0].text for msg in reasoning_msg]) |
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
|
Sorry for hijacking this, but since you are working on fixes for GPT-OSS/harmony utils, and I know my PRs will never be looked at, I want to bring your attention to the following fix/tweak: #23167 (and #23155 , but less important). I think enabling commentary channel only when native tool calls are expected is a substantial usability improvement in certain scenarios. Sorry again, I just want to help improving the support of GPT-OSS in vllm. Feel free to just cherry-pick from my commits. |
WoosukKwon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the fix
|
@Ithanil I've reviewed your PRs. Please CC me when you create new PRs related to gpt-oss. |
…llm-project#23318) Signed-off-by: Chen Zhang <zhangch99@outlook.com> Signed-off-by: Duncan Moss <djm.moss@gmail.com>
|
When can we expect a new release with this fix? |
…llm-project#23318) Signed-off-by: Chen Zhang <zhangch99@outlook.com> Signed-off-by: root <xwq391974@alibaba-inc.com>
|
Hi @simon-mo, any plans on pushing this fix to https://hub.docker.com/r/vllm/vllm-openai/tags ? |
…llm-project#23318) Signed-off-by: Chen Zhang <zhangch99@outlook.com>
…llm-project#23318) Signed-off-by: Chen Zhang <zhangch99@outlook.com> Signed-off-by: Xiao Yu <xiao.yu@amd.com>
…llm-project#23318) Signed-off-by: Chen Zhang <zhangch99@outlook.com>
…llm-project#23318) Signed-off-by: Chen Zhang <zhangch99@outlook.com>
…llm-project#23318) Signed-off-by: Chen Zhang <zhangch99@outlook.com>
…Output Message (#279) Upstream PR: vllm-project/vllm#23318 Closes: https://issues.redhat.com/browse/RHOAIENG-34181 Image build: https://github.com/neuralmagic/nm-cicd/actions/runs/17739898673 Resulting image (not published yet, but expected): quay.io/vllm/automation-vllm:cuda-17739898673
…llm-project#23318) Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Purpose
Fix #22403 (comment)
Test Plan
Try this request locally.
Test Result
Raise an error "Expected 2 output messages (reasoning and final), but got 48." before but now works.
(Optional) Documentation Update
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.