Skip to content

Conversation

@Zhikaiiii
Copy link
Contributor

@Zhikaiiii Zhikaiiii commented Sep 17, 2025

Purpose

Contribute the internal tool parser used at Qwen API Service, which use a standard xml parser to parse text streamingly, and handles a lot of corner cases:

  1. make sure the params of corresponding type are returned
  2. handle function format error such as missing } for params

Test Plan

We test both origin parser and new parser in test_qwen3coder_tool_parser.py

pytest -v -s tests/tool_use/test_qwen3coder_tool_parser.py

Test Result

============================ test session starts ==============================
platform linux -- Python 3.11.11, pytest-8.4.2, pluggy-1.6.0 -- /usr/local/bin/python
cachedir: .pytest_cache
rootdir: /mnt/workspace/dev_workspace/wuzhikai.wzk/code_repo/vllm
configfile: pyproject.toml
plugins: hydra-core-1.3.2, anyio-4.10.0, asyncio-1.2.0
asyncio: mode=Mode.STRICT, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collecting ... collected 36 items

tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_no_tools[original] Downloading Model from https://www.modelscope.cn to directory: /mnt/workspace/.cache/modelscope/hub/models/Qwen/Qwen3-Coder-30B-A3B-Instruct-FP8
INFO 09-17 11:33:58 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:33:58 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_no_tools[xml] INFO 09-17 11:33:58 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:33:58 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls[original-single_tool] INFO 09-17 11:33:59 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:33:59 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls[original-single_tool_with_content] INFO 09-17 11:33:59 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:33:59 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls[original-single_tool_multiline_param] INFO 09-17 11:33:59 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:33:59 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls[original-parallel_tools] INFO 09-17 11:34:00 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:00 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls[original-tool_with_typed_params] INFO 09-17 11:34:00 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:00 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls[xml-single_tool] INFO 09-17 11:34:00 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:00 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls[xml-single_tool_with_content] INFO 09-17 11:34:01 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:01 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls[xml-single_tool_multiline_param] INFO 09-17 11:34:01 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:01 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls[xml-parallel_tools] INFO 09-17 11:34:01 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:01 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls[xml-tool_with_typed_params] INFO 09-17 11:34:02 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:02 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_fallback_no_tags[original] INFO 09-17 11:34:02 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:02 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_fallback_no_tags[xml] INFO 09-17 11:34:02 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:02 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_type_conversion[original] INFO 09-17 11:34:03 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:03 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_type_conversion[xml] INFO 09-17 11:34:03 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:03 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[original-no_tools] INFO 09-17 11:34:03 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:03 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[original-single_tool] INFO 09-17 11:34:04 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:04 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[original-single_tool_with_content] INFO 09-17 11:34:04 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:04 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[original-single_tool_multiline_param] INFO 09-17 11:34:04 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:04 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[original-parallel_tools] INFO 09-17 11:34:05 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:05 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[original-tool_with_typed_params] INFO 09-17 11:34:05 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:05 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[xml-no_tools] INFO 09-17 11:34:05 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:05 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[xml-single_tool] INFO 09-17 11:34:06 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:06 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[xml-single_tool_with_content] INFO 09-17 11:34:06 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:06 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[xml-single_tool_multiline_param] INFO 09-17 11:34:06 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:06 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[xml-parallel_tools] INFO 09-17 11:34:07 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:07 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[xml-tool_with_typed_params] INFO 09-17 11:34:07 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:07 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_missing_closing_parameter_tag[original] INFO 09-17 11:34:07 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:07 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_missing_closing_parameter_tag[xml] INFO 09-17 11:34:08 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:08 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming_missing_closing_tag[original] INFO 09-17 11:34:08 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:08 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming_missing_closing_tag[xml] INFO 09-17 11:34:08 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:08 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming_incremental[original] INFO 09-17 11:34:09 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:09 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming_incremental[xml] INFO 09-17 11:34:09 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:09 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_complex_type_with_single_quote[original] INFO 09-17 11:34:09 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:09 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_complex_type_with_single_quote[xml] INFO 09-17 11:34:10 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:10 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
======================= 36 passed, 4 warnings in 13.25s ========================


Signed-off-by: Zhikaiiii <1658973216@qq.com>
Signed-off-by: Zhikaiiii <1658973216@qq.com>
Signed-off-by: Zhikaiiii <1658973216@qq.com>
@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new XML-based tool parser, Qwen3CoderXMLToolParser, designed for streaming and handling various corner cases in tool call parsing. The existing tests have been refactored to use a parameterized fixture, which is a great approach to ensure both the original and the new parser are tested against the same comprehensive suite of tests.

My review focuses on the new parser implementation. I've identified a critical security vulnerability related to the use of ast.literal_eval which could lead to a Denial of Service. I've also found a couple of high-severity issues concerning silent exception handling and overly greedy regex patterns that could affect the parser's robustness and maintainability. Addressing these points will significantly improve the quality and security of the new parser.

raw_for_parse = raw_text + '\n'
else:
raw_for_parse = raw_text
parsed_value = ast.literal_eval(raw_for_parse)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Using ast.literal_eval on input from an LLM without any size restrictions can expose the system to a Denial of Service (DoS) attack. A malicious or malformed model output could provide a very large or deeply nested structure that consumes excessive CPU or memory, or causes a stack overflow during parsing. This can crash the server process. It's critical to add a size limit check before evaluating the raw text.

                    # A reasonable limit to prevent DoS attacks.
                    # This can be made configurable if needed.
                    MAX_LITERAL_SIZE = 1_000_000
                    if len(raw_for_parse) > MAX_LITERAL_SIZE:
                        raise ValueError(
                            f"Parameter value size ({len(raw_for_parse)}) "
                            f"exceeds the limit for literal_eval "
                            f"({MAX_LITERAL_SIZE}).")
                    parsed_value = ast.literal_eval(raw_for_parse)

Comment on lines 203 to 204
except Exception:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Silently ignoring all exceptions during XML parsing can hide critical bugs and make debugging extremely difficult. If self.parser.Parse fails due to malformed input from _find_next_complete_element or other issues, the error is swallowed. This can lead to incorrect or incomplete tool call generation without any warning. It's crucial to log these exceptions to aid in debugging and to improve the parser's robustness.

Suggested change
except Exception:
pass
except Exception:
logger.warning("Failed to parse XML chunk: '%s'",
preprocessed_element,
exc_info=True)
pass

Comment on lines +393 to +397
processed = re.sub(r'<function=([^>]+)>', r'<function name="\1">',
chunk)
# Handle <parameter=name> format -> <parameter name="name">
processed = re.sub(r'<parameter=([^>]+)>', r'<parameter name="\1">',
processed)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The regex ([^>]+) used to capture function and parameter names is too greedy and can lead to invalid XML if the model generates a malformed name. For instance, an output like <function=my_func(arg="val")> would result in a broken XML tag <function name="my_func(arg="val")">. To improve robustness, the regex should be more restrictive, allowing only a specific set of characters that are valid for identifiers.

Suggested change
processed = re.sub(r'<function=([^>]+)>', r'<function name="\1">',
chunk)
# Handle <parameter=name> format -> <parameter name="name">
processed = re.sub(r'<parameter=([^>]+)>', r'<parameter name="\1">',
processed)
processed = re.sub(r'<function=([a-zA-Z0-9_.-]+)>', r'<function name="\1">',
chunk)
# Handle <parameter=name> format -> <parameter name="name">
processed = re.sub(r'<parameter=([a-zA-Z0-9_.-]+)>', r'<parameter name="\1">',
processed)

Signed-off-by: Zhikaiiii <1658973216@qq.com>
@chaunceyjiang
Copy link
Collaborator

chaunceyjiang commented Sep 17, 2025

Contribute the internal tool parser used at Qwen API Service,

Hi, @Zhikaiiii Thanks~, Are you from the Qwen team?

@simon-mo
Copy link
Collaborator

Yes this is from the Qwen team

Signed-off-by: Zhikaiiii <1658973216@qq.com>
Signed-off-by: Zhikaiiii <1658973216@qq.com>
Signed-off-by: Zhikaiiii <1658973216@qq.com>
self.deferred_param_raw_value = ""


@ToolParserManager.register_module("qwen3_coder_xml")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Zhikaiiii, just a small question: the existing qwen3_coder should also be contributed by the Qwen team’s @ranpox.
From the unit tests, it looks like qwen3_coder_xml can completely replace qwen3_coder.

So should we deprecate the existing qwen3_coder and adopt the newer, more stable qwen3_coder_xml instead?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don’t deprecate it(qwen3_coder), how can end users determine whether they should use qwen3_coder_xml or qwen3_coder?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we intent to use qwen3_coder_xml replace qwen3_coder, but for a clear review and some accuracy problems, we did not directly replace it. After we fix the accuracy problem, and all review comments been resolved, we will rename it

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

accuracy problem is eliminated, so we can move on to review and merge this PR @Zhikaiiii @chaunceyjiang

Signed-off-by: Zhikaiiii <1658973216@qq.com>
@mergify
Copy link

mergify bot commented Sep 21, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @Zhikaiiii.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Sep 21, 2025
Signed-off-by: Zhikaiiii <1658973216@qq.com>
Signed-off-by: Zhikaiiii <1658973216@qq.com>
@mergify mergify bot removed the needs-rebase label Sep 23, 2025
self.deferred_param_raw_value = ""


@ToolParserManager.register_module("qwen3_xml")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest renaming it to qwen3_coder, and deprecating the original qwen3_coder.

Since both are intended for use with the code3-coder model, maintaining two different tool_parser implementations would incur high maintenance costs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chaunceyjiang
This is because in future qwen3-series model, we might also use this parser, not just in the coder model. Therefore, we have named it to qwen3_xml instead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see.

docs/features/tool_calling.md

Could you please update the documentation to recommend which models should use this parser?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Signed-off-by: Zhikaiiii <1658973216@qq.com>
@mergify mergify bot added the documentation Improvements or additions to documentation label Sep 23, 2025
@chaunceyjiang chaunceyjiang added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 23, 2025
@chaunceyjiang chaunceyjiang enabled auto-merge (squash) September 23, 2025 08:05
Copy link
Collaborator

@chaunceyjiang chaunceyjiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks~

@chaunceyjiang chaunceyjiang merged commit 9383cd6 into vllm-project:main Sep 23, 2025
55 checks passed
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025
…ect#25028)

Signed-off-by: Zhikaiiii <1658973216@qq.com>
Signed-off-by: charlifu <charlifu@amd.com>
yewentao256 pushed a commit that referenced this pull request Oct 3, 2025
Signed-off-by: Zhikaiiii <1658973216@qq.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
@zxgx
Copy link

zxgx commented Oct 4, 2025

Hi, I tried this tool call parser in v0.11.0 with cmd: vllm serve Qwen/Qwen3-Coder-30B-A3B-Instruct --dtype auto --api-key token-abc123 --enable-auto-tool-choice --tool-call-parser qwen3_xml --max-model-len 131072

It will report error:

(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145] Error in chat completion stream generator.
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145] Traceback (most recent call last):
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145]   File "/home/v-kenanli/venvs/serve/lib/python3.12/site-packages/vllm/entrypoints/openai/serving_chat.py", line 1033, in chat_completion_stream_generator
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145]     tool_parser.prev_tool_call_arr[index].get(
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145]     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145] IndexError: list index out of range

Do you have any suggestion on resolving this issue?

@Zhikaiiii
Copy link
Contributor Author

Hi, I tried this tool call parser in v0.11.0 with cmd: vllm serve Qwen/Qwen3-Coder-30B-A3B-Instruct --dtype auto --api-key token-abc123 --enable-auto-tool-choice --tool-call-parser qwen3_xml --max-model-len 131072

It will report error:

(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145] Error in chat completion stream generator.
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145] Traceback (most recent call last):
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145]   File "/home/v-kenanli/venvs/serve/lib/python3.12/site-packages/vllm/entrypoints/openai/serving_chat.py", line 1033, in chat_completion_stream_generator
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145]     tool_parser.prev_tool_call_arr[index].get(
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145]     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145] IndexError: list index out of range

Do you have any suggestion on resolving this issue?

sry , I will check this bug asap. You can try use old parser qwen3_coder first .

@geraldthewes
Copy link

geraldthewes commented Oct 6, 2025

Hi, I tried this tool call parser in v0.11.0 with cmd: vllm serve Qwen/Qwen3-Coder-30B-A3B-Instruct --dtype auto --api-key token-abc123 --enable-auto-tool-choice --tool-call-parser qwen3_xml --max-model-len 131072
It will report error:

(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145] Error in chat completion stream generator.
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145] Traceback (most recent call last):
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145]   File "/home/v-kenanli/venvs/serve/lib/python3.12/site-packages/vllm/entrypoints/openai/serving_chat.py", line 1033, in chat_completion_stream_generator
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145]     tool_parser.prev_tool_call_arr[index].get(
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145]     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145] IndexError: list index out of range

Do you have any suggestion on resolving this issue?

sry , I will check this bug asap. You can try use old parser qwen3_coder first .

Running into the exact same issue

docker run --runtime nvidia --shm-size=2g --gpus all -p 8000:8000 -v ~/.cache/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=REMOVED" --env "PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True" --env "CUDA_DEVICE_ORDER=PCI_BUS_ID" --env "VLLM_LOGGING_LEVEL=DEBUG" --ipc=host vllm/vllm-openai:v0.11.0 --model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ --quantization awq --tensor-parallel-size 2 --max-model-len 65536 --gpu-memory-utilization 0.85 --dtype float16 --enable-auto-tool-choice --tool-call-parser qwen3_xml --host 0.0.0.0 --port 8000 --enable-log-requests

(APIServer pid=1) ERROR 10-06 15:16:11 [serving_chat.py:1145] Error in chat completion stream generator.
(APIServer pid=1) ERROR 10-06 15:16:11 [serving_chat.py:1145] Traceback (most recent call last):
(APIServer pid=1) ERROR 10-06 15:16:11 [serving_chat.py:1145] File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/serving_chat.py", line 1033, in chat_completion_stream_generator
(APIServer pid=1) ERROR 10-06 15:16:11 [serving_chat.py:1145] tool_parser.prev_tool_call_arr[index].get(
(APIServer pid=1) ERROR 10-06 15:16:11 [serving_chat.py:1145] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^
(APIServer pid=1) ERROR 10-06 15:16:11 [serving_chat.py:1145] IndexError: list index out of range

Basically trying to tool call an MCP server using qwen code

qwen -v
0.0.14

@Zhikaiiii
Copy link
Contributor Author

Zhikaiiii commented Oct 7, 2025

@geraldthewes @zxgx hi, I have fixed this problem in #26345.
Could you please help double check in your scenario?

@geraldthewes
Copy link

@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.

Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.

But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:

model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ

As I see it outputs

<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>

So somehow seems like the model failed to include the starting <tool_call> token for some reason.

gjc0824 pushed a commit to gjc0824/vllm that referenced this pull request Oct 10, 2025
…ect#25028)

Signed-off-by: Zhikaiiii <1658973216@qq.com>
Signed-off-by: gaojc <1055866782@qq.com>
@Zhikaiiii
Copy link
Contributor Author

@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.

Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.

But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:

model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ

As I see it outputs

<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>

So somehow seems like the model failed to include the starting <tool_call> token for some reason.

Thank you for the verification. The latter issue appears to be caused by the parser's fallback logic failing in this particular case. How frequently does this case occur? I'll try to fix this problem asap.

@Zhikaiiii
Copy link
Contributor Author

@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.

Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.

But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:

model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ

As I see it outputs

<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>

So somehow seems like the model failed to include the starting <tool_call> token for some reason.

@geraldthewes
Hi, I try to add this case in vllm/tests/tool_use/test_qwen3coder_tool_parser.py.

I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>

it seems that the parser return correct result,

tool_calls=[ToolCall(id='chatcmpl-tool-67620c18dccc425abafda6fb9172a493', type='function', function=FunctionCall(name='run_shell_command', arguments='{"command": "jobforge submit-job deploy/build.yaml\\n", "is_background": "False\\n", "description": "Running jobforge submit-job command to deploy build.yaml\\n\\n'))] content="I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n"

What is the parsed output in your sevrer look like?

@kexinoh
Copy link

kexinoh commented Oct 10, 2025

Hi, I want to know why Qwen no longer uses JSON and instead uses XML?

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
…ect#25028)

Signed-off-by: Zhikaiiii <1658973216@qq.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
@geraldthewes
Copy link

geraldthewes commented Oct 10, 2025

I'm not sure what you are asking. I run qwen coder as follows

docker run --runtime nvidia --shm-size=2g --gpus all -p 8000:8000 -v /mnt/data1/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=REMOVED" --env "PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True" --env "CUDA_DEVICE_ORDER=PCI_BUS_ID" --ipc=host qwen3_xml_parser_fix:latest --model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ --quantization awq --tensor-parallel-size 2 --max-model-len 65536 --gpu-memory-utilization 0.85 --dtype float16 --enable-auto-tool-choice --tool-call-parser qwen3_xml --host 0.0.0.0 --port 8000 --enable-log-requests

And look at the output logs - I had assumed the output is after your parsing tool, but I guess that is incorrect - since it's my understanding the whole point of your parser is to emit JSON tool calling format that is more widely understood by code agents. Let me know if there is an easy way to capture that output - otherwise I will setup a proxy to see what is really returned to the coding agent.

@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.
Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.
But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:
model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ
As I see it outputs
<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>
So somehow seems like the model failed to include the starting <tool_call> token for some reason.

@geraldthewes Hi, I try to add this case in vllm/tests/tool_use/test_qwen3coder_tool_parser.py.

I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>

it seems that the parser return correct result,

tool_calls=[ToolCall(id='chatcmpl-tool-67620c18dccc425abafda6fb9172a493', type='function', function=FunctionCall(name='run_shell_command', arguments='{"command": "jobforge submit-job deploy/build.yaml\\n", "is_background": "False\\n", "description": "Running jobforge submit-job command to deploy build.yaml\\n\\n'))] content="I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n"

What is the parsed output in your sevrer look like?

@Zhikaiiii
Copy link
Contributor Author

Zhikaiiii commented Oct 10, 2025

I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>

Normally, as you understand, the code shown above is the raw output from the model. The parser will convert this output into JSON tool format for downstream use like qwen-code or other scenarios.

I directly used the model output from your example as the input to the parser. Although this output appears to be missing <tool_call>, the parser stiil successfully applied its fallback mechanism and produced the correct JSON format. Therefore, I'd like to know, in your scenario, what is the final output returned by the outermost vLLM server?

I'm not sure what you are asking. I run qwen coder as follows

docker run --runtime nvidia --shm-size=2g --gpus all -p 8000:8000 -v /mnt/data1/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=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" --env "PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True" --env "CUDA_DEVICE_ORDER=PCI_BUS_ID" --ipc=host qwen3_xml_parser_fix:latest --model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ --quantization awq --tensor-parallel-size 2 --max-model-len 65536 --gpu-memory-utilization 0.85 --dtype float16 --enable-auto-tool-choice --tool-call-parser qwen3_xml --host 0.0.0.0 --port 8000 --enable-log-requests

And look at the output logs - I had assumed the output is after your parsing tool, but I guess that is incorrect - since it's my understanding the whole point of your parser is to emit JSON tool calling format that is more widely understood by code agents. Let me know if there is an easy way to capture that output - otherwise I will setup a proxy to see what is really returned to the coding agent.

@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.
Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.
But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:
model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ
As I see it outputs
<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>
So somehow seems like the model failed to include the starting <tool_call> token for some reason.

@geraldthewes Hi, I try to add this case in vllm/tests/tool_use/test_qwen3coder_tool_parser.py.

I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>

it seems that the parser return correct result,

tool_calls=[ToolCall(id='chatcmpl-tool-67620c18dccc425abafda6fb9172a493', type='function', function=FunctionCall(name='run_shell_command', arguments='{"command": "jobforge submit-job deploy/build.yaml\\n", "is_background": "False\\n", "description": "Running jobforge submit-job command to deploy build.yaml\\n\\n'))] content="I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n"

What is the parsed output in your sevrer look like?

I'm not sure what you are asking. I run qwen coder as follows

docker run --runtime nvidia --shm-size=2g --gpus all -p 8000:8000 -v /mnt/data1/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=REMOVED" --env "PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True" --env "CUDA_DEVICE_ORDER=PCI_BUS_ID" --ipc=host qwen3_xml_parser_fix:latest --model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ --quantization awq --tensor-parallel-size 2 --max-model-len 65536 --gpu-memory-utilization 0.85 --dtype float16 --enable-auto-tool-choice --tool-call-parser qwen3_xml --host 0.0.0.0 --port 8000 --enable-log-requests

And look at the output logs - I had assumed the output is after your parsing tool, but I guess that is incorrect - since it's my understanding the whole point of your parser is to emit JSON tool calling format that is more widely understood by code agents. Let me know if there is an easy way to capture that output - otherwise I will setup a proxy to see what is really returned to the coding agent.

@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.
Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.
But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:
model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ
As I see it outputs
<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>
So somehow seems like the model failed to include the starting <tool_call> token for some reason.

@geraldthewes Hi, I try to add this case in vllm/tests/tool_use/test_qwen3coder_tool_parser.py.

I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>

it seems that the parser return correct result,

tool_calls=[ToolCall(id='chatcmpl-tool-67620c18dccc425abafda6fb9172a493', type='function', function=FunctionCall(name='run_shell_command', arguments='{"command": "jobforge submit-job deploy/build.yaml\\n", "is_background": "False\\n", "description": "Running jobforge submit-job command to deploy build.yaml\\n\\n'))] content="I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n"

What is the parsed output in your sevrer look like?

geraldthewes pushed a commit to geraldthewes/vllm that referenced this pull request Oct 10, 2025
Fixes an issue where Qwen3-Coder models sometimes generate tool calls
starting directly with <function=...> instead of the expected
<tool_call><function=...> structure.

Changes:
- Enhanced _find_next_complete_element to detect <function= tags as
  potential tool call starts when not currently parsing a tool call
- Updated _should_skip_element to properly handle function tags that
  appear without tool_call wrappers
- Added logic to wait for complete <function= tags before processing
  to avoid treating partial tags as text

The parser now gracefully handles:
1. Missing opening <tool_call> tag (starts with <function=)
2. Missing both <tool_call> tags (only function wrapper)
3. Streaming mode with missing tags

Leverages existing fallback logic (line 628-630) that auto-creates
a tool_call when a function element is encountered without a parent
tool_call context.

Tests:
- test_extract_tool_calls_missing_opening_tool_call_tag: Tests the
  exact scenario from the bug report with run_shell_command
- test_extract_tool_calls_missing_both_tool_call_tags: Tests when
  both opening and closing tool_call tags are missing
- test_extract_tool_calls_streaming_missing_opening_tag: Validates
  streaming behavior with missing tags

Fixes: vllm-project#25028

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@geraldthewes
Copy link

geraldthewes commented Oct 10, 2025

@Zhikaiiii

ok, Yesterday I asked claude code to make a fix for this. Unfortunately creating the docker images for me to test takes forever. I was able to test it today and now the example I have now works.

I have pushed my patch here

https://github.com/geraldthewes/vllm/tree/fix/qwen3_xml_missing_tool_call_tag

That said multiple caveats. I know nothing on the parser setup, I have not looked at the code changes made by Claude code. While it did add a test, not sure it knew how to run the test.

But it is good news that at least fixed my symptoms. Let me know what makes more sense to do at this point.

I will test it more today to see if this finally gives me a working combination of qwen code with 4-bit QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ running locally using vllm.

But nice to see qwen code finally tool call my example after asking my permission:

"I'll run the jobforge submit-job command to deploy the build.yaml file.

╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ ✓ Shell jobforge submit-job deploy/build.yaml (Running jobforge submit-job command to deploy build.yaml) │
│ │
│ { │
│ "job_id": "8cd4ca3d-3c0c-4d30-9d13-7ceb405813f8", │
│ "status": "BUILDING" │
│ } │
╰──────────────
"

I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>

Normally, as you understand, the code shown above is the raw output from the model. The parser will convert this output into JSON tool format for downstream use like qwen-code or other scenarios.

I directly used the model output from your example as the input to the parser. Although this output appears to be missing <tool_call>, the parser stiil successfully applied its fallback mechanism and produced the correct JSON format. Therefore, I'd like to know, in your scenario, what is the final output returned by the outermost vLLM server?

I'm not sure what you are asking. I run qwen coder as follows
docker run --runtime nvidia --shm-size=2g --gpus all -p 8000:8000 -v /mnt/data1/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=REMOVED" --env "PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True" --env "CUDA_DEVICE_ORDER=PCI_BUS_ID" --ipc=host qwen3_xml_parser_fix:latest --model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ --quantization awq --tensor-parallel-size 2 --max-model-len 65536 --gpu-memory-utilization 0.85 --dtype float16 --enable-auto-tool-choice --tool-call-parser qwen3_xml --host 0.0.0.0 --port 8000 --enable-log-requests
And look at the output logs - I had assumed the output is after your parsing tool, but I guess that is incorrect - since it's my understanding the whole point of your parser is to emit JSON tool calling format that is more widely understood by code agents. Let me know if there is an easy way to capture that output - otherwise I will setup a proxy to see what is really returned to the coding agent.

@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.
Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.
But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:
model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ
As I see it outputs
<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>
So somehow seems like the model failed to include the starting <tool_call> token for some reason.

@geraldthewes Hi, I try to add this case in vllm/tests/tool_use/test_qwen3coder_tool_parser.py.

I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>

it seems that the parser return correct result,

tool_calls=[ToolCall(id='chatcmpl-tool-67620c18dccc425abafda6fb9172a493', type='function', function=FunctionCall(name='run_shell_command', arguments='{"command": "jobforge submit-job deploy/build.yaml\\n", "is_background": "False\\n", "description": "Running jobforge submit-job command to deploy build.yaml\\n\\n'))] content="I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n"

What is the parsed output in your sevrer look like?

I'm not sure what you are asking. I run qwen coder as follows
docker run --runtime nvidia --shm-size=2g --gpus all -p 8000:8000 -v /mnt/data1/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=REMOVED" --env "PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True" --env "CUDA_DEVICE_ORDER=PCI_BUS_ID" --ipc=host qwen3_xml_parser_fix:latest --model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ --quantization awq --tensor-parallel-size 2 --max-model-len 65536 --gpu-memory-utilization 0.85 --dtype float16 --enable-auto-tool-choice --tool-call-parser qwen3_xml --host 0.0.0.0 --port 8000 --enable-log-requests
And look at the output logs - I had assumed the output is after your parsing tool, but I guess that is incorrect - since it's my understanding the whole point of your parser is to emit JSON tool calling format that is more widely understood by code agents. Let me know if there is an easy way to capture that output - otherwise I will setup a proxy to see what is really returned to the coding agent.

@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.
Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.
But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:
model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ
As I see it outputs
<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>
So somehow seems like the model failed to include the starting <tool_call> token for some reason.

@geraldthewes Hi, I try to add this case in vllm/tests/tool_use/test_qwen3coder_tool_parser.py.

I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>

it seems that the parser return correct result,

tool_calls=[ToolCall(id='chatcmpl-tool-67620c18dccc425abafda6fb9172a493', type='function', function=FunctionCall(name='run_shell_command', arguments='{"command": "jobforge submit-job deploy/build.yaml\\n", "is_background": "False\\n", "description": "Running jobforge submit-job command to deploy build.yaml\\n\\n'))] content="I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n"

What is the parsed output in your sevrer look like?

@Zhikaiiii
Copy link
Contributor Author

Zhikaiiii commented Oct 11, 2025

Thank you very much for your reply. Based on the code you provided, I have fixed this issue and merged it into the #26345

@Zhikaiiii

ok, Yesterday I asked claude code to make a fix for this. Unfortunately creating the docker images for me to test takes forever. I was able to test it today and now the example I have now works.

I have pushed my patch here

https://github.com/geraldthewes/vllm/tree/fix/qwen3_xml_missing_tool_call_tag

That said multiple caveats. I know nothing on the parser setup, I have not looked at the code changes made by Claude code. While it did add a test, not sure it knew how to run the test.

But it is good news that at least fixed my symptoms. Let me know what makes more sense to do at this point.

I will test it more today to see if this finally gives me a working combination of qwen code with 4-bit QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ running locally using vllm.

But nice to see qwen code finally tool call my example after asking my permission:

"I'll run the jobforge submit-job command to deploy the build.yaml file.

╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ ✓ Shell jobforge submit-job deploy/build.yaml (Running jobforge submit-job command to deploy build.yaml) │ │ │ │ { │ │ "job_id": "8cd4ca3d-3c0c-4d30-9d13-7ceb405813f8", │ │ "status": "BUILDING" │ │ } │ ╰────────────── "

I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>

Normally, as you understand, the code shown above is the raw output from the model. The parser will convert this output into JSON tool format for downstream use like qwen-code or other scenarios.
I directly used the model output from your example as the input to the parser. Although this output appears to be missing <tool_call>, the parser stiil successfully applied its fallback mechanism and produced the correct JSON format. Therefore, I'd like to know, in your scenario, what is the final output returned by the outermost vLLM server?

I'm not sure what you are asking. I run qwen coder as follows
docker run --runtime nvidia --shm-size=2g --gpus all -p 8000:8000 -v /mnt/data1/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=REMOVED" --env "PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True" --env "CUDA_DEVICE_ORDER=PCI_BUS_ID" --ipc=host qwen3_xml_parser_fix:latest --model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ --quantization awq --tensor-parallel-size 2 --max-model-len 65536 --gpu-memory-utilization 0.85 --dtype float16 --enable-auto-tool-choice --tool-call-parser qwen3_xml --host 0.0.0.0 --port 8000 --enable-log-requests
And look at the output logs - I had assumed the output is after your parsing tool, but I guess that is incorrect - since it's my understanding the whole point of your parser is to emit JSON tool calling format that is more widely understood by code agents. Let me know if there is an easy way to capture that output - otherwise I will setup a proxy to see what is really returned to the coding agent.

@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.
Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.
But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:
model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ
As I see it outputs
<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>
So somehow seems like the model failed to include the starting <tool_call> token for some reason.

@geraldthewes Hi, I try to add this case in vllm/tests/tool_use/test_qwen3coder_tool_parser.py.

I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>

it seems that the parser return correct result,

tool_calls=[ToolCall(id='chatcmpl-tool-67620c18dccc425abafda6fb9172a493', type='function', function=FunctionCall(name='run_shell_command', arguments='{"command": "jobforge submit-job deploy/build.yaml\\n", "is_background": "False\\n", "description": "Running jobforge submit-job command to deploy build.yaml\\n\\n'))] content="I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n"

What is the parsed output in your sevrer look like?

I'm not sure what you are asking. I run qwen coder as follows
docker run --runtime nvidia --shm-size=2g --gpus all -p 8000:8000 -v /mnt/data1/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=REMOVED" --env "PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True" --env "CUDA_DEVICE_ORDER=PCI_BUS_ID" --ipc=host qwen3_xml_parser_fix:latest --model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ --quantization awq --tensor-parallel-size 2 --max-model-len 65536 --gpu-memory-utilization 0.85 --dtype float16 --enable-auto-tool-choice --tool-call-parser qwen3_xml --host 0.0.0.0 --port 8000 --enable-log-requests
And look at the output logs - I had assumed the output is after your parsing tool, but I guess that is incorrect - since it's my understanding the whole point of your parser is to emit JSON tool calling format that is more widely understood by code agents. Let me know if there is an easy way to capture that output - otherwise I will setup a proxy to see what is really returned to the coding agent.

@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.
Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.
But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:
model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ
As I see it outputs
<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>
So somehow seems like the model failed to include the starting <tool_call> token for some reason.

@geraldthewes Hi, I try to add this case in vllm/tests/tool_use/test_qwen3coder_tool_parser.py.

I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>

it seems that the parser return correct result,

tool_calls=[ToolCall(id='chatcmpl-tool-67620c18dccc425abafda6fb9172a493', type='function', function=FunctionCall(name='run_shell_command', arguments='{"command": "jobforge submit-job deploy/build.yaml\\n", "is_background": "False\\n", "description": "Running jobforge submit-job command to deploy build.yaml\\n\\n'))] content="I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n"

What is the parsed output in your sevrer look like?

choprahetarth pushed a commit to Tandemn-Labs/vllm that referenced this pull request Oct 11, 2025
@geraldthewes
Copy link

@Zhikaiiii

Thank you so much, this version works too. Very appreciated, hope it gets merged into the main vllm branch.

-- gerald

Thank you very much for your reply. Based on the code you provided, I have fixed this issue and merged it into the #26345

@Zhikaiiii
ok, Yesterday I asked claude code to make a fix for this. Unfortunately creating the docker images for me to test takes forever. I was able to test it today and now the example I have now works.
I have pushed my patch here
https://github.com/geraldthewes/vllm/tree/fix/qwen3_xml_missing_tool_call_tag
That said multiple caveats. I know nothing on the parser setup, I have not looked at the code changes made by Claude code. While it did add a test, not sure it knew how to run the test.
But it is good news that at least fixed my symptoms. Let me know what makes more sense to do at this point.
I will test it more today to see if this finally gives me a working combination of qwen code with 4-bit QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ running locally using vllm.
But nice to see qwen code finally tool call my example after asking my permission:
"I'll run the jobforge submit-job command to deploy the build.yaml file.
╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ ✓ Shell jobforge submit-job deploy/build.yaml (Running jobforge submit-job command to deploy build.yaml) │ │ │ │ { │ │ "job_id": "8cd4ca3d-3c0c-4d30-9d13-7ceb405813f8", │ │ "status": "BUILDING" │ │ } │ ╰────────────── "

I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>

Normally, as you understand, the code shown above is the raw output from the model. The parser will convert this output into JSON tool format for downstream use like qwen-code or other scenarios.
I directly used the model output from your example as the input to the parser. Although this output appears to be missing <tool_call>, the parser stiil successfully applied its fallback mechanism and produced the correct JSON format. Therefore, I'd like to know, in your scenario, what is the final output returned by the outermost vLLM server?

I'm not sure what you are asking. I run qwen coder as follows
docker run --runtime nvidia --shm-size=2g --gpus all -p 8000:8000 -v /mnt/data1/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=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" --env "PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True" --env "CUDA_DEVICE_ORDER=PCI_BUS_ID" --ipc=host qwen3_xml_parser_fix:latest --model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ --quantization awq --tensor-parallel-size 2 --max-model-len 65536 --gpu-memory-utilization 0.85 --dtype float16 --enable-auto-tool-choice --tool-call-parser qwen3_xml --host 0.0.0.0 --port 8000 --enable-log-requests
And look at the output logs - I had assumed the output is after your parsing tool, but I guess that is incorrect - since it's my understanding the whole point of your parser is to emit JSON tool calling format that is more widely understood by code agents. Let me know if there is an easy way to capture that output - otherwise I will setup a proxy to see what is really returned to the coding agent.

@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.
Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.
But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:
model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ
As I see it outputs
<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>
So somehow seems like the model failed to include the starting <tool_call> token for some reason.

@geraldthewes Hi, I try to add this case in vllm/tests/tool_use/test_qwen3coder_tool_parser.py.

I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>

it seems that the parser return correct result,

tool_calls=[ToolCall(id='chatcmpl-tool-67620c18dccc425abafda6fb9172a493', type='function', function=FunctionCall(name='run_shell_command', arguments='{"command": "jobforge submit-job deploy/build.yaml\\n", "is_background": "False\\n", "description": "Running jobforge submit-job command to deploy build.yaml\\n\\n'))] content="I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n"

What is the parsed output in your sevrer look like?

I'm not sure what you are asking. I run qwen coder as follows
docker run --runtime nvidia --shm-size=2g --gpus all -p 8000:8000 -v /mnt/data1/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=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" --env "PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True" --env "CUDA_DEVICE_ORDER=PCI_BUS_ID" --ipc=host qwen3_xml_parser_fix:latest --model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ --quantization awq --tensor-parallel-size 2 --max-model-len 65536 --gpu-memory-utilization 0.85 --dtype float16 --enable-auto-tool-choice --tool-call-parser qwen3_xml --host 0.0.0.0 --port 8000 --enable-log-requests
And look at the output logs - I had assumed the output is after your parsing tool, but I guess that is incorrect - since it's my understanding the whole point of your parser is to emit JSON tool calling format that is more widely understood by code agents. Let me know if there is an easy way to capture that output - otherwise I will setup a proxy to see what is really returned to the coding agent.

@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.
Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.
But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:
model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ
As I see it outputs
<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>
So somehow seems like the model failed to include the starting <tool_call> token for some reason.

@geraldthewes Hi, I try to add this case in vllm/tests/tool_use/test_qwen3coder_tool_parser.py.

I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>

it seems that the parser return correct result,

tool_calls=[ToolCall(id='chatcmpl-tool-67620c18dccc425abafda6fb9172a493', type='function', function=FunctionCall(name='run_shell_command', arguments='{"command": "jobforge submit-job deploy/build.yaml\\n", "is_background": "False\\n", "description": "Running jobforge submit-job command to deploy build.yaml\\n\\n'))] content="I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n"

What is the parsed output in your sevrer look like?

@russellb
Copy link
Member

@geraldthewes your huggingface token was included in multiple comments on this PR. I suggest revoking it.

@geraldthewes
Copy link

@geraldthewes your huggingface token was included in multiple comments on this PR. I suggest revoking it.

Yes, already done. Thanks.

lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
…ect#25028)

Signed-off-by: Zhikaiiii <1658973216@qq.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation frontend qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed tool-calling

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

8 participants