[Frontend] Add a new xml-based tool parser for qwen3-coder #25028

Zhikaiiii · 2025-09-17T03:36:17Z

Purpose

Contribute the internal tool parser used at Qwen API Service, which use a standard xml parser to parse text streamingly, and handles a lot of corner cases:

make sure the params of corresponding type are returned
handle function format error such as missing } for params

Test Plan

We test both origin parser and new parser in test_qwen3coder_tool_parser.py

pytest -v -s tests/tool_use/test_qwen3coder_tool_parser.py

Test Result

============================ test session starts ==============================
platform linux -- Python 3.11.11, pytest-8.4.2, pluggy-1.6.0 -- /usr/local/bin/python
cachedir: .pytest_cache
rootdir: /mnt/workspace/dev_workspace/wuzhikai.wzk/code_repo/vllm
configfile: pyproject.toml
plugins: hydra-core-1.3.2, anyio-4.10.0, asyncio-1.2.0
asyncio: mode=Mode.STRICT, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collecting ... collected 36 items

tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_no_tools[original] Downloading Model from https://www.modelscope.cn to directory: /mnt/workspace/.cache/modelscope/hub/models/Qwen/Qwen3-Coder-30B-A3B-Instruct-FP8
INFO 09-17 11:33:58 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:33:58 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_no_tools[xml] INFO 09-17 11:33:58 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:33:58 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls[original-single_tool] INFO 09-17 11:33:59 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:33:59 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls[original-single_tool_with_content] INFO 09-17 11:33:59 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:33:59 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls[original-single_tool_multiline_param] INFO 09-17 11:33:59 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:33:59 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls[original-parallel_tools] INFO 09-17 11:34:00 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:00 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls[original-tool_with_typed_params] INFO 09-17 11:34:00 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:00 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls[xml-single_tool] INFO 09-17 11:34:00 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:00 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls[xml-single_tool_with_content] INFO 09-17 11:34:01 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:01 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls[xml-single_tool_multiline_param] INFO 09-17 11:34:01 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:01 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls[xml-parallel_tools] INFO 09-17 11:34:01 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:01 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls[xml-tool_with_typed_params] INFO 09-17 11:34:02 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:02 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_fallback_no_tags[original] INFO 09-17 11:34:02 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:02 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_fallback_no_tags[xml] INFO 09-17 11:34:02 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:02 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_type_conversion[original] INFO 09-17 11:34:03 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:03 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_type_conversion[xml] INFO 09-17 11:34:03 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:03 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[original-no_tools] INFO 09-17 11:34:03 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:03 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[original-single_tool] INFO 09-17 11:34:04 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:04 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[original-single_tool_with_content] INFO 09-17 11:34:04 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:04 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[original-single_tool_multiline_param] INFO 09-17 11:34:04 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:04 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[original-parallel_tools] INFO 09-17 11:34:05 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:05 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[original-tool_with_typed_params] INFO 09-17 11:34:05 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:05 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[xml-no_tools] INFO 09-17 11:34:05 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:05 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[xml-single_tool] INFO 09-17 11:34:06 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:06 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[xml-single_tool_with_content] INFO 09-17 11:34:06 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:06 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[xml-single_tool_multiline_param] INFO 09-17 11:34:06 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:06 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[xml-parallel_tools] INFO 09-17 11:34:07 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:07 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming[xml-tool_with_typed_params] INFO 09-17 11:34:07 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:07 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_missing_closing_parameter_tag[original] INFO 09-17 11:34:07 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:07 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_missing_closing_parameter_tag[xml] INFO 09-17 11:34:08 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:08 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming_missing_closing_tag[original] INFO 09-17 11:34:08 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:08 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming_missing_closing_tag[xml] INFO 09-17 11:34:08 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:08 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming_incremental[original] INFO 09-17 11:34:09 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:09 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_streaming_incremental[xml] INFO 09-17 11:34:09 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:09 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_complex_type_with_single_quote[original] INFO 09-17 11:34:09 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:09 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
tests/tool_use/test_qwen3coder_tool_parser.py::test_extract_tool_calls_complex_type_with_single_quote[xml] INFO 09-17 11:34:10 [qwen3coder_tool_parser.py:76] vLLM Successfully import tool parser Qwen3CoderToolParser !
INFO 09-17 11:34:10 [qwen3coder_xml_tool_parser.py:1044] vLLM Successfully import tool parser Qwen3CoderXMLToolParser !
PASSED
======================= 36 passed, 4 warnings in 13.25s ========================

Signed-off-by: Zhikaiiii <1658973216@qq.com>

github-actions · 2025-09-17T03:36:26Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

gemini-code-assist

Code Review

This pull request introduces a new XML-based tool parser, Qwen3CoderXMLToolParser, designed for streaming and handling various corner cases in tool call parsing. The existing tests have been refactored to use a parameterized fixture, which is a great approach to ensure both the original and the new parser are tested against the same comprehensive suite of tests.

My review focuses on the new parser implementation. I've identified a critical security vulnerability related to the use of ast.literal_eval which could lead to a Denial of Service. I've also found a couple of high-severity issues concerning silent exception handling and overly greedy regex patterns that could affect the parser's robustness and maintainability. Addressing these points will significantly improve the quality and security of the new parser.

gemini-code-assist · 2025-09-17T03:38:00Z

vllm/entrypoints/openai/tool_parsers/qwen3xml_tool_parser.py

+                        raw_for_parse = raw_text + '\n'
+                    else:
+                        raw_for_parse = raw_text
+                    parsed_value = ast.literal_eval(raw_for_parse)


Using ast.literal_eval on input from an LLM without any size restrictions can expose the system to a Denial of Service (DoS) attack. A malicious or malformed model output could provide a very large or deeply nested structure that consumes excessive CPU or memory, or causes a stack overflow during parsing. This can crash the server process. It's critical to add a size limit check before evaluating the raw text.

# A reasonable limit to prevent DoS attacks. # This can be made configurable if needed. MAX_LITERAL_SIZE = 1_000_000 if len(raw_for_parse) > MAX_LITERAL_SIZE: raise ValueError( f"Parameter value size ({len(raw_for_parse)}) " f"exceeds the limit for literal_eval " f"({MAX_LITERAL_SIZE}).") parsed_value = ast.literal_eval(raw_for_parse)

gemini-code-assist · 2025-09-17T03:38:00Z

vllm/entrypoints/openai/tool_parsers/qwen3coder_xml_tool_parser.py

+            except Exception:
+                pass


Silently ignoring all exceptions during XML parsing can hide critical bugs and make debugging extremely difficult. If self.parser.Parse fails due to malformed input from _find_next_complete_element or other issues, the error is swallowed. This can lead to incorrect or incomplete tool call generation without any warning. It's crucial to log these exceptions to aid in debugging and to improve the parser's robustness.

Suggested change

except Exception:

pass

except Exception:

logger.warning("Failed to parse XML chunk: '%s'",

preprocessed_element,

exc_info=True)

pass

gemini-code-assist · 2025-09-17T03:38:00Z

vllm/entrypoints/openai/tool_parsers/qwen3xml_tool_parser.py

+        processed = re.sub(r'<function=([^>]+)>', r'<function name="\1">',
+                           chunk)
+        # Handle <parameter=name> format -> <parameter name="name">
+        processed = re.sub(r'<parameter=([^>]+)>', r'<parameter name="\1">',
+                           processed)


The regex ([^>]+) used to capture function and parameter names is too greedy and can lead to invalid XML if the model generates a malformed name. For instance, an output like <function=my_func(arg="val")> would result in a broken XML tag <function name="my_func(arg="val")">. To improve robustness, the regex should be more restrictive, allowing only a specific set of characters that are valid for identifiers.

Suggested change

processed = re.sub(r'<function=([^>]+)>', r'<function name="\1">',

chunk)

# Handle <parameter=name> format -> <parameter name="name">

processed = re.sub(r'<parameter=([^>]+)>', r'<parameter name="\1">',

processed)

processed = re.sub(r'<function=([a-zA-Z0-9_.-]+)>', r'<function name="\1">',

chunk)

# Handle <parameter=name> format -> <parameter name="name">

processed = re.sub(r'<parameter=([a-zA-Z0-9_.-]+)>', r'<parameter name="\1">',

processed)

Signed-off-by: Zhikaiiii <1658973216@qq.com>

chaunceyjiang · 2025-09-17T03:53:25Z

Contribute the internal tool parser used at Qwen API Service,

Hi, @Zhikaiiii Thanks~, Are you from the Qwen team?

simon-mo · 2025-09-17T04:47:34Z

Yes this is from the Qwen team

vllm/entrypoints/openai/tool_parsers/qwen3coder_xml_tool_parser.py

Signed-off-by: Zhikaiiii <1658973216@qq.com>

chaunceyjiang · 2025-09-17T14:13:15Z

vllm/entrypoints/openai/tool_parsers/qwen3coder_xml_tool_parser.py

+        self.deferred_param_raw_value = ""
+
+
+@ToolParserManager.register_module("qwen3_coder_xml")


Hi @Zhikaiiii, just a small question: the existing qwen3_coder should also be contributed by the Qwen team’s @ranpox.
From the unit tests, it looks like qwen3_coder_xml can completely replace qwen3_coder.

So should we deprecate the existing qwen3_coder and adopt the newer, more stable qwen3_coder_xml instead?

If we don’t deprecate it(qwen3_coder), how can end users determine whether they should use qwen3_coder_xml or qwen3_coder?

/cc @simon-mo @DarkLight1337 @aarnphm WDYT?

yes, we intent to use qwen3_coder_xml replace qwen3_coder, but for a clear review and some accuracy problems, we did not directly replace it. After we fix the accuracy problem, and all review comments been resolved, we will rename it

accuracy problem is eliminated, so we can move on to review and merge this PR @Zhikaiiii @chaunceyjiang

Signed-off-by: Zhikaiiii <1658973216@qq.com>

mergify · 2025-09-21T23:17:18Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @Zhikaiiii.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Zhikaiiii <1658973216@qq.com>

chaunceyjiang · 2025-09-23T02:15:49Z

vllm/entrypoints/openai/tool_parsers/qwen3xml_tool_parser.py

+        self.deferred_param_raw_value = ""
+
+
+@ToolParserManager.register_module("qwen3_xml")


I suggest renaming it to qwen3_coder, and deprecating the original qwen3_coder.

Since both are intended for use with the code3-coder model, maintaining two different tool_parser implementations would incur high maintenance costs.

@chaunceyjiang
This is because in future qwen3-series model, we might also use this parser, not just in the coder model. Therefore, we have named it to qwen3_xml instead.

i see.

docs/features/tool_calling.md

Could you please update the documentation to recommend which models should use this parser?

Signed-off-by: Zhikaiiii <1658973216@qq.com>

chaunceyjiang

Thanks~

…ect#25028) Signed-off-by: Zhikaiiii <1658973216@qq.com>

…ect#25028) Signed-off-by: Zhikaiiii <1658973216@qq.com> Signed-off-by: charlifu <charlifu@amd.com>

Signed-off-by: Zhikaiiii <1658973216@qq.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>

zxgx · 2025-10-04T10:59:53Z

Hi, I tried this tool call parser in v0.11.0 with cmd: vllm serve Qwen/Qwen3-Coder-30B-A3B-Instruct --dtype auto --api-key token-abc123 --enable-auto-tool-choice --tool-call-parser qwen3_xml --max-model-len 131072

It will report error:

(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145] Error in chat completion stream generator.
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145] Traceback (most recent call last):
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145]   File "/home/v-kenanli/venvs/serve/lib/python3.12/site-packages/vllm/entrypoints/openai/serving_chat.py", line 1033, in chat_completion_stream_generator
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145]     tool_parser.prev_tool_call_arr[index].get(
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145]     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145] IndexError: list index out of range

Do you have any suggestion on resolving this issue?

Zhikaiiii · 2025-10-05T04:16:29Z

Hi, I tried this tool call parser in v0.11.0 with cmd: vllm serve Qwen/Qwen3-Coder-30B-A3B-Instruct --dtype auto --api-key token-abc123 --enable-auto-tool-choice --tool-call-parser qwen3_xml --max-model-len 131072

It will report error:

(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145] Error in chat completion stream generator.
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145] Traceback (most recent call last):
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145]   File "/home/v-kenanli/venvs/serve/lib/python3.12/site-packages/vllm/entrypoints/openai/serving_chat.py", line 1033, in chat_completion_stream_generator
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145]     tool_parser.prev_tool_call_arr[index].get(
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145]     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145] IndexError: list index out of range

Do you have any suggestion on resolving this issue?

sry , I will check this bug asap. You can try use old parser qwen3_coder first .

geraldthewes · 2025-10-06T22:32:16Z

Hi, I tried this tool call parser in v0.11.0 with cmd: vllm serve Qwen/Qwen3-Coder-30B-A3B-Instruct --dtype auto --api-key token-abc123 --enable-auto-tool-choice --tool-call-parser qwen3_xml --max-model-len 131072
It will report error:
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145] Error in chat completion stream generator.
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145] Traceback (most recent call last):
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145]   File "/home/v-kenanli/venvs/serve/lib/python3.12/site-packages/vllm/entrypoints/openai/serving_chat.py", line 1033, in chat_completion_stream_generator
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145]     tool_parser.prev_tool_call_arr[index].get(
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145]     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^
(APIServer pid=1904320) ERROR 10-04 10:52:12 [serving_chat.py:1145] IndexError: list index out of range
Do you have any suggestion on resolving this issue?
sry , I will check this bug asap. You can try use old parser qwen3_coder first .

Running into the exact same issue

docker run --runtime nvidia --shm-size=2g --gpus all -p 8000:8000 -v ~/.cache/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=REMOVED" --env "PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True" --env "CUDA_DEVICE_ORDER=PCI_BUS_ID" --env "VLLM_LOGGING_LEVEL=DEBUG" --ipc=host vllm/vllm-openai:v0.11.0 --model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ --quantization awq --tensor-parallel-size 2 --max-model-len 65536 --gpu-memory-utilization 0.85 --dtype float16 --enable-auto-tool-choice --tool-call-parser qwen3_xml --host 0.0.0.0 --port 8000 --enable-log-requests

(APIServer pid=1) ERROR 10-06 15:16:11 [serving_chat.py:1145] Error in chat completion stream generator.
(APIServer pid=1) ERROR 10-06 15:16:11 [serving_chat.py:1145] Traceback (most recent call last):
(APIServer pid=1) ERROR 10-06 15:16:11 [serving_chat.py:1145] File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/serving_chat.py", line 1033, in chat_completion_stream_generator
(APIServer pid=1) ERROR 10-06 15:16:11 [serving_chat.py:1145] tool_parser.prev_tool_call_arr[index].get(
(APIServer pid=1) ERROR 10-06 15:16:11 [serving_chat.py:1145] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^
(APIServer pid=1) ERROR 10-06 15:16:11 [serving_chat.py:1145] IndexError: list index out of range

Basically trying to tool call an MCP server using qwen code

qwen -v
0.0.14

Zhikaiiii · 2025-10-07T12:54:03Z

@geraldthewes @zxgx hi, I have fixed this problem in #26345.
Could you please help double check in your scenario?

geraldthewes · 2025-10-09T14:42:35Z

@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.

Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.

But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:

model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ

As I see it outputs

<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>

So somehow seems like the model failed to include the starting <tool_call> token for some reason.

…ect#25028) Signed-off-by: Zhikaiiii <1658973216@qq.com> Signed-off-by: gaojc <1055866782@qq.com>

Zhikaiiii · 2025-10-10T02:22:01Z

@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.

Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.

But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:

model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ

As I see it outputs

<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>

So somehow seems like the model failed to include the starting <tool_call> token for some reason.

Thank you for the verification. The latter issue appears to be caused by the parser's fallback logic failing in this particular case. How frequently does this case occur? I'll try to fix this problem asap.

Zhikaiiii · 2025-10-10T02:42:55Z

@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.

Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.

But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:

model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ

As I see it outputs

<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>

So somehow seems like the model failed to include the starting <tool_call> token for some reason.

@geraldthewes
Hi, I try to add this case in vllm/tests/tool_use/test_qwen3coder_tool_parser.py.

I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>

it seems that the parser return correct result,

tool_calls=[ToolCall(id='chatcmpl-tool-67620c18dccc425abafda6fb9172a493', type='function', function=FunctionCall(name='run_shell_command', arguments='{"command": "jobforge submit-job deploy/build.yaml\\n", "is_background": "False\\n", "description": "Running jobforge submit-job command to deploy build.yaml\\n\\n'))] content="I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n"

What is the parsed output in your sevrer look like?

kexinoh · 2025-10-10T02:56:08Z

Hi, I want to know why Qwen no longer uses JSON and instead uses XML?

…ect#25028) Signed-off-by: Zhikaiiii <1658973216@qq.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

geraldthewes · 2025-10-10T12:59:02Z

I'm not sure what you are asking. I run qwen coder as follows

docker run --runtime nvidia --shm-size=2g --gpus all -p 8000:8000 -v /mnt/data1/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=REMOVED" --env "PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True" --env "CUDA_DEVICE_ORDER=PCI_BUS_ID" --ipc=host qwen3_xml_parser_fix:latest --model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ --quantization awq --tensor-parallel-size 2 --max-model-len 65536 --gpu-memory-utilization 0.85 --dtype float16 --enable-auto-tool-choice --tool-call-parser qwen3_xml --host 0.0.0.0 --port 8000 --enable-log-requests

And look at the output logs - I had assumed the output is after your parsing tool, but I guess that is incorrect - since it's my understanding the whole point of your parser is to emit JSON tool calling format that is more widely understood by code agents. Let me know if there is an easy way to capture that output - otherwise I will setup a proxy to see what is really returned to the coding agent.

@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.
Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.
But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:
model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ
As I see it outputs
<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>
So somehow seems like the model failed to include the starting <tool_call> token for some reason.

@geraldthewes Hi, I try to add this case in vllm/tests/tool_use/test_qwen3coder_tool_parser.py.
I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>
it seems that the parser return correct result,
tool_calls=[ToolCall(id='chatcmpl-tool-67620c18dccc425abafda6fb9172a493', type='function', function=FunctionCall(name='run_shell_command', arguments='{"command": "jobforge submit-job deploy/build.yaml\\n", "is_background": "False\\n", "description": "Running jobforge submit-job command to deploy build.yaml\\n\\n'))] content="I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n"
What is the parsed output in your sevrer look like?

Zhikaiiii · 2025-10-10T13:07:46Z

I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>

Normally, as you understand, the code shown above is the raw output from the model. The parser will convert this output into JSON tool format for downstream use like qwen-code or other scenarios.

I directly used the model output from your example as the input to the parser. Although this output appears to be missing <tool_call>, the parser stiil successfully applied its fallback mechanism and produced the correct JSON format. Therefore, I'd like to know, in your scenario, what is the final output returned by the outermost vLLM server?

I'm not sure what you are asking. I run qwen coder as follows

docker run --runtime nvidia --shm-size=2g --gpus all -p 8000:8000 -v /mnt/data1/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=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" --env "PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True" --env "CUDA_DEVICE_ORDER=PCI_BUS_ID" --ipc=host qwen3_xml_parser_fix:latest --model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ --quantization awq --tensor-parallel-size 2 --max-model-len 65536 --gpu-memory-utilization 0.85 --dtype float16 --enable-auto-tool-choice --tool-call-parser qwen3_xml --host 0.0.0.0 --port 8000 --enable-log-requests

And look at the output logs - I had assumed the output is after your parsing tool, but I guess that is incorrect - since it's my understanding the whole point of your parser is to emit JSON tool calling format that is more widely understood by code agents. Let me know if there is an easy way to capture that output - otherwise I will setup a proxy to see what is really returned to the coding agent.
@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.
Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.
But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:
model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ
As I see it outputs
<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>
So somehow seems like the model failed to include the starting <tool_call> token for some reason.

@geraldthewes Hi, I try to add this case in vllm/tests/tool_use/test_qwen3coder_tool_parser.py.
I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>
it seems that the parser return correct result,
tool_calls=[ToolCall(id='chatcmpl-tool-67620c18dccc425abafda6fb9172a493', type='function', function=FunctionCall(name='run_shell_command', arguments='{"command": "jobforge submit-job deploy/build.yaml\\n", "is_background": "False\\n", "description": "Running jobforge submit-job command to deploy build.yaml\\n\\n'))] content="I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n"
What is the parsed output in your sevrer look like?

I'm not sure what you are asking. I run qwen coder as follows

docker run --runtime nvidia --shm-size=2g --gpus all -p 8000:8000 -v /mnt/data1/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=REMOVED" --env "PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True" --env "CUDA_DEVICE_ORDER=PCI_BUS_ID" --ipc=host qwen3_xml_parser_fix:latest --model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ --quantization awq --tensor-parallel-size 2 --max-model-len 65536 --gpu-memory-utilization 0.85 --dtype float16 --enable-auto-tool-choice --tool-call-parser qwen3_xml --host 0.0.0.0 --port 8000 --enable-log-requests

And look at the output logs - I had assumed the output is after your parsing tool, but I guess that is incorrect - since it's my understanding the whole point of your parser is to emit JSON tool calling format that is more widely understood by code agents. Let me know if there is an easy way to capture that output - otherwise I will setup a proxy to see what is really returned to the coding agent.
@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.
Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.
But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:
model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ
As I see it outputs
<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>
So somehow seems like the model failed to include the starting <tool_call> token for some reason.

@geraldthewes Hi, I try to add this case in vllm/tests/tool_use/test_qwen3coder_tool_parser.py.
I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>
it seems that the parser return correct result,
tool_calls=[ToolCall(id='chatcmpl-tool-67620c18dccc425abafda6fb9172a493', type='function', function=FunctionCall(name='run_shell_command', arguments='{"command": "jobforge submit-job deploy/build.yaml\\n", "is_background": "False\\n", "description": "Running jobforge submit-job command to deploy build.yaml\\n\\n'))] content="I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n"
What is the parsed output in your sevrer look like?

Fixes an issue where Qwen3-Coder models sometimes generate tool calls starting directly with <function=...> instead of the expected <tool_call><function=...> structure. Changes: - Enhanced _find_next_complete_element to detect <function= tags as potential tool call starts when not currently parsing a tool call - Updated _should_skip_element to properly handle function tags that appear without tool_call wrappers - Added logic to wait for complete <function= tags before processing to avoid treating partial tags as text The parser now gracefully handles: 1. Missing opening <tool_call> tag (starts with <function=) 2. Missing both <tool_call> tags (only function wrapper) 3. Streaming mode with missing tags Leverages existing fallback logic (line 628-630) that auto-creates a tool_call when a function element is encountered without a parent tool_call context. Tests: - test_extract_tool_calls_missing_opening_tool_call_tag: Tests the exact scenario from the bug report with run_shell_command - test_extract_tool_calls_missing_both_tool_call_tags: Tests when both opening and closing tool_call tags are missing - test_extract_tool_calls_streaming_missing_opening_tag: Validates streaming behavior with missing tags Fixes: vllm-project#25028 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

geraldthewes · 2025-10-10T13:49:09Z

@Zhikaiiii

ok, Yesterday I asked claude code to make a fix for this. Unfortunately creating the docker images for me to test takes forever. I was able to test it today and now the example I have now works.

I have pushed my patch here

https://github.com/geraldthewes/vllm/tree/fix/qwen3_xml_missing_tool_call_tag

That said multiple caveats. I know nothing on the parser setup, I have not looked at the code changes made by Claude code. While it did add a test, not sure it knew how to run the test.

But it is good news that at least fixed my symptoms. Let me know what makes more sense to do at this point.

I will test it more today to see if this finally gives me a working combination of qwen code with 4-bit QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ running locally using vllm.

But nice to see qwen code finally tool call my example after asking my permission:

"I'll run the jobforge submit-job command to deploy the build.yaml file.

╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ ✓ Shell jobforge submit-job deploy/build.yaml (Running jobforge submit-job command to deploy build.yaml) │
│ │
│ { │
│ "job_id": "8cd4ca3d-3c0c-4d30-9d13-7ceb405813f8", │
│ "status": "BUILDING" │
│ } │
╰──────────────
"

I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>
Normally, as you understand, the code shown above is the raw output from the model. The parser will convert this output into JSON tool format for downstream use like qwen-code or other scenarios.

I directly used the model output from your example as the input to the parser. Although this output appears to be missing <tool_call>, the parser stiil successfully applied its fallback mechanism and produced the correct JSON format. Therefore, I'd like to know, in your scenario, what is the final output returned by the outermost vLLM server?
I'm not sure what you are asking. I run qwen coder as follows
docker run --runtime nvidia --shm-size=2g --gpus all -p 8000:8000 -v /mnt/data1/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=REMOVED" --env "PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True" --env "CUDA_DEVICE_ORDER=PCI_BUS_ID" --ipc=host qwen3_xml_parser_fix:latest --model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ --quantization awq --tensor-parallel-size 2 --max-model-len 65536 --gpu-memory-utilization 0.85 --dtype float16 --enable-auto-tool-choice --tool-call-parser qwen3_xml --host 0.0.0.0 --port 8000 --enable-log-requests
And look at the output logs - I had assumed the output is after your parsing tool, but I guess that is incorrect - since it's my understanding the whole point of your parser is to emit JSON tool calling format that is more widely understood by code agents. Let me know if there is an easy way to capture that output - otherwise I will setup a proxy to see what is really returned to the coding agent.
@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.
Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.
But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:
model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ
As I see it outputs
<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>
So somehow seems like the model failed to include the starting <tool_call> token for some reason.

@geraldthewes Hi, I try to add this case in vllm/tests/tool_use/test_qwen3coder_tool_parser.py.
I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>
it seems that the parser return correct result,
tool_calls=[ToolCall(id='chatcmpl-tool-67620c18dccc425abafda6fb9172a493', type='function', function=FunctionCall(name='run_shell_command', arguments='{"command": "jobforge submit-job deploy/build.yaml\\n", "is_background": "False\\n", "description": "Running jobforge submit-job command to deploy build.yaml\\n\\n'))] content="I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n"
What is the parsed output in your sevrer look like?
I'm not sure what you are asking. I run qwen coder as follows
docker run --runtime nvidia --shm-size=2g --gpus all -p 8000:8000 -v /mnt/data1/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=REMOVED" --env "PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True" --env "CUDA_DEVICE_ORDER=PCI_BUS_ID" --ipc=host qwen3_xml_parser_fix:latest --model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ --quantization awq --tensor-parallel-size 2 --max-model-len 65536 --gpu-memory-utilization 0.85 --dtype float16 --enable-auto-tool-choice --tool-call-parser qwen3_xml --host 0.0.0.0 --port 8000 --enable-log-requests
And look at the output logs - I had assumed the output is after your parsing tool, but I guess that is incorrect - since it's my understanding the whole point of your parser is to emit JSON tool calling format that is more widely understood by code agents. Let me know if there is an easy way to capture that output - otherwise I will setup a proxy to see what is really returned to the coding agent.
@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.
Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.
But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:
model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ
As I see it outputs
<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>
So somehow seems like the model failed to include the starting <tool_call> token for some reason.

@geraldthewes Hi, I try to add this case in vllm/tests/tool_use/test_qwen3coder_tool_parser.py.
I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>
it seems that the parser return correct result,
tool_calls=[ToolCall(id='chatcmpl-tool-67620c18dccc425abafda6fb9172a493', type='function', function=FunctionCall(name='run_shell_command', arguments='{"command": "jobforge submit-job deploy/build.yaml\\n", "is_background": "False\\n", "description": "Running jobforge submit-job command to deploy build.yaml\\n\\n'))] content="I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n"
What is the parsed output in your sevrer look like?

Zhikaiiii · 2025-10-11T02:59:26Z

Thank you very much for your reply. Based on the code you provided, I have fixed this issue and merged it into the #26345

@Zhikaiiii

ok, Yesterday I asked claude code to make a fix for this. Unfortunately creating the docker images for me to test takes forever. I was able to test it today and now the example I have now works.

I have pushed my patch here

https://github.com/geraldthewes/vllm/tree/fix/qwen3_xml_missing_tool_call_tag

That said multiple caveats. I know nothing on the parser setup, I have not looked at the code changes made by Claude code. While it did add a test, not sure it knew how to run the test.

But it is good news that at least fixed my symptoms. Let me know what makes more sense to do at this point.

I will test it more today to see if this finally gives me a working combination of qwen code with 4-bit QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ running locally using vllm.

But nice to see qwen code finally tool call my example after asking my permission:

"I'll run the jobforge submit-job command to deploy the build.yaml file.

╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ ✓ Shell jobforge submit-job deploy/build.yaml (Running jobforge submit-job command to deploy build.yaml) │ │ │ │ { │ │ "job_id": "8cd4ca3d-3c0c-4d30-9d13-7ceb405813f8", │ │ "status": "BUILDING" │ │ } │ ╰────────────── "
I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>
Normally, as you understand, the code shown above is the raw output from the model. The parser will convert this output into JSON tool format for downstream use like qwen-code or other scenarios.
I directly used the model output from your example as the input to the parser. Although this output appears to be missing <tool_call>, the parser stiil successfully applied its fallback mechanism and produced the correct JSON format. Therefore, I'd like to know, in your scenario, what is the final output returned by the outermost vLLM server?
I'm not sure what you are asking. I run qwen coder as follows
docker run --runtime nvidia --shm-size=2g --gpus all -p 8000:8000 -v /mnt/data1/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=REMOVED" --env "PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True" --env "CUDA_DEVICE_ORDER=PCI_BUS_ID" --ipc=host qwen3_xml_parser_fix:latest --model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ --quantization awq --tensor-parallel-size 2 --max-model-len 65536 --gpu-memory-utilization 0.85 --dtype float16 --enable-auto-tool-choice --tool-call-parser qwen3_xml --host 0.0.0.0 --port 8000 --enable-log-requests
And look at the output logs - I had assumed the output is after your parsing tool, but I guess that is incorrect - since it's my understanding the whole point of your parser is to emit JSON tool calling format that is more widely understood by code agents. Let me know if there is an easy way to capture that output - otherwise I will setup a proxy to see what is really returned to the coding agent.
@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.
Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.
But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:
model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ
As I see it outputs
<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>
So somehow seems like the model failed to include the starting <tool_call> token for some reason.

@geraldthewes Hi, I try to add this case in vllm/tests/tool_use/test_qwen3coder_tool_parser.py.
I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>
it seems that the parser return correct result,
tool_calls=[ToolCall(id='chatcmpl-tool-67620c18dccc425abafda6fb9172a493', type='function', function=FunctionCall(name='run_shell_command', arguments='{"command": "jobforge submit-job deploy/build.yaml\\n", "is_background": "False\\n", "description": "Running jobforge submit-job command to deploy build.yaml\\n\\n'))] content="I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n"
What is the parsed output in your sevrer look like?
I'm not sure what you are asking. I run qwen coder as follows
docker run --runtime nvidia --shm-size=2g --gpus all -p 8000:8000 -v /mnt/data1/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=REMOVED" --env "PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True" --env "CUDA_DEVICE_ORDER=PCI_BUS_ID" --ipc=host qwen3_xml_parser_fix:latest --model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ --quantization awq --tensor-parallel-size 2 --max-model-len 65536 --gpu-memory-utilization 0.85 --dtype float16 --enable-auto-tool-choice --tool-call-parser qwen3_xml --host 0.0.0.0 --port 8000 --enable-log-requests
And look at the output logs - I had assumed the output is after your parsing tool, but I guess that is incorrect - since it's my understanding the whole point of your parser is to emit JSON tool calling format that is more widely understood by code agents. Let me know if there is an easy way to capture that output - otherwise I will setup a proxy to see what is really returned to the coding agent.
@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.
Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.
But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:
model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ
As I see it outputs
<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>
So somehow seems like the model failed to include the starting <tool_call> token for some reason.

@geraldthewes Hi, I try to add this case in vllm/tests/tool_use/test_qwen3coder_tool_parser.py.
I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>
it seems that the parser return correct result,
tool_calls=[ToolCall(id='chatcmpl-tool-67620c18dccc425abafda6fb9172a493', type='function', function=FunctionCall(name='run_shell_command', arguments='{"command": "jobforge submit-job deploy/build.yaml\\n", "is_background": "False\\n", "description": "Running jobforge submit-job command to deploy build.yaml\\n\\n'))] content="I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n"
What is the parsed output in your sevrer look like?

…ect#25028) Signed-off-by: Zhikaiiii <1658973216@qq.com>

geraldthewes · 2025-10-12T17:03:19Z

@Zhikaiiii

Thank you so much, this version works too. Very appreciated, hope it gets merged into the main vllm branch.

-- gerald

Thank you very much for your reply. Based on the code you provided, I have fixed this issue and merged it into the #26345
@Zhikaiiii
ok, Yesterday I asked claude code to make a fix for this. Unfortunately creating the docker images for me to test takes forever. I was able to test it today and now the example I have now works.
I have pushed my patch here
https://github.com/geraldthewes/vllm/tree/fix/qwen3_xml_missing_tool_call_tag
That said multiple caveats. I know nothing on the parser setup, I have not looked at the code changes made by Claude code. While it did add a test, not sure it knew how to run the test.
But it is good news that at least fixed my symptoms. Let me know what makes more sense to do at this point.
I will test it more today to see if this finally gives me a working combination of qwen code with 4-bit QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ running locally using vllm.
But nice to see qwen code finally tool call my example after asking my permission:
"I'll run the jobforge submit-job command to deploy the build.yaml file.
╭────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮ │ ✓ Shell jobforge submit-job deploy/build.yaml (Running jobforge submit-job command to deploy build.yaml) │ │ │ │ { │ │ "job_id": "8cd4ca3d-3c0c-4d30-9d13-7ceb405813f8", │ │ "status": "BUILDING" │ │ } │ ╰────────────── "
I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>
Normally, as you understand, the code shown above is the raw output from the model. The parser will convert this output into JSON tool format for downstream use like qwen-code or other scenarios.
I directly used the model output from your example as the input to the parser. Although this output appears to be missing <tool_call>, the parser stiil successfully applied its fallback mechanism and produced the correct JSON format. Therefore, I'd like to know, in your scenario, what is the final output returned by the outermost vLLM server?
I'm not sure what you are asking. I run qwen coder as follows
docker run --runtime nvidia --shm-size=2g --gpus all -p 8000:8000 -v /mnt/data1/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=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" --env "PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True" --env "CUDA_DEVICE_ORDER=PCI_BUS_ID" --ipc=host qwen3_xml_parser_fix:latest --model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ --quantization awq --tensor-parallel-size 2 --max-model-len 65536 --gpu-memory-utilization 0.85 --dtype float16 --enable-auto-tool-choice --tool-call-parser qwen3_xml --host 0.0.0.0 --port 8000 --enable-log-requests
And look at the output logs - I had assumed the output is after your parsing tool, but I guess that is incorrect - since it's my understanding the whole point of your parser is to emit JSON tool calling format that is more widely understood by code agents. Let me know if there is an easy way to capture that output - otherwise I will setup a proxy to see what is really returned to the coding agent.
@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.
Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.
But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:
model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ
As I see it outputs
<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>
So somehow seems like the model failed to include the starting <tool_call> token for some reason.

@geraldthewes Hi, I try to add this case in vllm/tests/tool_use/test_qwen3coder_tool_parser.py.
I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>
it seems that the parser return correct result,
tool_calls=[ToolCall(id='chatcmpl-tool-67620c18dccc425abafda6fb9172a493', type='function', function=FunctionCall(name='run_shell_command', arguments='{"command": "jobforge submit-job deploy/build.yaml\\n", "is_background": "False\\n", "description": "Running jobforge submit-job command to deploy build.yaml\\n\\n'))] content="I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n"
What is the parsed output in your sevrer look like?
I'm not sure what you are asking. I run qwen coder as follows
docker run --runtime nvidia --shm-size=2g --gpus all -p 8000:8000 -v /mnt/data1/huggingface:/root/.cache/huggingface --env "HUGGING_FACE_HUB_TOKEN=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" --env "PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True" --env "CUDA_DEVICE_ORDER=PCI_BUS_ID" --ipc=host qwen3_xml_parser_fix:latest --model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ --quantization awq --tensor-parallel-size 2 --max-model-len 65536 --gpu-memory-utilization 0.85 --dtype float16 --enable-auto-tool-choice --tool-call-parser qwen3_xml --host 0.0.0.0 --port 8000 --enable-log-requests
And look at the output logs - I had assumed the output is after your parsing tool, but I guess that is incorrect - since it's my understanding the whole point of your parser is to emit JSON tool calling format that is more widely understood by code agents. Let me know if there is an easy way to capture that output - otherwise I will setup a proxy to see what is really returned to the coding agent.
@geraldthewes @zxgx hi, I have fixed this problem in #26345. Could you please help double check in your scenario?

@Zhikaiiii Sorry about the delay, I had to learnhow to rebuild the vllm docker image and it took time to build it.
Sort answer is it seems to have fixed the issue I reported as I no longer see it and qwen code CLI seems to work fine at this point for common tasks.
But I seem to have run into a different issue - not sure what your advice is but some tool call are failing, looking at the output from my model seems like a problem with the following model:
model QuantTrio/Qwen3-Coder-30B-A3B-Instruct-AWQ
As I see it outputs
<|im_start|>user\nUse run_shell_command to run "jobforge submit-job deploy/build.yaml"<|im_end|>\n<|im_start|>assistant\nI'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call><|im_end|>
So somehow seems like the model failed to include the starting <tool_call> token for some reason.

@geraldthewes Hi, I try to add this case in vllm/tests/tool_use/test_qwen3coder_tool_parser.py.
I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n<function=run_shell_command>\n<parameter=command>\njobforge submit-job deploy/build.yaml\n\n<parameter=is_background>\nFalse\n\n<parameter=description>\nRunning jobforge submit-job command to deploy build.yaml\n\n\n</tool_call>
it seems that the parser return correct result,
tool_calls=[ToolCall(id='chatcmpl-tool-67620c18dccc425abafda6fb9172a493', type='function', function=FunctionCall(name='run_shell_command', arguments='{"command": "jobforge submit-job deploy/build.yaml\\n", "is_background": "False\\n", "description": "Running jobforge submit-job command to deploy build.yaml\\n\\n'))] content="I'll run the jobforge submit-job command to deploy the build.yaml file.\n\n"
What is the parsed output in your sevrer look like?

russellb · 2025-10-15T14:20:09Z

@geraldthewes your huggingface token was included in multiple comments on this PR. I suggest revoking it.

geraldthewes · 2025-10-15T15:54:10Z

@geraldthewes your huggingface token was included in multiple comments on this PR. I suggest revoking it.

Yes, already done. Thanks.

…ect#25028) Signed-off-by: Zhikaiiii <1658973216@qq.com>

…ect#25028) Signed-off-by: Zhikaiiii <1658973216@qq.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

Zhikaiiii added 3 commits September 17, 2025 11:01

add qwen3_coder new xml tool parser

849902b

Signed-off-by: Zhikaiiii <1658973216@qq.com>

add single quote test case for qwen3-coder

dd59682

Signed-off-by: Zhikaiiii <1658973216@qq.com>

remove useless debug info

6b867cf

Signed-off-by: Zhikaiiii <1658973216@qq.com>

Zhikaiiii requested review from aarnphm and chaunceyjiang as code owners September 17, 2025 03:36

mergify bot added frontend qwen Related to Qwen models tool-calling labels Sep 17, 2025

github-project-automation bot added this to Tool Calling Sep 17, 2025

gemini-code-assist bot reviewed Sep 17, 2025

View reviewed changes

fix pre-commit lint

321031e

Signed-off-by: Zhikaiiii <1658973216@qq.com>

wenmengzhou reviewed Sep 17, 2025

View reviewed changes

vllm/entrypoints/openai/tool_parsers/qwen3coder_xml_tool_parser.py Outdated Show resolved Hide resolved

Zhikaiiii added 3 commits September 17, 2025 18:16

add new parser registry

593d1c1

Signed-off-by: Zhikaiiii <1658973216@qq.com>

fix mypy type error

d7dbf98

Signed-off-by: Zhikaiiii <1658973216@qq.com>

fix mypy error for py3.9

52afcd3

Signed-off-by: Zhikaiiii <1658973216@qq.com>

chaunceyjiang reviewed Sep 17, 2025

View reviewed changes

chaunceyjiang mentioned this pull request Sep 19, 2025

[Bug]: sometimes tool calling is not correctly parsed but remains in the plain content for qwen3 coder #22975

Open

1 task

fix badcase with missing tag

245ac05

Signed-off-by: Zhikaiiii <1658973216@qq.com>

mergify bot added the needs-rebase label Sep 21, 2025

Zhikaiiii added 2 commits September 23, 2025 09:52

rename parser to qwen3_xml

1c79ddd

Signed-off-by: Zhikaiiii <1658973216@qq.com>

Merge branch 'main' into feat/qwen3_coder_new_parser

a83fc3c

Signed-off-by: Zhikaiiii <1658973216@qq.com>

mergify bot removed the needs-rebase label Sep 23, 2025

chaunceyjiang reviewed Sep 23, 2025

View reviewed changes

update tool_calling docs

620b1aa

Signed-off-by: Zhikaiiii <1658973216@qq.com>

mergify bot added the documentation Improvements or additions to documentation label Sep 23, 2025

chaunceyjiang added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 23, 2025

chaunceyjiang enabled auto-merge (squash) September 23, 2025 08:05

chaunceyjiang approved these changes Sep 23, 2025

View reviewed changes

chaunceyjiang merged commit 9383cd6 into vllm-project:main Sep 23, 2025
55 checks passed

github-project-automation bot moved this to Done in Tool Calling Sep 23, 2025

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

[Frontend] Add a new xml-based tool parser for qwen3-coder (vllm-proj…

3e39d41

…ect#25028) Signed-off-by: Zhikaiiii <1658973216@qq.com>

charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025

[Frontend] Add a new xml-based tool parser for qwen3-coder (vllm-proj…

0e12972

…ect#25028) Signed-off-by: Zhikaiiii <1658973216@qq.com> Signed-off-by: charlifu <charlifu@amd.com>

yewentao256 pushed a commit that referenced this pull request Oct 3, 2025

[Frontend] Add a new xml-based tool parser for qwen3-coder (#25028)

c4a15ee

Signed-off-by: Zhikaiiii <1658973216@qq.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>

Zhikaiiii mentioned this pull request Oct 7, 2025

[Bugfix]fix Qwen3 xml tool parser #26345

Merged

gjc0824 pushed a commit to gjc0824/vllm that referenced this pull request Oct 10, 2025

[Frontend] Add a new xml-based tool parser for qwen3-coder (vllm-proj…

0b5178d

…ect#25028) Signed-off-by: Zhikaiiii <1658973216@qq.com> Signed-off-by: gaojc <1055866782@qq.com>

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025

[Frontend] Add a new xml-based tool parser for qwen3-coder (vllm-proj…

dee38d3

…ect#25028) Signed-off-by: Zhikaiiii <1658973216@qq.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

choprahetarth pushed a commit to Tandemn-Labs/vllm that referenced this pull request Oct 11, 2025

[Frontend] Add a new xml-based tool parser for qwen3-coder (vllm-proj…

15cd1c7

…ect#25028) Signed-off-by: Zhikaiiii <1658973216@qq.com>

lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025

[Frontend] Add a new xml-based tool parser for qwen3-coder (vllm-proj…

15e2199

…ect#25028) Signed-off-by: Zhikaiiii <1658973216@qq.com>

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025

[Frontend] Add a new xml-based tool parser for qwen3-coder (vllm-proj…

52410f1

…ect#25028) Signed-off-by: Zhikaiiii <1658973216@qq.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

-            except Exception:
-                pass
+            except Exception:
+                logger.warning("Failed to parse XML chunk: '%s'",
+                               preprocessed_element,
+                               exc_info=True)
+                pass

		self.deferred_param_raw_value = ""


		@ToolParserManager.register_module("qwen3_coder_xml")

		self.deferred_param_raw_value = ""


		@ToolParserManager.register_module("qwen3_xml")

Uh oh!

Uh oh!

[Frontend] Add a new xml-based tool parser for qwen3-coder #25028

[Frontend] Add a new xml-based tool parser for qwen3-coder #25028

Conversation

Zhikaiiii commented Sep 17, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

github-actions bot commented Sep 17, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

chaunceyjiang commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

simon-mo commented Sep 17, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Sep 21, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chaunceyjiang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zxgx commented Oct 4, 2025

Uh oh!

Zhikaiiii commented Oct 5, 2025

Uh oh!

geraldthewes commented Oct 6, 2025 • edited by russellb Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Zhikaiiii commented Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

geraldthewes commented Oct 9, 2025

Uh oh!

Zhikaiiii commented Oct 10, 2025

Uh oh!

Zhikaiiii commented Oct 10, 2025

Uh oh!

kexinoh commented Oct 10, 2025

Uh oh!

geraldthewes commented Oct 10, 2025 • edited by russellb Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Zhikaiiii commented Oct 10, 2025 • edited by russellb Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Zhikaiiii commented Sep 17, 2025 •

edited by github-actions bot

Loading

chaunceyjiang commented Sep 17, 2025 •

edited

Loading

geraldthewes commented Oct 6, 2025 •

edited by russellb

Loading

Zhikaiiii commented Oct 7, 2025 •

edited

Loading

geraldthewes commented Oct 10, 2025 •

edited by russellb

Loading

Zhikaiiii commented Oct 10, 2025 •

edited by russellb

Loading

geraldthewes commented Oct 10, 2025 •

edited by russellb

Loading

Zhikaiiii commented Oct 11, 2025 •

edited by russellb

Loading