-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
Add xLAM tool parser support #17148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add xLAM tool parser support #17148
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable to me, thanks for the clear code and testing. Just a few comments. It would be nice to add a dedicated format example to examples/offline_inference or online_serving
|
Hi @mgoin , would you mind taking a look again for this PR? Thank you! |
|
Tried this parser on my hosted llm-1 | INFO: 172.18.0.1:37142 - "POST /v1/chat/completions HTTP/1.1" 200 OK
llm-1 | INFO 05-05 09:22:04 [async_llm.py:228] Added request chatcmpl-e543138e0c3647f197935cbc69e5234d.
llm-1 | Error in streaming tool calls
llm-1 | Traceback (most recent call last):
llm-1 | File "/xlam_tool_parser.py", line 230, in extract_tool_calls_streaming
llm-1 | function_name = current_tool_call.get("name")
llm-1 | ^^^^^^^^^^^^^^^^^^^^^
llm-1 | AttributeError: 'list' object has no attribute 'get'A potential bug? |
|
Hi @dhaneshsabane , thanks for catching this. I have fixed the streaming issue. I used the following test scripts to test our xLAM models and it works well: Serving:
Testing scripts: Please let me know if you find any other issues. |
|
The error has disappeared but the tool call in itself is still incorrect. Here's the output of your test script: Notice the missing |
|
I tried it with a simple langflow agent, it returns: rather than calling the tool |
|
Hi @dhaneshsabane and @vxtra1973 , sorry about the previous mistakes, I didn't understand the streaming function calling mode very well. The parallel function calls in streaming mode is more complex and difficult to implement than I expected.
After serving the model The outcome is: This should be the expected behavior, right? Let me know your thoughts. Thanks. |
|
This pull request has merge conflicts that must be resolved before it can be |
@zuxin666I am using the latest code and running inference as follows: It seems that every tool call prints "[{"", which is not necessary. Could you please check this issue? Thank you! |
|
Hi @loki369loki , it has been solved, it was because the function call prefix detection logic here. |
|
Hi @mgoin , can you also please check this PR when you are available? Thx. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, thanks for the tests and examples!
|
@mgoin Thanks! Seems like the above CI failing are not related to this PR? Any other blockers to merge it? |
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Zuxin Liu <zuxin.liu@salesforce.com>
Signed-off-by: Zuxin Liu <zuxin.liu@salesforce.com>
Signed-off-by: Zuxin Liu <zuxin.liu@salesforce.com>
Signed-off-by: Zuxin Liu <zuxin.liu@salesforce.com>
Signed-off-by: Zuxin Liu <zuxin.liu@salesforce.com>
This reverts commit 337486885aa0c28bcca123c1ac646afc14435ab7. Signed-off-by: Zuxin Liu <zuxin.liu@salesforce.com>
Signed-off-by: Zuxin Liu <zuxin.liu@salesforce.com>
Signed-off-by: Zuxin Liu <zuxin.liu@salesforce.com>
Signed-off-by: Zuxin Liu <zuxin.liu@salesforce.com>
Signed-off-by: Zuxin Liu <zuxin.liu@salesforce.com>
|
Re-triggered the CI. |
|
Thanks! Seems that it is good to be merged? @houseroad @aarnphm @mgoin |

Description
This PR adds support for xLAM-2 models in vLLM's tool calling feature. The xLAM tool parser is designed to support models that generate tool calls in various JSON formats, including Salesforce's Llama-xLAM and Qwen-xLAM models.
Key highlights:
Implemented
xLAMToolParserclass that can detect function calls in multiple output styles:<think>...</think>tags[TOOL_CALLS]tags<tool_call>...</tool_call>tagsAdded support for both streaming and non-streaming modes for tool calls
Implemented robust JSON parsing with fallback mechanisms to handle various output formats
Added support for parallel function calls with effective separation of text content from tool calls
Supported Models
Salesforce/Llama-xLAM-2-8B-fc-r,Salesforce/Llama-xLAM-2-70B-fc-rSalesforce/xLAM-1B-fc-r,Salesforce/xLAM-3B-fc-r,Salesforce/xLAM-32B-fc-rFix
Enhances vLLM's tool calling capability by adding support for the xLAM-2 model family.