Skip to content

Commit c64f0d5

Browse files
authored
fix: Get builtin tool calling working in remote-vllm (#1236)
# What does this PR do? This PR makes a couple of changes required to get the test `tests/client-sdk/agents/test_agents.py::test_builtin_tool_web_search` passing on the remote-vllm provider. First, we adjust agent_instance to also pass in the description and parameters of builtin tools. We need these parameters so we can pass the tool's expected parameters into vLLM. The meta-reference implementations may not have needed these for builtin tools, as they are able to take advantage of the Llama-model specific support for certain builtin tools. However, with vLLM, our server-side chat templates for tool calling treat all tools the same and don't separate out Llama builtin vs custom tools. So, we need to pass the full set of parameter definitions and list of required parameters for builtin tools as well. Next, we adjust the vllm streaming chat completion code to fix up some edge cases where it was returning an extra ChatCompletionResponseEvent with an empty ToolCall with empty string call_id, tool_name, and arguments properties. This is a bug discovered after the above fix, where after a successful tool invocation we were sending extra chunks back to the client with these empty ToolCalls. ## Test Plan With these changes, the following test that previously failed now passes: ``` VLLM_URL="http://localhost:8000/v1" \ INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ LLAMA_STACK_CONFIG=remote-vllm \ python -m pytest -v \ tests/client-sdk/agents/test_agents.py::test_builtin_tool_web_search \ --inference-model "meta-llama/Llama-3.2-3B-Instruct" ``` Additionally, I ran the remote-vllm client-sdk and provider inference tests as below to ensure they all still passed with this change: ``` VLLM_URL="http://localhost:8000/v1" \ INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct" \ LLAMA_STACK_CONFIG=remote-vllm \ python -m pytest -v \ tests/client-sdk/inference/test_text_inference.py \ --inference-model "meta-llama/Llama-3.2-3B-Instruct" ``` ``` VLLM_URL="http://localhost:8000/v1" \ python -m pytest -s -v \ llama_stack/providers/tests/inference/test_text_inference.py \ --providers "inference=vllm_remote" ``` [//]: # (## Documentation) Signed-off-by: Ben Browning <bbrownin@redhat.com>
1 parent 2ed2c0b commit c64f0d5

File tree

2 files changed

+15
-3
lines changed

2 files changed

+15
-3
lines changed

llama_stack/providers/inline/agents/meta_reference/agent_instance.py

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -909,7 +909,19 @@ async def _get_tool_defs(
909909
if tool_def_map.get(built_in_type, None):
910910
raise ValueError(f"Tool {built_in_type} already exists")
911911

912-
tool_def_map[built_in_type] = ToolDefinition(tool_name=built_in_type)
912+
tool_def_map[built_in_type] = ToolDefinition(
913+
tool_name=built_in_type,
914+
description=tool_def.description,
915+
parameters={
916+
param.name: ToolParamDefinition(
917+
param_type=param.parameter_type,
918+
description=param.description,
919+
required=param.required,
920+
default=param.default,
921+
)
922+
for param in tool_def.parameters
923+
},
924+
)
913925
tool_to_group[built_in_type] = tool_def.toolgroup_id
914926
continue
915927

llama_stack/providers/remote/inference/vllm/vllm.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -169,7 +169,7 @@ async def _process_vllm_chat_completion_stream_response(
169169
args = {} if not args_str else json.loads(args_str)
170170
except Exception as e:
171171
log.warning(f"Failed to parse tool call buffer arguments: {args_str} \nError: {e}")
172-
if args is not None:
172+
if args:
173173
yield ChatCompletionResponseStreamChunk(
174174
event=ChatCompletionResponseEvent(
175175
event_type=event_type,
@@ -183,7 +183,7 @@ async def _process_vllm_chat_completion_stream_response(
183183
),
184184
)
185185
)
186-
else:
186+
elif args_str:
187187
yield ChatCompletionResponseStreamChunk(
188188
event=ChatCompletionResponseEvent(
189189
event_type=ChatCompletionResponseEventType.progress,

0 commit comments

Comments
 (0)