- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 10.8k
[Frontend] Responses API MCP tools for built in tools and to pass through headers #24628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Frontend] Responses API MCP tools for built in tools and to pass through headers #24628
Conversation
4f24d18    to
    fecbeaf      
    Compare
  
    …h headers Signed-off-by: Alec Solder <alecs@fb.com>
…tion issues Signed-off-by: Alec Solder <alecs@fb.com>
Signed-off-by: Alec Solder <alecs@fb.com>
cd2c876    to
    f0553be      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like this direction. And I think it is possible to setup this test in CI by using the python MCP server and PYTHON_EXECUTION_BACKEND=dangerously_use_uv. You can have a try if you don't want this feature be broken by future PRs.
| async def test_mcp_tool(client: OpenAI, model_name: str): | ||
| response = await client.responses.create( | ||
| model=model_name, | ||
| input="Please multiply 123 and 456 using the available python tool.", | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use a complex test like the one in #24315 :
input=("What's the first 4 digits after the decimal point of "
               "cube root of `19910212 * 20250910`? "
               "Show only the digits."),
model can be very clever and skip using python tools.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave this a try and it did work, however because we don't respect limiting the output tokens when tools are enabled and max_tool_calls is not implemented yet, we can't stop the generation after just one tool call.
Because the dangerously_use_uv implementation is not jupyter, the model gets confused and keeps trying to make tool calls until it seems some output (it does eventually start printing), which causes the tests to time out in CI. Otherwise I think it would work just fine though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alecsolder let's open another issues to improve the tests and possibly bring in swe-rex / docker client in oss as we get time on this.
Signed-off-by: Alec Solder <alecs@fb.com>
Signed-off-by: Alec Solder <alecs@fb.com>
| # If there is a MCP tool with a matching server_label, then | ||
| # the tool is enabled. Native types like code_interpreter work | ||
| # as well | ||
| tool_types = [ | ||
| tool.server_label if tool.type == "mcp" else tool.type | ||
| for tool in request.tools | ||
| ] | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use mcp as built-in tool is a bit weird for most developers. do you think we should gate that with similar env like VLLM_GPT_OSS_USE_CONTAINER_TOOL. possibly we can modify the naming and use one env to control
| @pytest.mark.asyncio | ||
| @pytest.mark.parametrize("model_name", [MODEL_NAME]) | ||
| @pytest.mark.skip(reason="Code interpreter tool is not available in CI yet.") | ||
| async def test_mcp_tool(client: OpenAI, model_name: str): | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's also add some unit tests coverage tool server init logic for mcp tool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the change! let's also remember to update the recipe and make this easy to follow. main concern is to make this opt-in and more coverage in unit tests
…cp tools to enable built in tools Signed-off-by: Alec Solder <alecs@fb.com>
| mind also taking this vllm-project/recipes#43 and update it? feel free to open a new one explaining how to use it. | 
| thanks for pushing this, allow mcp tools to override built-in should be generic enough to enable more dev and research usage. | 
Signed-off-by: Alec Solder <alecs@fb.com>
| for val in values: | ||
| if not case_sensitive: | ||
| check_value = val.lower() | ||
| check_choices = [choice.lower() for choice in actual_choices] | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this list doesn't have to be recreated every time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is implemented in a lazy way in order to support valid values being a callable, which has to happen after other python importing logic is done. This means there isn't a great way to "cache" these values without writing a bigger class for this purpose, which I think it out of the scope for this PR
| return _get_validated_env | ||
|  | ||
|  | ||
| def env_list_with_choices( | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add some unit tests here. also @youkaichao do you have concerns to init a list from environment variables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did something similar to how it is done for plugins right now:
Line 740 in 1d7f95b
| lambda: None if "VLLM_PLUGINS" not in os.environ else os.environ[ | 
| This pull request has merge conflicts that must be resolved before it can be | 
Signed-off-by: Alec Solder <alecs@fb.com>
Signed-off-by: Alec S <10566873+alecsolder@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for addressing all the feedback. cc: @heheda12345
…ough headers (vllm-project#24628) Signed-off-by: Alec Solder <alecs@fb.com> Signed-off-by: Alec S <10566873+alecsolder@users.noreply.github.com> Co-authored-by: Alec Solder <alecs@fb.com> Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>
…ough headers (vllm-project#24628) Signed-off-by: Alec Solder <alecs@fb.com> Signed-off-by: Alec S <10566873+alecsolder@users.noreply.github.com> Co-authored-by: Alec Solder <alecs@fb.com> Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com> Signed-off-by: charlifu <charlifu@amd.com>
…ough headers (#24628) Signed-off-by: Alec Solder <alecs@fb.com> Signed-off-by: Alec S <10566873+alecsolder@users.noreply.github.com> Co-authored-by: Alec Solder <alecs@fb.com> Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>
…ough headers (vllm-project#24628) Signed-off-by: Alec Solder <alecs@fb.com> Signed-off-by: Alec S <10566873+alecsolder@users.noreply.github.com> Co-authored-by: Alec Solder <alecs@fb.com> Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com> Signed-off-by: gaojc <1055866782@qq.com>
…ough headers (vllm-project#24628) Signed-off-by: Alec Solder <alecs@fb.com> Signed-off-by: Alec S <10566873+alecsolder@users.noreply.github.com> Co-authored-by: Alec Solder <alecs@fb.com> Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
…ough headers (vllm-project#24628) Signed-off-by: Alec Solder <alecs@fb.com> Signed-off-by: Alec S <10566873+alecsolder@users.noreply.github.com> Co-authored-by: Alec Solder <alecs@fb.com> Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>
…ough headers (vllm-project#24628) Signed-off-by: Alec Solder <alecs@fb.com> Signed-off-by: Alec S <10566873+alecsolder@users.noreply.github.com> Co-authored-by: Alec Solder <alecs@fb.com> Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com>
…ough headers (vllm-project#24628) Signed-off-by: Alec Solder <alecs@fb.com> Signed-off-by: Alec S <10566873+alecsolder@users.noreply.github.com> Co-authored-by: Alec Solder <alecs@fb.com> Co-authored-by: Ye (Charlotte) Qi <yeq@meta.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Purpose
This PR is to add the ability to use MCP tools to enable built in tools for gpt-oss. Either the OpenAI tool type (i.e browser) or an MCP tool with
server_labelset to the name of the OpenAI tool type.This enables header passthrough from the request to the MCP server in addition to the session_id
Test Plan
Tested with responses API server with curl. Added one unit test which worked locally but can't run in CI which tests code_interpreter tool functionality.
Test Result
Interacting with the server worked locally and local unit test worked as well.