-
Notifications
You must be signed in to change notification settings - Fork 8k
Fix/mcp tool adapter connects to the mcp server in a stateless mode #6284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix/mcp tool adapter connects to the mcp server in a stateless mode #6284
Conversation
…Connects_to_the_MCP_Server_in_a_StatelessMode
Notes on Lifecycle Management and Design Decisions
In summary: there may be edge cases that lead to a hang only if the session is used without an Agent and not closed explicitly. If there’s a better idiomatic approach for AutoGen or MCP usage patterns, I’d greatly appreciate your advice. |
Make sense. |
…Connects_to_the_MCP_Server_in_a_StatelessMode
…r_in_a_StatelessMode
…r_in_a_StatelessMode
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6284 +/- ##
==========================================
+ Coverage 77.24% 77.27% +0.02%
==========================================
Files 200 200
Lines 14473 14556 +83
==========================================
+ Hits 11180 11248 +68
- Misses 3293 3308 +15
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed my mind regarding the overall design I approved earlier.
I tried this with some examples and I found that when there is an error from the tool, it is not properly closing the tools and causing event loop closed errors -- as I expected because I forgot to include try--finally to close the tools.
Try this simple example:
params = StdioServerParams(
command="uvx",
args=["mcp-server-fetch"],
read_timeout_seconds=60,
)
tools = await mcp_server_tools(server_params=params)
assert tools is not None
assert tools[0].name == "fetch"
result = await tools[0].run_json({"url": "https://github.com/microsoft/not_exist"}, CancellationToken())
assert result is not NoneAnother weird issue is when one of the tool is closed, all of the tools will be closed -- this can be counter-intuitive.
I think we still need to use a context manager for the session. We can still use the McpSessionActor wrapper. I think this way it will make the implementation much easier and avoid confusion.
One more note: I think McpSession instead of McpSessionActor is more succinct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's update the API doc to use a context manager for the McpSessionActor.
| server_params: McpServerParams | ||
|
|
||
|
|
||
| class McpSessionActor(ComponentBase[BaseModel], Component[McpSessionConfig]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The McpSessionActor should be used as an async context manager.
it is very easy for user to accidently forget calling close().
I think the new usage should be like this:
async McpSessionActor(server_params) as session:
tools = mcp_server_tools(sessions)
# rest of code.Otherwise we will need to call close() on every tools directly, which is fine in some cases, but it is really easy to forget. So we should support the async context manager usage.
|
Also, let's add some unit tests that test |
|
Thanks for the detailed suggestions! I think I now better understand your preference for using async with on McpSessionActor directly — it’s a clean way to enforce graceful shutdown and avoid forgetting close(). However, I’d like to share one constraint from real-world usage that makes that pattern difficult in practice: 🧩 Tools are often injected into Agents — not used inline In most AutoGen use cases, we don’t call tools immediately after creation. In this pattern, the actor must outlive the entire agent execution, and it’s very difficult to scope the McpSessionActor within an async with block before injecting it. Then the session would be closed the moment the with block ends — even before the agent starts running. ⸻ 🛠️ Why the current design favors robustness To solve that, I moved session lifecycle outside of tool logic, and into the McpSessionActor, which: And yes — I absolutely agree that we should add aenter / aexit support as optional sugar, so advanced users can still do. Would love to hear your thoughts, and happy to iterate on this further! |
Then all of the running of agents and teams should be within the session's context. I don't see why this is an issue. The session context clearly indicates the lifecycle of the session, and anything that depends on it should live within it. If the session is not created as a context, then we can lazily initialize it. But how do you solve the issue that closing one tool closes all the tools? |
|
@ekzhu TL;DR
I’ll still try to support 1. #6198 root issueThe original issue was the session closing early if tools used Later I found a bigger problem: For example, 2. Why `with` was tricky in practiceYes, wrapping the whole thing in a
Right now I’m working on making this session sharing work even in deserialization — your comment helped me catch this. Appreciate it. 3. Why I went with ActorMCP SDK raises errors when the session is closed from a different task: That’s why I put the session inside a dedicated task - the Actor. 4. About “closing one tool closes all”This is expected - because tools share the same session. I thought about tracking tool lifetimes and only closing the session when all are done - Let me know if you think anything here’s off. |
|
@ekzhu ✅ Summary of changes:
🤔 A quick question about tests:From what I saw in the previous I’ve also documented more details, but figured I’d hold off unless it’s helpful now (since it might be too much information all at once). Just let me know if you’d like me to share more! |
I think the fundamental issue is that MCP server is not compatible with how It's not something we can solve with this PR -- it will require changes to the |
|
@ekzhu But since passing (That way, the current PR wouldn't need to change much at all.) Also, we could add
BTW even with your design, I think we’ll still need the so this might work either way. Just one question: |
|
I think we can do that in a different PR. For this PR, I want to avoid the complexity in the current implementation: reference counts, actors ... too many moving parts and potential for bugs. Let's just focus on addressing the original issue with minimal fix, without worrying about serialization, which will be addressed in a separate PR. So, for tools that require sharing a persistent session: with create_mcp_server_session(server_params) as session:
tools = mcp_server_tools(session)
agent = AssistantAgent(..., tools=tools)This won't work with serialization but that's not the point. It addresses the immediate issue. For tools that can use new session on each invocation: tools = mcp_server_tools(server_params)
agent = AssistantAgent(..., tools=tools)Same thing as before. I believe my issue description in #6198 already described this. |
|
I'd rather have a quick fix to this problem so I can make a release today :D I can take on this if it's too late for you. |
|
@ekzhu That said, I’ve been working on a use case where MCP + serialization is actually super important 😅 So… just wanted to flag that there’s at least one user (me 😅) really hoping to see serialization support without introducing breaking changes to the API. I’ll hold off for now and follow up in a separate PR as suggested! By the way — it’s currently 4AM KST here 😅 |
Understood. Let's get this issue over with and work on a more complete fix as a separate PR. |
Resolves #6232, #6198 This PR introduces an optional parameter `session` to `mcp_server_tools` to support reuse of the same session. ```python import asyncio from autogen_agentchat.agents import AssistantAgent from autogen_agentchat.conditions import TextMentionTermination from autogen_agentchat.teams import RoundRobinGroupChat from autogen_agentchat.ui import Console from autogen_ext.models.openai import OpenAIChatCompletionClient from autogen_ext.tools.mcp import StdioServerParams, create_mcp_server_session, mcp_server_tools async def main() -> None: model_client = OpenAIChatCompletionClient(model="gpt-4o", parallel_tool_calls=False) # type: ignore params = StdioServerParams( command="npx", args=["@playwright/mcp@latest"], read_timeout_seconds=60, ) async with create_mcp_server_session(params) as session: await session.initialize() tools = await mcp_server_tools(server_params=params, session=session) print(f"Tools: {[tool.name for tool in tools]}") agent = AssistantAgent( name="Assistant", model_client=model_client, tools=tools, # type: ignore ) termination = TextMentionTermination("TERMINATE") team = RoundRobinGroupChat([agent], termination_condition=termination) await Console( team.run_stream( task="Go to https://ekzhu.com/, visit the first link in the page, then tell me about the linked page." ) ) asyncio.run(main()) ``` Based on discussion in this thread: #6284, we will consider serialization and deserialization of MCP server tools when used in this manner in a separate issue. This PR also replaces the `json_schema_to_pydantic` dependency with built-in utils.
Why are these changes needed?
This PR introduces a reusable, event-driven McpSessionActor component for AutoGen’s MCP tool adapters.
Previously, each tool instance created its own isolated ClientSession, making it difficult to share a single session context across tools. This change solves that by:
This addresses hanging issues during test teardown and resolves RuntimeError: Attempted to exit cancel scope in a different task errors in anyio.
Related issue number
Closes #6198
Checks
—-
Notes on Lifecycle Management and Design Decisions
This PR does not include a routine to reinitialize the session when an Agent or Team stops for various reasons. Instead, the actor shuts down gracefully when the process terminates.
It's unclear whether the
on_resetmethod is expected to be invoked in such cases, so I took a conservative approach. I'm happy to adjust this if there's guidance from the maintainers.As seen in the test cases, when the session is used outside of an Agent (i.e., called directly), users must explicitly call the
close()method.This design assumes lazy initialization and does not cause hangs as long as the session is constructed but not used.
Supporting
with-statement awareness was not included in this PR to keep things simple, but it can be added in a separate PR if the community sees value in it.Currently, the session is not reset inside
on_reset()since doing so would require anotherinitialize()on the next call, adding more complexity.If needed, I’m open to revisiting this, either in this PR or a follow-up, depending on guidance from domain experts or maintainers.
In summary: there may be edge cases that lead to a hang only if the session is used without an Agent and not closed explicitly.
This is considered a usage-side responsibility. Adding more complexity to guard against such nonstandard usage might lead to over-engineering.
As for
on_reset, I tried to follow the “simple is best” philosophy by relying on graceful shutdown rather than adding extra lifecycle management.If there’s a better idiomatic approach for AutoGen or MCP usage patterns, I’d greatly appreciate your advice.