Skip to content

Conversation

@ekzhu
Copy link
Contributor

@ekzhu ekzhu commented Apr 19, 2025

This PR introduces WorkBench.

A workbench provides a group of tools that share the same resource and state. For example, McpWorkbench provides the underlying tools on the MCP server. A workbench allows tools to be managed together and abstract away the lifecycle of individual tools under a single entity. This makes it possible to create agents with stateful tools from serializable configuration (component configs), and it also supports dynamic tools: tools change after each execution.

Here is how a workbench may be used with AssistantAgent (not included in this PR):

workbench = McpWorkbench(server_params)
agent = AssistantAgent("assistant", tools=workbench)
result = await agent.run(task="do task...")

TODOs:

  1. In a subsequent PR, update AssistantAgent to use workbench as an alternative in the tools parameter. Use StaticWorkbench to manage individual tools.
  2. In another PR, add documentation on workbench.

@codecov
Copy link

codecov bot commented Apr 19, 2025

Codecov Report

Attention: Patch coverage is 80.66465% with 64 lines in your changes missing coverage. Please review.

Project coverage is 78.37%. Comparing base (a283d26) to head (a753bc2).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...utogen-ext/src/autogen_ext/tools/mcp/_workbench.py 72.27% 28 Missing ⚠️
...es/autogen-ext/src/autogen_ext/tools/mcp/_actor.py 75.23% 26 Missing ⚠️
.../autogen-core/src/autogen_core/tools/_workbench.py 87.03% 7 Missing ⚠️
...n-core/src/autogen_core/tools/_static_workbench.py 94.91% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6340      +/-   ##
==========================================
+ Coverage   78.32%   78.37%   +0.05%     
==========================================
  Files         217      221       +4     
  Lines       15684    16010     +326     
==========================================
+ Hits        12284    12548     +264     
- Misses       3400     3462      +62     
Flag Coverage Δ
unittests 78.37% <80.66%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ekzhu ekzhu marked this pull request as ready for review April 19, 2025 23:10
@SongChiYoung
Copy link
Contributor

🎉 I really appreciate the direction of this PR — enabling Workbench for AssistantAgent is something I’ve personally been looking forward to. It’s great to see it taking shape here.

Just wanted to share a small concern: in a quick dummy test where I injected McpWorkbench into AssistantAgent and called only start() and close(), I hit this error:

RuntimeError: Attempted to exit cancel scope in a different task than it was entered in

Not sure if my test was solid — it’s likely too naive.
Still, I’ve seen this happen before with nested async lifecycles, so just flagging it in case it resurfaces.

(Of course, this may totally depend on how Workbench is managed internally. Ha ha 😄)

@SongChiYoung
Copy link
Contributor

Here’s a minimal repro (no AssistantAgent involved):

def test24():
    import asyncio
    from autogen_ext.tools.mcp import StdioServerParams, McpWorkBench

    async def main() -> None:
        server_params = StdioServerParams(
            command="npx",
            args=["@playwright/mcp@latest", "--headless", "--executable-path", "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"],
        )
        server_params.read_timeout_seconds = 120
        # tools = await mcp_server_tools(server_params)
        workbench = McpWorkBench(server_params)

        await workbench.start()
        print(await workbench.list_tools())
        await workbench.stop()

    asyncio.run(main())


if __name__ == "__main__":
    test24()

This hits the same RuntimeError: Attempted to exit cancel scope in a different task than it was entered in, even without involving any agent or nested structure.

If this is just a matter of the lifecycle not being fully wired up yet, feel free to disregard — I might’ve been a bit too eager since I’ve been really looking forward to this one 😄

@ekzhu
Copy link
Contributor Author

ekzhu commented Apr 21, 2025

Thanks @SongChiYoung . AI generated the implementation of start and close and I yet have time to properly address it. Would you like to help out here?

@SongChiYoung
Copy link
Contributor

@ekzhu
That’s actually the reason I created an explicit Actor in PR #6284 😄
At the time, I needed to ensure that each session was strictly operated within a single task context, and to do that, I designed an actor loop that pulls calls from a queue and executes them in its own dedicated task.

Even though it was a 1-on-1 execution model, I deliberately used a queue to avoid message loss or race conditions during async transitions.

If helpful, you’re welcome to reuse that actor pattern here — or I’d be happy to adapt it into this PR’s context.

That said, I don’t believe I currently have access to push commits to this PR.
Let me know how I can best help!

@ekzhu
Copy link
Contributor Author

ekzhu commented Apr 21, 2025

Can you create a separate PR from your forked repo targeting this branch ekzhu-workbench? We can use the actor as the implementation for McpWorkbench.

@SongChiYoung
Copy link
Contributor

Can you create a separate PR from your forked repo targeting this branch ekzhu-workbench? We can use the actor as the implementation for McpWorkbench.

Cool I will do that soon.

@SongChiYoung
Copy link
Contributor

@ekzhu
Take a look
#6360

@ekzhu ekzhu requested a review from lspinheiro April 23, 2025 07:15
@ekzhu
Copy link
Contributor Author

ekzhu commented Apr 23, 2025

@lspinheiro could you take a look and see if this can be used as a base abstraction for canvas memory?

@lspinheiro
Copy link
Collaborator

@lspinheiro could you take a look and see if this can be used as a base abstraction for canvas memory?

I'm trying to better understand the proposal. The canvas is design as a simple form of version control over data generated by the LLM. Right now, primarily text, either code or documents. I think we can easily make the workbench read and write from the canvas if we want to tools to change the state of existing data through patching, but I'm not sure it would fit in as a base abstraction and replace an existing component, is that what you meant?

@ekzhu
Copy link
Contributor Author

ekzhu commented Apr 23, 2025

I think we can easily make the workbench read and write from the canvas if we want to tools to change the state of existing data through patching, but I'm not sure it would fit in as a base abstraction and replace an existing component, is that what you meant?

I am thinking of the following:

  • Canvas is a shared "whiteboard" memory for multiple agents
  • You can't currently pass a canvas directly into an AssistantAgent, and have the agent manages the lifecycle of the canvas object.
  • This same problem for the current MCP tools, which also share a state that is not managed by the agent.
  • Introduce workbench, which can be passed into an AssistantAgent (or another class) that can manage its lifecycle

So, what I meant is that we can have canvas subclass Workbench and implements Workbench base class. So, it can be used directly as a whole by agents, rather than exporting individual tools, and make the agent not serializable.

@SongChiYoung
Copy link
Contributor

@ekzhu @lspinheiro

I just saw the discussion about integrating Canvas with Workbench in my inbox. 😊

@lspinheiro, I also agree with @ekzhu’s idea. The concept of Workbench supporting serialization and lifecycle management makes a lot of sense, and as ekzh​u mentioned, having multiple agents share the same Canvas also sounds reasonable.

I tested it — Canvas works well with agents, even across multiple calls (with memory). However, it’s not serializable yet.


Since Canvas is still marked as experimental, I think it might make sense to continue using it as a lightweight tool for agent use cases for now.

We could then handle the Workbench integration in a separate Issue, after we see how Workbench will actually work with Agents.

At the moment, it’s still unclear how Workbench is intended to interact with Agents, so @lspinheiro might find it difficult to know exactly how to proceed with integrating Canvas into Workbench.
Depending on how Workbench behaves, the way we integrate Canvas could differ - because Canvas is not just a tool, but also serves as memory.

BTW, I fully support @ekzhu’s direction. If expanding this PR won’t make future maintenance harder (and I understand ekzh​u will be managing this), I’m totally fine with continuing the discussion here.

If it’s helpful, I’m happy to stay involved in this discussion — feel free to reach out!

## Why are these changes needed?
- Add return_value_as_string for formating result from MCP tool

## Related issue number
- Opened Issue on #6368 

## Checks
- [x] I've included any doc changes needed for
<https://microsoft.github.io/autogen/>. See
<https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to
build and test documentation locally.
- [x] I've added tests (if relevant) corresponding to the changes
introduced in this PR.
- [x] I've made sure all auto checks have passed.

---------

Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>
Copy link
Collaborator

@lspinheiro lspinheiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be missing api reference but otherwise looks fine. Only major coment would be on tool output types. It felt a bit limited.

@lspinheiro
Copy link
Collaborator

I think we can easily make the workbench read and write from the canvas if we want to tools to change the state of existing data through patching, but I'm not sure it would fit in as a base abstraction and replace an existing component, is that what you meant?

I am thinking of the following:

  • Canvas is a shared "whiteboard" memory for multiple agents
  • You can't currently pass a canvas directly into an AssistantAgent, and have the agent manages the lifecycle of the canvas object.
  • This same problem for the current MCP tools, which also share a state that is not managed by the agent.
  • Introduce workbench, which can be passed into an AssistantAgent (or another class) that can manage its lifecycle

So, what I meant is that we can have canvas subclass Workbench and implements Workbench base class. So, it can be used directly as a whole by agents, rather than exporting individual tools, and make the agent not serializable.

I think I got the idea, but currently I'm using the canvas more as a space to share objects between agents so it being serialized with an agent was something that I didnt think about. But the canvas memory holds a reference to the canvas, if we make it a component config I think it would be serialised as part of the agent. Wouldnt that solve the problem? Going through the PR the workbench concept seemed much closer to tools while the canvas is closer to memory. The tools in the canvas are there primarily because we dont have a clear abstraction to updating memories. I can have look to see if we can combine the concepts but I'm not 100% sure they fit together.

@SongChiYoung
Copy link
Contributor

SongChiYoung commented Apr 24, 2025

@ekzhu @lspinheiro

I think we can easily make the workbench read and write from the canvas if we want to tools to change the state of existing data through patching, but I'm not sure it would fit in as a base abstraction and replace an existing component, is that what you meant?

I am thinking of the following:

  • Canvas is a shared "whiteboard" memory for multiple agents
  • You can't currently pass a canvas directly into an AssistantAgent, and have the agent manages the lifecycle of the canvas object.
  • This same problem for the current MCP tools, which also share a state that is not managed by the agent.
  • Introduce workbench, which can be passed into an AssistantAgent (or another class) that can manage its lifecycle

So, what I meant is that we can have canvas subclass Workbench and implements Workbench base class. So, it can be used directly as a whole by agents, rather than exporting individual tools, and make the agent not serializable.

I think I got the idea, but currently I'm using the canvas more as a space to share objects between agents so it being serialized with an agent was something that I didnt think about. But the canvas memory holds a reference to the canvas, if we make it a component config I think it would be serialised as part of the agent. Wouldnt that solve the problem? Going through the PR the workbench concept seemed much closer to tools while the canvas is closer to memory. The tools in the canvas are there primarily because we dont have a clear abstraction to updating memories. I can have look to see if we can combine the concepts but I'm not 100% sure they fit together.

You’re right - that’s the key issue what I raised:

“At the moment, it’s still unclear how Workbench is intended to interact with Agents.”

And @ekzhu mentioned Workbench as a tool for sharing across agents - I think this gives us a direction.
What if each Workbench instance is globally registered via a Singleton-like Manager?

example,

workbench = McpWorkbench( ...)

It looks simple, but internally something like:

class McpWorkbench(...):
    def __new__(cls, id:str|None=None, ...):
        if id is None:
             id = uuid.....
        workbench = WorkbenchManager.get_workbench(id)
        if workbench is None:
            # build workbench
            WorkbenchManager.set_workbench(id, workbench)
        return workbench
        
    def _to_config(self,):
        return McpWorkbenchConfig(
            id = self._id,
            ....
        )
        
    @classmethod
    def _from_config(cls, config: McpWorkbenchConfig):
        cls(config.id, ...)

Using this approach, agents could share Workbench instances naturally, and serialization/deserialization would follow a consistent global reference pattern.

This pattern would also simplify @lspinheiro’s effort to integrate Canvas with Workbench,
as it removes the need to manage explicit sharing logic between agents in a serialization-aware context.

The global manager pattern is safely scoped using UUIDs.
Each Workbench instance is uniquely identified and only re-used when explicitly serialized/deserialized.
This ensures no accidental collisions and maintains test isolation, while supporting cross-agent sharing as originally intended.

@ekzhu
Copy link
Contributor Author

ekzhu commented Apr 24, 2025

Thanks @lspinheiro for the feedback. We can improve those in the future.

@ekzhu ekzhu merged commit 8fcba01 into main Apr 24, 2025
61 checks passed
@ekzhu ekzhu deleted the ekzhu-workbench branch April 24, 2025 17:37
peterj added a commit to kagent-dev/autogen that referenced this pull request Apr 24, 2025
…e0424

* upstream/main:
  Remove `name` field from OpenAI Assistant Message (microsoft#6388)
  Introduce workbench (microsoft#6340)
  TEST/change gpt4, gpt4o serise to gpt4.1nano (microsoft#6375)
  update website version (microsoft#6364)
  Add self-debugging loop to `CodeExecutionAgent` (microsoft#6306)
  Fix: deserialize model_context in AssistantAgent and SocietyOfMindAgent and CodeExecutorAgent (microsoft#6337)
  Add azure ai agent (microsoft#6191)
  Avoid re-registering a message type already registered (microsoft#6354)
  Added support for exposing GPUs to docker code executor (microsoft#6339)
  fix: ollama fails when tools use optional args (microsoft#6343)
  Add an example using autogen-core and FastAPI to create streaming responses (microsoft#6335)
  FEAT: SelectorGroupChat could using stream inner select_prompt (microsoft#6286)
  Add experimental notice to canvas (microsoft#6349)
  DOC: add extentions - autogen-oaiapi and autogen-contextplus (microsoft#6338)
  fix: ensure serialized messages are passed to LLMStreamStartEvent (microsoft#6344)
  Generalize Continuous SystemMessage merging via model_info[“multiple_system_messages”] instead of `startswith("gemini-")` (microsoft#6345)
  Agentchat canvas (microsoft#6215)

Signed-off-by: Peter Jausovec <peter.jausovec@solo.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants