Introduce workbench #6340

ekzhu · 2025-04-19T21:32:38Z

This PR introduces WorkBench.

A workbench provides a group of tools that share the same resource and state. For example, McpWorkbench provides the underlying tools on the MCP server. A workbench allows tools to be managed together and abstract away the lifecycle of individual tools under a single entity. This makes it possible to create agents with stateful tools from serializable configuration (component configs), and it also supports dynamic tools: tools change after each execution.

Here is how a workbench may be used with AssistantAgent (not included in this PR):

workbench = McpWorkbench(server_params)
agent = AssistantAgent("assistant", tools=workbench)
result = await agent.run(task="do task...")

TODOs:

In a subsequent PR, update AssistantAgent to use workbench as an alternative in the tools parameter. Use StaticWorkbench to manage individual tools.
In another PR, add documentation on workbench.

codecov · 2025-04-19T21:40:29Z

Codecov Report

Attention: Patch coverage is 80.66465% with 64 lines in your changes missing coverage. Please review.

Project coverage is 78.37%. Comparing base (a283d26) to head (a753bc2).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
...utogen-ext/src/autogen_ext/tools/mcp/_workbench.py	72.27%	28 Missing ⚠️
...es/autogen-ext/src/autogen_ext/tools/mcp/_actor.py	75.23%	26 Missing ⚠️
.../autogen-core/src/autogen_core/tools/_workbench.py	87.03%	7 Missing ⚠️
...n-core/src/autogen_core/tools/_static_workbench.py	94.91%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6340      +/-   ##
==========================================
+ Coverage   78.32%   78.37%   +0.05%     
==========================================
  Files         217      221       +4     
  Lines       15684    16010     +326     
==========================================
+ Hits        12284    12548     +264     
- Misses       3400     3462      +62

Flag	Coverage Δ
unittests	`78.37% <80.66%> (+0.05%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

SongChiYoung · 2025-04-20T00:12:30Z

🎉 I really appreciate the direction of this PR — enabling Workbench for AssistantAgent is something I’ve personally been looking forward to. It’s great to see it taking shape here.

Just wanted to share a small concern: in a quick dummy test where I injected McpWorkbench into AssistantAgent and called only start() and close(), I hit this error:

RuntimeError: Attempted to exit cancel scope in a different task than it was entered in

Not sure if my test was solid — it’s likely too naive.
Still, I’ve seen this happen before with nested async lifecycles, so just flagging it in case it resurfaces.

(Of course, this may totally depend on how Workbench is managed internally. Ha ha 😄)

SongChiYoung · 2025-04-20T00:31:33Z

Here’s a minimal repro (no AssistantAgent involved):

def test24():
    import asyncio
    from autogen_ext.tools.mcp import StdioServerParams, McpWorkBench

    async def main() -> None:
        server_params = StdioServerParams(
            command="npx",
            args=["@playwright/mcp@latest", "--headless", "--executable-path", "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"],
        )
        server_params.read_timeout_seconds = 120
        # tools = await mcp_server_tools(server_params)
        workbench = McpWorkBench(server_params)

        await workbench.start()
        print(await workbench.list_tools())
        await workbench.stop()

    asyncio.run(main())


if __name__ == "__main__":
    test24()

This hits the same RuntimeError: Attempted to exit cancel scope in a different task than it was entered in, even without involving any agent or nested structure.

If this is just a matter of the lifecycle not being fully wired up yet, feel free to disregard — I might’ve been a bit too eager since I’ve been really looking forward to this one 😄

ekzhu · 2025-04-21T18:07:22Z

Thanks @SongChiYoung . AI generated the implementation of start and close and I yet have time to properly address it. Would you like to help out here?

SongChiYoung · 2025-04-21T22:07:38Z

@ekzhu
That’s actually the reason I created an explicit Actor in PR #6284 😄
At the time, I needed to ensure that each session was strictly operated within a single task context, and to do that, I designed an actor loop that pulls calls from a queue and executes them in its own dedicated task.

Even though it was a 1-on-1 execution model, I deliberately used a queue to avoid message loss or race conditions during async transitions.

If helpful, you’re welcome to reuse that actor pattern here — or I’d be happy to adapt it into this PR’s context.

That said, I don’t believe I currently have access to push commits to this PR.
Let me know how I can best help!

ekzhu · 2025-04-21T23:49:28Z

Can you create a separate PR from your forked repo targeting this branch ekzhu-workbench? We can use the actor as the implementation for McpWorkbench.

SongChiYoung · 2025-04-22T00:52:23Z

Can you create a separate PR from your forked repo targeting this branch ekzhu-workbench? We can use the actor as the implementation for McpWorkbench.

Cool I will do that soon.

SongChiYoung · 2025-04-22T08:35:15Z

@ekzhu
Take a look
#6360

ekzhu · 2025-04-23T07:26:15Z

@lspinheiro could you take a look and see if this can be used as a base abstraction for canvas memory?

lspinheiro · 2025-04-23T07:53:53Z

@lspinheiro could you take a look and see if this can be used as a base abstraction for canvas memory?

I'm trying to better understand the proposal. The canvas is design as a simple form of version control over data generated by the LLM. Right now, primarily text, either code or documents. I think we can easily make the workbench read and write from the canvas if we want to tools to change the state of existing data through patching, but I'm not sure it would fit in as a base abstraction and replace an existing component, is that what you meant?

ekzhu · 2025-04-23T08:00:25Z

I think we can easily make the workbench read and write from the canvas if we want to tools to change the state of existing data through patching, but I'm not sure it would fit in as a base abstraction and replace an existing component, is that what you meant?

I am thinking of the following:

Canvas is a shared "whiteboard" memory for multiple agents
You can't currently pass a canvas directly into an AssistantAgent, and have the agent manages the lifecycle of the canvas object.
This same problem for the current MCP tools, which also share a state that is not managed by the agent.
Introduce workbench, which can be passed into an AssistantAgent (or another class) that can manage its lifecycle

So, what I meant is that we can have canvas subclass Workbench and implements Workbench base class. So, it can be used directly as a whole by agents, rather than exporting individual tools, and make the agent not serializable.

SongChiYoung · 2025-04-23T10:20:10Z

@ekzhu @lspinheiro

I just saw the discussion about integrating Canvas with Workbench in my inbox. 😊

@lspinheiro, I also agree with @ekzhu’s idea. The concept of Workbench supporting serialization and lifecycle management makes a lot of sense, and as ekzhu mentioned, having multiple agents share the same Canvas also sounds reasonable.

I tested it — Canvas works well with agents, even across multiple calls (with memory). However, it’s not serializable yet.

Since Canvas is still marked as experimental, I think it might make sense to continue using it as a lightweight tool for agent use cases for now.

We could then handle the Workbench integration in a separate Issue, after we see how Workbench will actually work with Agents.

At the moment, it’s still unclear how Workbench is intended to interact with Agents, so @lspinheiro might find it difficult to know exactly how to proceed with integrating Canvas into Workbench.
Depending on how Workbench behaves, the way we integrate Canvas could differ - because Canvas is not just a tool, but also serves as memory.

BTW, I fully support @ekzhu’s direction. If expanding this PR won’t make future maintenance harder (and I understand ekzhu will be managing this), I’m totally fine with continuing the discussion here.

If it’s helpful, I’m happy to stay involved in this discussion — feel free to reach out!

## Why are these changes needed? - Add return_value_as_string for formating result from MCP tool ## Related issue number - Opened Issue on #6368 ## Checks - [x] I've included any doc changes needed for <https://microsoft.github.io/autogen/>. See <https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md> to build and test documentation locally. - [x] I've added tests (if relevant) corresponding to the changes introduced in this PR. - [x] I've made sure all auto checks have passed. --------- Co-authored-by: Eric Zhu <ekzhu@users.noreply.github.com>

…6378) Reverts #6371

python/packages/autogen-core/src/autogen_core/tools/_workbench.py

lspinheiro

Seems to be missing api reference but otherwise looks fine. Only major coment would be on tool output types. It felt a bit limited.

lspinheiro · 2025-04-24T10:17:56Z

I think we can easily make the workbench read and write from the canvas if we want to tools to change the state of existing data through patching, but I'm not sure it would fit in as a base abstraction and replace an existing component, is that what you meant?

I am thinking of the following:

Canvas is a shared "whiteboard" memory for multiple agents

You can't currently pass a canvas directly into an AssistantAgent, and have the agent manages the lifecycle of the canvas object.

This same problem for the current MCP tools, which also share a state that is not managed by the agent.

Introduce workbench, which can be passed into an AssistantAgent (or another class) that can manage its lifecycle

So, what I meant is that we can have canvas subclass Workbench and implements Workbench base class. So, it can be used directly as a whole by agents, rather than exporting individual tools, and make the agent not serializable.

I think I got the idea, but currently I'm using the canvas more as a space to share objects between agents so it being serialized with an agent was something that I didnt think about. But the canvas memory holds a reference to the canvas, if we make it a component config I think it would be serialised as part of the agent. Wouldnt that solve the problem? Going through the PR the workbench concept seemed much closer to tools while the canvas is closer to memory. The tools in the canvas are there primarily because we dont have a clear abstraction to updating memories. I can have look to see if we can combine the concepts but I'm not 100% sure they fit together.

SongChiYoung · 2025-04-24T12:25:58Z

@ekzhu @lspinheiro

I think we can easily make the workbench read and write from the canvas if we want to tools to change the state of existing data through patching, but I'm not sure it would fit in as a base abstraction and replace an existing component, is that what you meant?

I am thinking of the following:

Canvas is a shared "whiteboard" memory for multiple agents

You can't currently pass a canvas directly into an AssistantAgent, and have the agent manages the lifecycle of the canvas object.

This same problem for the current MCP tools, which also share a state that is not managed by the agent.

Introduce workbench, which can be passed into an AssistantAgent (or another class) that can manage its lifecycle

So, what I meant is that we can have canvas subclass Workbench and implements Workbench base class. So, it can be used directly as a whole by agents, rather than exporting individual tools, and make the agent not serializable.

I think I got the idea, but currently I'm using the canvas more as a space to share objects between agents so it being serialized with an agent was something that I didnt think about. But the canvas memory holds a reference to the canvas, if we make it a component config I think it would be serialised as part of the agent. Wouldnt that solve the problem? Going through the PR the workbench concept seemed much closer to tools while the canvas is closer to memory. The tools in the canvas are there primarily because we dont have a clear abstraction to updating memories. I can have look to see if we can combine the concepts but I'm not 100% sure they fit together.

You’re right - that’s the key issue what I raised:

“At the moment, it’s still unclear how Workbench is intended to interact with Agents.”

And @ekzhu mentioned Workbench as a tool for sharing across agents - I think this gives us a direction.
What if each Workbench instance is globally registered via a Singleton-like Manager?

example,

workbench = McpWorkbench( ...)

It looks simple, but internally something like:

class McpWorkbench(...):
    def __new__(cls, id:str|None=None, ...):
        if id is None:
             id = uuid.....
        workbench = WorkbenchManager.get_workbench(id)
        if workbench is None:
            # build workbench
            WorkbenchManager.set_workbench(id, workbench)
        return workbench
        
    def _to_config(self,):
        return McpWorkbenchConfig(
            id = self._id,
            ....
        )
        
    @classmethod
    def _from_config(cls, config: McpWorkbenchConfig):
        cls(config.id, ...)

Using this approach, agents could share Workbench instances naturally, and serialization/deserialization would follow a consistent global reference pattern.

This pattern would also simplify @lspinheiro’s effort to integrate Canvas with Workbench,
as it removes the need to manage explicit sharing logic between agents in a serialization-aware context.

The global manager pattern is safely scoped using UUIDs.
Each Workbench instance is uniquely identified and only re-used when explicitly serialized/deserialized.
This ensures no accidental collisions and maintains test isolation, while supporting cross-agent sharing as originally intended.

ekzhu · 2025-04-24T17:30:42Z

Thanks @lspinheiro for the feedback. We can improve those in the future.

…e0424 * upstream/main: Remove `name` field from OpenAI Assistant Message (microsoft#6388) Introduce workbench (microsoft#6340) TEST/change gpt4, gpt4o serise to gpt4.1nano (microsoft#6375) update website version (microsoft#6364) Add self-debugging loop to `CodeExecutionAgent` (microsoft#6306) Fix: deserialize model_context in AssistantAgent and SocietyOfMindAgent and CodeExecutorAgent (microsoft#6337) Add azure ai agent (microsoft#6191) Avoid re-registering a message type already registered (microsoft#6354) Added support for exposing GPUs to docker code executor (microsoft#6339) fix: ollama fails when tools use optional args (microsoft#6343) Add an example using autogen-core and FastAPI to create streaming responses (microsoft#6335) FEAT: SelectorGroupChat could using stream inner select_prompt (microsoft#6286) Add experimental notice to canvas (microsoft#6349) DOC: add extentions - autogen-oaiapi and autogen-contextplus (microsoft#6338) fix: ensure serialized messages are passed to LLMStreamStartEvent (microsoft#6344) Generalize Continuous SystemMessage merging via model_info[“multiple_system_messages”] instead of `startswith("gemini-")` (microsoft#6345) Agentchat canvas (microsoft#6215) Signed-off-by: Peter Jausovec <peter.jausovec@solo.io>

Introduce workbench

ffc3d66

Add static workbench and mcp workbench

e3f219f

ekzhu marked this pull request as ready for review April 19, 2025 23:10

SongChiYoung mentioned this pull request Apr 22, 2025

FEAT: MCP Actor for WorkBench for #6340 #6360

Merged

3 tasks

SongChiYoung and others added 3 commits April 22, 2025 12:35

FEAT: MCP Actor for WorkBench for #6340 (#6360)

8f5c8a7

Merge branch 'main' into ekzhu-workbench

4077c3d

rename

eb30233

ekzhu requested review from husseinmozannar, jackgerrits and victordibia April 22, 2025 21:30

ekzhu added 4 commits April 22, 2025 23:34

Fix a bug in FunctionTool, add unit test for StaticWorkbench

62e3a87

Update tests for mcp workbench

a1f11b5

update

177f336

Fix type

dd00545

ekzhu requested a review from lspinheiro April 23, 2025 07:15

Fix doc

f1fa042

fix type

5b1a457

ekzhu added 2 commits April 23, 2025 20:50

Merge branch 'main' into ekzhu-workbench

d809394

Revert "Update: override return_value_as_string for McpToolAdapter" (#…

a753bc2

…6378) Reverts #6371

lspinheiro reviewed Apr 24, 2025

View reviewed changes

python/packages/autogen-core/src/autogen_core/tools/_workbench.py Show resolved Hide resolved

lspinheiro approved these changes Apr 24, 2025

View reviewed changes

ekzhu merged commit 8fcba01 into main Apr 24, 2025
61 checks passed

ekzhu deleted the ekzhu-workbench branch April 24, 2025 17:37

Introduce workbench #6340

Introduce workbench #6340

Uh oh!

Conversation

ekzhu commented Apr 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Apr 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

SongChiYoung commented Apr 20, 2025

Uh oh!

SongChiYoung commented Apr 20, 2025

Uh oh!

ekzhu commented Apr 21, 2025

Uh oh!

SongChiYoung commented Apr 21, 2025

Uh oh!

ekzhu commented Apr 21, 2025

Uh oh!

SongChiYoung commented Apr 22, 2025

Uh oh!

SongChiYoung commented Apr 22, 2025

Uh oh!

ekzhu commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lspinheiro commented Apr 23, 2025

Uh oh!

ekzhu commented Apr 23, 2025

Uh oh!

SongChiYoung commented Apr 23, 2025

Uh oh!

Uh oh!

lspinheiro left a comment

Choose a reason for hiding this comment

Uh oh!

lspinheiro commented Apr 24, 2025

Uh oh!

SongChiYoung commented Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ekzhu commented Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ekzhu commented Apr 19, 2025 •

edited

Loading

codecov bot commented Apr 19, 2025 •

edited

Loading

ekzhu commented Apr 23, 2025 •

edited

Loading

SongChiYoung commented Apr 24, 2025 •

edited

Loading

ekzhu commented Apr 24, 2025 •

edited

Loading