Skip to content

Conversation

@strawgate
Copy link
Contributor

@strawgate strawgate commented Jul 29, 2025

This PR adds a decorator that can be used to dynamically add toolsets to an Agent run similar to how the system prompt and instructions can be added to via decorated functions.

@strawgate strawgate changed the title [Draft] Example of Dynamic toolset Introduce a Dynamic Toolset Jul 30, 2025
@DouweM
Copy link
Collaborator

DouweM commented Jul 30, 2025

@strawgate Thanks! My main concern is that the function taking ctx suggests that a new toolset is going to be built when the context changes (on each run step) or at least when a new run is performed (and the ctx.deps would likely change), so that you can safely use run/user-specific details from ctx.deps like in your filesystem root directory example, without accidentally using a toolset built for another user/run.

But in reality it is tied to toolset enters/exits and there's no such guarantee... Those could align with agent runs, but wouldn't if you have your own async with agent: wrapping multiple agent runs (which you may want to do to keep MCP servers running during an entire request that could include multiple agent runs) because of the entered_count stuff we do in Agent.__aenter__ and CombinedToolset.__aenter__. Maybe we should just rip that out, so that we know for sure we enter/exit each toolset whenever a new agent run starts? Maybe we should also pass only deps instead of RunContext to make it clear you're not actually getting run-step specific context?

I want to make sure we don't accidentally introduce a footgun where the behavior will change completely when you add an async with agent: around a request or the entire FastAPI lifecycle or something, thinking you're just optimizing e.g. MCP server start/shutdowns.

Maybe we need to add a new hook to toolsets that's only called when a new run/step is actually started, so we don't rely on the enters/exits which could happen at any level?

This is all a bit complicated and my analysis here may very well be wrong. Either way I think we'll need tests with different patterns to make sure we get the desired behavior in each case.

We should probably chat about this directly, would you mind joining our public Slack so we can huddle?

@strawgate
Copy link
Contributor Author

strawgate commented Jul 31, 2025

@DouweM i added the toolset decorator and added toolsetfunc to agent init, the tests will be passing shortly, I'll play around with this in my poc environment

@strawgate strawgate force-pushed the dynamic-toolset branch 2 times, most recently from 71f3c22 to 26b8364 Compare July 31, 2025 14:51
@strawgate
Copy link
Contributor Author

@DouweM the tests are passing and I've been using this on my fork for a little bit with good success, interested in your thoughts

@strawgate strawgate changed the title Introduce a Dynamic Toolset Introduce a Dynamic Toolsets Aug 1, 2025
@strawgate strawgate changed the title Introduce a Dynamic Toolsets Introduce Dynamic Toolsets Aug 1, 2025
@DouweM
Copy link
Collaborator

DouweM commented Aug 1, 2025

@strawgate Thank you, I like the decorator approach a lot. As I mentioned on our call, I want the default behavior to be that it gets rebuilt for every run step, as the presence of the RunContext suggests, so I implemented that and pushed into the PR.

We now have a new per_run_step argument on the decorator that you'll want to set to False, plus a DynamicToolset that tracks the toolset and handles enters/exits, so we're also sort of back to where this PR started :) Let me know what you think!

The only things missing now is a test to make sure the enter/exit behavior is correct, and docs.

@strawgate strawgate force-pushed the dynamic-toolset branch 3 times, most recently from 2822fae to aefbce3 Compare August 2, 2025 14:30
Copy link
Collaborator

@DouweM DouweM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@strawgate Thanks William, I take it the new implementation has been working well for you?

@strawgate
Copy link
Contributor Author

@strawgate Thanks William, I take it the new implementation has been working well for you?

Yes! I am using it here and here

I was mostly testing the per_run_step=true but just switched to per_run_step=false and am seeing good results.

@strawgate
Copy link
Contributor Author

looks like unrelated test is failing

@DouweM DouweM changed the title Introduce Dynamic Toolsets based on run context Let toolsets be built dynamically based on run context Aug 6, 2025
@DouweM
Copy link
Collaborator

DouweM commented Aug 6, 2025

@strawgate I made some tweaks to the docs, having this replace the WrapperToolset workaround for dynamic toolsets that encouraged some bad patterns. If this all looks good to you I think we can merge soon!

@strawgate
Copy link
Contributor Author

Looks great

@DouweM DouweM enabled auto-merge (squash) August 8, 2025 14:19
@DouweM DouweM merged commit 13ea417 into pydantic:main Aug 8, 2025
17 checks passed
@HamzaFarhan
Copy link
Contributor

HamzaFarhan commented Aug 9, 2025

@DouweM I feel like after this it makes even more sense for output_type to also accept toolsets
Right now if we have 10 output_types, and 5 toolsets with let's say 5-6 tools per toolset, we can add a toolset dynamically. In other words, add 5-6 tools with a single check. But we would need to add the output_types using if else in a single prepare function.
If they were toolsets, the same trigger could be used to load 5-6 tools and 2-3 corresponding output_types.

@DouweM
Copy link
Collaborator

DouweM commented Aug 12, 2025

@HamzaFarhan For proper type checking of the agent.result.output, the possible output types need to be statically known at agent definition time, which wouldn't work with toolsets. Type safety is pretty core to Pydantic (and AI's) philosophy, so that makes this a non-starter unfortunately.

ethanabrooks added a commit to reflectionai/pydantic-ai that referenced this pull request Aug 20, 2025
* Add `priority` `service_tier` to `OpenAIModelSettings` and respect it in `OpenAIResponsesModel` (pydantic#2368)

* Add an example of using RunContext to pass data among tools (pydantic#2316)

Co-authored-by: Douwe Maan <douwe@pydantic.dev>

* Rename gemini-2.5-flash-lite-preview-06-17 to gemini-2.5-flash-lite as it's out of preview (pydantic#2387)

* Fix toggleable toolset example so toolset state is not shared across agent runs (pydantic#2396)

* Support custom thinking tags specified on the model profile (pydantic#2364)

Co-authored-by: jescudero <jescudero@itos.es>
Co-authored-by: Douwe Maan <douwe@pydantic.dev>

* Add convenience functions to handle AG-UI requests with request-specific deps (pydantic#2397)

* docs: add missing optional packages in `install.md` (pydantic#2412)

* Include default values in tool arguments JSON schema (pydantic#2418)

* Fix "test_download_item_no_content_type test fails on macOS" (pydantic#2404)

* Allow string format, pattern and others in OpenAI strict JSON mode (pydantic#2420)

* Let more `BaseModel`s use OpenAI strict JSON mode by defaulting to `additionalProperties=False` (pydantic#2419)

* BREAKING CHANGE: Change type of 'source' field on EvaluationResult (pydantic#2388)

Co-authored-by: Douwe Maan <douwe@pydantic.dev>

* Fix ImageUrl, VideoUrl, AudioUrl and DocumentUrl not being serializable (pydantic#2422)

* BREAKING CHANGE: Support printing reasons in the console output for pydantic-evals (pydantic#2163)

* Document performance implications of async vs sync tools (pydantic#2298)

Co-authored-by: Douwe Maan <douwe@pydantic.dev>

* Mention that tools become toolset internally (pydantic#2395)

Co-authored-by: Douwe Maan <douwe@pydantic.dev>

* Fix tests for Logfire>=3.22.0 (pydantic#2346)

* tests: speed up the test suite (pydantic#2414)

* google: add more information about schema on union (pydantic#2426)

* typo in output docs (pydantic#2427)

* Deprecate `GeminiModel` in favor of `GoogleModel` (pydantic#2416)

* Use `httpx` on `GoogleProvider` (pydantic#2438)

* Remove older deprecated models and add new model of Anthropic (pydantic#2435)

* Remove `next()` method from `Graph` (pydantic#2440)

* BREAKING CHANGE: Remove `data` from `FinalResult` (pydantic#2443)

* BREAKING CHANGE: Remove `get_data` and `validate_structured_result` from `StreamedRunResult` (pydantic#2445)

* docs: add `griffe_warnings_deprecated` (pydantic#2444)

* BREAKING CHANGE: Remove `format_as_xml` module (pydantic#2446)

* BREAKING CHANGE: Remove `result_type` parameter and similar from `Agent` (pydantic#2441)

* Deprecate `GoogleGLAProvider` and `GoogleVertexProvider` (pydantic#2450)

* BREAKING CHANGE: drop 4 months old deprecation warnings (pydantic#2451)

* Automatically use OpenAI strict mode for strict-compatible native output types (pydantic#2447)

* Make `InlineDefsJsonSchemaTransformer` public (pydantic#2455)

* Send `ThinkingPart`s back to Anthropic used through Bedrock (pydantic#2454)

* Bump boto3 to support `AWS_BEARER_TOKEN_BEDROCK` API key env var (pydantic#2456)

* Add new Heroku models (pydantic#2459)

* Add `builtin_tools` to `Agent` (pydantic#2102)

Co-authored-by: Marcelo Trylesinski <marcelotryle@gmail.com>
Co-authored-by: Douwe Maan <douwe@pydantic.dev>

* Bump mcp-run-python (pydantic#2470)

* Remove fail_under from top-level coverage config so <100% html-coverage step doesn't end CI run (pydantic#2475)

* Add AbstractAgent, WrapperAgent, Agent.event_stream_handler, Toolset.id, Agent.override(tools=...) in preparation for Temporal (pydantic#2458)

* Let toolsets be built dynamically based on run context (pydantic#2366)

Co-authored-by: Douwe Maan <douwe@pydantic.dev>

* Add ToolsetFunc to API docs (fix CI) (pydantic#2486)

* tests: change time of evals example (pydantic#2501)

* ci: remove html and xml reports (pydantic#2491)

* fix: Add gpt-5 models to reasoning model detection for temperature parameter handling (pydantic#2483)

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Co-authored-by: Douwe Maan <DouweM@users.noreply.github.com>
Co-authored-by: Marcelo Trylesinski <marcelotryle@gmail.com>

* History processor replaces message history (pydantic#2324)

Co-authored-by: Marcelo Trylesinski <marcelotryle@gmail.com>

* ci: split test suite (pydantic#2436)

Co-authored-by: Douwe Maan <douwe@pydantic.dev>

* ci: use the right install command (pydantic#2506)

* Update config.yaml (pydantic#2514)

* Skip testing flaky evals example (pydantic#2518)

* Fix error when parsing usage details for video without audio track in Google models (pydantic#2507)

* Make OpenAIResponsesModelSettings.openai_builtin_tools work again (pydantic#2520)

* Let Agent be run in a Temporal workflow by moving model requests, tool calls, and MCP to Temporal activities (pydantic#2225)

* Install only dev in CI (pydantic#2523)

* Improve CLAUDE.md (pydantic#2524)

* Add best practices regarding to coverage to CLAUDE.md (pydantic#2527)

* Add support for `"openai-responses"` model inference string (pydantic#2528)

Co-authored-by: Claude <noreply@anthropic.com>

* docs: Confident AI (pydantic#2529)

* chore: mention what to do with the documentation when deprecating a class (pydantic#2530)

* chore: drop hyperlint (pydantic#2531)

* ci: improve matrix readability (pydantic#2532)

* Add pip to dev deps for PyCharm (pydantic#2533)

Co-authored-by: Marcelo Trylesinski <marcelotryle@gmail.com>

* Add genai-prices to dev deps and a basic test (pydantic#2537)

* Add `--durations=100` to all pytest calls in CI (pydantic#2534)

* Cleanup snapshot in test_evaluate_async_logfire (pydantic#2538)

* Make some minor tweaks to the temporal docs (pydantic#2522)

Co-authored-by: Douwe Maan <douwe@pydantic.dev>

* Add new OpenAI GPT-5 models (pydantic#2503)

* Fix `FallbackModel` to respect each model's model settings (pydantic#2540)

* Add support for OpenAI verbosity parameter in Responses API (pydantic#2493)

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Douwe Maan <douwe@pydantic.dev>

* Add `UsageLimits.count_tokens_before_request` using Gemini `count_tokens` API (pydantic#2137)

Co-authored-by: Douwe Maan <douwe@pydantic.dev>

* chore: Fix uv.lock (pydantic#2546)

* Stop calling MCP server `get_tools` ahead of `agent run` span (pydantic#2545)

* Disable instrumentation by default in tests (pydantic#2535)

Co-authored-by: Marcelo Trylesinski <marcelotryle@gmail.com>

* Only wrap necessary parts of type aliases in forward annotations (pydantic#2548)

* Remove anthropic-beta default header set in `AnthropicModel` (pydantic#2544)

Co-authored-by: Marcelo Trylesinski <marcelotryle@gmail.com>

* docs: Clarify why AG-UI example links are on localhost (pydantic#2549)

* chore: Fix path to agent class in CLAUDE.md (pydantic#2550)

* Ignore leading whitespace when streaming from Qwen or DeepSeek (pydantic#2554)

* Ask model to try again if it produced a response without text or tool calls, only thinking (pydantic#2556)

Co-authored-by: Douwe Maan <douwe@pydantic.dev>

* chore: Improve Temporal test to check trace as tree instead of list (pydantic#2559)

* Fix: Forward max_uses parameter to Anthropic WebSearchTool (pydantic#2561)

* Let message history end on ModelResponse and execute pending tool calls (pydantic#2562)

* Fix type issues

* skip tests requiring API keys

* add `google-genai` dependency

* add other provider deps

* add pragma: no cover for untested logic

---------

Co-authored-by: akenar <52220260+akenarsari@users.noreply.github.com>
Co-authored-by: Tony Woland <16152581+tonyxwz@users.noreply.github.com>
Co-authored-by: Douwe Maan <douwe@pydantic.dev>
Co-authored-by: Yi-Chen Lin <103916325+ethan01x@users.noreply.github.com>
Co-authored-by: José I. Escudero <joseignacioescudero@gmail.com>
Co-authored-by: jescudero <jescudero@itos.es>
Co-authored-by: Marcelo Trylesinski <marcelotryle@gmail.com>
Co-authored-by: William Easton <bill.easton@elastic.co>
Co-authored-by: David Montague <35119617+dmontagu@users.noreply.github.com>
Co-authored-by: Guillermo <guillermo@mankind.technology>
Co-authored-by: Hamza Farhan <thehamza96@gmail.com>
Co-authored-by: Mohamed Amine Zghal <medaminezghal@outlook.com>
Co-authored-by: Yinon Ehrlich <Tiksagol@users.noreply.github.com>
Co-authored-by: Matthew Brandman <matthb6@gmail.com>
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Co-authored-by: Douwe Maan <DouweM@users.noreply.github.com>
Co-authored-by: Alex Enrique <41076109+AlexEnrique@users.noreply.github.com>
Co-authored-by: Jerry Yan <jerry@heygen.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Mayank <83648453+spike-spiegel-21@users.noreply.github.com>
Co-authored-by: Alex Hall <alex.mojaki@gmail.com>
Co-authored-by: Jerry Lin <jerry@reevo.ai>
Co-authored-by: Raymond Xu <raymond.y.xu@gmail.com>
Co-authored-by: kauabh <56749351+kauabh@users.noreply.github.com>
Co-authored-by: Victorien <65306057+Viicos@users.noreply.github.com>
Co-authored-by: Ethan Brooks <ethanabrooks@gmail.com>
Co-authored-by: eballesteros <44843469+eballesteros@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dynamic toolsets based on run context

3 participants