Skip to content

Load balancing #1983

@TimW9

Description

@TimW9

Description

I am running into trouble with rate limits in some scenarios. I would love to see a feature in pydantic ai where I can provide the agent with (multiple) models/fallback models, so the agent could then use the other models if rate limits are reached.

Right now, the only way to achieve this is to define multiple agents, give each a model and wrap them all over the system prompts and tools, which is already really messy and then try catch the rate limits and balance the load elsewhere.

References

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions