Skip to content

Standardize Inference Providers to Use OpenAIMixin #3387

@mattf

Description

@mattf

🤔 What is the technical debt you think should be addressed?

When the inference providers were originally created, shared mixins like OpenAIMixin and LiteLLMOpenAIMixin did not exist. As a result, many providers implemented their own logic manually and inconsistently.

Now that these mixins are available and some providers have adopted them, we have a fragmented implementation across the codebase. This results in:

  • Duplicated logic (e.g. for streaming, parameter handling, response formatting)
  • Inconsistent behavior across providers
  • Increased maintenance burden
  • Higher likelihood of subtle bugs and divergent implementations

💡 What is the benefit of addressing this technical debt?

  • Consistency: All inference providers follow the same behavior.
  • Maintainability: Changes (e.g. API updates, bug fixes) can be made in one place.
  • Reduced Duplication: Shared logic eliminates repeated code across providers.
  • Scalability: Easier to onboard or implement new providers.
  • Better Testing: Shared mixins can be tested centrally, increasing reliability.

Inference providers

provider chat completions embeddings status notes
anthropic yes yes yes #3366
azure openai yes yes yes #3396
bedrock yes yes no #3748
cerebras yes yes no #3481
databricks yes no no #3500
fireworks yes yes yes #3480
gemini yes yes yes #3351
groq yes yes yes #3348
llama yes yes yes #2835
nvidia yes yes yes #2835
ollama yes yes yes #3395
openai yes yes yes #2835
runpod yes yes yes #3707
sambanova yes yes yes #3345
tgi yes yes no #3417
hf::serverless yes yes no TODO BROKEN: #3415
hf::endpoints yes yes no TODO
together yes yes yes #3458
vertexai yes yes no #3377
vllm yes yes no #3404
watsonx yes yes no #3674 standardized on LiteLLMOpenAIMixin

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions