Standardize Inference Providers to Use `OpenAIMixin`

### 🤔 What is the technical debt you think should be addressed?

When the inference providers were originally created, shared mixins like `OpenAIMixin` and `LiteLLMOpenAIMixin` did not exist. As a result, many providers implemented their own logic manually and inconsistently.

Now that these mixins are available and some providers have adopted them, we have a fragmented implementation across the codebase. This results in:

- Duplicated logic (e.g. for streaming, parameter handling, response formatting)
- Inconsistent behavior across providers
- Increased maintenance burden
- Higher likelihood of subtle bugs and divergent implementations

### 💡 What is the benefit of addressing this technical debt?

- **Consistency**: All inference providers follow the same behavior.
- **Maintainability**: Changes (e.g. API updates, bug fixes) can be made in one place.
- **Reduced Duplication**: Shared logic eliminates repeated code across providers.
- **Scalability**: Easier to onboard or implement new providers.
- **Better Testing**: Shared mixins can be tested centrally, increasing reliability.

### Inference providers

| provider | chat | completions | embeddings | status | notes |
| - | - | - | - | - | - |
| anthropic |  yes | yes | yes | ~#3366~ | |
| azure openai | yes | yes | yes | ~#3396~ | |
| bedrock |  yes | yes | no | #3748 ||
| cerebras |  yes | yes | no | ~#3481~ | |
| databricks | yes | no | no | ~#3500~ | |
| fireworks | yes | yes | yes | ~#3480~ | |
| gemini | yes | yes | yes | ~#3351~ | |
| groq | yes | yes | yes | ~#3348~ | |
| llama | yes | yes | yes | ~#2835~ | |
| nvidia | yes | yes | yes | ~#2835~ | |
| ollama | yes | yes | yes | ~#3395~ | |
| openai | yes | yes | yes | ~#2835~ | |
| runpod | yes | yes | yes | ~#3707~ | |
| sambanova | yes | yes | yes | ~#3345~ | |
| tgi | yes | yes | no | ~#3417~ | |
| hf::serverless | yes | yes | no | TODO | BROKEN: #3415 |
| hf::endpoints | yes | yes | no | TODO | |
| together | yes | yes | yes | ~#3458~ | |
| vertexai | yes | yes | no | ~#3377~ | |
| vllm | yes | yes | no | ~#3404~ | |
| watsonx | yes | yes | no | #3674 | standardized on LiteLLMOpenAIMixin |

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Standardize Inference Providers to Use `OpenAIMixin` #3387

🤔 What is the technical debt you think should be addressed?

💡 What is the benefit of addressing this technical debt?

Inference providers

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

provider	chat	completions	embeddings	status	notes
anthropic	yes	yes	yes	~~#3366~~
azure openai	yes	yes	yes	~~#3396~~
bedrock	yes	yes	no	#3748
cerebras	yes	yes	no	~~#3481~~
databricks	yes	no	no	~~#3500~~
fireworks	yes	yes	yes	~~#3480~~
gemini	yes	yes	yes	~~#3351~~
groq	yes	yes	yes	~~#3348~~
llama	yes	yes	yes	~~#2835~~
nvidia	yes	yes	yes	~~#2835~~
ollama	yes	yes	yes	~~#3395~~
openai	yes	yes	yes	~~#2835~~
runpod	yes	yes	yes	~~#3707~~
sambanova	yes	yes	yes	~~#3345~~
tgi	yes	yes	no	~~#3417~~
hf::serverless	yes	yes	no	TODO	BROKEN: #3415
hf::endpoints	yes	yes	no	TODO
together	yes	yes	yes	~~#3458~~
vertexai	yes	yes	no	~~#3377~~
vllm	yes	yes	no	~~#3404~~
watsonx	yes	yes	no	#3674	standardized on LiteLLMOpenAIMixin

Standardize Inference Providers to Use OpenAIMixin #3387

Description

🤔 What is the technical debt you think should be addressed?

💡 What is the benefit of addressing this technical debt?

Inference providers

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Standardize Inference Providers to Use `OpenAIMixin` #3387