Standardize URL Configuration Across Remote Inference Providers

### 🤔 What is the technical debt you think should be addressed?

## Problem

Remote inference providers have inconsistent URL configuration patterns, making the API confusing for users and difficult to maintain. After surveying all providers, there are multiple naming conventions and handling approaches.

## Current State Analysis

| Provider | Field Name | Type | Default Value | URL Transformation |
|----------|------------|------|---------------|-------------------|
| **OpenAI** | `base_url` | `str` | `https://api.openai.com/v1` | **None** - Direct pass-through |
| **Cerebras** | `base_url` | `str` | `https://api.cerebras.ai` | **Append `/v1`** - `urljoin(base_url, "v1")` |
| **Azure** | `api_base` | `HttpUrl` | Required field | **Append `/openai/v1`** - `urljoin(api_base, "/openai/v1")` |
| **Llama OpenAI Compat** | `openai_compat_api_base` | `str` | `https://api.llama.com/compat/v1/` | **None** - Direct pass-through |
| **NVIDIA** | `url` | `str` | `https://integrate.api.nvidia.com` | **Conditional `/v1`** - Append `/v1` if `append_api_version=True` |
| **Fireworks** | `url` | `str` | `https://api.fireworks.ai/inference/v1` | **Hardcoded** - Ignores config, returns `https://api.fireworks.ai/inference/v1` |
| **Together** | `url` | `str` | `https://api.together.xyz/v1` | **Hardcoded** - Ignores config, uses Together's `BASE_URL` constant |
| **Groq** | `url` | `str` | `https://api.groq.com` | **Append `/openai/v1`** - `{url}/openai/v1` |
| **Ollama** | `url` | `str` | `http://localhost:11434` | **Append `/v1`** - `url.rstrip("/") + "/v1"` |
| **vLLM** | `url` | `str \| None` | `None` (required) | **None** - Direct pass-through |
| **TGI** | `url` | `str` | Required field | **Append `/v1`** - `{url.rstrip('/')}/v1` (set during initialization) |
| **Databricks** | `url` | `str \| None` | `None` | **Append `/serving-endpoints`** - `{url}/serving-endpoints` |
| **SambaNova** | `url` | `str` | `https://api.sambanova.ai/v1` | **None** - Direct pass-through |
| **Runpod** | `url` | `str \| None` | `None` | **None** - Direct pass-through |
| **WatsonX** | `url` | `str` | `https://us-south.ml.cloud.ibm.com` | **None** - Direct pass-through |
| **Passthrough** | `url` | `str` | `None` (required) | **None** - Direct pass-through |
| **Anthropic** | N/A | N/A | N/A | **Hardcoded** - Ignores config, returns `https://api.anthropic.com/v1` |
| **VertexAI** | `project` + `location` | `str` | Various | **Custom construction** - GCP-specific URL from project/location |
| **Gemini** | N/A | N/A | N/A | **Hardcoded** - Returns `https://generativelanguage.googleapis.com/v1beta/openai/` |

### URL Construction Patterns

Providers handle URL construction with significant inconsistencies:

1. **Direct pass-through**: OpenAI, vLLM, SambaNova, Runpod, WatsonX, Passthrough, Llama OpenAI Compat
2. **Automatic `/v1` appending**: 
   - Ollama (always): `url.rstrip("/") + "/v1"`
   - NVIDIA (conditional): `/v1` if `append_api_version=True`
   - Cerebras (always): `urljoin(base_url, "v1")`
   - TGI (during init): `{url.rstrip('/')}/v1`
3. **Custom path construction**: 
   - Azure: `urljoin(api_base, "/openai/v1")`
   - Groq: `{url}/openai/v1`
   - Databricks: `{url}/serving-endpoints`
   - VertexAI: Complex GCP URL construction
4. **Hardcoded endpoints**: 
   - Anthropic: `https://api.anthropic.com/v1` (ignores config)
   - Fireworks: `https://api.fireworks.ai/inference/v1` (ignores config)
   - Together: Uses Together's `BASE_URL` constant (ignores config)
   - Gemini: `https://generativelanguage.googleapis.com/v1beta/openai/` (ignores config)

### Environment Variable Inconsistencies

- `OPENAI_BASE_URL`, `NVIDIA_BASE_URL`, `WATSONX_BASE_URL` (some providers)
- `OLLAMA_URL`, `VLLM_URL`, `TGI_URL` (other providers)
- Mixed patterns that don't always align with provider documentation

## Proposed Solution

### 1. Standardize Field Naming

**Recommendation**: Use `base_url` consistently across all providers unless the provider has their own standard, e.g. Databricks documents [`host` and `DATABICKS_HOST`](https://docs.databricks.com/aws/en/dev-tools/auth/).

### 2. Consistent Type Annotations

- Use `HttpUrl` or `HttpUrl | None`

### 3. Environment Variable Alignment

**Recommendation**: Align with each provider's official documentation and conventions.

Examples of provider-native conventions:
- `OPENAI_BASE_URL` (OpenAI standard)
- `OLLAMA_URL` (Ollama standard)
- `VLLM_URL` (vLLM standard)

**Approach**: Research each provider's official documentation and use their recommended environment variable names, rather than forcing a unified pattern that conflicts with provider conventions.

### 4. URL Construction Guidelines

**Recommendation**: Minimize modifications made to user configuration. For instance, have users input a full url with `/v1` or `/openai/v1` instead of appending at runtime.

**Warning**: This will be a breakng change for multiple providers.


### 💡 What is the benefit of addressing this technical debt?

(above)

### Other thoughts

_No response_

Provider	Field Name	Type	Default Value	URL Transformation
OpenAI	`base_url`	`str`	`https://api.openai.com/v1`	None - Direct pass-through
Cerebras	`base_url`	`str`	`https://api.cerebras.ai`	Append `/v1` - `urljoin(base_url, "v1")`
Azure	`api_base`	`HttpUrl`	Required field	Append `/openai/v1` - `urljoin(api_base, "/openai/v1")`
Llama OpenAI Compat	`openai_compat_api_base`	`str`	`https://api.llama.com/compat/v1/`	None - Direct pass-through
NVIDIA	`url`	`str`	`https://integrate.api.nvidia.com`	Conditional `/v1` - Append `/v1` if `append_api_version=True`
Fireworks	`url`	`str`	`https://api.fireworks.ai/inference/v1`	Hardcoded - Ignores config, returns `https://api.fireworks.ai/inference/v1`
Together	`url`	`str`	`https://api.together.xyz/v1`	Hardcoded - Ignores config, uses Together's `BASE_URL` constant
Groq	`url`	`str`	`https://api.groq.com`	Append `/openai/v1` - `{url}/openai/v1`
Ollama	`url`	`str`	`http://localhost:11434`	Append `/v1` - `url.rstrip("/") + "/v1"`
vLLM	`url`	`str \| None`	`None` (required)	None - Direct pass-through
TGI	`url`	`str`	Required field	Append `/v1` - `{url.rstrip('/')}/v1` (set during initialization)
Databricks	`url`	`str \| None`	`None`	Append `/serving-endpoints` - `{url}/serving-endpoints`
SambaNova	`url`	`str`	`https://api.sambanova.ai/v1`	None - Direct pass-through
Runpod	`url`	`str \| None`	`None`	None - Direct pass-through
WatsonX	`url`	`str`	`https://us-south.ml.cloud.ibm.com`	None - Direct pass-through
Passthrough	`url`	`str`	`None` (required)	None - Direct pass-through
Anthropic	N/A	N/A	N/A	Hardcoded - Ignores config, returns `https://api.anthropic.com/v1`
VertexAI	`project` + `location`	`str`	Various	Custom construction - GCP-specific URL from project/location
Gemini	N/A	N/A	N/A	Hardcoded - Returns `https://generativelanguage.googleapis.com/v1beta/openai/`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Standardize URL Configuration Across Remote Inference Providers #3732

🤔 What is the technical debt you think should be addressed?

Problem

Current State Analysis

URL Construction Patterns

Environment Variable Inconsistencies

Proposed Solution

1. Standardize Field Naming

2. Consistent Type Annotations

3. Environment Variable Alignment

4. URL Construction Guidelines

💡 What is the benefit of addressing this technical debt?

Other thoughts

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Standardize URL Configuration Across Remote Inference Providers #3732

Description

🤔 What is the technical debt you think should be addressed?

Problem

Current State Analysis

URL Construction Patterns

Environment Variable Inconsistencies

Proposed Solution

1. Standardize Field Naming

2. Consistent Type Annotations

3. Environment Variable Alignment

4. URL Construction Guidelines

💡 What is the benefit of addressing this technical debt?

Other thoughts

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions