Skip to content

OpenAI service tier support #2916

@jakvbs

Description

@jakvbs

Problem

Currently, Codex CLI doesn't provide users with control over OpenAI API service tiers, limiting cost optimization options for different use cases.

Proposed Solution

Add service_tier configuration option to control API request cost and latency via OpenAI's service tier parameter:

  • auto (default): Standard processing with automatic tier selection
  • flex: 50% cheaper processing with increased latency, ideal for non-production workloads
  • priority: Faster processing for enterprise users

Implementation Details

  • Configuration via service_tier in config.toml and profiles
  • CLI flag --service-tier for per-session override
  • Model compatibility validation with fallback to auto
  • Support for both exec and TUI modes

Use Cases

  • Development/testing: Use flex tier for 50% cost savings on non-critical workloads
  • Production: Use priority tier for faster response times
  • Batch processing: Use flex for cost-effective background tasks

Configuration Examples

# Global setting
service_tier = "flex"

# Profile-specific
[profiles.cost-optimized]
model = "o3"
service_tier = "flex"
# CLI usage
codex --service-tier flex "Analyze this code"
codex exec --service-tier priority "Generate tests"

Benefits

  • Cost optimization: Up to 50% savings with flex tier
  • Performance control: Priority tier for time-sensitive work
  • Flexibility: Per-session and per-profile configuration
  • Backward compatibility: Auto tier maintains existing behavior

This feature aligns with OpenAI's service tier offerings and provides users with cost/performance control without breaking existing workflows.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions