Skip to content

Codex: move system prompts into user input for Responses API#1683

Merged
xingyaoww merged 6 commits intoOpenHands:feat/openai-subscription-authfrom
kumanday:feat/openai-subscription-auth
Jan 20, 2026
Merged

Codex: move system prompts into user input for Responses API#1683
xingyaoww merged 6 commits intoOpenHands:feat/openai-subscription-authfrom
kumanday:feat/openai-subscription-auth

Conversation

@kumanday
Copy link
Contributor

Summary

This PR fixes OpenHands Codex subscription model support by making the Responses API payload Codex-compatible. Codex subscription endpoints can reject complex/long top-level instructions (often returning 400 / “Instructions are not valid”), which breaks agent flows that rely on a large system prompt.

What changed

  • Codex-specific message adapter for Responses formatting
    • For models whose id contains "codex":
      • We do not send any system messages as top-level instructions.
      • Instead, we merge system prompt content and prepend it to the first user input message (or inject a synthetic user message if none exists).
    • Non-Codex models keep the current behavior: system messages are concatenated into instructions.
  • Ensure Codex models go through the Responses API path
    • Expanded Responses-model detection from only codex-mini-latest to any model containing codex (e.g. gpt-5.1-codex, gpt-5-codex, provider-prefixed variants).

Why this is needed

Codex subscription endpoints require server-validated instructions and can reject the complex, tool-heavy system prompts typically used by agent frameworks. Moving that content into user input avoids the strict validation while preserving the same guidance/context for the model.

Tests

  • Added a unit test that verifies for a Codex model:
    • instructions is None
    • system prompt content appears at the start of the first user input message.

Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found several issues that need attention: overly broad model matching, potential data loss bug, and missing test coverage. Details in inline comments.

@kumanday kumanday marked this pull request as draft January 10, 2026 07:03
@kumanday kumanday marked this pull request as ready for review January 19, 2026 17:24
Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this PR addresses a real need for Codex subscription compatibility, but there are several issues that should be addressed before merging. The main concerns are around detection logic consistency and unsafe content manipulation.

Comment on lines +1030 to +1035
# Subscription transport gate: only apply this workaround when calling
# ChatGPT subscription Codex backend (not the standard OpenAI API).
base = (self.base_url or "").lower()
is_subscription_codex_transport = (
"chatgpt.com" in base and "backend-api" in base and "codex" in base
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Critical: The detection logic only checks base_url but ignores the model name. This creates inconsistency with the PR description which states "Expanded Responses-model detection from only codex-mini-latest to any model containing codex".

Issues:

  1. A non-Codex model (e.g., "gpt-4") with base_url="https://chatgpt.com/backend-api/codex" will get Codex handling (likely incorrect)
  2. A Codex model (e.g., "gpt-5-codex") with standard OpenAI base_url won't get Codex handling (may be intended, but contradicts PR description)

Suggestion: Include model name in the detection logic for consistency:

Suggested change
# Subscription transport gate: only apply this workaround when calling
# ChatGPT subscription Codex backend (not the standard OpenAI API).
base = (self.base_url or "").lower()
is_subscription_codex_transport = (
"chatgpt.com" in base and "backend-api" in base and "codex" in base
)
# Subscription transport gate: only apply this workaround when calling
# ChatGPT subscription Codex backend (not the standard OpenAI API).
base = (self.base_url or "").lower()
model_name = (self.model or "").lower()
is_subscription_codex_transport = (
"chatgpt.com" in base and "backend-api" in base and "codex" in base
and "codex" in model_name
)

Or clarify in documentation that only the base_url matters, not the model name.

Comment on lines +1067 to +1072
content = item.get("content")
if not isinstance(content, list):
content = [content] if content else []
item["content"] = (
[{"type": "input_text", "text": prefix}] + content
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: Unsafe content manipulation. The code assumes that if content is not a list, it can be safely wrapped as [content]. However, this doesn't validate what type content actually is.

Issues:

  1. If content is a dict (which is possible from to_responses_dict), wrapping it as [dict] creates [{...}] instead of the expected list of content items
  2. If content is already a string or other type, this could create incorrect structure

Suggestion: Add type validation and proper handling:

Suggested change
content = item.get("content")
if not isinstance(content, list):
content = [content] if content else []
item["content"] = (
[{"type": "input_text", "text": prefix}] + content
)
if item.get("type") == "message" and item.get("role") == "user":
content = item.get("content")
if not isinstance(content, list):
# Content should already be a list from to_responses_dict
# If not, this indicates an unexpected state
content = [] if not content else [{"type": "input_text", "text": str(content)}]
item["content"] = (
[{"type": "input_text", "text": prefix}] + content
)

normalized.append(
{
"role": item.get("role"),
"content": item.get("content") or [],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: Inconsistent empty content handling. Compare this line with line 1069:

  • Line 1069: content = [content] if content else [] - preserves truthy content, uses empty list for falsy
  • Line 1096: item.get("content") or [] - replaces ANY falsy value (including empty string "") with empty list

This inconsistency could cause bugs if content is legitimately an empty string or 0.

Suggestion: Use consistent handling:

Suggested change
"content": item.get("content") or [],
"content": item.get("content") if item.get("content") is not None else [],

Comment on lines +1060 to +1083
if is_subscription_codex_transport and system_chunks:
merged_system = "\n\n---\n\n".join(system_chunks).strip()
if merged_system:
prefix = f"Context (system prompt):\n{merged_system}\n\n"
injected = False
for item in input_items:
if item.get("type") == "message" and item.get("role") == "user":
content = item.get("content")
if not isinstance(content, list):
content = [content] if content else []
item["content"] = (
[{"type": "input_text", "text": prefix}] + content
)
injected = True
break

if not injected:
input_items.insert(
0,
{
"role": "user",
"content": [{"type": "input_text", "text": prefix}],
},
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: These two operations (system prompt injection + normalization) iterate through input_items twice. For better performance, consider combining them into a single loop:

if is_subscription_codex_transport and system_chunks:
    merged_system = "\n\n---\n\n".join(system_chunks).strip()
    prefix = f"Context (system prompt):\n{merged_system}\n\n" if merged_system else None
    
    normalized: list[dict[str, Any]] = []
    injected = False
    
    for item in input_items:
        # Normalize first
        if item.get("type") == "message":
            normalized_item = {
                "role": item.get("role"),
                "content": item.get("content") or [],
            }
        else:
            normalized_item = item
        
        # Then inject prefix if applicable
        if prefix and not injected and normalized_item.get("role") == "user":
            content = normalized_item["content"] if isinstance(normalized_item["content"], list) else []
            normalized_item["content"] = [{"type": "input_text", "text": prefix}] + content
            injected = True
        
        normalized.append(normalized_item)
    
    # If we never found a user message, add synthetic one
    if prefix and not injected:
        normalized.insert(0, {
            "role": "user",
            "content": [{"type": "input_text", "text": prefix}],
        })
    
    input_items = normalized

Comment on lines +1037 to +1040
DEFAULT_CODEX_INSTRUCTIONS = (
"You are OpenHands agent, a helpful AI assistant that can interact "
"with a computer to solve tasks."
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: The hardcoded default instructions should be documented. Add a comment explaining:

  1. Why this specific text is used
  2. Why it's required by the endpoint
  3. Whether it could/should be configurable

Alternatively, consider making it a class constant with better visibility:

Suggested change
DEFAULT_CODEX_INSTRUCTIONS = (
"You are OpenHands agent, a helpful AI assistant that can interact "
"with a computer to solve tasks."
)
# Codex subscription endpoints require a simple, stable instructions string.
# Complex system prompts are moved to user content to avoid validation errors.
DEFAULT_CODEX_INSTRUCTIONS = (
"You are OpenHands agent, a helpful AI assistant that can interact "
"with a computer to solve tasks."
)

m_sys = Message(role="system", content=[TextContent(text="SYS")])
m_user = Message(role="user", content=[TextContent(text="USER")])

llm = LLM(model="gpt-5.1-codex", base_url="https://chatgpt.com/backend-api/codex")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Nit: The test uses model="gpt-5.1-codex" but the actual implementation doesn't check the model name at all - only the base_url. This makes the test somewhat misleading.

Either:

  1. Update the implementation to check model name (see comment on llm.py:1033)
  2. Or update this test to make it clear that only the base_url matters:
Suggested change
llm = LLM(model="gpt-5.1-codex", base_url="https://chatgpt.com/backend-api/codex")
llm = LLM(model="any-model", base_url="https://chatgpt.com/backend-api/codex")

Comment on lines +58 to +59
assert instr is not None
assert "OpenHands agent" in instr
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Nit: The assertion is too weak. It should explicitly verify that the default instructions are used, not the system prompt content:

Suggested change
assert instr is not None
assert "OpenHands agent" in instr
# Verify default instructions are used, not the system prompt
assert instr == "You are OpenHands agent, a helpful AI assistant that can interact with a computer to solve tasks."
assert "SYS" not in instr # System prompt should NOT be in instructions

Copy link
Collaborator

@xingyaoww xingyaoww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Gonna merge this into my PR first so we can do clean up later!

@xingyaoww xingyaoww merged commit 57cea16 into OpenHands:feat/openai-subscription-auth Jan 20, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants