Skip to content

Feature Request: Multimodal Image Support via @ File Search #2589

@salah9003

Description

@salah9003

What feature would you like to see?

Multimodal Image Support via @ File Search

Problem: Codex CLI has full backend infrastructure for OpenAI Vision API but no way for users to send images in interactive mode.

Solution: Extend the existing @ file search system to detect image files and display them as [Image #1] placeholders while sending actual image data to the Vision API.

Usage Example:
user> @screenshot.png explain this UI
[TAB to select image]
user> [Image #1] explain this UI

codex> This interface shows a terminal with...

Benefits:

  • Unlocks existing unused multimodal capabilities
  • Uses familiar @ syntax with TAB completion
  • Clean visual feedback with numbered placeholders
  • Zero breaking changes to existing functionality
  • Maintains chat history readability

Are you interested in implementing this feature?

Yes, I have a working prototype that extends the existing file search system. I will wait for acknowledgement before opening a PR.

Additional information

The implementation builds on existing infrastructure:

  • InputItem::LocalImage already exists in protocol
  • OpenAI Vision API integration already works
  • @ file search popup system already exists
  • Only missing piece is image detection in file search UI

Implementation preserves all existing functionality while adding seamless image support.

Technical Details:

  • Supports jpg, jpeg, png, gif, webp, bmp formats
  • Extends ChatComposer::insert_selected_path() to detect images
  • Creates numbered placeholders [Image #1], [Image #2] for display
  • Sends cleaned text + actual image data to AI processing
  • Maintains backward compatibility with existing @ file search

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions