-
Notifications
You must be signed in to change notification settings - Fork 1
Closed
6 / 66 of 6 issues completed
Copy link
Labels
epicMilestone-level tracking issueMilestone-level tracking issuellmLLM provider relatedLLM provider related
Description
Epic
Add image/vision support across the Zeph stack: accept images from TUI, CLI, and Telegram channels, flow them through the content model, and send to LLM providers that support vision APIs.
Architecture overview
Content model changes (zeph-llm)
- Add
MessagePart::Image { data: Vec<u8>, mime_type: String }variant - Add
LlmProvider::supports_vision() -> boolmethod - When provider doesn't support vision — skip image parts with a warning
Provider-specific handling
- Ollama:
ollama-rsalready supports.with_images()onChatMessage. Config getsvision_modelfield for dedicated image-to-text model (e.g.llava,bakllava) - Claude: Add
AnthropicContentBlock::Imagevariant, switch from plain string to structured content format when images present - OpenAI: Switch content from string to array format
[{type: "text"}, {type: "image_url"}]when images present
Channel/input changes
ChannelMessagegetsimages: Vec<ImageData>field- TUI:
/image <path>command (crossterm lacks drag-drop/clipboard image support) - CLI:
--image <path>flag or/imagecommand - Telegram: handle
msg.photo()via teloxide, download and attach
Config
[llm.ollama]getsvision_modeloptional field for dedicated vision model- Cloud providers (Claude, OpenAI) use the same model — vision is implicit in the API
Sub-issues
Tracked below. Implementation order matches dependency chain.
Acceptance criteria
- User can attach an image in TUI/CLI/Telegram
- Image is sent to vision-capable provider and response displayed
- Ollama can use a separate vision model for image processing
- Non-vision providers gracefully skip images with a log warning
Reactions are currently unavailable
Sub-issues
Metadata
Metadata
Assignees
Labels
epicMilestone-level tracking issueMilestone-level tracking issuellmLLM provider relatedLLM provider related