Skip to content

vision: Ollama provider image support with configurable vision model #492

@bug-ops

Description

@bug-ops

Parent: #490
Depends on: #491

Scope

Implement vision support in the Ollama provider, including a separate configurable model for image processing.

Changes

  1. crates/zeph-core/src/config/types.rs

    • Add vision_model: Option<String> field to Ollama config section
  2. config/default.toml

    • Add commented # vision_model = "llava" example
  3. crates/zeph-llm/src/ollama.rs

    • Override supports_vision() -> true when vision_model is configured
    • In convert_message(): extract MessagePart::Image data, pass via ChatMessage::with_images()
    • When vision_model is set and message contains images: temporarily switch to vision model for that request, then switch back

Notes

  • ollama-rs ChatMessage already has .with_images() — just needs wiring
  • Vision models (llava, bakllava, moondream) are pulled separately in Ollama
  • Test with mock: verify images are passed through to the API request

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestllmLLM provider related

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions