Parent: #740 (P1)
Problem
Full SKILL.md inlining in system prompt wastes context budget on models with small context windows (local Ollama, Candle). A single skill can consume 1-2K tokens.
Solution
- Add
compact prompt mode: inject only skill name + description + file path
- Agent reads full instructions on-demand via a
read_skill tool call
- Auto-select mode based on model context window size (threshold configurable)
- Manual override via
skills.prompt_mode = "full" | "compact" | "auto"
Affected crates
zeph-skills (prompt generation)
zeph-core (mode selection, optional read_skill tool)
Acceptance criteria