Skip to content

Comments

feat(llm): local Whisper backend via candle for offline STT#566

Merged
bug-ops merged 3 commits intomainfrom
feat/audio-input-4
Feb 18, 2026
Merged

feat(llm): local Whisper backend via candle for offline STT#566
bug-ops merged 3 commits intomainfrom
feat/audio-input-4

Conversation

@bug-ops
Copy link
Owner

@bug-ops bug-ops commented Feb 18, 2026

Summary

  • CandleWhisperProvider implementing SpeechToText trait, behind existing candle feature flag
  • Audio pipeline: symphonia decode -> rubato resample to 16kHz mono -> mel spectrogram -> candle-transformers whisper inference
  • Model weights from HuggingFace via hf-hub (default: whisper-tiny, 39MB)
  • Device auto-detect: metal > cuda > cpu
  • 5-minute audio duration guard, MAX_DECODE_TOKENS limit, proper error handling
  • Bootstrap wiring: provider = "candle-whisper" in [llm.stt] config
  • New workspace deps: symphonia 0.5.5, rubato 0.16

Test plan

  • 6 candle_whisper tests (device detection, decode errors, resample, duration guard)
  • cargo +nightly fmt --check pass
  • cargo clippy --workspace -- -D warnings pass
  • cargo nextest run --workspace --lib --bins — 1783 passed, 9 skipped

Closes #523
Relates to #520

@github-actions github-actions bot added documentation Improvements or additions to documentation llm LLM provider related rust dependencies enhancement New feature or request size/XL labels Feb 18, 2026
Implement CandleWhisperProvider behind existing `candle` feature flag.
Audio pipeline: symphonia decode -> rubato resample to 16kHz mono ->
mel spectrogram -> candle-transformers whisper inference.

Model weights downloaded from HuggingFace via hf-hub on first use
(default: whisper-tiny). Device auto-detect: metal > cuda > cpu.
Includes 5-minute audio duration guard, max decode token limit,
and proper error handling for language token lookup.

New workspace deps: symphonia 0.5.5, rubato 0.16.

Closes #523
@bug-ops bug-ops merged commit 913538c into main Feb 18, 2026
26 of 33 checks passed
@bug-ops bug-ops deleted the feat/audio-input-4 branch February 18, 2026 23:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies documentation Improvements or additions to documentation enhancement New feature or request llm LLM provider related rust size/XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

audio: add local Whisper backend via candle (feature-gated)

1 participant