-
Notifications
You must be signed in to change notification settings - Fork 1
Closed as not planned
Labels
enhancementNew feature or requestNew feature or request
Description
Parent: #520
Depends on: #522, #523
Context
For real-time voice interaction, support streaming audio input with incremental transcription. This is a stretch goal for v1.
Options
- OpenAI Realtime API — WebSocket-based, supports audio streaming with function calling
- Local streaming — Whisper with chunked audio (VAD + sliding window)
- Deepgram/AssemblyAI — third-party streaming STT APIs
Design
`SpeechToText` streaming extension
pub trait StreamingStt: SpeechToText {
fn transcribe_stream(
&self,
audio_stream: impl Stream<Item = Vec<u8>> + Send,
) -> impl Stream<Item = Result<PartialTranscript, SttError>> + Send;
}
pub struct PartialTranscript {
pub text: String,
pub is_final: bool,
}Integration points
- TUI: microphone input via `cpal` crate + VAD (voice activity detection)
- Channels: platform-specific streaming (if supported)
Acceptance criteria
- `StreamingStt` trait defined
- At least one streaming backend implemented
- TUI microphone input works (feature-gated)
- Partial transcripts displayed in real-time
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request