Extract local whisper inference as tower service#1268
Conversation
|
Caution Review failedThe pull request is closed. 📝 WalkthroughWalkthroughThis change introduces a new Changes
Sequence Diagram(s)sequenceDiagram
participant Client (WebSocket)
participant WhisperStreamingService (hypr-transcribe-whisper-local)
participant WhisperModel
participant VAD/Chunker
Client->>WhisperStreamingService: Connect & send audio stream
WhisperStreamingService->>VAD/Chunker: Chunk audio (VAD)
VAD/Chunker-->>WhisperStreamingService: Audio chunks
WhisperStreamingService->>WhisperModel: Transcribe audio chunks
WhisperModel-->>WhisperStreamingService: Transcription results
WhisperStreamingService->>Client: Stream JSON transcription results
Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes Possibly related PRs
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. 📜 Recent review detailsConfiguration used: .coderabbit.yaml 📒 Files selected for processing (1)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
crates/transcribe-interface/Cargo.toml (1)
6-9: Dependencies declared but not used in empty library.The dependencies are reasonable for a transcription interface but are currently unused since
src/lib.rsis empty. This relates to the unused dependencies issue in the library file.
🧹 Nitpick comments (3)
crates/transcribe-interface/src/lib.rs (1)
1-2: Consider removing unused dependencies or adding placeholder content.The library file is empty but the
Cargo.tomldeclares dependencies onserde,serde_json, andthiserrorthat are not used. This violates the coding guideline about unused dependencies.If this is intentionally a placeholder, consider adding a comment explaining the future purpose or adding minimal placeholder types:
+// Placeholder for shared transcription interfaces +// TODO: Define common traits and types for transcription servicesAlternatively, remove the unused dependencies from
Cargo.tomluntil they're needed.crates/transcribe-whisper-local/src/service/recorded.rs (1)
55-56: Remove unnecessary clone and address TODO.The
word.clone()is unnecessary sincewordis not used after this point. Also, the TODO comment should be addressed or removed.- // TODO - words.push(word.clone()); + words.push(word);crates/transcribe-aws/src/lib.rs (1)
1-1: Consider removing or expanding the draft comment.The comment "AWS draft" provides minimal context. Either remove it or expand it to explain what aspects are still in draft state.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (25)
Cargo.toml(1 hunks)crates/transcribe-aws/Cargo.toml(1 hunks)crates/transcribe-aws/src/error.rs(1 hunks)crates/transcribe-aws/src/lib.rs(4 hunks)crates/transcribe-interface/Cargo.toml(1 hunks)crates/transcribe-interface/src/lib.rs(1 hunks)crates/transcribe-whisper-local/Cargo.toml(1 hunks)crates/transcribe-whisper-local/src/error.rs(1 hunks)crates/transcribe-whisper-local/src/lib.rs(1 hunks)crates/transcribe-whisper-local/src/manager.rs(1 hunks)crates/transcribe-whisper-local/src/service/mod.rs(1 hunks)crates/transcribe-whisper-local/src/service/recorded.rs(1 hunks)crates/transcribe-whisper-local/src/service/streaming.rs(1 hunks)plugins/local-stt/Cargo.toml(3 hunks)plugins/local-stt/build.rs(0 hunks)plugins/local-stt/js/bindings.gen.ts(0 hunks)plugins/local-stt/permissions/autogenerated/commands/process_recorded.toml(0 hunks)plugins/local-stt/permissions/autogenerated/reference.md(0 hunks)plugins/local-stt/permissions/default.toml(0 hunks)plugins/local-stt/permissions/schemas/schema.json(1 hunks)plugins/local-stt/src/commands.rs(0 hunks)plugins/local-stt/src/ext.rs(1 hunks)plugins/local-stt/src/lib.rs(1 hunks)plugins/local-stt/src/manager.rs(0 hunks)plugins/local-stt/src/server.rs(4 hunks)
💤 Files with no reviewable changes (7)
- plugins/local-stt/permissions/default.toml
- plugins/local-stt/build.rs
- plugins/local-stt/permissions/autogenerated/reference.md
- plugins/local-stt/src/commands.rs
- plugins/local-stt/permissions/autogenerated/commands/process_recorded.toml
- plugins/local-stt/js/bindings.gen.ts
- plugins/local-stt/src/manager.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{js,ts,tsx,rs}
⚙️ CodeRabbit Configuration File
**/*.{js,ts,tsx,rs}: 1. No error handling.
2. No unused imports, variables, or functions.
3. For comments, keep it minimal. It should be about "Why", not "What".
Files:
crates/transcribe-interface/src/lib.rscrates/transcribe-whisper-local/src/service/mod.rsplugins/local-stt/src/lib.rscrates/transcribe-whisper-local/src/error.rscrates/transcribe-whisper-local/src/manager.rscrates/transcribe-aws/src/lib.rscrates/transcribe-whisper-local/src/lib.rscrates/transcribe-whisper-local/src/service/recorded.rsplugins/local-stt/src/ext.rscrates/transcribe-aws/src/error.rsplugins/local-stt/src/server.rscrates/transcribe-whisper-local/src/service/streaming.rs
🪛 GitHub Actions: .github/workflows/fmt.yaml
crates/transcribe-aws/src/error.rs
[error] 5-5: dprint formatting error: left behind trailing whitespace at line 5. rustfmt failed to format due to this error.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: ci
- GitHub Check: ci (windows, windows-latest)
- GitHub Check: ci (macos, macos-latest)
🔇 Additional comments (17)
crates/transcribe-aws/Cargo.toml (1)
24-25: LGTM! AWS Smithy dependencies added for error handling.The addition of
aws-smithy-runtime-apiandaws-smithy-typesdependencies supports the error handling refactoring mentioned in the summary. The specific versions ensure reproducible builds.Cargo.toml (1)
66-66: LGTM! Workspace dependency correctly added.The new
hypr-transcribe-whisper-localworkspace dependency properly integrates the new crate following the established pattern of other workspace dependencies.crates/transcribe-whisper-local/src/lib.rs (1)
1-6: LGTM! Clean module facade following Rust conventions.The selective re-export of
errorandservicemodules while keepingmanagerinternal demonstrates good API design and encapsulation.crates/transcribe-whisper-local/src/service/mod.rs (1)
1-5: LGTM! Standard service module structure.The use of wildcard re-exports appropriately exposes the streaming and recorded service APIs.
plugins/local-stt/permissions/schemas/schema.json (1)
454-457: Permission schema correctly updated to reflect command removal.The default permission descriptions properly exclude the removed
allow-process-recordedpermission, maintaining consistency with the architectural changes.plugins/local-stt/src/lib.rs (2)
32-45: Command list correctly updated.The removal of
process_recordedcommand from the Specta builder aligns with the architectural refactor to extract transcription functionality.
9-9: Visibility pattern is intentional and correct.The
servermodule is kept private (mod server) but its public items are re-exported viapub use server::*, allowing external code to access only the intended symbols without exposing the module’s internal path. No changes needed.crates/transcribe-whisper-local/src/service/recorded.rs (1)
3-61: Well-structured audio processing pipeline.The function follows a clear workflow: decode → resample → segment → transcribe → format results. The integration of rodio, hypr libraries, and Whisper/pyannote models is appropriate for recorded audio processing.
crates/transcribe-whisper-local/src/manager.rs (1)
19-19: Handle potential mutex poisoning.Using
unwrap()on mutex lock can panic if the mutex is poisoned. Consider handling this error case gracefully.- let mut slot = self.inner.lock().unwrap(); + let mut slot = match self.inner.lock() { + Ok(guard) => guard, + Err(poisoned) => poisoned.into_inner(), + };Likely an incorrect or invalid review comment.
plugins/local-stt/src/ext.rs (1)
117-117: LGTM!The change from
default()tobuilder()follows a clearer builder pattern convention.crates/transcribe-aws/src/error.rs (1)
3-21: Excellent error handling refactor!The change from generic string-based errors to specific SDK error wrappers with transparent error propagation is a significant improvement. This provides better error context and type safety.
crates/transcribe-whisper-local/Cargo.toml (1)
1-38: Well-structured crate manifest!The Cargo.toml is properly configured with:
- Clear feature flags for various hardware acceleration backends
- Consistent use of workspace dependencies
- Appropriate dependency selection for transcription functionality
plugins/local-stt/Cargo.toml (1)
16-24: LGTM! Clean dependency consolidation.The refactoring successfully consolidates multiple crates (
hypr-whisper-local,hypr-pyannote-local) into a singlehypr-transcribe-whisper-localcrate, improving modularity. The removal of the "multipart" feature from axum aligns with the elimination of theprocess_recordedfunctionality.Also applies to: 48-48, 65-65
plugins/local-stt/src/server.rs (2)
98-100: LGTM! Clean server implementation and appropriate test coverage.The simplified health endpoint and the focused test for the health check are well-implemented. The server structure is now more maintainable with the extraction of transcription logic to the dedicated service.
Also applies to: 102-130
80-86: Add error handling for service initialization.The
WhisperStreamingService::builder().build()call could potentially fail if the model path is invalid or inaccessible. Consider handling potential initialization errors.Consider wrapping the service creation in error handling:
-fn make_service_router(state: ServerState) -> Router { +fn make_service_router(state: ServerState) -> Result<Router, crate::Error> { let model_path = state.model_cache_dir.join(state.model_type.file_name()); let whisper_service = hypr_transcribe_whisper_local::WhisperStreamingService::builder() .model_path(model_path) .build(); - Router::new() + Ok(Router::new() .route("/health", get(health)) .route_service("/api/desktop/listen/realtime", whisper_service) .layer( CorsLayer::new() .allow_origin(cors::Any) .allow_methods(cors::Any) .allow_headers(cors::Any), - ) + )) }And update the caller:
- let router = make_service_router(state); + let router = make_service_router(state)?;Likely an incorrect or invalid review comment.
crates/transcribe-whisper-local/src/service/streaming.rs (2)
59-102: Excellent service architecture and implementation.The streaming service is well-designed with:
- Proper use of the tower Service trait
- Clean separation between single and dual channel handling
- Good connection management with cancellation support
- Appropriate error logging in the VAD stream processor
The WebSocket upgrade handling and async stream processing demonstrate good understanding of async Rust patterns.
Also applies to: 104-194, 196-283
49-56: Replace unwrap() calls with proper error handling.The builder's
build()method usesunwrap()which will panic if required fields are not set. Consider returning a Result or using a type-state pattern to ensure compile-time safety.- pub fn build(self) -> WhisperStreamingService { - WhisperStreamingService { - model_path: self.model_path.unwrap(), - connection_manager: self - .connection_manager - .unwrap_or_else(ConnectionManager::default), - } - } + pub fn build(self) -> Result<WhisperStreamingService, &'static str> { + let model_path = self.model_path.ok_or("model_path is required")?; + Ok(WhisperStreamingService { + model_path, + connection_manager: self + .connection_manager + .unwrap_or_else(ConnectionManager::default), + }) + }Likely an incorrect or invalid review comment.
No description provided.