feat: add Whisper Large V3 (by Argmax) support to STT settings#2047
feat: add Whisper Large V3 (by Argmax) support to STT settings#2047
Conversation
Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
✅ Deploy Preview for hyprnote-storybook ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
✅ Deploy Preview for hyprnote ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
📝 WalkthroughWalkthroughAdds support for a new STT model "am-whisper-large-v3" (Whisper Large v3) across the application. The integration spans frontend configuration UI, selection flow, shared utility mappings, and backend model registry, establishing model metadata, language support, and type-level registration. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Possibly related PRs
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (3)
apps/desktop/src/components/settings/ai/stt/configure.tsx (1)
192-196: Consider arch‑gating and naming consistency for the new local Argmax model
- The new
HyprProviderLocalRowcorrectly usesmodel="am-whisper-large-v3"and wires into the existing download flow.- Right now this row is visible on all platforms, while
useConfiguredMappingonly exposes"am-whisper-large-v3"on Apple Silicon. If the model truly only works on Apple Silicon, consider mirroring theisAppleSilicongating here (e.g., hide or disable this row off‑arch) to avoid confusing users who can “download” but never effectively use it.- Also,
displayName="Whisper Large v3"here vs"Whisper Large V3"indisplayModelIdis a tiny naming mismatch; it may be worth normalizing on one variant for a consistent UI.apps/desktop/src/components/settings/ai/stt/shared.tsx (2)
51-53: Model ID wiring is consistent; consider unifying display label casing
displayModelIdcorrectly special‑cases"am-whisper-large-v3"and the HyprnotePROVIDERSentry includes this model string, keeping the selection and display pipeline coherent.- As noted in
configure.tsx, the label here is"Whisper Large V3"while the config card uses"Whisper Large v3". Not a blocker, but you may want to standardize on one (e.g.,"Whisper Large v3") for a consistent UX.Also applies to: 69-83
145-176: LANGUAGE_SUPPORT for am-whisper-large-v3 mirrors Whisper’s multilingual set; watch for drift
- The new
"am-whisper-large-v3"entry inLANGUAGE_SUPPORT.hyprnoteuses the full Whisper multilingual language list, which matches how the Rust backend mapsWhisperLargeV3towhisper_multi_languages.- This duplication (frontend string codes vs backend
ISO639list) is reasonable for now but can drift if one side is updated and the other isn’t.It’d be good to:
- Add a brief comment noting that this list should stay in sync with the backend Whisper multilingual set / Argmax model capabilities, and
- Double‑check that this language list matches the actual support guarantees for the Argmax Whisper Large V3 build you’re shipping.
Also applies to: 177-275
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
apps/desktop/src/components/settings/ai/stt/configure.tsx(1 hunks)apps/desktop/src/components/settings/ai/stt/select.tsx(2 hunks)apps/desktop/src/components/settings/ai/stt/shared.tsx(3 hunks)plugins/local-stt/src/model.rs(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{ts,tsx}: Avoid creating a bunch of types/interfaces if they are not shared. Especially for function props, just inline them instead.
Never do manual state management for form/mutation. Use useForm (from tanstack-form) and useQuery/useMutation (from tanstack-query) instead for 99% of cases. Avoid patterns like setError.
If there are many classNames with conditional logic, usecn(import from@hypr/utils). It is similar toclsx. Always pass an array and split by logical grouping.
Usemotion/reactinstead offramer-motion.
Files:
apps/desktop/src/components/settings/ai/stt/configure.tsxapps/desktop/src/components/settings/ai/stt/shared.tsxapps/desktop/src/components/settings/ai/stt/select.tsx
🧬 Code graph analysis (1)
apps/desktop/src/components/settings/ai/stt/select.tsx (1)
apps/desktop/src/components/settings/ai/stt/shared.tsx (1)
sttModelQueries(386-413)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Redirect rules - hyprnote
- GitHub Check: Header rules - hyprnote
- GitHub Check: Pages changed - hyprnote
- GitHub Check: Devin
- GitHub Check: desktop_ci (linux, depot-ubuntu-24.04-8)
- GitHub Check: desktop_ci (linux, depot-ubuntu-22.04-8)
- GitHub Check: fmt
- GitHub Check: desktop_ci (macos, depot-macos-14)
🔇 Additional comments (2)
plugins/local-stt/src/model.rs (1)
4-15: WhisperLargeV3 is correctly registered and aligned with multilingual Whisper behavior
SUPPORTED_MODELSlength matches the 10 listed variants, andAmModel::WhisperLargeV3is added consistently to the array andsupported_languages.- Mapping
WhisperLargeV3towhisper_multi_languageskeeps its behavior aligned with existing multilingual Whisper models.Also applies to: 187-191
apps/desktop/src/components/settings/ai/stt/select.tsx (1)
232-240: New Argmax Whisper Large V3 query and model mapping look consistent
- The extra
useQueriesslot for"am-whisper-large-v3"matches the destructuring order, andwhisperLargeV3.data ?? falseis correctly threaded into the Hyprnotemodelsarray.- Gating the new model behind
isAppleSiliconin the configured mapping is consistent with limiting it to Apple Silicon.Please just confirm that
"am-whisper-large-v3"is included in theSupportedSttModeltype from@hypr/plugin-local-sttand that the backend expects exactly this identifier string.Also applies to: 255-263
feat: add Whisper Large V3 (by Argmax) support to STT settings
Summary
Adds Whisper Large V3 (powered by Argmax) as a new on-device STT model option in the settings UI. The backend support for this model already existed in
crates/am- this PR exposes it in the frontend.Changes:
AmModel::WhisperLargeV3toSUPPORTED_MODELSarray in local-stt pluginReview & Testing Checklist for Human
Recommended test plan: On an Apple Silicon Mac, open Settings > AI > STT, expand the Hyprnote provider section, download "Whisper Large v3", then start a recording session to verify transcription works.
Notes