feat: add Soniox STT provider to settings by yujonglee · Pull Request #2060 · fastrepl/char

yujonglee · 2025-12-02T05:30:43Z

Summary

Adds Soniox as a new STT provider option in the desktop app settings. This includes the Soniox logo asset and provider configuration following the existing pattern for Deepgram and other providers.

Changes:

Added soniox.jpeg logo to apps/desktop/public/assets/
Added Soniox provider entry in shared.tsx with stt-v3 model
Added displayModelId mapping to show "Soniox v3" in the UI
Added Soniox context description in configure.tsx
Updated Soniox adapters to map stt-v3 to the appropriate API model names:
- live.rs: maps stt-v3 → stt-rt-v3 for realtime transcription
- batch.rs: maps stt-v3 → stt-async-v3 for batch transcription

Review & Testing Checklist for Human

Verify stt-rt-v3 and stt-async-v3 are valid Soniox model names per Soniox docs
Verify the Soniox logo displays correctly in the provider selector (size-5 rounded styling)
Test end-to-end: configure Soniox API key, select "Soniox v3" model, and verify realtime transcription works
If batch transcription is used in the app, verify that flow also works with Soniox

Notes

This PR is based on the refactor-adapters branch (PR #2059) as requested.

The model mapping approach was chosen because Soniox uses different model names for realtime (stt-rt-v3) vs batch (stt-async-v3), unlike Deepgram which uses the same model for both. The UI exposes a single stt-v3 option and the adapters handle the translation.

Link to Devin run: https://app.devin.ai/sessions/f857a3f230434654b027a7ab2b183b85
Requested by: yujonglee (@yujonglee)

Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>

devin-ai-integration · 2025-12-02T05:30:45Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR that start with 'DevinAI' or '@devin'.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

coderabbitai · 2025-12-02T05:30:48Z

📝 Walkthrough

Walkthrough

Adds Soniox speech-to-text provider support through desktop UI configuration and updates the RealtimeSttAdapter trait signature across all implementations to return Vec<StreamResponse> instead of Option<StreamResponse>, enabling multi-response emission per streamed input.

Changes

Cohort / File(s)	Summary
Desktop UI: Soniox Provider Configuration `apps/desktop/src/components/settings/ai/stt/configure.tsx`, `apps/desktop/src/components/settings/ai/stt/shared.tsx`	Added conditional branch for Soniox provider in ProviderContext messaging and introduced Soniox provider object with id, displayName, icon, baseUrl, and model mappings in PROVIDERS array. Updated displayModelId to handle "stt-v3" model identifier.
Adapter Trait Signature Change `owhisper/owhisper-client/src/adapter/mod.rs`	Changed RealtimeSttAdapter trait method `parse_response` return type from `Option<StreamResponse>` to `Vec<StreamResponse>`.
Deepgram & Argmax Adapter Implementations `owhisper/owhisper-client/src/adapter/deepgram/live.rs`, `owhisper/owhisper-client/src/adapter/argmax/live.rs`	Updated parse_response implementations to return Vec, converting JSON parsing results using `.into_iter().collect()` instead of `.ok()` for error handling.
Soniox Adapter: Batch Mode `owhisper/owhisper-client/src/adapter/soniox/batch.rs`	Modified model selection logic to default to "stt-v3", mapping both "stt-v3" and "stt-async-preview" to "stt-async-v3" backend model.
Soniox Adapter: Live Streaming `owhisper/owhisper-client/src/adapter/soniox/live.rs`	Implemented parse_response returning Vec, added build_response helper method, restructured token/response processing for final and non-final tokens, changed multichannel support to false, updated model mapping ("stt-v3" → "stt-rt-v3"), and added Debug derives to Token, SpeakerId, and SonioxMessage structs.
Stream Response Processing `owhisper/owhisper-client/src/live.rs`	Replaced per-item asynchronous mapping with flat_map + iterator pattern, converting each adapter parse_response result into a stream via `futures_util::stream::iter` to handle multiple responses per input across all listen paths.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

soniox/live.rs: Review response construction logic, token aggregation, and build_response helper implementation for correctness
owhisper-client/src/live.rs: Verify flat_map + iter stream flattening pattern consistency across all listen paths (mic/speaker/dual streams)
Adapter implementations: Confirm all parse_response implementations correctly handle empty/single-element vectors and maintain error semantics
Desktop UI integration: Validate Soniox provider configuration matches backend adapter expectations

Possibly related PRs

Refactor provider specific split connection through adaptor #2052: Introduces adapter-driven STT refactoring and new SonioxAdapter with trait-based method signature changes that directly align with this PR's RealtimeSttAdapter modifications.
Batch transcribe support #1638: Modifies adapter APIs (parse_response shapes) and owhisper-client listener implementations for batch transcription support, overlapping with signature and stream handling changes.
feat: add Whisper Large V3 (by Argmax) support to STT settings #2047: Updates STT settings UI with displayModelId and PROVIDERS list additions, sharing desktop configuration surface with this PR's Soniox provider integration.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 5.56% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: add Soniox STT provider to settings' clearly and concisely describes the main change: adding Soniox as a new STT provider option in the settings.
Description check	✅ Passed	The description is well-detailed and directly related to the changeset, providing context about the Soniox provider addition, model mappings, testing checklist, and rationale for design decisions.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch devin/1764653410-add-soniox-stt

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>

…soniox-stt

netlify · 2025-12-02T06:57:49Z

✅ Deploy Preview for hyprnote-storybook ready!

Name	Link
🔨 Latest commit	`4253bf3`
🔍 Latest deploy log	https://app.netlify.com/projects/hyprnote-storybook/deploys/692e8deae80ead0008f0b1d4
😎 Deploy Preview	https://deploy-preview-2060--hyprnote-storybook.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

netlify · 2025-12-02T06:57:51Z

✅ Deploy Preview for hyprnote ready!

Name	Link
🔨 Latest commit	`4253bf3`
🔍 Latest deploy log	https://app.netlify.com/projects/hyprnote/deploys/692e8dea4b984700089e595f
😎 Deploy Preview	https://deploy-preview-2060--hyprnote.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

owhisper/owhisper-client/src/adapter/deepgram/live.rs (1)
77-78: Consider logging JSON parse failures for debugging.

The current implementation silently returns an empty Vec on parse errors. For consistency with the Soniox adapter (which logs parse failures), consider adding a warning log:
     fn parse_response(&self, raw: &str) -> Vec<StreamResponse> {
-        serde_json::from_str(raw).into_iter().collect()
+        match serde_json::from_str(raw) {
+            Ok(response) => vec![response],
+            Err(e) => {
+                tracing::warn!(error = ?e, raw = raw, "deepgram_json_parse_failed");
+                vec![]
+            }
+        }
     }

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2d7101e and 4253bf3.

⛔ Files ignored due to path filters (1)

apps/desktop/public/assets/soniox.jpeg is excluded by !**/*.jpeg

📒 Files selected for processing (8)

apps/desktop/src/components/settings/ai/stt/configure.tsx (1 hunks)
apps/desktop/src/components/settings/ai/stt/shared.tsx (2 hunks)
owhisper/owhisper-client/src/adapter/argmax/live.rs (1 hunks)
owhisper/owhisper-client/src/adapter/deepgram/live.rs (1 hunks)
owhisper/owhisper-client/src/adapter/mod.rs (1 hunks)
owhisper/owhisper-client/src/adapter/soniox/batch.rs (1 hunks)
owhisper/owhisper-client/src/adapter/soniox/live.rs (7 hunks)
owhisper/owhisper-client/src/live.rs (3 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx}: Avoid creating a bunch of types/interfaces if they are not shared. Especially for function props, just inline them instead.
Never do manual state management for form/mutation. Use useForm (from tanstack-form) and useQuery/useMutation (from tanstack-query) instead for 99% of cases. Avoid patterns like setError.
If there are many classNames with conditional logic, use cn (import from @hypr/utils). It is similar to clsx. Always pass an array and split by logical grouping.
Use motion/react instead of framer-motion.

Files:

apps/desktop/src/components/settings/ai/stt/shared.tsx
apps/desktop/src/components/settings/ai/stt/configure.tsx

🧬 Code graph analysis (4)

owhisper/owhisper-client/src/adapter/argmax/live.rs (3)

owhisper/owhisper-client/src/adapter/deepgram/live.rs (1)

parse_response (77-79)

owhisper/owhisper-client/src/adapter/mod.rs (1)

parse_response (43-43)

owhisper/owhisper-client/src/adapter/soniox/live.rs (1)

parse_response (81-135)

owhisper/owhisper-client/src/adapter/soniox/batch.rs (1)

owhisper/owhisper-client/src/lib.rs (1)

params (45-48)

owhisper/owhisper-client/src/adapter/mod.rs (3)

owhisper/owhisper-client/src/adapter/argmax/live.rs (1)

parse_response (30-32)

owhisper/owhisper-client/src/adapter/deepgram/live.rs (1)

parse_response (77-79)

owhisper/owhisper-client/src/adapter/soniox/live.rs (1)

parse_response (81-135)

owhisper/owhisper-client/src/adapter/deepgram/live.rs (2)

owhisper/owhisper-client/src/adapter/argmax/live.rs (1)

parse_response (30-32)

owhisper/owhisper-client/src/adapter/mod.rs (1)

parse_response (43-43)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)

GitHub Check: Redirect rules - hyprnote
GitHub Check: Header rules - hyprnote
GitHub Check: Pages changed - hyprnote
GitHub Check: desktop_ci (linux, depot-ubuntu-22.04-8)
GitHub Check: desktop_ci (macos, depot-macos-14)
GitHub Check: desktop_ci (linux, depot-ubuntu-24.04-8)
GitHub Check: fmt

🔇 Additional comments (14)

apps/desktop/src/components/settings/ai/stt/shared.tsx (2)

31-33: LGTM!

The displayModelId mapping for "stt-v3" to "Soniox v3" follows the existing pattern and is correctly placed before the generic prefix-based checks.

101-111: LGTM!

The Soniox provider entry is well-structured and follows the existing provider pattern. All required fields are properly defined.

apps/desktop/src/components/settings/ai/stt/configure.tsx (1)

464-468: LGTM!

The Soniox context message follows the existing pattern and integrates cleanly into the conditional chain. The message is concise and provides a helpful link.

owhisper/owhisper-client/src/adapter/argmax/live.rs (1)

30-32: LGTM!

The signature update correctly aligns with the trait change, and the delegation to inner.parse_response works seamlessly with the new return type.

owhisper/owhisper-client/src/adapter/mod.rs (1)

43-43: LGTM!

The trait signature change from Option<StreamResponse> to Vec<StreamResponse> is a well-reasoned design decision that enables adapters to emit multiple responses per input message—necessary for Soniox's separate final/non-final token handling.

owhisper/owhisper-client/src/live.rs (3)

202-209: LGTM!

The flat_map + iter pattern correctly expands the Vec<StreamResponse> into individual stream items while preserving error propagation.

244-251: LGTM!

Consistent application of the flat_map pattern for the native multichannel path.

284-306: LGTM!

The split path correctly applies the same flat_map pattern to both mic and speaker streams, maintaining consistency across all stream processing paths.

owhisper/owhisper-client/src/adapter/soniox/live.rs (5)

81-134: LGTM!

The parse_response implementation is well-structured:

Properly handles error messages and parse failures with logging

Correctly identifies speech finalization via <fin>/<end> tokens

Separates final and non-final tokens into distinct responses, enabling the UI to show interim results separately from confirmed transcriptions

209-248: LGTM!

The build_response helper is well-implemented:

Correctly handles whitespace-only tokens (adds to transcript but skips word creation)

Properly converts milliseconds to seconds for timing

Calculates duration from first/last token timestamps

272-325: LGTM!

Good addition of integration test coverage. The test demonstrates the expected usage pattern with build_dual() and validates the stream processing pipeline.

259-268: The hardcoded channel_index: [0, 1] is overwritten downstream by the split path.

Since supports_native_multichannel() returns false for Soniox, responses flow through the split path in owhisper/owhisper-client/src/live.rs where set_channel_index() is called: set_channel_index(0, 2) for microphone and set_channel_index(1, 2) for speaker. The hardcoded [0, 1] value is never actually used.

44-49: LGTM!

The model mapping correctly translates UI model identifiers to Soniox realtime API model names (stt-rt-v3), consistent with the batch adapter's approach which uses stt-async-v3. Both adapters normalize the same input identifiers ("stt-v3" and preview variants) to their respective endpoint-specific model names.

owhisper/owhisper-client/src/adapter/soniox/batch.rs (1)

90-95: LGTM!

The model mapping logic correctly translates UI model identifiers to Soniox API-specific batch model names. stt-async-v3 is the valid Soniox asynchronous transcription model for batch processing. The backward compatibility for "stt-async-preview" is a nice touch.

yujonglee and others added 3 commits December 2, 2025 14:18

split live-batch

380163e

done

f421aa8

feat: add Soniox STT provider to settings

3c2f7eb

Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>

devin-ai-integration bot and others added 2 commits December 2, 2025 05:40

feat: update Soniox model to stt-v3 with adapter mapping

7930e95

Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>

done

6be2bcf

Base automatically changed from refactor-adapters to main December 2, 2025 06:55

Merge remote-tracking branch 'origin/main' into devin/1764653410-add-…

4253bf3

…soniox-stt

coderabbitai bot reviewed Dec 2, 2025

View reviewed changes

yujonglee merged commit 65aea13 into main Dec 2, 2025
15 checks passed

yujonglee deleted the devin/1764653410-add-soniox-stt branch December 2, 2025 07:03

This was referenced Dec 4, 2025

feat: add Gladia realtime STT adapter #2115

Merged

feat: add Gladia as STT provider #2119

Merged

Refactor OpenAI adapter and add model support #2128

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Soniox STT provider to settings#2060

feat: add Soniox STT provider to settings#2060
yujonglee merged 6 commits intomainfrom
devin/1764653410-add-soniox-stt

yujonglee commented Dec 2, 2025 •

edited by devin-ai-integration bot

Loading

Uh oh!

devin-ai-integration bot commented Dec 2, 2025

Uh oh!

coderabbitai bot commented Dec 2, 2025 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

netlify bot commented Dec 2, 2025 •

edited

Loading

Uh oh!

netlify bot commented Dec 2, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yujonglee commented Dec 2, 2025 • edited by devin-ai-integration bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Review & Testing Checklist for Human

Notes

Uh oh!

devin-ai-integration bot commented Dec 2, 2025

🤖 Devin AI Engineer

Uh oh!

coderabbitai bot commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Pre-merge checks and finishing touches

Uh oh!

netlify bot commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for hyprnote-storybook ready!

Uh oh!

netlify bot commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for hyprnote ready!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yujonglee commented Dec 2, 2025 •

edited by devin-ai-integration bot

Loading

coderabbitai bot commented Dec 2, 2025 •

edited

Loading

netlify bot commented Dec 2, 2025 •

edited

Loading

netlify bot commented Dec 2, 2025 •

edited

Loading