feat(owhisper-client): add response parsing utilities#2111
Conversation
- Add parsing.rs with utility functions and traits: - parse_speaker_id: Parse speaker ID from various formats - ms_to_secs: Convert milliseconds to seconds - ms_to_secs_opt: Convert optional milliseconds to seconds - HasTimeSpan trait with start_time/end_time methods - calculate_time_span: Calculate start time and duration from word list - WordBuilder: Fluent API for constructing Word structs - Refactor adapters to use new utilities: - soniox/live.rs: Use ms_to_secs_opt and WordBuilder - assemblyai/live.rs: Use ms_to_secs, calculate_time_span, WordBuilder - fireworks/live.rs: Use WordBuilder This eliminates repetitive Word construction and timestamp conversion code. Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
✅ Deploy Preview for hyprnote ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
✅ Deploy Preview for hyprnote-storybook ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
📝 WalkthroughWalkthroughIntroduces a new parsing module with utilities for word timing and construction (WordBuilder, time converters), then refactors multiple live adapters (AssemblyAI, Fireworks, Soniox) to use WordBuilder instead of direct Word struct construction with standardized time conversion. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Areas requiring attention:
Possibly related PRs
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (3)
owhisper/owhisper-client/src/adapter/soniox/live.rs (1)
2-8: WordBuilder usage is solid; consider clamping negative durationsThe move to
WordBuilderwithms_to_secs_optfor tokenstart/endand speaker IDs preserves prior behavior and centralizes word defaults, which is good.One small improvement: because
ms_to_secs_optreturns0.0when a timestamp isNone,(end_secs - start_secs)can go negative ifend_msis missing or less thanstart_ms. If the downstream code assumes non‑negative durations, consider clamping:- let (start, duration) = if let (Some(first), Some(last)) = (tokens.first(), tokens.last()) { - let start_secs = ms_to_secs_opt(first.start_ms); - let end_secs = ms_to_secs_opt(last.end_ms); - (start_secs, end_secs - start_secs) + let (start, duration) = if let (Some(first), Some(last)) = (tokens.first(), tokens.last()) { + let start_secs = ms_to_secs_opt(first.start_ms); + let end_secs = ms_to_secs_opt(last.end_ms); + let duration = (end_secs - start_secs).max(0.0); + (start_secs, duration)Also applies to: 233-245, 247-250
owhisper/owhisper-client/src/adapter/parsing.rs (2)
22-46:HasTimeSpan+calculate_time_spanabstraction is clean; consider guarding durationThe
HasTimeSpantrait and itsWordimplementation give a nice generic hook for computing(start, duration)from sequences, andcalculate_time_spanremoves duplicated first/last logic across adapters.If there’s any chance callers pass unsorted items or data where
end_time < start_time, you may want to defensively clamp the duration to non‑negative:- (Some(first), Some(last)) => { - let start = first.start_time(); - let end = last.end_time(); - (start, end - start) - } + (Some(first), Some(last)) => { + let start = first.start_time(); + let end = last.end_time(); + let duration = (end - start).max(0.0); + (start, duration) + }
110-187: Tests give good coverage; note potential speaker parsing reuseThe tests exercise numeric/prefixed speaker IDs, time conversion helpers, empty/single/multiple
calculate_time_spancases, and fullWordBuilderconstruction, which gives solid confidence in this module.Given you also have similar string→speaker parsing logic in
SpeakerId::as_i32insoniox/live.rs, you might later consider reusingparse_speaker_idthere (with a cast) to avoid duplicated rules for extracting numeric IDs from strings.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
owhisper/owhisper-client/src/adapter/assemblyai/live.rs(2 hunks)owhisper/owhisper-client/src/adapter/fireworks/live.rs(3 hunks)owhisper/owhisper-client/src/adapter/mod.rs(1 hunks)owhisper/owhisper-client/src/adapter/parsing.rs(1 hunks)owhisper/owhisper-client/src/adapter/soniox/live.rs(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
owhisper/owhisper-client/src/adapter/assemblyai/live.rs (1)
owhisper/owhisper-client/src/adapter/parsing.rs (4)
calculate_time_span(37-46)ms_to_secs(14-16)new(59-70)start(72-75)
owhisper/owhisper-client/src/adapter/fireworks/live.rs (1)
owhisper/owhisper-client/src/adapter/parsing.rs (1)
new(59-70)
owhisper/owhisper-client/src/adapter/soniox/live.rs (1)
owhisper/owhisper-client/src/adapter/parsing.rs (4)
ms_to_secs_opt(18-20)speaker(87-90)new(59-70)start(72-75)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Redirect rules - hyprnote
- GitHub Check: Redirect rules - hyprnote-storybook
- GitHub Check: Header rules - hyprnote-storybook
- GitHub Check: Header rules - hyprnote
- GitHub Check: Pages changed - hyprnote-storybook
- GitHub Check: Pages changed - hyprnote
- GitHub Check: fmt
- GitHub Check: Devin
🔇 Additional comments (5)
owhisper/owhisper-client/src/adapter/mod.rs (1)
7-7: Exposingparsingmodule is consistent with new utilitiesMaking
parsingpublic underadaptermatches how other modules importcrate::adapter::parsingand cleanly exposes the new helpers.owhisper/owhisper-client/src/adapter/fireworks/live.rs (1)
2-8: WordBuilder integration preserves Fireworks semanticsSwitching to
WordBuilderfor per-word construction keeps the previous field mapping (word text, start/end, probability→confidence, language) while centralizing defaults (confidenceandpunctuated_word). Both segment and text branches remain functionally equivalent, with cleaner construction.Also applies to: 79-88, 124-135
owhisper/owhisper-client/src/adapter/assemblyai/live.rs (1)
2-8: WordBuilder +calculate_time_spansimplification looks correctUsing
WordBuilderwithms_to_secsforstart/endand delegatingstart/durationcomputation tocalculate_time_span(&words)keeps the existing timing behavior while reducing duplication and handling the empty-words case gracefully.Also applies to: 217-231
owhisper/owhisper-client/src/adapter/parsing.rs (2)
3-20: Speaker ID parsing and ms→sec helpers are straightforward and well-tested
parse_speaker_idcorrectly handles pure numeric IDs and simple prefixed forms (e.g.SPEAKER_1), and thems_to_secs/ms_to_secs_opthelpers provide a clear, reusable place for time conversion, with unit tests covering the expected cases.
48-108: WordBuilder design aligns withWordand centralizes sensible defaults
WordBuildermirrors theWordfields, sets reasonable defaults (start/end = 0.0,confidence = 1.0,punctuated_word = Some(word.clone())), and provides a concise fluent API used by the adapters. Thebuildmethod cleanly transfers all builder fields into aWord, matching expectations.
feat(owhisper-client): add response parsing utilities
Summary
Adds a new
parsing.rsmodule to the owhisper-client crate with shared utilities for response parsing, then refactors three STT adapters to use these utilities.New utilities in
parsing.rs:parse_speaker_id(value: &str) -> Option<usize>- Parse speaker ID from various formatsms_to_secs(ms: u64) -> f64- Convert milliseconds to secondsms_to_secs_opt(ms: Option<u64>) -> f64- Convert optional milliseconds to secondsHasTimeSpantrait withstart_time()/end_time()methods, implemented forWordcalculate_time_span<T: HasTimeSpan>(words: &[T]) -> (f64, f64)- Calculate start time and durationWordBuilder- Fluent API for constructingWordstructsRefactored adapters:
soniox/live.rs- Usesms_to_secs_optandWordBuilderassemblyai/live.rs- Usesms_to_secs,calculate_time_span, andWordBuilderfireworks/live.rs- UsesWordBuilderReview & Testing Checklist for Human
WordBuilderdefaults match original behavior:punctuated_wordis set toSome(word.clone())andconfidencedefaults to1.0parse_speaker_idis currently unused (generates compiler warning) - soniox adapter has its ownSpeakerIdenum. Consider if this should be removed or if soniox should be refactored to use it.calculate_time_spanis only used in assemblyai - soniox/fireworks still have inline time span logic due to different source types (tokens vs words)Test plan: Run
cargo test -p owhisper-clientto verify all unit tests pass. Integration tests are ignored (require API keys).Notes