Skip to content

save per-channel when DEBUG=1#1766

Merged
yujonglee merged 1 commit intomainfrom
yl-branch-35
Nov 21, 2025
Merged

save per-channel when DEBUG=1#1766
yujonglee merged 1 commit intomainfrom
yl-branch-35

Conversation

@yujonglee
Copy link
Contributor

No description provided.

@netlify
Copy link

netlify bot commented Nov 21, 2025

Deploy Preview for hyprnote ready!

Name Link
🔨 Latest commit 2e513a3
🔍 Latest deploy log https://app.netlify.com/projects/hyprnote/deploys/69203287d22e780008591af9
😎 Deploy Preview https://deploy-preview-1766--hyprnote.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 21, 2025

📝 Walkthrough

Walkthrough

This PR extracts audio mixing logic into reusable utilities within the hypr-audio-utils crate and refactors existing implementations to use them. The listener recorder actor is restructured to support separate per-channel WAV recording in debug mode and to use Arc-based audio payload sharing via new AudioSingle and AudioDual message variants.

Changes

Cohort / File(s) Summary
Audio mixing utilities
crates/audio-utils/src/lib.rs
Added three new public functions: mix_sample_f32() for f32 sample mixing with clamping, mix_audio_f32() for slice-level mixing with zero-padding, and mix_audio_pcm16le() for PCM16LE byte stream mixing with averaging and Little-Endian emission.
Dependency declarations
crates/transcribe-aws/Cargo.toml, crates/transcribe-deepgram/Cargo.toml
Added hypr-audio-utils as a workspace dependency to both crates.
Transcribe audio refactoring
crates/transcribe-aws/src/lib.rs, crates/transcribe-deepgram/src/service.rs
Replaced local mix_audio() implementations with calls to mix_audio_pcm16le() from the utilities crate; removed now-unused local mixing functions.
WebSocket utilities refactoring
crates/ws-utils/src/lib.rs
Replaced local mix_audio_channels() implementation with mix_audio_f32() from utilities crate; updated imports accordingly.
Listener recorder actor restructuring
plugins/listener/src/actors/recorder.rs
Changed RecMsg::Audio(Vec<f32>) to RecMsg::AudioSingle(Arc<[f32]>) and RecMsg::AudioDual(Arc<[f32]>, Arc<[f32]>). Added optional per-source WAV writers (writer_mic, writer_spk) for debug-mode separate channel recording. Updated message handling and added cleanup logic via finalize_writer(). Added debug-mode detection via is_debug_mode() and flushing utilities.
Listener source actor dispatch
plugins/listener/src/actors/source.rs
Updated message dispatch to use new RecMsg::AudioSingle and RecMsg::AudioDual variants with Arc-wrapped payloads; removed local mix helper function.

Sequence Diagram

sequenceDiagram
    participant Source as Source<br/>(source.rs)
    participant Recorder as Recorder<br/>(recorder.rs)
    participant Writers as WAV Writers<br/>(main/mic/spk)

    Source->>Source: Capture audio<br/>(mic, speaker)
    
    alt Single Audio Path
        Source->>Recorder: RecMsg::AudioSingle(Arc<[f32]>)
        Recorder->>Writers: Write to main writer
    else Dual Audio Path
        Source->>Recorder: RecMsg::AudioDual(Arc<[f32]>, Arc<[f32]>)
        rect rgb(200, 220, 240)
            Note over Recorder,Writers: Main writer path
            Recorder->>Recorder: Mix mic + speaker
            Recorder->>Writers: Write mixed to main writer
        end
        rect rgb(220, 240, 200)
            Note over Recorder,Writers: Debug mode separate paths
            alt Debug mode enabled
                Recorder->>Writers: Write mic to writer_mic
                Recorder->>Writers: Write speaker to writer_spk
            end
        end
    end
    
    Recorder->>Recorder: Flush if due
    Recorder->>Writers: Periodic flush
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Recorder enum changes: RecMsg variant refactor from single Audio to split AudioSingle/AudioDual requires tracing all message construction sites and ensuring Arc-wrapped payloads are correctly handled throughout the actor pipeline.
  • Multi-writer management: New conditional writer initialization (writer_mic, writer_spk), debug-mode gating, and unified cleanup via finalize_writer() introduce stateful complexity in message handling.
  • Refactor consistency across crates: Three separate crates with identical mix_audio() removal and replacement with utility functions; verify the replacements maintain equivalent behavior in all contexts (transcribe-aws, transcribe-deepgram, ws-utils).
  • Arc lifetime and borrowing: Verify Arc-based audio sharing does not introduce unintended aliasing or lifetime issues, especially across async actor boundaries.

Possibly related PRs

  • audio-pipeline-fixes #1764: Overlaps at the listener/recorder actor message shape refactor (RecMsg enum variants AudioSingle/AudioDual) and the same source/recorder actor dispatch implementation.
  • Implement binary diarization #1015: Modifies audio handling and dual-audio support flows in ws-utils and related audio conversion paths; shares the same crate-level audio utilities additions.
  • Explicit sample_rate in owhisper client #1651: Updates audio-utils crate for audio chunking and metadata handling; related through shared audio utility layer modifications.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Description check ❓ Inconclusive No pull request description was provided by the author, making it impossible to assess whether any description content relates to the changeset. Add a pull request description explaining the purpose and scope of these changes, even briefly.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'save per-channel when DEBUG=1' directly relates to the main change: adding per-source WAV writers (mic/spk) in debug mode to the recorder actor.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch yl-branch-35

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
plugins/listener/src/actors/recorder.rs (1)

123-146: Consider simplifying the per-channel writer access pattern.

The outer check if st.writer_mic.is_some() on Line 131 is redundant since the inner if let Some(...) patterns already handle the None case. Since both per-channel writers are created together (Lines 77-96), you can simplify this:

-                if st.writer_mic.is_some() {
-                    if let Some(ref mut writer_mic) = st.writer_mic {
-                        for s in mic.iter() {
-                            writer_mic.write_sample(*s)?;
-                        }
-                    }
-
-                    if let Some(ref mut writer_spk) = st.writer_spk {
-                        for s in spk.iter() {
-                            writer_spk.write_sample(*s)?;
-                        }
+                if let Some(ref mut writer_mic) = st.writer_mic {
+                    for s in mic.iter() {
+                        writer_mic.write_sample(*s)?;
+                    }
+                }
+
+                if let Some(ref mut writer_spk) = st.writer_spk {
+                    for s in spk.iter() {
+                        writer_spk.write_sample(*s)?;
                     }
                 }
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fa40d0a and 2e513a3.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • crates/audio-utils/src/lib.rs (1 hunks)
  • crates/transcribe-aws/Cargo.toml (1 hunks)
  • crates/transcribe-aws/src/lib.rs (2 hunks)
  • crates/transcribe-deepgram/Cargo.toml (1 hunks)
  • crates/transcribe-deepgram/src/service.rs (2 hunks)
  • crates/ws-utils/src/lib.rs (2 hunks)
  • plugins/listener/src/actors/recorder.rs (6 hunks)
  • plugins/listener/src/actors/source.rs (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
crates/transcribe-aws/src/lib.rs (1)
crates/audio-utils/src/lib.rs (1)
  • mix_audio_pcm16le (102-127)
crates/transcribe-deepgram/src/service.rs (1)
crates/audio-utils/src/lib.rs (1)
  • mix_audio_pcm16le (102-127)
crates/audio-utils/src/lib.rs (1)
crates/aec/benches/aec_bench.rs (1)
  • mic_sample (19-20)
crates/ws-utils/src/lib.rs (1)
crates/audio-utils/src/lib.rs (2)
  • bytes_to_f32_samples (78-85)
  • mix_audio_f32 (91-100)
plugins/listener/src/actors/recorder.rs (2)
crates/audio-utils/src/lib.rs (2)
  • std (37-37)
  • mix_audio_f32 (91-100)
crates/audio-utils/src/vorbis.rs (2)
  • decode_vorbis_to_wav_file (111-150)
  • encode_wav_to_vorbis_file (152-175)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Redirect rules - hyprnote
  • GitHub Check: Header rules - hyprnote
  • GitHub Check: Pages changed - hyprnote
  • GitHub Check: fmt
  • GitHub Check: ci (macos, macos-14)
🔇 Additional comments (15)
crates/transcribe-aws/Cargo.toml (1)

10-10: LGTM!

The workspace dependency addition aligns with the PR objective to use shared audio mixing utilities.

crates/transcribe-deepgram/Cargo.toml (1)

7-7: LGTM!

The dependency is declared in both [dev-dependencies] and [dependencies] sections. If hypr-audio-utils is only used in production code (as suggested by the service.rs changes), the dev-dependencies entry may be redundant.

Also applies to: 15-15

crates/transcribe-aws/src/lib.rs (1)

31-31: LGTM!

The refactor correctly replaces the local mixing implementation with the shared mix_audio_pcm16le utility from hypr-audio-utils, improving code maintainability and reducing duplication.

Also applies to: 92-92

crates/transcribe-deepgram/src/service.rs (1)

11-11: LGTM!

Consistent with the refactor in other modules, this correctly adopts the shared mix_audio_pcm16le utility for dual-audio mixing.

Also applies to: 75-75

crates/ws-utils/src/lib.rs (1)

11-11: LGTM!

The refactor correctly replaces the local mix_audio_channels function with the shared mix_audio_f32 utility, which is appropriate since the data is already in f32 format at this point in the pipeline.

Also applies to: 119-119

plugins/listener/src/actors/source.rs (1)

378-386: LGTM!

The refactor to Arc-based audio payloads is efficient and correct. Using Arc::clone avoids expensive buffer copies while sharing data between actors. The error handling is appropriate.

crates/audio-utils/src/lib.rs (3)

87-89: LGTM!

The mix_sample_f32 function correctly clamps the mixed sample to the valid audio range [-1.0, 1.0] to prevent distortion.


91-100: LGTM!

The mix_audio_f32 function correctly handles slices of different lengths by zero-padding shorter inputs and delegates mixing logic to mix_sample_f32.


102-127: Callers do not validate input buffer lengths; odd-length buffers from clients will silently lose data.

Verification reveals that both callers (crates/transcribe-aws/src/lib.rs:92 and crates/transcribe-deepgram/src/service.rs:75) receive mic and speaker buffers from untrusted JSON input without any length validation. If a client sends odd-length buffers in the DualAudio message, mix_audio_pcm16le will silently discard the final incomplete sample. Consider adding buffer length validation at the deserialization layer or documenting this behavior in the function's contract.

plugins/listener/src/actors/recorder.rs (6)

4-4: LGTM!

The switch to Arc<[f32]> for audio payloads is efficient and avoids unnecessary copying of large buffers between actors.

Also applies to: 15-16


26-27: LGTM!

The per-channel writer initialization correctly creates separate WAV files for mic and speaker in debug mode. The files are created in the session-specific directory with appropriate naming conventions.

Also applies to: 77-96


115-122: LGTM!

The AudioSingle handling correctly writes samples directly to the main writer and triggers periodic flushing.


157-159: LGTM!

The cleanup logic correctly finalizes all writers (main, mic, and speaker) using the finalize_writer helper, ensuring all data is properly flushed and written to disk.

Also applies to: 219-227


191-196: LGTM!

The is_debug_mode function correctly checks both compile-time debug assertions and the HYPRNOTE_DEBUG environment variable for enabling debug features.


198-217: LGTM!

The flushing helpers (flush_if_due and flush_all) correctly implement periodic flushing with a 1-second interval, ensuring data durability without excessive I/O overhead. All three writers are properly coordinated.

@yujonglee yujonglee merged commit ef26afe into main Nov 21, 2025
11 checks passed
@yujonglee yujonglee deleted the yl-branch-35 branch November 21, 2025 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant