Skip to content

audio-pipeline-fixes#1764

Merged
yujonglee merged 1 commit intomainfrom
yl-branch-34
Nov 21, 2025
Merged

audio-pipeline-fixes#1764
yujonglee merged 1 commit intomainfrom
yl-branch-34

Conversation

@yujonglee
Copy link
Contributor

No description provided.

@netlify
Copy link

netlify bot commented Nov 21, 2025

Deploy Preview for hyprnote ready!

Name Link
🔨 Latest commit 496560d
🔍 Latest deploy log https://app.netlify.com/projects/hyprnote/deploys/692007706fcaa70008430520
😎 Deploy Preview https://deploy-preview-1764--hyprnote.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 21, 2025

📝 Walkthrough

Walkthrough

The listener plugin is refactored to support both single-channel (microphone-only) and dual-channel (microphone and speaker) audio modes. The unified Audio message is replaced with AudioSingle and AudioDual variants. A new ChannelSender enum manages mode-specific message channels, and RX task spawning is split into separate spawn_rx_task_single and spawn_rx_task_dual functions. Message routing is updated in the source module to dispatch appropriate payloads based on the active mode.

Changes

Cohort / File(s) Summary
Core listener refactoring
plugins/listener/src/actors/listener.rs
Replaced Audio(mic, spk) with AudioSingle(Bytes) and AudioDual(Bytes, Bytes) variants. Introduced ChannelSender enum (Single / Dual) to manage mode-specific channels. Updated ListenerState.tx to store ChannelSender. Split spawn_rx_task into spawn_rx_task_single and spawn_rx_task_dual, each returning ChannelSender plus worker handles. Added build_listen_params and build_extra helper functions. Reworked message handling in handle to route AudioSingle and AudioDual to appropriate channel variants.
Message routing
plugins/listener/src/actors/source.rs
Updated Pipeline::dispatch to send raw mic data in Single mode and mixed mic/speaker data in Dual mode to Recorder. Changed Listener path to send AudioSingle (mic-only) in Single mode and AudioDual (mic and speaker separately) in Dual mode, replacing the previous unified Audio payload. Consolidated Listener casting flow into a single result path.
Configuration constant
plugins/listener/src/actors/mod.rs
Changed macOS-specific SAMPLE_RATE constant from 24_000 to 16_000.

Sequence Diagram

sequenceDiagram
    participant Pipeline
    participant Listener as ListenerActor
    participant ChannelSender
    participant Recorder
    
    rect rgb(200, 220, 240)
    Note over Pipeline,ChannelSender: Single Mode (Mic Only)
    Pipeline->>Recorder: Audio (mic data)
    Pipeline->>Listener: AudioSingle(mic_bytes)
    Listener->>ChannelSender: Route via Single variant
    ChannelSender->>Listener: Send MixedMessage<Bytes>
    end
    
    rect rgb(240, 200, 220)
    Note over Pipeline,ChannelSender: Dual Mode (Mic + Speaker)
    Pipeline->>Recorder: Audio (mixed mic/spk data)
    Pipeline->>Listener: AudioDual(mic_bytes, spk_bytes)
    Listener->>ChannelSender: Route via Dual variant
    ChannelSender->>Listener: Send MixedMessage<(Bytes, Bytes)>
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Areas requiring extra attention:

  • ChannelSender enum routing logic — Verify that Single and Dual variants are correctly dispatched based on mode and that message payloads match the expected types at the streaming layer.
  • spawn_rx_task_single vs. spawn_rx_task_dual — Ensure both functions properly initialize the Whisper client and manage worker handles consistently; check for any divergent error handling or resource cleanup paths.
  • Message type consistency — Confirm that AudioSingle and AudioDual routing through Pipeline::dispatch aligns with how both Recorder and Listener consume these messages, especially for mode transitions.
  • SAMPLE_RATE constant change — Validate that changing macOS sample rate from 24_000 to 16_000 does not conflict with existing mode-specific streaming configurations or introduce regressions in audio quality or timing.
  • Helper functions integration — Verify that build_listen_params and build_extra correctly construct required configuration and metadata for the streaming process across both modes.

Possibly related PRs

  • 24k sample-rate in macOS to minimize resampling #1741 — Modifies the same macOS SAMPLE_RATE constant (24_000 → 16_000), representing either a direct conflict or coordinated change to audio configuration.
  • Mixed audio #1471 — Directly overlaps with the AudioAudioSingle/AudioDual split and updates to the listener message routing in Pipeline::dispatch and listener actor.
  • Binary diarization #1102 — Introduces complementary dual-channel vs. single-channel messaging patterns with mode-driven client/server handling and ChannelSender-like abstractions for audio I/O.

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings, 1 inconclusive)
Check name Status Explanation Resolution
Description check ⚠️ Warning No pull request description was provided by the author, making it impossible to assess whether the description relates to the changeset. Add a pull request description that explains the motivation, changes made, and impact of this refactoring on the audio pipeline functionality.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title check ❓ Inconclusive The title 'audio-pipeline-fixes' is vague and generic, using non-descriptive language that doesn't clearly convey the specific changes made to the audio pipeline. Use a more specific title that describes the main change, such as 'Refactor audio pipeline to support single and dual channel modes' or 'Split Audio message handling into AudioSingle and AudioDual variants'.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch yl-branch-34

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
plugins/listener/src/actors/source.rs (1)

375-405: Single/Dual dispatch semantics are correct; consider de-duping mic cloning in Single mode

The new behavior looks right:

  • Recorder: mic-only in ChannelMode::Single, mixed mic+spk in ChannelMode::Dual.
  • Listener: AudioSingle (mic only) for single-channel, AudioDual (separate mic/spk) for dual-channel.

One small optimization in the Single branch: you currently do mic.to_vec() twice (once indirectly via audio_for_recording and once again when building the Bytes). You can allocate/copy once and reuse:

-        if let Some(cell) = registry::where_is(RecorderActor::name()) {
-            let actor: ActorRef<RecMsg> = cell.into();
-            let audio_for_recording = if mode == ChannelMode::Single {
-                mic.to_vec()
-            } else {
-                Self::mix(mic.as_ref(), spk.as_ref())
-            };
+        if let Some(cell) = registry::where_is(RecorderActor::name()) {
+            let actor: ActorRef<RecMsg> = cell.into();
+            let audio_for_recording = if mode == ChannelMode::Single {
+                // Single mode: record mic only, and reuse this Vec later when encoding.
+                mic.to_vec()
+            } else {
+                Self::mix(mic.as_ref(), spk.as_ref())
+            };
             if let Err(e) = actor.cast(RecMsg::Audio(audio_for_recording)) {
                 tracing::error!(error = ?e, "failed_to_send_audio_to_recorder");
             }
         }
@@
-        let result = if mode == ChannelMode::Single {
-            let audio_bytes = f32_to_i16_bytes(mic.to_vec().iter().copied());
-            actor.cast(ListenerMsg::AudioSingle(audio_bytes))
+        let result = if mode == ChannelMode::Single {
+            // If you keep `audio_for_recording` as the mic Vec in the branch above,
+            // you can pass an iterator over it here instead of creating a second Vec.
+            let audio_bytes = f32_to_i16_bytes(mic.iter().copied());
+            actor.cast(ListenerMsg::AudioSingle(audio_bytes))

Or equivalently, bind let mic_vec = mic.to_vec(); once in the Single arm, pass mic_vec to the recorder, and then use mic_vec.iter().copied() for byte conversion.

This keeps the new logic but avoids an extra allocation and copy on the hot path.

plugins/listener/src/actors/listener.rs (1)

18-27: Mode‑aware ListenerMsg + ChannelSender wiring looks correct; consider logging on mismatched mode

The restructuring around ListenerMsg::{AudioSingle, AudioDual} and ChannelSender::{Single, Dual} looks solid:

  • In Single mode you only ever push MixedMessage<Bytes, _> into a single‑channel stream.
  • In Dual mode you only ever push MixedMessage<(Bytes, Bytes), _> into a dual‑channel stream.
  • The if let ChannelSender::Single/Dual(..) guards ensure you can’t accidentally send the wrong payload shape into a given channel.

One minor improvement: if AudioSingle is received while state.tx is ChannelSender::Dual (or vice versa), the message is just dropped. That’s probably “impossible” under normal flow, but adding a tracing::warn! in the non‑matching case would make any future wiring/Mode‑Change bugs much easier to diagnose without changing runtime behavior for the happy path.

Also applies to: 44-54, 106-117

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 33ad4bc and 496560d.

📒 Files selected for processing (3)
  • plugins/listener/src/actors/listener.rs (5 hunks)
  • plugins/listener/src/actors/mod.rs (1 hunks)
  • plugins/listener/src/actors/source.rs (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
plugins/listener/src/actors/source.rs (1)
crates/audio-utils/src/lib.rs (1)
  • f32_to_i16_bytes (66-76)
plugins/listener/src/actors/listener.rs (3)
owhisper/owhisper-interface/src/lib.rs (2)
  • default (130-132)
  • default (152-161)
owhisper/owhisper-interface/src/stream.rs (2)
  • default (66-76)
  • default (91-102)
owhisper/owhisper-client/src/live.rs (1)
  • builder (106-108)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: ci (macos, macos-14)
  • GitHub Check: fmt
🔇 Additional comments (2)
plugins/listener/src/actors/mod.rs (1)

13-16: SAMPLE_RATE alignment on macOS looks good

Switching macOS SAMPLE_RATE to 16 kHz to match the non-macOS path and the listen-side configuration is consistent with the rest of this PR and with typical STT expectations. I don’t see any functional issues with this change; just be aware this slightly changes the effective frontend behavior for existing macOS users.

plugins/listener/src/actors/listener.rs (1)

195-206: Review comment verification complete: concern about ListenParams.channels is invalid

The review identified a potential issue with ListenParams.channels not being set to 2 in dual-channel mode. However, verification shows this concern is unfounded.

Key finding: The build_dual() method calls self.build_request(2) with a hardcoded channel count of 2, and the build_request() function accepts a channels parameter and passes it to build_uri(channels) to construct the request URI. This means build_dual() internally controls the channel count via the client builder—it does not depend on or use the channels field in ListenParams.

The code is correct as-is. The build_listen_params() function can safely leave channels at its default value (1) because build_dual() overrides this internally when constructing the dual-channel client.

@yujonglee yujonglee merged commit 8e52260 into main Nov 21, 2025
11 checks passed
@yujonglee yujonglee deleted the yl-branch-34 branch November 21, 2025 07:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant