Skip to content

Conversation

pennycoders
Copy link
Contributor

@pennycoders pennycoders commented Aug 2, 2025

Summary

This PR introduces bidirectional audio support to JetKVM, enabling both audio output (listening to the managed device) and audio input (microphone from browser to device). Audio is implemented using an in-process CGO architecture that directly calls C code for ALSA audio capture/playback and Opus encoding/decoding. The managed device presents itself as a USB Audio Class 1 (UAC1) gadget providing both stereo speakers and a stereo microphone interface over USB.

Key Features:

  • Bidirectional stereo audio (48kHz, 16-bit, 2 channels)
  • In-process CGO implementation for low latency and simplicity
  • USB Audio Gadget (UAC1) integration
  • WebRTC-based real-time streaming with Opus codec
  • Frontend controls for enabling/disabling audio output and input
  • HDMI or USB audio capture source selection
  • SDP munging for proper stereo audio support in browsers

Credits

Thanks!
Alex

@CLAassistant
Copy link

CLAassistant commented Aug 2, 2025

CLA assistant check
All committers have signed the CLA.

@pennycoders pennycoders changed the title JetKVM Advanced, CGO Audio Support JetKVM Advanced, CGO-based Audio Support Aug 2, 2025
@adamshiervani adamshiervani added this to the 0.5.0 milestone Aug 4, 2025
@adamshiervani adamshiervani moved this to Backlog in JetKVM Aug 4, 2025
@adamshiervani adamshiervani moved this from Backlog to In progress in JetKVM Aug 4, 2025
@adamshiervani adamshiervani moved this from In progress to In review in JetKVM Aug 4, 2025
@adamshiervani adamshiervani moved this from In review to In progress in JetKVM Aug 4, 2025
@adamshiervani adamshiervani moved this from In progress to In Review in JetKVM Aug 4, 2025
@adamshiervani adamshiervani mentioned this pull request Aug 4, 2025
3 tasks
@pennycoders
Copy link
Contributor Author

Great news! I'll soon update this PR with Audio Input pass-through functionality too

@adamshiervani adamshiervani linked an issue Aug 4, 2025 that may be closed by this pull request
@pennycoders pennycoders changed the title JetKVM Advanced, CGO-based Audio Support JetKVM Advanced, CGO-based 2-way Audio Support Aug 4, 2025
@IDisposable
Copy link
Contributor

This is amazing!

Would it be possible to forward the audio channel on device's input HDMI to the browser?

By this I mean that if I set my host/controlled device's audio output to the JetKVM virtual monitor then the sound is going to be coming in the HDMI stream, which might be possible to extract (I know nothing about that hardware), so we could have the host-audio come through without an additional (virtual) audit device.

image

@pennycoders
Copy link
Contributor Author

pennycoders commented Aug 8, 2025

Hi @IDisposable

Glad you like this functionality, mainly to free up as much of that USB bandwidth. I'm actually looking at this, however, it is a little trickier as it moat likely requires changes in the rv1106-system repo containing the OS too.

In case I do manage to pull that off before the v0.5.0 release for which this functionality has been scheduled, I'll update this PR.

Thanks,
Alex

@vvns
Copy link

vvns commented Aug 10, 2025

JetKVM Audio PR Review & Test Feedback

Hi @pennycoders ,

First, thanks for the work on bringing audio and mic support into JetKVM.

I’ve tested the new functionality in a local LAN environment with both playback and microphone streaming active, including during real-world scenarios like a Teams call.

Test conditions:

  • Setup: Wired LAN, low network latency, tested with a headset mic.
  • Modes tested: Low, Medium, High, Ultra for both playback and mic.

Main observations:

  1. Mic quality constant across modes

    • Microphone stream sounds the same in all modes.
    • Quality is acceptable but not “HD” — there is a constant background noise floor, even with a headset mic on a clean LAN.
  2. Ultra playback distortion

    • In Ultra mode only, playback sometimes has a warped/buzzy/distorted effect.
    • Low/Medium/High playback modes sound good and consistent.
  3. Latency when mic is active

    • Mouse and keyboard control become noticeably less responsive whenever the mic stream is active, even on a low-latency LAN connection.
    • Likely due to video/control WebSocket traffic competing with audio packets on the same channel.
  4. Packet loss

    • Playback drop rate: ~22%
    • Mic drop rate: ~13%
    • Loss observed despite no network congestion, pointing to buffering or scheduling bottlenecks.

Potential Improvements (Technical):

  1. Separate transport channels

    • Move audio to a dedicated WebSocket endpoint (e.g. /ws/audio) or use WebRTC for audio transport.
    • Prevents video/control from being delayed by audio bursts.
  2. Opus tuning exposure

    • Make parameters adjustable via UI or JSON config:
      • bitrate, frame size, complexity
      • FEC, DTX, VBR/CBR
    • Lets users balance latency, quality, and bandwidth.
  3. ALSA parameter control

    • Expose period_time and buffer_time for fine-tuning latency vs underrun protection.
  4. Queue management

    • Use a bounded audio frame queue with drop-oldest to prevent latency spikes when encoding falls behind.
  5. Noise reduction & echo cancellation

    • Integrate RNNoise or WebRTC AEC/NS for mic clarity.
    • Even simple high-pass filtering can reduce constant hum.
  6. Thread/process separation

    • Run audio encode/decode in its own goroutine/process to isolate timing from video/control.

Happy to re-run these tests and provide before/after metrics once adjustments are implemented.
This PR is already a big step forward, and with these improvements, we could get low-latency, clean mic audio without impacting remote control responsiveness.

@pennycoders
Copy link
Contributor Author

Hi @vvns

Thanks! Thank you very much for putting this through its paces! This is great feedback, that I can definitely work with. I initially encountered the interference with the Keyboard & Mouse that you are mentioning and made some optimizations. Do you happen to know the commit hash you've tested at? Is it the latest version of my branch? I am asking because I've tested actual calls with the latest implementation and was definitely usable.

I will break down into your feedback and see what I can do about each of the items.

If you want we can discuss more in-depth on other channels too.

Thanks,

Alex

@vvns
Copy link

vvns commented Aug 10, 2025

Hi @pennycoders ,

Glad the feedback is useful! 👍
I’ve confirmed that my tests were run on the latest commit at the time — 5f905e7 from your feat/audio-support branch — so the results I reported already include your most recent optimizations.

The version is indeed usable, but in slightly more demanding conditions (e.g., during calls or with sustained mic usage) the remote control latency — which was very low before the audio feature — increases significantly, to the point where slow mouse movement becomes noticeably delayed.

I can still retest to be sure nothing was missed, but the latency impact with mic active, packet loss, and Ultra mode distortion were all observed on that commit.

I’m happy to continue sharing feedback as you push further updates, so we can iterate quickly toward the best possible audio and control experience. Let me know which channel you’d prefer for more direct discussion, so you can share any details privately if needed.

Thanks again for the great work — we’re close to a fully smooth audio + control experience.

@pennycoders
Copy link
Contributor Author

Hi @vvns,

Are you on the JetKVM Discord?

If so, we can discuss there. What's your username?

Thanks,
Alex

Copy link
Contributor

@IDisposable IDisposable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This really looks nice, all my comments are questions or nits, just feel free to ignore... I wonder if we need to be more explicit in the priority assignment of the other RTC channels (medai/serial/rpc) as we really want to ensure the control signals get through at very high fidelity ... might even be worthwhile splitting up the RPC messages into control vs. advisory messaging, but that's not this PR :)

@adamshiervani adamshiervani added this to the 0.5.1 milestone Oct 3, 2025
- Check SetReadDeadline error in IPC client
- Explicitly ignore Kill() error (process may be dead)
- Remove init() function and rely on explicit ExtractEmbeddedBinaries() call
Replace 'as any' type assertions with proper React.ChangeEvent<HTMLInputElement>
type for synthetic events passed to audio toggle handlers.
Add noise gate threshold at peak > 256 (-42dB) to prevent dynamic gain
from amplifying quantization noise and hardware noise floor. This fixes
crackling, buzzing, and static-like noise when HDMI audio is at very
low volume or during silence.

Without the gate, signals below -42dB (peak < 256) would get 8x gain
applied, amplifying noise floor to audible levels. Now these signals
pass through unmodified, eliminating the artifacts.
@DoubleDensity
Copy link

thank you all, this feature is essential for my primary use case of JetKVM (working remotely on a product whose interface renders video and audio out only over HDMI) -- much appreciated @pennycoders !

@pennycoders
Copy link
Contributor Author

thank you all, this feature is essential for my primary use case of JetKVM (working remotely on a product whose interface renders video and audio out only over HDMI) -- much appreciate

Hi @DoubleDensity ! You will also have to compile and update your system image, since the HDMI input is not captured by v0.2.5, so beware of that. If you need help let me know

Thanks

Remove dynamic gain code and rely on Opus encoder quality improvements:
- Increase Opus complexity from 2 to 5 for better quality
- Change bandwidth from FULLBAND (20kHz) to SUPERWIDEBAND (16kHz) for better quality at 128kbps
- Disable FEC to allocate all bits to audio quality
- Increase ALSA buffer from 40ms to 80ms for stability

The dynamic gain code was adding complexity without solving the underlying
issue: TC358743 HDMI chip captures digital audio at whatever volume the
source outputs. Users should adjust volume at the source or in their browser.
Audio quality improvements:
- Enable constrained VBR to prevent bitrate starvation at low volumes
- Increase Opus complexity from 2 to 5 for better quality
- Enable DTX for bandwidth optimization
- Enable FEC (Forward Error Correction)
- Add DTX and FEC signaling in SDP (usedtx=1;useinbandfec=1)

Default configuration changes:
- Change default audio output source from HDMI to USB
- Enable USB Audio device by default
- USB audio works on current stable image (HDMI requires newer device tree)

These changes fix crackling issues at low volumes and provide better
overall audio quality for both USB and HDMI audio paths.
pennycoders and others added 4 commits October 7, 2025 11:26
Remove all subprocess-based audio code to simplify the audio system and
reduce complexity. Audio now uses CGO in-process mode exclusively.

Changes:
- Remove subprocess mode: Deleted Supervisor, IPCSource, embed.go
- Remove audio mode selection from UI (Settings → Audio)
- Remove audio mode from backend config (AudioMode field)
- Remove JSON-RPC handlers: getAudioMode/setAudioMode
- Remove Makefile targets: build_audio_output/input/binaries
- Remove standalone C binaries: jetkvm_audio_{input,output}.c
- Remove IPC protocol implementation: ipc_protocol.{c,h}
- Remove unused IPC functions from audio_common.{c,h}
- Simplify audio.go: startAudio() instead of startAudioSubprocesses()
- Update all function calls and comments to remove subprocess references
- Add constants to cgo_source.go (ipcMaxFrameSize, ipcMsgTypeOpus)
- Keep update_opus_encoder_params() for potential future runtime config

Benefits:
- Simpler codebase: -1,734 lines of code
- Better performance: No IPC overhead on embedded hardware
- Easier maintenance: Single audio implementation
- Smaller binary: No embedded audio subprocess binaries

The audio system now works exclusively via CGO direct C function calls,
with ALSA device selection (HDMI vs USB) still configurable via settings.
@DoubleDensity
Copy link

thank you all, this feature is essential for my primary use case of JetKVM (working remotely on a product whose interface renders video and audio out only over HDMI) -- much appreciate

Hi @DoubleDensity ! You will also have to compile and update your system image, since the HDMI input is not captured by v0.2.5, so beware of that. If you need help let me know

Thanks

I can't seem to get the feature to show up, maybe I am missing a step? Or not looking in the right place?

image

here is what I am doing:

RUN git clone --branch feat/audio-support https://github.com/pennycoders/kvm.git /home/builder/rv1106-system/app/jetkvm

CMD ["/bin/bash", "-c", "./build.sh lunch BoardConfig_IPC/BoardConfig-EMMC-NONE-RV1106_JETKVM_V2.mk && sudo ./build.sh clean && ./build.sh uboot && ./build.sh app && ./build.sh kernel && ./build.sh media && ./build.sh rootfs && ./build.sh firmware"]

then I flash the update:

 sudo ./upgrade_tool uf ~/jetkvm/output/image/update.img
Using upgrade_tool_v2.17_for_linux/config.ini
Loading firmware...
Support Type:1106       FW Ver:0.0.00   FW Time:2025-10-07 13:59:13
Loader ver:1.01 Loader Time:2025-10-07 13:50:21
Start to upgrade firmware...
Download Boot Start
Download Boot Success
Wait For Maskrom Start
Wait For Maskrom Success
Test Device Start
Test Device Success
Check Chip Start
Check Chip Success
Get FlashInfo Start
Get FlashInfo Success
Prepare IDB Start
Prepare IDB Success
Download IDB Start
Download IDB Success
Download Firmware Start
Download Image... (100%)
Download Firmware Success
Upgrade firmware ok.

@pennycoders
Copy link
Contributor Author

thank you all, this feature is essential for my primary use case of JetKVM (working remotely on a product whose interface renders video and audio out only over HDMI) -- much appreciate

Hi @DoubleDensity ! You will also have to compile and update your system image, since the HDMI input is not captured by v0.2.5, so beware of that. If you need help let me know
Thanks

I can't seem to get the feature to show up, maybe I am missing a step? Or not looking in the right place?

image here is what I am doing:
RUN git clone --branch feat/audio-support https://github.com/pennycoders/kvm.git /home/builder/rv1106-system/app/jetkvm

CMD ["/bin/bash", "-c", "./build.sh lunch BoardConfig_IPC/BoardConfig-EMMC-NONE-RV1106_JETKVM_V2.mk && sudo ./build.sh clean && ./build.sh uboot && ./build.sh app && ./build.sh kernel && ./build.sh media && ./build.sh rootfs && ./build.sh firmware"]

then I flash the update:

 sudo ./upgrade_tool uf ~/jetkvm/output/image/update.img
Using upgrade_tool_v2.17_for_linux/config.ini
Loading firmware...
Support Type:1106       FW Ver:0.0.00   FW Time:2025-10-07 13:59:13
Loader ver:1.01 Loader Time:2025-10-07 13:50:21
Start to upgrade firmware...
Download Boot Start
Download Boot Success
Wait For Maskrom Start
Wait For Maskrom Success
Test Device Start
Test Device Success
Check Chip Start
Check Chip Success
Get FlashInfo Start
Get FlashInfo Success
Prepare IDB Start
Prepare IDB Success
Download IDB Start
Download IDB Success
Download Firmware Start
Download Image... (100%)
Download Firmware Success
Upgrade firmware ok.

You need to read the DEVELOPMENT.md file

Basically that's all you should do.

If you are on Discord, please reach out there

@IDisposable
Copy link
Contributor

Nice work getting it in-proc!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Backlog
Development

Successfully merging this pull request may close these issues.

Add sound support