Skip to content

Conversation

@chenghao-mou
Copy link
Member

@chenghao-mou chenghao-mou commented Jan 20, 2026

Bugs introduced by #4131 and it should close #4568 affecting version 1.3.11

  • Avatar services use DataStreamAudioOutput and QueueAudioOutput where first frame future was never resolved, leading to agent messages missed in context and session hooks.
  • Tested with liveavatar locally.

More details:

We have 7 AudioOutput subsclasses:

  • DataStreamAudioOutput
  • ConsoleAudioOutput
  • _ParticipantAudioOutput
  • FakeAudioOutput
  • RecorderAudioOutput
  • _SyncedAudioOutput
  • QueueAudioOutput

Internal classes and the recorder will not fire on_playback_started events since other output on the chain should take care of that. We don't put this in the base class because we want to make sure it is only fired as close to the device as possible.

Summary by CodeRabbit

  • New Features

    • Voice avatar playback now emits a start event with timestamp information when playback begins.
  • Changes

    • Modified observable audio events in queue playback system; certain events are no longer publicly available.

✏️ Tip: You can customize this high-level summary in your review settings.

@chenghao-mou chenghao-mou requested a review from a team January 20, 2026 13:04
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 20, 2026

📝 Walkthrough

Walkthrough

Two avatar audio output classes now emit on_playback_started signals with timestamps upon first frame capture. The QueueAudioOutput class narrows its public event interface to emit only "clear_buffer" events, removing playback_finished from observable events.

Changes

Cohort / File(s) Summary
Avatar Audio Output Event Emissions
livekit/agents/voice/avatar/_datastream_io.py, livekit/agents/voice/avatar/_queue_io.py
Added on_playback_started event emission in DataStreamAudioOutput and QueueAudioOutput when capturing the first frame, passing current timestamp via time.time(). Updated QueueAudioOutput event type to emit only "clear_buffer" events publicly, removing "playback_finished" from the interface.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 Hops of joy, a playback start,
Timestamps mark the avatar's heart,
Clear buffers, events refined,
Synchronization by design! 🎬✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: adding playback_started event calls to two audio output classes.
Linked Issues check ✅ Passed The PR directly addresses issue #4568 by emitting on_playback_started for DataStreamAudioOutput and QueueAudioOutput, ensuring the 'first frame' future resolves and assistant messages appear in ChatContext.
Out of Scope Changes check ✅ Passed All changes are scoped to addressing the missing on_playback_started calls in the two specified audio output classes; no unrelated modifications detected.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0722371 and 261b4fa.

📒 Files selected for processing (2)
  • livekit-agents/livekit/agents/voice/avatar/_datastream_io.py
  • livekit-agents/livekit/agents/voice/avatar/_queue_io.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-agents/livekit/agents/voice/avatar/_datastream_io.py
  • livekit-agents/livekit/agents/voice/avatar/_queue_io.py
🧬 Code graph analysis (2)
livekit-agents/livekit/agents/voice/avatar/_datastream_io.py (1)
livekit-agents/livekit/agents/voice/io.py (1)
  • on_playback_started (188-189)
livekit-agents/livekit/agents/voice/avatar/_queue_io.py (1)
livekit-agents/livekit/agents/voice/io.py (1)
  • on_playback_started (188-189)
🔇 Additional comments (3)
livekit-agents/livekit/agents/voice/avatar/_datastream_io.py (1)

138-141: LGTM! Correctly fires playback started signal on first frame.

The fix appropriately addresses the missing "first frame" future resolution. The acknowledgment in the comment about timing imperfection is appreciated. The placement after stream_writer initialization ensures this fires once per audio segment (since flush() resets _stream_writer to None).

livekit-agents/livekit/agents/voice/avatar/_queue_io.py (2)

39-43: LGTM! Consistent pattern with DataStreamAudioOutput.

The fix correctly fires on_playback_started on first frame capture and resets via _capturing = False in flush(), allowing each segment to trigger the event. This mirrors the approach in DataStreamAudioOutput and properly resolves the "first frame" future.


17-21: Type narrowing on EventEmitter doesn't impact actual consumers—consider removing if intentional, or clarify intent.

The explicit EventEmitter[Literal["clear_buffer"]] inheritance narrows the type signature and removes "playback_finished" from static type hints. While notify_playback_finished() (line 67) still emits events through the inherited AudioOutput mechanism at runtime, no downstream consumers directly call QueueAudioOutput.on("playback_finished", ...). All test and example code accesses the event through the AudioOutput interface, not QueueAudioOutput directly. If this narrowing is intentional (to restrict the public interface of this internal implementation class), consider documenting it; otherwise, remove the explicit EventEmitter inheritance to preserve type clarity.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@chenghao-mou chenghao-mou merged commit 3d9c8fb into main Jan 21, 2026
20 checks passed
@chenghao-mou chenghao-mou deleted the fix/missing-agent-transcripts branch January 21, 2026 09:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ChatContext missing assistant messages when using liveavatar

3 participants