Skip to content

Ignore Filler Words During Interruption Detection #4450

@darshankparmar

Description

@darshankparmar

Feature Type

Nice to have

Feature Description

Voice users often produce filler or backchannel speech (e.g. “um”, “uh”, “hmm”, “yeah”) while listening.
Today, such utterances can unintentionally trigger interruption logic, causing the agent to stop speaking even when the user does not intend to interrupt.

This feature allows agents to ignore configurable filler words during interruption detection, preventing unnecessary interruptions when users speak only filler content.

Solution Overview

Introduce an optional configuration:

interruption_ignore_words=["um", "uh", "like", "hmm"]

If a transcript contains only ignored words, the interruption is suppressed.
If at least one non-ignored word is present, interruption proceeds normally.

This acts as a content-based filter layered on top of existing interruption logic.

Behavior Summary

  • Filler-only speech does not interrupt
  • Meaningful speech interrupts as usual
  • Existing timing and word-count thresholds remain unchanged
  • Feature is fully opt-in

Integration Across Turn Detection Modes

The ignore-word filter applies consistently across:

  • VAD / STT-based interruption detection
  • Realtime LLM turn detection
  • Preemptive generation
  • False interruption resume logic

No changes are made to interruption timing or scheduling semantics.

Backward Compatibility

  • Default value is None
  • Existing behavior is preserved
  • No API-breaking changes
  • Ignore words act only as an additional filter layer

Performance Impact

  • Latency: <1ms per interruption decision
  • CPU: Constant-time set lookups
  • Memory: Minimal (small word lists)

Notes

  • Requires STT to be enabled for transcript-based filtering
  • Case-insensitive matching with punctuation stripping
  • Intended to improve conversational naturalness, not replace timing-based controls

Workarounds / Alternatives

Currently, users can partially mitigate this by increasing min_interruption_words, but this also delays legitimate interruptions and does not distinguish filler speech from meaningful input.

Additional Context

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions