-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Description
Bug Description
There's a model name discrepancy between LiveKit Inference (gateway) and the Direct AssemblyAI Plugin that creates uncertainty about correctness and best practices.
My main concern is that in the backstage, the server is calling a deprecated version of Assembly STT, since I noticed a deterioration in performance when calling "assemblyai/universal-streaming:en" vs "universal-streaming-english".
LiveKit Inference (string format):
stt="assemblyai/universal-streaming:en"Model: universal-streaming (no -english suffix)
Direct Plugin:
from livekit.plugins import assemblyai
stt = assemblyai.STT(model="universal-streaming-english")Model: universal-streaming-english (with -english suffix)
AssemblyAI's actual API expects:
- Parameter:
speech_model - Values:
"universal-streaming-english"or"universal-streaming-multilingual"
The Problem: Server-Side Black Box
The LiveKit Inference gateway (https://agent-gateway.livekit.cloud/v1) is a closed-source cloud service. When developers use:
stt="assemblyai/universal-streaming:en"The SDK sends to the gateway (stt.py:554):
{
"model": "assemblyai/universal-streaming",
"settings": {"language": "en"}
}The gateway must then:
- Translate
"universal-streaming"→"universal-streaming-english" - Map to AssemblyAI's API format:
speech_model="universal-streaming-english"
We cannot verify this translation from the SDK code because it happens server-side.
Questions
- Does the gateway correctly translate
"universal-streaming"→"universal-streaming-english"? - Is using the gateway recommended over the direct plugin for production?
- Should documentation clarify this translation layer exists?
Issues Found
1. Missing Model in Type Literal
AssemblyAIModels = Literal[
"assemblyai",
"assemblyai/universal-streaming",
# Missing: "assemblyai/universal-streaming-multilingual"
]2. Hardcoded Language in Direct Plugin
stt.SpeechData(
language="en", # Hardcoded, ignores language parameter
text=interim_text,
)The stream(language=...) parameter is accepted but never used in the response data.
Proposed Solutions
- Documentation: Explain how the gateway translates model names and when to use gateway vs direct plugin
- Type Hints: Add
"assemblyai/universal-streaming-multilingual"toAssemblyAIModels - Direct Plugin: Use the
languageparameter instead of hardcoding"en"
References
Environment: LiveKit Agents latest (commit 0cc77447), Python 3.11+
Thanks for building this excellent framework! Would appreciate clarification on the gateway translation behavior and recommended production approach.
Expected Behavior
Clear documentation on how LiveKit Inference gateway translates "assemblyai/universal-streaming" to AssemblyAI's API format "universal-streaming-english", and guidance on when to use gateway vs direct plugin.
Reproduction Steps
1. Use `stt="assemblyai/universal-streaming:en"` in AgentSession
2. Observe it works, but cannot verify server-side translation
3. Compare with direct plugin using `assemblyai.STT(model="universal-streaming-english")`
4. Note the model name mismatch between approachesOperating System
Linux (WSL2)
Models Used
No response
Package Versions
- livekit-agents: latest (commit 0cc77447)
- livekit-plugins-assemblyai: latest
- Python: 3.11+Session/Room/Call IDs
No response
Proposed Solution
Additional Context
No response
Screenshots and Recordings
No response