-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Description
Bug Description
Gemini’s API provides thought summaries when include_thoughts=true is enabled. These appear in content.parts with the part.thought flag set and must be handled separately from normal answer text. https://ai.google.dev/gemini-api/docs/thinking#summaries
Right now, the Google LLM plugin in agents/livekit-plugins/livekit-plugins-google/livekit/plugins/google /llm.py ignores part.thought entirely. The parsing code only looks at part.function_call and part.text:
def _parse_part(self, id: str, part: types.Part) -> llm.ChatChunk | None:
if part.function_call:
chat_chunk = llm.ChatChunk(
id=id,
delta=llm.ChoiceDelta(
role="assistant",
tool_calls=[
llm.FunctionToolCall(
arguments=json.dumps(part.function_call.args),
name=part.function_call.name,
call_id=part.function_call.id or utils.shortuuid("function_call_"),
)
],
content=part.text,
),
)
return chat_chunk
return llm.ChatChunk(
id=id,
delta=llm.ChoiceDelta(content=part.text, role="assistant"),
)There’s no check for part.thought, so thought summaries are treated as regular assistant output.
Problems
-
Bug: Thoughts are spoken by TTS
When include_thoughts=True, thought summaries are merged into the same content used for user-facing responses. The TTS layer receives them and reads the agent’s internal reasoning out loud, which is not what Gemini’s “thinking” feature is meant for. -
Missing observability of thoughts in OTEL
LiveKit already uses OpenTelemetry, but the Gemini thought summaries are not surfaced there at all. There is no way to inspect the model’s internal reasoning in traces/logs while keeping it hidden from the end user and TTS.
Expected Behavior
Separation of thoughts vs. answer
Parts with part.thought == True should not be included in the assistant’s user-visible content.
Thought parts should never be forwarded to TTS or any channel that is meant for end-user output.
Observability via existing OpenTelemetry
Thought summaries should be attached to the existing OTEL spans/traces for Gemini calls
Reproduction Steps
e.g.:
session = AgentSession(
stt="assemblyai/universal-streaming-multilingual",
llm=google.LLM(
model="gemini-2.5-flash-preview-09-2025",
temperature=0.8,
thinking_config=types.ThinkingConfig(
include_thoughts=True,
thinking_budget=1500,
),
),
tts=elevenlabs.TTS(
voice_id=elevenlabs_voice_id,
model=elevenlabs_model,
language=tts_language,
),
)Operating System
macOS Tahoe
Models Used
assemblyai, google plugin, eleven labs
Package Versions
# Core LiveKit dependencies
livekit>=1.0.13
livekit-agents[images,elevenlabs]>=1.3.6
livekit-api>=1.0.5
livekit-protocol>=1.0.6
# LiveKit plugins
livekit-plugins-google>=1.3.6
# Google Gemini API
google-generativeai==0.8.3
# Telemetry (Langfuse/OpenTelemetry/Judgment Labs)
opentelemetry-api>=1.39.0
opentelemetry-sdk>=1.39.0
opentelemetry-exporter-otlp-proto-http>=1.39.0
judgeval>=0.1.0 # Judgment Labs tracing (OpenTelemetry compatible)Session/Room/Call IDs
No response
Proposed Solution
Additional Context
No response
Screenshots and Recordings
No response