-
Notifications
You must be signed in to change notification settings - Fork 154
Description
Bug Description
When a conversation is resumed after being paused/finished, a duplicate ObservationEvent can be created with the same tool_call_id as an existing observation. This causes the Anthropic API to reject the request with:
litellm.BadRequestError: Error code: 400 - {
"error": {
"message": "litellm.BadRequestError: AnthropicException - {
\"type\": \"error\",
\"error\": {
\"type\": \"invalid_request_error\",
\"message\": \"messages.59: `tool_use` ids were found without `tool_result` blocks immediately after: toolu_01CGASf7KnafqkuQMuLstHi4. Each `tool_use` block must have a corresponding `tool_result` block in the next message.\"
},
\"request_id\": \"req_011CXMLWpMmp7TBFbCVyLXY3\"
}
No fallback model group found for original model_group=prod/claude-opus-4-5-20251101.
Fallbacks=[{"qwen3-coder-480b": ["qwen3-coder-480b-or"]}, {"glm-4.5": ["glm-4.5-or"]}].
Received Model Group=prod/claude-opus-4-5-20251101
Available Model Group Fallbacks=None",
"type": null,
"param": null,
"code": "400"
}
}
Since the duplicate observation is persisted in the event stream, there is no way to recover - every LLM call will include the malformed message history and fail.
Steps to Reproduce
Observed in conversation af22d964cb9f464e957320d647c7471e:
- Start a conversation and let the agent run actions
- Let the conversation finish normally (agent calls
finishtool) - Wait some time (in this case ~1.5 hours)
- Resume the conversation by sending a new user message
- The conversation fails with the tool_use/tool_result mismatch error
- All subsequent attempts to continue the conversation fail with the same error
Root Cause Analysis
Analyzing the event stream via /api/v1/conversation/{id}/events/search:
| Event | Timestamp | Type | Details |
|---|---|---|---|
| 94 | 21:04:03 | ActionEvent | tool_call_id=toolu_01CGASf7KnafqkuQMuLstHi4 |
| 95 | 21:04:24 | ObservationEvent | First observation for action 94 |
| 110 | 21:06:10 | ObservationEvent | Finish action - conversation completed |
| 111 | 21:06:10 | StateUpdate | execution_status=finished |
| (1.5 hour gap) | |||
| 114 | 22:45:53 | MessageEvent | User sent new message |
| 116 | 22:45:54 | ObservationEvent | DUPLICATE - same tool_call_id=toolu_01CGASf7KnafqkuQMuLstHi4 |
The result is:
- 1
tool_use(ActionEvent) with IDtoolu_01CGASf7KnafqkuQMuLstHi4 - 2
tool_result(ObservationEvents) with the same ID
Claude's API requires exactly one tool_result per tool_use.
Likely Causes
-
Event sync issue during resume: When the conversation was restarted, events may not have been fully loaded from persistence, causing
get_unmatched_actions()to incorrectly identify the old action as unmatched and re-execute it. -
Missing deduplication in
filter_unmatched_tool_calls(): The current implementation inView.filter_unmatched_tool_calls()uses sets to track tool_call_ids. If there are multiple observations with the same tool_call_id, they're all kept because the ID exists in the action set.
Suggested Fixes
Defensive Fix (view.py)
Modify filter_unmatched_tool_calls() to track counts and only keep one observation per tool_call_id:
@staticmethod
def filter_unmatched_tool_calls(
events: list[LLMConvertibleEvent],
) -> list[LLMConvertibleEvent]:
# ... existing code ...
# Track which tool_call_ids have already been seen for observations
seen_observation_tool_call_ids: set[ToolCallID] = set()
result = []
for event in events:
if event.id in removed_event_ids:
continue
if isinstance(event, ObservationBaseEvent):
if event.tool_call_id in tool_call_ids_to_remove:
continue
# NEW: Skip duplicate observations
if event.tool_call_id in seen_observation_tool_call_ids:
logger.warning(
f"Skipping duplicate observation for tool_call_id: {event.tool_call_id}"
)
continue
if event.tool_call_id is not None:
seen_observation_tool_call_ids.add(event.tool_call_id)
result.append(event)
return resultThis defensive fix would also recover existing stuck conversations by filtering out the duplicate observations at LLM message construction time.
Root Cause Fix
Investigate why duplicate observations are created during conversation resume. Key locations:
openhands-agent-server/openhands/agent_server/event_service.py- thestart()methodopenhands-sdk/openhands/sdk/conversation/state.py-get_unmatched_actions()- Event persistence/loading logic during remote sandbox resume
Environment
- Platform: OpenHands Cloud (app.all-hands.dev)
- Model:
prod/claude-opus-4-5-20251101 - Conversation ID:
af22d964cb9f464e957320d647c7471e
Related Files
openhands-sdk/openhands/sdk/context/view.py-filter_unmatched_tool_calls()openhands-sdk/openhands/sdk/conversation/state.py-get_unmatched_actions()openhands-agent-server/openhands/agent_server/event_service.py-start()method