Bug: Duplicate ObservationEvent with same tool_call_id causes LLM API error on conversation resume

## Bug Description

**⚠️ This bug puts the conversation in an unrecoverable state. Once triggered, the agent cannot proceed and the conversation is permanently stuck. Every subsequent attempt to run the agent fails with the same error.**

When a conversation is resumed after being paused/finished, a duplicate `ObservationEvent` can be created with the same `tool_call_id` as an existing observation. This causes the Anthropic API to reject the request with:

```
litellm.BadRequestError: Error code: 400 - {
  "error": {
    "message": "litellm.BadRequestError: AnthropicException - {
      \"type\": \"error\",
      \"error\": {
        \"type\": \"invalid_request_error\",
        \"message\": \"messages.59: `tool_use` ids were found without `tool_result` blocks immediately after: toolu_01CGASf7KnafqkuQMuLstHi4. Each `tool_use` block must have a corresponding `tool_result` block in the next message.\"
      },
      \"request_id\": \"req_011CXMLWpMmp7TBFbCVyLXY3\"
    }
    No fallback model group found for original model_group=prod/claude-opus-4-5-20251101.
    Fallbacks=[{"qwen3-coder-480b": ["qwen3-coder-480b-or"]}, {"glm-4.5": ["glm-4.5-or"]}].
    Received Model Group=prod/claude-opus-4-5-20251101
    Available Model Group Fallbacks=None",
    "type": null,
    "param": null,
    "code": "400"
  }
}
```

Since the duplicate observation is persisted in the event stream, there is no way to recover - every LLM call will include the malformed message history and fail.

## Steps to Reproduce

Observed in conversation `af22d964cb9f464e957320d647c7471e`:

1. Start a conversation and let the agent run actions
2. Let the conversation finish normally (agent calls `finish` tool)
3. Wait some time (in this case ~1.5 hours)
4. Resume the conversation by sending a new user message
5. The conversation fails with the tool_use/tool_result mismatch error
6. **All subsequent attempts to continue the conversation fail with the same error**

## Root Cause Analysis

Analyzing the event stream via `/api/v1/conversation/{id}/events/search`:

| Event | Timestamp | Type | Details |
|-------|-----------|------|------|
| 94 | 21:04:03 | ActionEvent | `tool_call_id=toolu_01CGASf7KnafqkuQMuLstHi4` |
| 95 | 21:04:24 | ObservationEvent | First observation for action 94 |
| 110 | 21:06:10 | ObservationEvent | Finish action - conversation completed |
| 111 | 21:06:10 | StateUpdate | `execution_status=finished` |
| *(1.5 hour gap)* | | | |
| 114 | 22:45:53 | MessageEvent | User sent new message |
| 116 | 22:45:54 | ObservationEvent | **DUPLICATE** - same `tool_call_id=toolu_01CGASf7KnafqkuQMuLstHi4` |

The result is:
- 1 `tool_use` (ActionEvent) with ID `toolu_01CGASf7KnafqkuQMuLstHi4`
- 2 `tool_result` (ObservationEvents) with the same ID

Claude's API requires exactly one `tool_result` per `tool_use`.

## Likely Causes

1. **Event sync issue during resume**: When the conversation was restarted, events may not have been fully loaded from persistence, causing `get_unmatched_actions()` to incorrectly identify the old action as unmatched and re-execute it.

2. **Missing deduplication in `filter_unmatched_tool_calls()`**: The current implementation in `View.filter_unmatched_tool_calls()` uses sets to track tool_call_ids. If there are multiple observations with the same tool_call_id, they're all kept because the ID exists in the action set.

## Suggested Fixes

### Defensive Fix (view.py)

Modify `filter_unmatched_tool_calls()` to track counts and only keep one observation per tool_call_id:

```python
@staticmethod
def filter_unmatched_tool_calls(
    events: list[LLMConvertibleEvent],
) -> list[LLMConvertibleEvent]:
    # ... existing code ...
    
    # Track which tool_call_ids have already been seen for observations
    seen_observation_tool_call_ids: set[ToolCallID] = set()
    
    result = []
    for event in events:
        if event.id in removed_event_ids:
            continue
        if isinstance(event, ObservationBaseEvent):
            if event.tool_call_id in tool_call_ids_to_remove:
                continue
            # NEW: Skip duplicate observations
            if event.tool_call_id in seen_observation_tool_call_ids:
                logger.warning(
                    f"Skipping duplicate observation for tool_call_id: {event.tool_call_id}"
                )
                continue
            if event.tool_call_id is not None:
                seen_observation_tool_call_ids.add(event.tool_call_id)
        result.append(event)
    return result
```

This defensive fix would also **recover existing stuck conversations** by filtering out the duplicate observations at LLM message construction time.

### Root Cause Fix

Investigate why duplicate observations are created during conversation resume. Key locations:
- `openhands-agent-server/openhands/agent_server/event_service.py` - the `start()` method
- `openhands-sdk/openhands/sdk/conversation/state.py` - `get_unmatched_actions()`
- Event persistence/loading logic during remote sandbox resume

## Environment

- Platform: OpenHands Cloud (app.all-hands.dev)
- Model: `prod/claude-opus-4-5-20251101`
- Conversation ID: `af22d964cb9f464e957320d647c7471e`

## Related Files

- `openhands-sdk/openhands/sdk/context/view.py` - `filter_unmatched_tool_calls()`
- `openhands-sdk/openhands/sdk/conversation/state.py` - `get_unmatched_actions()`
- `openhands-agent-server/openhands/agent_server/event_service.py` - `start()` method

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: Duplicate ObservationEvent with same tool_call_id causes LLM API error on conversation resume #1782

Bug Description

Steps to Reproduce

Root Cause Analysis

Likely Causes

Suggested Fixes

Defensive Fix (view.py)

Root Cause Fix

Environment

Related Files

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Event	Timestamp	Type	Details
94	21:04:03	ActionEvent	`tool_call_id=toolu_01CGASf7KnafqkuQMuLstHi4`
95	21:04:24	ObservationEvent	First observation for action 94
110	21:06:10	ObservationEvent	Finish action - conversation completed
111	21:06:10	StateUpdate	`execution_status=finished`
(1.5 hour gap)
114	22:45:53	MessageEvent	User sent new message
116	22:45:54	ObservationEvent	DUPLICATE - same `tool_call_id=toolu_01CGASf7KnafqkuQMuLstHi4`

Bug: Duplicate ObservationEvent with same tool_call_id causes LLM API error on conversation resume #1782

Description

Bug Description

Steps to Reproduce

Root Cause Analysis

Likely Causes

Suggested Fixes

Defensive Fix (view.py)

Root Cause Fix

Environment

Related Files

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions