Skip to content

Tool calls hang indefinitely when confirmations arrive out of order #5558

@wpfleger96

Description

@wpfleger96

Problem

Tool calls in goosed hang permanently when multiple concurrent requests receive confirmations out of order. Requires manual pod restarts to recover.

Reproduction

Trigger 3-5+ rapid concurrent tool calls in the same session (e.g., rapid Slack mentions tagging a Goose-powered Slackbot). Confirmations may arrive in different order than requests due to network timing, causing hangs.

Root Cause

Code location: crates/goose/src/agents/tool_execution.rs lines 81-126

let mut rx = self.confirmation_rx.lock().await;
while let Some((req_id, confirmation)) = rx.recv().await {
    if req_id == request.id {
        break; // Found matching confirmation
    }
    // Bug: Non-matching confirmation is silently discarded
}

When confirmations arrive out of order, non-matching confirmations are discarded instead of being queued. Tool requests waiting for those discarded confirmations hang forever.

Race Condition

1. Request #1 locks confirmation channel, starts waiting
2. Request #2 queued for lock
3. Confirmation #2 arrives first (network timing)
4. Request #1 receives Confirmation #2, discards it (ID mismatch)
5. Request #1 gets its confirmation and completes
6. Request #2 acquires lock but its confirmation was already discarded
7. Request #2 hangs forever

Metadata

Metadata

Assignees

Labels

p2Priority 2 - Medium

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions