Skip to content

Bug: Bash command polling stops after 2 attempts, causing agent loop to hang #1633

@simonrosenberg

Description

@simonrosenberg

Summary

Bash command result polling stops after 2 attempts, causing agent loop to hang indefinitely. This leads to 20-minute idle timeouts and 404 errors in SWE-bench evaluations.

Environment

  • Model: litellm_proxy/gpt-5-mini-2025-08-07
  • SDK: Remote workspace (eval-runtime cluster)
  • Job: eval-eval-20848009420-gpt-5-mini

Root Cause

After executing POST /api/bash/start_bash_command, the SDK polls GET /api/bash/bash_events/search only 2 times (~100ms total). If the command has not completed, polling stops and the SDK switches to conversation polling only. The bash result is never retrieved, causing the agent loop to hang.

Evidence: 4 Failed Runtimes Show Identical Pattern

Runtime 1: byqchmjqpdgxhdkl (django__django-11095)

10:29:23.xxx | POST /api/bash/start_bash_command HTTP/1.1" 200
10:29:23.xxx | GET /api/bash/bash_events/search?command_id__eq=9039afd2... 200  ← 1st check
10:29:23.xxx | GET /api/bash/bash_events/search?command_id__eq=9039afd2... 200  ← 2nd check
10:29:23.xxx | GET /api/conversations/1860b2a8-... 200  ← STOPS checking bash_events
10:29:24.xxx | GET /api/conversations/1860b2a8-... 200
10:29:25.xxx | GET /api/conversations/1860b2a8-... 200
... (continues for 20 minutes) ...
10:50:08 | [KILLED - idle for 1244 seconds]

Runtime 2: hkbjctmbjbbgrycx (django__django-13670)

10:27:37.xxx | POST /api/bash/start_bash_command HTTP/1.1" 200
10:27:37.xxx | GET /api/bash/bash_events/search?command_id__eq=17d36de8... 200  ← 1st check
10:27:37.xxx | GET /api/bash/bash_events/search?command_id__eq=17d36de8... 200  ← 2nd check
10:27:38.xxx | GET /api/conversations/3a6cefbd-... 200  ← STOPS checking bash_events
... (continues for 22 minutes) ...
10:50:08 | [KILLED - idle for 1351 seconds]

Runtime 3: shgiheepkuhjmnjp (pydata__xarray-6992)

10:34:03.xxx | POST /api/bash/start_bash_command HTTP/1.1" 200
10:34:03.xxx | GET /api/bash/bash_events/search?command_id__eq=9c7a0443... 200  ← 1st check
10:34:03.xxx | GET /api/bash/bash_events/search?command_id__eq=9c7a0443... 200  ← 2nd check
10:34:03.xxx | GET /api/conversations/18ab24cc-... 200  ← STOPS checking bash_events
... (continues for 21 minutes) ...
10:55:08 | [KILLED - idle for 1264 seconds]

Runtime 4: jtvryxnvddglunzx (django__django-15957)

10:31:38.xxx | POST /api/bash/start_bash_command HTTP/1.1" 200
10:31:38.xxx | GET /api/bash/bash_events/search?command_id__eq=df0a5810... 200  ← 1st check
10:31:39.xxx | GET /api/bash/bash_events/search?command_id__eq=df0a5810... 200  ← 2nd check
10:31:39.xxx | GET /api/conversations/3527bac8-... 200  ← STOPS checking bash_events
... (continues for 23 minutes) ...
10:55:08 | [KILLED - idle for 1408 seconds]

All 8 Affected Instances

Instance Runtime ID Last POST Killed At Idle Time
django__django-11095 byqchmjqpdgxhdkl 10:29:23 10:50:08 1244s
django__django-13670 hkbjctmbjbbgrycx 10:27:37 10:50:08 1351s
sympy__sympy-23534 mrjilxopivitvlvb N/A 10:50:09 1494s
pydata__xarray-6992 shgiheepkuhjmnjp 10:34:03 10:55:08 1264s
django__django-15957 jtvryxnvddglunzx 10:31:38 10:55:08 1408s
django__django-10097 qqttditbjzxtwbcq N/A 11:00:08 1277s
django__django-15499 aafaqbawqxlhuocb N/A 11:05:08 1317s
matplotlib__matplotlib-22871 mmgirxwhisdulrrn N/A 11:05:08 1493s

Failure Sequence

1. LLM generates ToolCallAction (bash command)           ✓ Works
2. SDK sends POST /api/bash/start_bash_command           ✓ Works (200 OK)
3. SDK polls GET /api/bash/bash_events/search            ✓ Works (1st check)
4. SDK polls GET /api/bash/bash_events/search            ✓ Works (2nd check)
5. SDK stops polling bash_events                         ✗ BUG - should continue
6. SDK only polls GET /api/conversations/...             ✗ Wrong - waiting for nothing
7. No ObservationEvent recorded                          ✗ Agent loop stuck
8. 20 minutes pass with no tool executions
9. Runtime killed for idle (1200s threshold)
10. Evaluator gets 404 → retry → resource pressure

Expected Behavior

The SDK should continue polling bash_events/search until:

  • The command completes (exit event received), OR
  • A configurable timeout is reached (then emit ErrorObservation)

Suggested Fix

In the bash command execution code, replace the current polling logic:

# Current (broken): Only 2 attempts
for _ in range(2):
    result = poll_bash_events(command_id)
    if result.completed:
        return result

# Fixed: Poll until completion or timeout
start = time.time()
while time.time() - start < BASH_TIMEOUT:
    result = poll_bash_events(command_id)
    if result.completed:
        return result
    await asyncio.sleep(0.1)
raise TimeoutError(f"Bash command {command_id} did not complete within {BASH_TIMEOUT}s")

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    openhandsSolving the issue with OpenHands.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions