Enforce batch atomicity for condenser by ryanhoangt · Pull Request #775 · OpenHands/software-agent-sdk

ryanhoangt · 2025-10-17T13:41:23Z

Integration tests: 7/7 (100%)

[10/17/25 13:48:27] INFO     Success rate: 100.00% (7/7)             run_infer.py:278
[10/17/25 13:48:27] INFO     Evaluation Results:                     run_infer.py:279
[10/17/25 13:48:27] INFO     t04_git_staging: ✓ - Successfully       run_infer.py:283
                             committed changes with message: 'Add                    
                             hello world Python script'                              
[10/17/25 13:48:27] INFO     t07_interactive_commands: ✓ -           run_infer.py:283
                             Interactive Python script setup                         
                             completed. Agent should execute the                     
                             script with inputs 'John' and '25' and                  
                             find the secret number: 707                             
[10/17/25 13:48:27] INFO     t05_simple_browsing: ✓ - Agent          run_infer.py:283
                             successfully found the answer! Matched                  
                             pattern: (?i)openhands is all you need.                 
                             Response contained the expected content                 
                             about OpenHands.                                        
[10/17/25 13:48:27] INFO     t03_jupyter_write_file: ✓ -             run_infer.py:283
                             Successfully created file with content:                 
                             hello world                                             
[10/17/25 13:48:27] INFO     t01_fix_simple_typo: ✓ - Successfully   run_infer.py:283
                             fixed all typos                                         
[10/17/25 13:48:27] INFO     t02_add_bash_hello: ✓ - Successfully    run_infer.py:283
                             created and executed script: hello                      
[10/17/25 13:48:27] INFO     t06_github_pr_browsing: ✓ - Agent's     run_infer.py:283
                             final answer contains information about                 
                             the PR content                                          
[10/17/25 13:48:27] INFO     Total cost: $0.27                       run_infer.py:284

Agent Server images for this PR

• GHCR package: https://github.com/All-Hands-AI/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Base Image	Docs / Tags
golang	`golang:1.21-bookworm`	Link
java	`eclipse-temurin:17-jdk`	Link
python	`nikolaik/python-nodejs:python3.12-nodejs22`	Link

Pull (multi-arch manifest)

docker pull ghcr.io/all-hands-ai/agent-server:14e6b9a-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-14e6b9a-python \
  ghcr.io/all-hands-ai/agent-server:14e6b9a-python

All tags pushed for this build

ghcr.io/all-hands-ai/agent-server:14e6b9a-golang
ghcr.io/all-hands-ai/agent-server:v1.0.0a2_golang_tag_1.21-bookworm_binary
ghcr.io/all-hands-ai/agent-server:14e6b9a-java
ghcr.io/all-hands-ai/agent-server:v1.0.0a2_eclipse-temurin_tag_17-jdk_binary
ghcr.io/all-hands-ai/agent-server:14e6b9a-python
ghcr.io/all-hands-ai/agent-server:v1.0.0a2_nikolaik_s_python-nodejs_tag_python3.12-nodejs22_binary

The 14e6b9a tag is a multi-arch manifest (amd64/arm64); your client pulls the right arch automatically.

github-actions · 2025-10-17T13:45:14Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-sdk/openhands/sdk/context
view.py	114	79	30%	41, 46, 51–52, 57–58, 63–67, 83–87, 89, 102–108, 110, 112, 114, 116–117, 123, 134–135, 137, 148–152, 159–161, 165–166, 175–178, 180, 187–192, 194–196, 201, 203, 211–212, 216–221, 223–224, 226–227, 231–237, 239
TOTAL	7987	3423	57%

openhands/sdk/agent/agent.py

...ts/anthropic_claude_haiku_4_5_20251001_agent-sdk-integration_N7_20251017_093024/results.json

enyst

I have a concern here, I think that both Claude and GPT models may crash if we repeat the same reasoning block... but I could be wrong. Do we have a log with parallel tool calls on the PR branch?

ryanhoangt · 2025-10-17T14:15:32Z

Do we have a log with parallel tool calls on the PR branch?

Yep, it's in the associated issue.

Dismissed, I misunderstood the behavior

enyst · 2025-10-20T14:33:42Z

openhands-sdk/openhands/sdk/context/view.py

+        batch are forgotten.
+
+        This prevents partial batches from being sent to the LLM, which can cause
+        API errors when thinking blocks are separated from their tool calls.


Just a thought: I think maybe the action events are in the order we received tool calls from the LLM? If so, maybe we could check in a simpler way, if the event(s) just before forgotten_event_ids have the same response_id as... the first of those forgotten?

Really just a thought, it's all good with checking this way too

csmith49 · 2025-10-20T16:09:17Z

openhands-sdk/openhands/sdk/context/view.py

+        # Enforce batch atomicity: if any event in a multi-action batch is forgotten,
+        # forget all events in that batch to prevent partial batches with thinking
+        # blocks separated from their tool calls
+        forgotten_event_ids = View._enforce_batch_atomicity(events, forgotten_event_ids)


I think it might make sense to call this after we call filter_unmatched_tool_calls? Not sure if it's even possible for that function to break up a batch but maybe better safe than sorry.

I think it makes more sense to do this event removal before filter_unmatched_tool_calls, since in the _enforce_batch_atomicity we only remove actions, not observations. Maybe putting the filter_unmatched_tool_calls at the end can help us remove some left-over observations that doesn't have a corresponding action.

Maybe I'll go with this for now, we can reconsider this later if we run into issues!

Yeah, makes sense to me.

openhands-sdk/openhands/sdk/context/view.py

csmith49

Modulo a few minor suggestions, I think this is good to go!

Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>

ryanhoangt added 2 commits October 17, 2025 10:41

duplicate thinking_blocks

bfc2fa2

add tests

db05016

ryanhoangt requested a review from xingyaoww October 17, 2025 13:48

enyst reviewed Oct 17, 2025

View reviewed changes

openhands/sdk/agent/agent.py Outdated Show resolved Hide resolved

enyst reviewed Oct 17, 2025

View reviewed changes

...ts/anthropic_claude_haiku_4_5_20251001_agent-sdk-integration_N7_20251017_093024/results.json Outdated Show resolved Hide resolved

remove intg tests

0baf9ac

enyst previously requested changes Oct 17, 2025

View reviewed changes

ryanhoangt added 2 commits October 20, 2025 09:11

Merge branch 'main' into ht/fix-extended-thinking-failed-with-condenser

0fba479

filter events of the same batch when condensing

a27a9db

ryanhoangt requested a review from enyst October 20, 2025 10:24

ryanhoangt added 2 commits October 20, 2025 10:35

fix tests

ec23bf8

reorder events

6751e4e

ryanhoangt requested a review from csmith49 October 20, 2025 10:43

OpenHands deleted a comment from openhands-ai bot Oct 20, 2025

enyst reviewed Oct 20, 2025

View reviewed changes

Merge branch 'main' into ht/fix-extended-thinking-failed-with-condenser

38bd554

csmith49 reviewed Oct 20, 2025

View reviewed changes

openhands-sdk/openhands/sdk/context/view.py Show resolved Hide resolved

csmith49 approved these changes Oct 20, 2025

View reviewed changes

ryanhoangt changed the title ~~Duplicate thinking_blocks when splitting from Message into actions~~ Enforce batch atomicity for condenser Oct 20, 2025

ryanhoangt merged commit 3c4ce52 into main Oct 20, 2025
16 checks passed

ryanhoangt deleted the ht/fix-extended-thinking-failed-with-condenser branch October 20, 2025 21:08

vivekvjnk pushed a commit to vivekvjnk/agent-sdk that referenced this pull request Nov 17, 2025

Enforce batch atomicity for condenser (OpenHands#775)

aae0faf

Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>

csmith49 mentioned this pull request Feb 19, 2026

feat(condenser): Explicit view properties #2116

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Enforce batch atomicity for condenser#775

Enforce batch atomicity for condenser#775
ryanhoangt merged 8 commits intomainfrom
ht/fix-extended-thinking-failed-with-condenser

ryanhoangt commented Oct 17, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Oct 17, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

enyst left a comment

Uh oh!

ryanhoangt commented Oct 17, 2025

Uh oh!

enyst Oct 20, 2025 •

edited

Loading

Uh oh!

csmith49 Oct 20, 2025

Uh oh!

ryanhoangt Oct 20, 2025

Uh oh!

ryanhoangt Oct 20, 2025

Uh oh!

csmith49 Oct 20, 2025

Uh oh!

Uh oh!

csmith49 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

Conversation

ryanhoangt commented Oct 17, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

enyst left a comment

Choose a reason for hiding this comment

Uh oh!

ryanhoangt commented Oct 17, 2025

Uh oh!

enyst Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

csmith49 Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

ryanhoangt Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

ryanhoangt Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

csmith49 Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

csmith49 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ryanhoangt commented Oct 17, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Oct 17, 2025 •

edited

Loading

enyst Oct 20, 2025 •

edited

Loading