Conversation
- Created comprehensive test suite for execute_bash functionality: * test_bash_session.py: Tests for BashSession class with 12 test cases * test_bash_parsing.py: Tests for bash command parsing utilities with 30 test cases * test_bash_ps1_metadata.py: Tests for PS1 metadata parsing with 12 test cases * conftest.py: Shared test utilities to reduce code duplication - Fixed all import statements to use correct module paths: * Updated imports from old openhands_logger to openhands.core.logger.get_logger * Fixed imports to use openhands.core.runtime.tools.execute_bash modules * Added proper __init__.py files for test directory structure - Enhanced BashSession class with bug fixes and improvements: * Added no_change_timeout_seconds parameter to constructor for configurable timeouts * Fixed critical timeout logic bug: changed 'if is_blocking' to 'if not is_blocking' * Fixed initialization order: set _initialized=True before calling _clear_screen() * Added proper session cleanup handling in close() method * Fixed destructor to initialize _closed=False in constructor - Enhanced ExecuteBashObservation class: * Added command_id property to get PID from metadata * Added message property for formatted command execution description * Fixed field name from 'content' to 'output' throughout codebase - Updated all ExecuteBashAction usage: * Added required security_risk parameter to all instances * Fixed constructor calls to use proper keyword arguments * Replaced deprecated CmdRunAction references with ExecuteBashAction - All 54 tests now pass successfully - Code passes pre-commit checks (formatting, linting, type checking) Co-authored-by: openhands <openhands@all-hands.dev>
2 tasks
9 tasks
xingyaoww
pushed a commit
that referenced
this pull request
Feb 9, 2026
This commit addresses the critical issues raised in the code review: ## Critical Issue #1: Data Structure Design Flaw - Refactored AgentState to AgentStateRegistry using dict[str, Any] pattern - Provides loosely-coupled storage for agent-specific state - Agents store state using string keys instead of typed fields - Maintains backward compatibility with AgentState alias ## Critical Issue #2: Hidden State Mutation - Fixed _check_iterative_refinement to only increment iteration counter when refinement will actually continue - State mutation now happens AFTER validation checks pass - Function name now accurately reflects its behavior ## Critical Issue #3: Missing Tests - Added comprehensive tests for AgentStateRegistry (14 tests) - Added tests for IterativeRefinementConfig validation (4 tests) - Added tests for _check_iterative_refinement logic (12 tests) - Tests cover edge cases: max iterations, threshold boundaries, etc. ## Suggestion #5: Unnecessary Forward Reference - Removed quotes from LLM type hint in example ## Suggestion #7: Unused Method - Removed reset_iterative_refinement() as part of registry refactor ## Note on Issue #4 (FinishTool Special Case) The suggestion to move iterative refinement to the conversation layer is a valid architectural improvement but requires more significant changes. This is deferred to a future PR to keep this change focused. Co-authored-by: openhands <openhands@all-hands.dev>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.