Skip to content

Conversation

@Ethan0456
Copy link
Contributor

Why are these changes needed?

This PR introduces a baseline self-debugging loop to the CodeExecutionAgent.

The loop automatically retries code generation and execution up to a configurable number of attempts (n) until the execution succeeds or the retry limit is reached.

This enables the agent to recover from transient failures (e.g., syntax errors, runtime errors) by using its own reasoning to iteratively improve generated code—laying the foundation for more robust autonomous behavior.

Related issue number

Closes #6207

Checks

)

Signed-off-by: Abhijeetsingh Meena <abhijeet040403@gmail.com>
@Ethan0456 Ethan0456 marked this pull request as ready for review April 17, 2025 17:22
- Introduced `max_retries_on_error` (default: 0) to control the number of code generation and execution retries on failure.
- Added `retry_attempt` field to `CodeGenerationEvent` and `CodeExecutionEvent` to track retry attempts.
- Refactored `execute_code_block` to return `CodeResult`.

Signed-off-by: Abhijeetsingh Meena <abhijeet040403@gmail.com>
Copy link
Contributor

@ekzhu ekzhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add unit tests for this, using ReplayChatCompletionClient, generate code with syntax error, and then generate code with correct syntax to simulate the debugging loop.

@Ethan0456
Copy link
Contributor Author

Hi @ekzhu, I’ve added a unit test for the self-debugging loop in CodeExecutorAgent. The test first generates code with a syntax error, then retries with the correct code, and verifies that the agent handles the failure and retry logic as expected.

@codecov
Copy link

codecov bot commented Apr 22, 2025

Codecov Report

Attention: Patch coverage is 87.71930% with 7 lines in your changes missing coverage. Please review.

Project coverage is 78.31%. Comparing base (b3f3731) to head (ea6bf65).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...c/autogen_agentchat/agents/_code_executor_agent.py 85.71% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6306      +/-   ##
==========================================
+ Coverage   78.26%   78.31%   +0.05%     
==========================================
  Files         217      217              
  Lines       15669    15684      +15     
==========================================
+ Hits        12263    12283      +20     
+ Misses       3406     3401       -5     
Flag Coverage Δ
unittests 78.31% <87.71%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ekzhu ekzhu enabled auto-merge (squash) April 22, 2025 06:09
@ekzhu ekzhu merged commit aad6caa into microsoft:main Apr 22, 2025
60 checks passed
peterj added a commit to kagent-dev/autogen that referenced this pull request Apr 24, 2025
…e0424

* upstream/main:
  Remove `name` field from OpenAI Assistant Message (microsoft#6388)
  Introduce workbench (microsoft#6340)
  TEST/change gpt4, gpt4o serise to gpt4.1nano (microsoft#6375)
  update website version (microsoft#6364)
  Add self-debugging loop to `CodeExecutionAgent` (microsoft#6306)
  Fix: deserialize model_context in AssistantAgent and SocietyOfMindAgent and CodeExecutorAgent (microsoft#6337)
  Add azure ai agent (microsoft#6191)
  Avoid re-registering a message type already registered (microsoft#6354)
  Added support for exposing GPUs to docker code executor (microsoft#6339)
  fix: ollama fails when tools use optional args (microsoft#6343)
  Add an example using autogen-core and FastAPI to create streaming responses (microsoft#6335)
  FEAT: SelectorGroupChat could using stream inner select_prompt (microsoft#6286)
  Add experimental notice to canvas (microsoft#6349)
  DOC: add extentions - autogen-oaiapi and autogen-contextplus (microsoft#6338)
  fix: ensure serialized messages are passed to LLMStreamStartEvent (microsoft#6344)
  Generalize Continuous SystemMessage merging via model_info[“multiple_system_messages”] instead of `startswith("gemini-")` (microsoft#6345)
  Agentchat canvas (microsoft#6215)

Signed-off-by: Peter Jausovec <peter.jausovec@solo.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Self-debugging in CodeExecutionAgent

2 participants