Add self-debugging loop to `CodeExecutionAgent` #6306

Ethan0456 · 2025-04-15T17:15:37Z

Why are these changes needed?

This PR introduces a baseline self-debugging loop to the CodeExecutionAgent.

The loop automatically retries code generation and execution up to a configurable number of attempts (n) until the execution succeeds or the retry limit is reached.

This enables the agent to recover from transient failures (e.g., syntax errors, runtime errors) by using its own reasoning to iteratively improve generated code—laying the foundation for more robust autonomous behavior.

Related issue number

Closes #6207

Checks

I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

) Signed-off-by: Abhijeetsingh Meena <abhijeet040403@gmail.com>

python/packages/autogen-agentchat/src/autogen_agentchat/agents/_code_executor_agent.py

- Introduced `max_retries_on_error` (default: 0) to control the number of code generation and execution retries on failure. - Added `retry_attempt` field to `CodeGenerationEvent` and `CodeExecutionEvent` to track retry attempts. - Refactored `execute_code_block` to return `CodeResult`. Signed-off-by: Abhijeetsingh Meena <abhijeet040403@gmail.com>

ekzhu

We need to add unit tests for this, using ReplayChatCompletionClient, generate code with syntax error, and then generate code with correct syntax to simulate the debugging loop.

python/packages/autogen-agentchat/src/autogen_agentchat/agents/_code_executor_agent.py

… `max_retries_on_error` > 0

Signed-off-by: Abhijeetsingh Meena <abhijeet040403@gmail.com>

Ethan0456 · 2025-04-19T16:59:14Z

Hi @ekzhu, I’ve added a unit test for the self-debugging loop in CodeExecutorAgent. The test first generates code with a syntax error, then retries with the correct code, and verifies that the agent handles the failure and retry logic as expected.

codecov · 2025-04-22T05:57:58Z

Codecov Report

Attention: Patch coverage is 87.71930% with 7 lines in your changes missing coverage. Please review.

Project coverage is 78.31%. Comparing base (b3f3731) to head (ea6bf65).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
...c/autogen_agentchat/agents/_code_executor_agent.py	85.71%	7 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6306      +/-   ##
==========================================
+ Coverage   78.26%   78.31%   +0.05%     
==========================================
  Files         217      217              
  Lines       15669    15684      +15     
==========================================
+ Hits        12263    12283      +20     
+ Misses       3406     3401       -5

Flag	Coverage Δ
unittests	`78.31% <87.71%> (+0.05%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

python/packages/autogen-agentchat/src/autogen_agentchat/agents/_code_executor_agent.py

…e0424 * upstream/main: Remove `name` field from OpenAI Assistant Message (microsoft#6388) Introduce workbench (microsoft#6340) TEST/change gpt4, gpt4o serise to gpt4.1nano (microsoft#6375) update website version (microsoft#6364) Add self-debugging loop to `CodeExecutionAgent` (microsoft#6306) Fix: deserialize model_context in AssistantAgent and SocietyOfMindAgent and CodeExecutorAgent (microsoft#6337) Add azure ai agent (microsoft#6191) Avoid re-registering a message type already registered (microsoft#6354) Added support for exposing GPUs to docker code executor (microsoft#6339) fix: ollama fails when tools use optional args (microsoft#6343) Add an example using autogen-core and FastAPI to create streaming responses (microsoft#6335) FEAT: SelectorGroupChat could using stream inner select_prompt (microsoft#6286) Add experimental notice to canvas (microsoft#6349) DOC: add extentions - autogen-oaiapi and autogen-contextplus (microsoft#6338) fix: ensure serialized messages are passed to LLMStreamStartEvent (microsoft#6344) Generalize Continuous SystemMessage merging via model_info[“multiple_system_messages”] instead of `startswith("gemini-")` (microsoft#6345) Agentchat canvas (microsoft#6215) Signed-off-by: Peter Jausovec <peter.jausovec@solo.io>

Add baseline self-debugging loop to CodeExecutionAgent (microsoft#6207

2b2e362

) Signed-off-by: Abhijeetsingh Meena <abhijeet040403@gmail.com>

Ethan0456 marked this pull request as ready for review April 17, 2025 17:22

ekzhu reviewed Apr 17, 2025

View reviewed changes

Ethan0456 added 3 commits April 18, 2025 19:59

Remove TODO comment asking no-code retry logic

230435a

Rename retry variable to max_retries_on_error

e6cb7d4

ekzhu suggested changes Apr 19, 2025

View reviewed changes

Ethan0456 added 5 commits April 19, 2025 10:14

Remove memory from CodeExecutorAgent

a2fefb7

Validate model_client.model_info for structured output support when…

e7afbb1

… `max_retries_on_error` > 0

Add type assertion for model_result.content and remove redundant cast

872e477

Update comments

ef86164

Add unit test for self-debugging loop in CodeExecutorAgent

6477e0f

Signed-off-by: Abhijeetsingh Meena <abhijeet040403@gmail.com>

ekzhu approved these changes Apr 22, 2025

View reviewed changes

ekzhu added 2 commits April 21, 2025 22:49

lint

eb3ae8a

Merge branch 'main' into feature/code-executor-agent-self-debug-loop

9e8a1ae

ekzhu suggested changes Apr 22, 2025

View reviewed changes

python/packages/autogen-agentchat/src/autogen_agentchat/agents/_code_executor_agent.py Show resolved Hide resolved

update api doc

ea6bf65

ekzhu enabled auto-merge (squash) April 22, 2025 06:09

ekzhu approved these changes Apr 22, 2025

View reviewed changes

ekzhu merged commit aad6caa into microsoft:main Apr 22, 2025
60 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add self-debugging loop to `CodeExecutionAgent` #6306

Add self-debugging loop to `CodeExecutionAgent` #6306

Ethan0456 commented Apr 15, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ekzhu left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Ethan0456 commented Apr 19, 2025

Uh oh!

codecov bot commented Apr 22, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add self-debugging loop to CodeExecutionAgent #6306

Add self-debugging loop to CodeExecutionAgent #6306

Conversation

Ethan0456 commented Apr 15, 2025

Why are these changes needed?

Related issue number

Checks

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ekzhu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Ethan0456 commented Apr 19, 2025

Uh oh!

codecov bot commented Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add self-debugging loop to `CodeExecutionAgent` #6306

Add self-debugging loop to `CodeExecutionAgent` #6306

codecov bot commented Apr 22, 2025 •

edited

Loading