[code_review] Misc improvements part 3 #5560

suhaibmujahid · 2025-12-18T21:39:44Z

These improvements could be reviewed commit by commit.

Among other things, this PR resolves #4410 and resolves #4812

Here are the results based on the latest evaluation run:

--------------------
Variant Name: Claude Opus 4.5 - with extended thinking
--------------------
New Comments: 107
New Valid Comments: 22
New Invalid Comments: 7
New Unevaluated Comments: 78
--------------------
Old Comments: 252
Old Valid Comments: 82
Old Invalid Comments: 170
--------------------
Recalled comments: 11.904761904761903
Recalled valid comments: 26.82926829268293
Recalled invalid comments: 4.705882352941177
--------------------
Missed valid comments: 73.17073170731707
Missed invalid comments: 95.29411764705881

Working on more improvements that aim to improve the results further.

Copilot

Pull request overview

This PR refactors the code review tool to improve its architecture by removing global state, restructuring prompts, and adopting structured output with Pydantic models. The changes aim to improve the quality and consistency of generated code review comments.

Removes the global TARGET_SOFTWARE variable and replaces it with a constructor parameter
Restructures prompts by separating system prompts from user messages and adding support for patch descriptions
Migrates from string-based JSON parsing to structured output using Pydantic models

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
scripts/code_review_tool_evaluator.py	Removes unused assignment to global `TARGET_SOFTWARE` variable
mcp/src/bugbug_mcp/server.py	Adds `output_format` parameter to CodeReviewTool initialization and removes `output_format` from `generate_initial_prompt` call
bugbug/tools/core/platforms/swarm.py	Adds abstract `patch_description` property to Swarm platform implementation
bugbug/tools/core/platforms/phabricator.py	Implements `patch_description` property to retrieve patch summary from Phabricator metadata
bugbug/tools/core/platforms/base.py	Adds abstract `patch_description` property to base Patch class
bugbug/tools/code_review/prompts.py	Restructures prompts by separating system prompt template, updating summarization to include patch description, and removing old output format instructions
bugbug/tools/code_review/agent.py	Refactors to use Pydantic models for structured output, hardcodes Claude model configuration, removes `parse_model_output` dependency, and updates to work with structured responses
bugbug/tools/code_review/init.py	Removes `TARGET_SOFTWARE` from exports

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

bugbug/tools/code_review/agent.py

mcp/src/bugbug_mcp/server.py

Deleted scripts/code_review_tool_runner.py as it is no longer needed

bugbug/tools/code_review/agent.py

mcp/src/bugbug_mcp/server.py

bugbug/tools/code_review/prompts.py

marco-c · 2025-12-19T10:14:31Z

Do you think the "Improve summarization prompt" commit is addressing #4812?

And #4410 is fixed by the "Refactor code review agent to use structured output" commit, right?

bugbug/tools/code_review/agent.py

suhaibmujahid · 2025-12-19T20:07:13Z

Do you think the "Improve summarization prompt" commit is addressing #4812?

Yes, and I will keep an eye on that anyway to make sure we do not encounter any other unwanted behaviours.

And #4410 is fixed by the "Refactor code review agent to use structured output" commit, right?

Yes, I forgot to link it. Thank you for the reminder.

Set default target_software to 'Mozilla Firefox'

842aaa0

suhaibmujahid requested review from Copilot and marco-c December 18, 2025 21:39

suhaibmujahid enabled auto-merge (rebase) December 18, 2025 21:39

Copilot started reviewing on behalf of suhaibmujahid December 18, 2025 21:40 View session

Copilot AI reviewed Dec 18, 2025

View reviewed changes

bugbug/tools/code_review/agent.py Outdated Show resolved Hide resolved

bugbug/tools/code_review/agent.py Outdated Show resolved Hide resolved

bugbug/tools/code_review/agent.py Outdated Show resolved Hide resolved

mcp/src/bugbug_mcp/server.py Outdated Show resolved Hide resolved

suhaibmujahid added 4 commits December 18, 2025 17:46

Improve code review prompt

f2aae90

Improve summarization prompt

1151caf

Refactor code review agent to use structured output

70b2f7c

Remove code_review_tool_runner script

7612a63

Deleted scripts/code_review_tool_runner.py as it is no longer needed

suhaibmujahid force-pushed the improve-revew-helper-3 branch from 23f8f72 to 1695f61 Compare December 18, 2025 23:06

Support LLM extended thinking

07c1768

suhaibmujahid force-pushed the improve-revew-helper-3 branch from 1695f61 to 07c1768 Compare December 19, 2025 01:13