Skip to content

Conversation

@suhaibmujahid
Copy link
Member

@suhaibmujahid suhaibmujahid commented Dec 18, 2025

These improvements could be reviewed commit by commit.

Among other things, this PR resolves #4410 and resolves #4812

Here are the results based on the latest evaluation run:

--------------------
Variant Name: Claude Opus 4.5 - with extended thinking
--------------------
New Comments: 107
New Valid Comments: 22
New Invalid Comments: 7
New Unevaluated Comments: 78
--------------------
Old Comments: 252
Old Valid Comments: 82
Old Invalid Comments: 170
--------------------
Recalled comments: 11.904761904761903
Recalled valid comments: 26.82926829268293
Recalled invalid comments: 4.705882352941177
--------------------
Missed valid comments: 73.17073170731707
Missed invalid comments: 95.29411764705881

Working on more improvements that aim to improve the results further.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the code review tool to improve its architecture by removing global state, restructuring prompts, and adopting structured output with Pydantic models. The changes aim to improve the quality and consistency of generated code review comments.

  • Removes the global TARGET_SOFTWARE variable and replaces it with a constructor parameter
  • Restructures prompts by separating system prompts from user messages and adding support for patch descriptions
  • Migrates from string-based JSON parsing to structured output using Pydantic models

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
scripts/code_review_tool_evaluator.py Removes unused assignment to global TARGET_SOFTWARE variable
mcp/src/bugbug_mcp/server.py Adds output_format parameter to CodeReviewTool initialization and removes output_format from generate_initial_prompt call
bugbug/tools/core/platforms/swarm.py Adds abstract patch_description property to Swarm platform implementation
bugbug/tools/core/platforms/phabricator.py Implements patch_description property to retrieve patch summary from Phabricator metadata
bugbug/tools/core/platforms/base.py Adds abstract patch_description property to base Patch class
bugbug/tools/code_review/prompts.py Restructures prompts by separating system prompt template, updating summarization to include patch description, and removing old output format instructions
bugbug/tools/code_review/agent.py Refactors to use Pydantic models for structured output, hardcodes Claude model configuration, removes parse_model_output dependency, and updates to work with structured responses
bugbug/tools/code_review/init.py Removes TARGET_SOFTWARE from exports

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@marco-c
Copy link
Collaborator

marco-c commented Dec 19, 2025

Do you think the "Improve summarization prompt" commit is addressing #4812?

And #4410 is fixed by the "Refactor code review agent to use structured output" commit, right?

@suhaibmujahid
Copy link
Member Author

Do you think the "Improve summarization prompt" commit is addressing #4812?

Yes, and I will keep an eye on that anyway to make sure we do not encounter any other unwanted behaviours.

And #4410 is fixed by the "Refactor code review agent to use structured output" commit, right?

Yes, I forgot to link it. Thank you for the reminder.

@suhaibmujahid suhaibmujahid merged commit a89e2d4 into mozilla:master Dec 19, 2025
6 checks passed
@suhaibmujahid suhaibmujahid deleted the improve-revew-helper-3 branch December 19, 2025 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants