-
Notifications
You must be signed in to change notification settings - Fork 620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding numbered hunks and code suggestions feature #50
Conversation
Preparing review... |
3 similar comments
Preparing review... |
Preparing review... |
Preparing review... |
def get_pr_diff(git_provider: Union[GithubProvider, Any], token_handler: TokenHandler, | ||
add_line_numbers_to_hunks: bool = False, disable_extra_lines: bool =False) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Consider using a class or a data structure to encapsulate the parameters of the get_pr_diff
function. This will make the function signature cleaner and easier to manage as the number of parameters grows.
def get_pr_diff(git_provider: Union[GithubProvider, Any], token_handler: TokenHandler, | |
add_line_numbers_to_hunks: bool = False, disable_extra_lines: bool =False) -> str: | |
class PrDiffParams: | |
def __init__(self, git_provider, token_handler, add_line_numbers_to_hunks=False, disable_extra_lines=False): | |
self.git_provider = git_provider | |
self.token_handler = token_handler | |
self.add_line_numbers_to_hunks = add_line_numbers_to_hunks | |
self.disable_extra_lines = disable_extra_lines | |
def get_pr_diff(params: PrDiffParams) -> str: |
self.vars = { | ||
"title": self.git_provider.pr.title, | ||
"branch": self.git_provider.get_pr_branch(), | ||
"description": self.git_provider.get_pr_description(), | ||
"language": self.main_language, | ||
"diff": "", # empty diff for initial calculation | ||
'num_code_suggestions': settings.pr_code_suggestions.num_code_suggestions, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Consider using a data class or a named tuple for the vars
attribute in the PRCodeSuggestions
class. This will make the code more readable and easier to maintain.
self.vars = { | |
"title": self.git_provider.pr.title, | |
"branch": self.git_provider.get_pr_branch(), | |
"description": self.git_provider.get_pr_description(), | |
"language": self.main_language, | |
"diff": "", # empty diff for initial calculation | |
'num_code_suggestions': settings.pr_code_suggestions.num_code_suggestions, | |
} | |
from dataclasses import dataclass | |
@dataclass | |
class PrVars: | |
title: str | |
branch: str | |
description: str | |
language: str | |
diff: str | |
num_code_suggestions: int | |
self.vars = PrVars( | |
title=self.git_provider.pr.title, | |
branch=self.git_provider.get_pr_branch(), | |
description=self.git_provider.get_pr_description(), | |
language=self.main_language, | |
diff="", # empty diff for initial calculation | |
num_code_suggestions=settings.pr_code_suggestions.num_code_suggestions | |
) |
def convert_to_hunks_with_lines_numbers(patch: str, file) -> str: | ||
# toDO: (maybe remove '-' and '+' from the beginning of the line) | ||
""" | ||
## src/file.ts | ||
--new hunk-- | ||
881 line1 | ||
882 line2 | ||
883 line3 | ||
884 line4 | ||
885 line6 | ||
886 line7 | ||
887 + line8 | ||
888 + line9 | ||
889 line10 | ||
890 line11 | ||
... | ||
--old hunk-- | ||
line1 | ||
line2 | ||
- line3 | ||
- line4 | ||
line5 | ||
line6 | ||
... | ||
|
||
""" | ||
patch_with_lines_str = f"## {file.filename}\n" | ||
import re | ||
patch_lines = patch.splitlines() | ||
RE_HUNK_HEADER = re.compile( | ||
r"^@@ -(\d+)(?:,(\d+))? \+(\d+)(?:,(\d+))? @@[ ]?(.*)") | ||
new_content_lines = [] | ||
old_content_lines = [] | ||
match = None | ||
start1, size1, start2, size2 = -1, -1, -1, -1 | ||
for line in patch_lines: | ||
if 'no newline at end of file' in line.lower(): | ||
continue | ||
|
||
if line.startswith('@@'): | ||
match = RE_HUNK_HEADER.match(line) | ||
if match and new_content_lines: # found a new hunk, split the previous lines | ||
if new_content_lines: | ||
patch_with_lines_str += '\n--new hunk--\n' | ||
for i, line_new in enumerate(new_content_lines): | ||
patch_with_lines_str += f"{start2 + i} {line_new}\n" | ||
if old_content_lines: | ||
patch_with_lines_str += '--old hunk--\n' | ||
for i, line_old in enumerate(old_content_lines): | ||
patch_with_lines_str += f"{line_old}\n" | ||
new_content_lines = [] | ||
old_content_lines = [] | ||
start1, size1, start2, size2 = map(int, match.groups()[:4]) | ||
elif line.startswith('+'): | ||
new_content_lines.append(line) | ||
elif line.startswith('-'): | ||
old_content_lines.append(line) | ||
else: | ||
new_content_lines.append(line) | ||
old_content_lines.append(line) | ||
|
||
# finishing last hunk | ||
if match and new_content_lines: | ||
if new_content_lines: | ||
patch_with_lines_str += '\n--new hunk--\n' | ||
for i, line_new in enumerate(new_content_lines): | ||
patch_with_lines_str += f"{start2 + i} {line_new}\n" | ||
if old_content_lines: | ||
patch_with_lines_str += '\n--old hunk--\n' | ||
for i, line_old in enumerate(old_content_lines): | ||
patch_with_lines_str += f"{line_old}\n" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: The convert_to_hunks_with_lines_numbers
function is quite long and does multiple things. Consider breaking it down into smaller, more manageable functions. This will make the code easier to read and maintain.
def convert_to_hunks_with_lines_numbers(patch: str, file) -> str: | |
# toDO: (maybe remove '-' and '+' from the beginning of the line) | |
""" | |
## src/file.ts | |
--new hunk-- | |
881 line1 | |
882 line2 | |
883 line3 | |
884 line4 | |
885 line6 | |
886 line7 | |
887 + line8 | |
888 + line9 | |
889 line10 | |
890 line11 | |
... | |
--old hunk-- | |
line1 | |
line2 | |
- line3 | |
- line4 | |
line5 | |
line6 | |
... | |
""" | |
patch_with_lines_str = f"## {file.filename}\n" | |
import re | |
patch_lines = patch.splitlines() | |
RE_HUNK_HEADER = re.compile( | |
r"^@@ -(\d+)(?:,(\d+))? \+(\d+)(?:,(\d+))? @@[ ]?(.*)") | |
new_content_lines = [] | |
old_content_lines = [] | |
match = None | |
start1, size1, start2, size2 = -1, -1, -1, -1 | |
for line in patch_lines: | |
if 'no newline at end of file' in line.lower(): | |
continue | |
if line.startswith('@@'): | |
match = RE_HUNK_HEADER.match(line) | |
if match and new_content_lines: # found a new hunk, split the previous lines | |
if new_content_lines: | |
patch_with_lines_str += '\n--new hunk--\n' | |
for i, line_new in enumerate(new_content_lines): | |
patch_with_lines_str += f"{start2 + i} {line_new}\n" | |
if old_content_lines: | |
patch_with_lines_str += '--old hunk--\n' | |
for i, line_old in enumerate(old_content_lines): | |
patch_with_lines_str += f"{line_old}\n" | |
new_content_lines = [] | |
old_content_lines = [] | |
start1, size1, start2, size2 = map(int, match.groups()[:4]) | |
elif line.startswith('+'): | |
new_content_lines.append(line) | |
elif line.startswith('-'): | |
old_content_lines.append(line) | |
else: | |
new_content_lines.append(line) | |
old_content_lines.append(line) | |
# finishing last hunk | |
if match and new_content_lines: | |
if new_content_lines: | |
patch_with_lines_str += '\n--new hunk--\n' | |
for i, line_new in enumerate(new_content_lines): | |
patch_with_lines_str += f"{start2 + i} {line_new}\n" | |
if old_content_lines: | |
patch_with_lines_str += '\n--old hunk--\n' | |
for i, line_old in enumerate(old_content_lines): | |
patch_with_lines_str += f"{line_old}\n" | |
def convert_to_hunks_with_lines_numbers(patch: str, file) -> str: | |
# toDO: (maybe remove '-' and '+' from the beginning of the line) | |
... | |
patch_with_lines_str = process_patch_lines(patch_lines) | |
return patch_with_lines_str.strip() | |
def process_patch_lines(patch_lines): | |
... | |
return patch_with_lines_str |
parser.add_argument('--pr_description', action='store_true', required=False) | ||
parser.add_argument('--pr_code_suggestions', action='store_true', required=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Consider using subcommands instead of flags for the different modes of operation (pr_description, pr_code_suggestions). This will make the command line interface more intuitive and easier to use.
parser.add_argument('--pr_description', action='store_true', required=False) | |
parser.add_argument('--pr_code_suggestions', action='store_true', required=False) | |
subparsers = parser.add_subparsers(dest='command') | |
pr_description_parser = subparsers.add_parser('pr_description') | |
pr_code_suggestions_parser = subparsers.add_parser('pr_code_suggestions') |
full code suggestions
ToDo:
|
Preparing review... |
11 similar comments
Preparing review... |
Preparing review... |
Preparing review... |
Preparing review... |
Preparing review... |
Preparing review... |
Preparing review... |
Preparing review... |
Preparing review... |
Preparing review... |
Preparing review... |
Adding numbered hunks and code suggestions feature
Preparing review... |
Preparing PR description... |
Question: Why PR Review is failing for local run through source code Answer:
The code uses several techniques to reduce the number of tokens in the diff, including:
The code also provides additional information about the files that were not processed due to insufficient token budget, and the modified files that were not included in the diff. ✨ Ask tool usage guide:Overview:
Note that the tool does not have "memory" of previous questions, and answers each question independently. See the ask usage page for a comprehensive guide on using this tool. |
Preparing review... |
Preparing PR description... |
12 similar comments
Preparing PR description... |
Preparing PR description... |
Preparing PR description... |
Preparing PR description... |
Preparing PR description... |
Preparing PR description... |
Preparing PR description... |
Preparing PR description... |
Preparing PR description... |
Preparing PR description... |
Preparing PR description... |
Preparing PR description... |
Type of PR:
Enhancement
PR Description:
This PR introduces a new feature that allows the conversion of hunks in a patch to numbered hunks, and the generation of code suggestions for a PR. It also includes the necessary changes to support this feature in the Github and Gitlab providers, and updates the CLI and configuration settings accordingly.
PR Main Files Walkthrough:
-
pr_agent/algo/git_patch_processing.py
: Added a new functionconvert_to_hunks_with_lines_numbers
that converts a patch to numbered hunks. This function is used to generate more detailed diffs for code suggestions.-
pr_agent/algo/pr_processing.py
: Updated theget_pr_diff
function to support the addition of line numbers to hunks. This is used when generating the PR diff for code suggestions.-
pr_agent/cli.py
: Added a new command line argument--pr_code_suggestions
to trigger the code suggestions feature.-
pr_agent/git_providers/git_provider.py
: Added a new abstract methodpublish_code_suggestion
to theGitProvider
base class, which is implemented by the Github and Gitlab providers.-
pr_agent/git_providers/github_provider.py
: Implemented thepublish_code_suggestion
method for the Github provider.-
pr_agent/git_providers/gitlab_provider.py
: Implemented thepublish_code_suggestion
method for the Gitlab provider, but it currently raises a 'not implemented yet' exception.-
pr_agent/tools/pr_code_suggestions.py
: Added a new classPRCodeSuggestions
that handles the generation and publishing of code suggestions for a PR.-
pr_agent/settings/configuration.toml
: Updated the configuration settings to include a new setting for the number of code suggestions to generate.-
pr_agent/settings/pr_code_suggestions_prompts.toml
: Added a new file that contains the prompts for the code suggestions feature.