-
Notifications
You must be signed in to change notification settings - Fork 739
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically enable /improve --extended
mode for large PRs
#564
Conversation
PR Description updated to latest commit (2f9fbbf) |
PR Analysis
PR Feedback💡 General suggestions: The PR is well-structured and the changes are clearly explained. It would be beneficial to add tests to verify the new functionality and ensure it works as expected. Also, consider handling exceptions more specifically rather than using a general exception clause. 🤖 Code feedback:
✨ Usage tips:
|
max_len = max( | ||
len(data_sorted), | ||
get_settings().pr_code_suggestions.num_code_suggestions, | ||
get_settings().pr_code_suggestions.num_code_suggestions_per_chunk, | ||
) | ||
new_len = int(0.5 + max_len * get_settings().pr_code_suggestions.final_clip_factor) | ||
if new_len < len(data_sorted): | ||
data_sorted = data_sorted[:new_len] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is intended to handle an edge-case where the total size of data_sorted
before clipping is already small enough in comparison to num_code_suggestions
or num_code_suggestions_per_chunk
, and so clipping it will result it potentially much fewer suggestions.
For example:
Assume num_code_suggestions_per_chunk=10
, final_clip_factor=0.5
and len(data_sorted)==2
before clipping. Based on these configurations, it would be reasonable to expect that the PR-Agent will give us all of the (2) suggestions it generated, but in reality it would have applied the final_clip_factor
and cut the data_sorted
list in half, giving us only the first suggestion.
With this "fix" the final_clip_factor
won't reduce the size of the data_sorted
list if it's already small enough relative to the max number of suggestions we set in the configuration.
@zmeir i am not sure about the need of this specific PR. I recommend always using '/improve --extended'. So the command itself is already "automatic", according to the correct input - the number of tokens, that will lead to the minimal number of calls Can you elaborate why do we need extra logic beyond the original one ? |
I think it comes down to usability and cost. From manager's perspective, you might not want to control the cost of using the However, from a user's perspective, I don't really want to care about the difference, I just want to best results and the simplest usage. So ideally I only want to run So to your question why the extended mode is not the default, I think this what I was aiming for with this PR - if this is what you want you can set the thresholds to 0 and it will always run with extended mode. But I wanted to still leave some wiggle room in case you're concerned about the extra calls. I hope that answers your questions. I should also note that in the extended mode you have some extra logic for clipping some percentage of the suggestions that will result in fewer suggestions than the normal mode, if there was only one call and it produced fewer than the max number of suggestions. This is another part of this PR - I can separate it if you'd like. |
To simplify my previous response: I'm a lazy user that wants to type fewer characters, so I want to type |
@zmeir It contains a lot of magic numbers, that are hard to tune or understand their meaning, and how they will impact the results. Also notice you can always lower the number of possible tokens per model call, and it will give you a similar effect what you talked about, yet in a more controlled, manner and still using a single parameter. Maybe the default option should just be to enable the 'extended' mode. I am not sure if 500 lines is a lot or not. maybe they have no impact at all, and in practice its just the same as doing 'extended' all the time. or maybe the other way around - extended won't happen, even if needed, because 500 lines are too much. I think this is correct for most users. |
Type
Enhancement
Description
PR changes walkthrough
1 files
pr_code_suggestions.py
pr_agent/tools/pr_code_suggestions.py
The PR introduces a new feature that automatically enables
extended mode for large PRs. This is determined by the
number of files, additions, and deletions in the PR. The PR
also modifies the logic for reducing the number of
suggestions, ensuring it doesn't go below a certain
threshold.
1 files
IMPROVE.md
docs/IMPROVE.md
The PR updates the documentation to include the new
parameters for automatically enabling extended mode for
large PRs. It also provides default values for these
parameters.
1 files
configuration.toml
pr_agent/settings/configuration.toml
The PR updates the configuration file to include the new
parameters for automatically enabling extended mode for
large PRs and their default values.
User description
Thought it could be nice to add an option to allow the
/improve
command to automatically run with--extended
mode based on the size of the PR (number of files/additions/deletions)