Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPT-3.5-turbo compatibility #65

Closed
1 of 3 tasks
irgolic opened this issue Apr 30, 2023 · 1 comment
Closed
1 of 3 tasks

GPT-3.5-turbo compatibility #65

irgolic opened this issue Apr 30, 2023 · 1 comment

Comments

@irgolic
Copy link
Owner

irgolic commented Apr 30, 2023

I'm confident AutoPR can be optimized to work well with gpt-3.5-turbo.

Background

As it stands, the rail-v1 pull request agent works fine with gpt-3.5-turbo.

However, the codegen agent does not. rail-v1 codegen may work, but suffers from the same problems that it does when used by gpt-4 – it produces an unreliable diff with ambiguous errors.

My hope is to use auto-v1 codegen, which creates unambiguous changes.

Action selection

  • Following @ShreyaR's guidance, we've augmented the guardrails calls with better instructions. gpt-3.5-turbo is now reliable at selection actions.
Click here to reveal the write-up on action selection.

How auto-v1 codegen works

Here is an example of a successful action selection call:

Prompt
```
You are about to make a decision on what to do next.

This is the issue that was opened:
```#20 Create a dice rolling bot

irgolic: In the `dice_rolling_bot` folder, develop a Discord bot that allows users to roll virtual dice by typing a command (e.g., "!roll 1d6"). The bot should support various dice types (e.g., d4, d6, d8, d10, d12, d20) and multiple dice rolls at once. Include a README.md file with information on the bot's features, setup instructions, and usage guidelines.



```

This is the pull request we're creating:
```Title: Feature/dice-rolling-bot: Add Dice Rolling Bot Functionality

This PR resolves issue #20, adding a dice rolling bot to the `dice_rolling_bot` folder. This Discord bot allows users to roll virtual dice by typing a command, such as "!roll 1d6". The bot supports various dice types (d4, d6, d8, d10, d12, d20) and multiple dice rolls at once. A README.md file has been included with information on the bot's features, setup instructions, and usage guidelines.

Commits:
1. Create dice_rolling_bot.py
2. Add README.md for dice rolling bot

Please review the changes and let me know if any modifications are required.

1. Commit: Create dice_rolling_bot.py
    Files: dice_rolling_bot/dice_rolling_bot.py:L1-L23
    Changes:
      - Created a new file `dice_rolling_bot.py`
      - Imported required libraries (random and discord)
      - Implemented the main functionality for the bot to roll dice based on user input
2. Commit: Add README.md for dice rolling bot
    Files: dice_rolling_bot/README.md:L1-L32
    Changes:
      - Created a new file `README.md`
      - Added setup instructions for the bot
      - Provided usage guidelines and examples for the dice rolling bot
```

This is the commit we're writing:
```Create dice_rolling_bot.py

- Created a new file `dice_rolling_bot.py`
- Imported required libraries (random and discord)
- Implemented the main functionality for the bot to roll dice based on user input```

This is the codebase subset we decided to look at:
``````

These are the actions we've taken so far:
``````

In our pursuit of implementing this commit, please make a decision on what to do next.
If a new file should be created, respond with `new_file` and the file path and description.
If one of the files in the codebase subset should be edited, respond with `edit_file` and the file path, description, and line numbers of the hunk to edit. When picking the line numbers, keep the hunk as short as possible while fulfilling the description. If you don't know which line numbers to pick, leave them blank.
If you're done, respond with `finished` and the commit message describing the past actions.
```


Given below is XML that describes the information to extract from this document and the tags to extract it into.

```xml
<output>
    <string name="action" choices="new_file,edit_file,finished"/>
    <object name="new_file" description="new_file" if="action==new_file">
        <string name="filepath" description="Path to the newly created file." required="true"/>
        <string name="description" description="Description of the contents of the new file." required="true"/>
    </object>
    <object name="edit_file" description="edit_file" if="action==edit_file">
        <string name="filepath" description="Path to the file to be edited." required="true"/>
        <string name="description" description="Description of the changes to be made to the file." required="true"/>
        <integer name="start_line" description="The line number of the first line of the hunk to be edited." format="positive" required="false" on-fail="noop"/>
        <integer name="end_line" description="The line number of the last line of the hunk to be edited. Keep the hunk as short as possible while fulfilling the description." format="positive" required="false" on-fail="noop"/>
    </object>
    <string name="finished" description="commit_message: A more appropriate commit message based on the actions taken." required="false" if="action==finished"/>
</output>
```


ONLY return a valid JSON object (no other text is necessary), where the key of the field in JSON is the `name` attribute of the corresponding XML, and the value is of the type specified by the corresponding XML's tag. The JSON MUST conform to the XML format, including any types and format requests e.g. requests for lists, objects and specific types. Be correct and concise.

Here are examples of simple (XML, JSON) pairs that show the expected behavior:
- `<string name='foo' format='two-words lower-case' />` => `{'foo': 'example one'}`
- `<list name='bar'><string format='upper-case' /></list>` => `{"bar": ['STRING ONE', 'STRING TWO', etc.]}`
- `<object name='baz'><string name="foo" format="capitalize two-words" /><integer name="index" format="1-indexed" /></object>` => `{'baz': {'foo': 'Some String', 'index': 1}}`
Raw Response
{
  "action": "new_file",
  "new_file": {
    "filepath": "dice_rolling_bot/dice_rolling_bot.py",
    "description": "Created a new file `dice_rolling_bot.py` with required libraries (random and discord) and implemented the main functionality for the bot to roll dice based on user input"
  }
}
Parsed Response
{
  "action": "new_file",
  "new_file": {
    "filepath": "dice_rolling_bot/dice_rolling_bot.py",
    "description": "Created a new file `dice_rolling_bot.py` with required libraries (random and discord) and implemented the main functionality for the bot to roll dice based on user input"
  },
  "edit_file": null,
  "commit_message": null
}

What's preventing gpt-3.5-turbo from working properly?

In the action selection stage, gpt-3.5-turbo commonly returns the action selection JSON in an incorrect format. Sometimes it wraps it in triple-backtick groups (now handled by guardrails-ai/guardrails#118). Commonly it flattens the dict, returning something akin to this:

Raw Response
{
  "action": "new_file",
  "filepath": "dice_rolling_bot/dice_rolling_bot.py",
  "description": "Created a new file `dice_rolling_bot.py` with required libraries (random and discord) and implemented the main functionality for the bot to roll dice based on user input"
}

In theory, this is an unambiguous error, and can be corrected for. See some discussion on this topic in guardrails-ai/guardrails#123. Ideally this fix is submitted through guardrails, but if that turns out to be too complicated, it's fine if AutoPR gets around it with a custom guardrails validator or parsing the raw string as a JSON on failure.

Following a conversation in a Discord thread, perhaps a better solution is to abstract out an "action selection engine", and implement different implementations for different models:

  • JSON with guardrails (as auto-v1 does now)
  • ! discord command syntax (e.g., !newfile <filename> <description>)
  • pythonic syntax (e.g., def new_file(filename: str, description: str))

File creation and editing

How auto-v1 codegen works

Here's an example of a successful new file generation:

Prompt
Human: Hey, we've got a new file to create.

This is the issue that was opened:
```
#20 Create a dice rolling bot

irgolic: In the `dice_rolling_bot` folder, develop a Discord bot that allows users to roll virtual dice by typing a command (e.g., "!roll 1d6"). The bot should support various dice types (e.g., d4, d6, d8, d10, d12, d20) and multiple dice rolls at once. Include a README.md file with information on the bot's features, setup instructions, and usage guidelines.




```

This is the pull request we're creating:
```
Title: Feature/dice-rolling-bot: Add Dice Rolling Bot Functionality

This PR resolves issue #20, adding a dice rolling bot to the `dice_rolling_bot` folder. This Discord bot allows users to roll virtual dice by typing a command, such as "!roll 1d6". The bot supports various dice types (d4, d6, d8, d10, d12, d20) and multiple dice rolls at once. A README.md file has been included with information on the bot's features, setup instructions, and usage guidelines.

Commits:
1. Create dice_rolling_bot.py
2. Add README.md for dice rolling bot

Please review the changes and let me know if any modifications are required.

1. Commit: Create dice_rolling_bot.py
    Files: dice_rolling_bot/dice_rolling_bot.py:L1-L23
    Changes:
      - Created a new file `dice_rolling_bot.py`
      - Imported required libraries (random and discord)
      - Implemented the main functionality for the bot to roll dice based on user input
2. Commit: Add README.md for dice rolling bot
    Files: dice_rolling_bot/README.md:L1-L32
    Changes:
      - Created a new file `README.md`
      - Added setup instructions for the bot
      - Provided usage guidelines and examples for the dice rolling bot

```

This is the commit we're writing:
```
Create dice_rolling_bot.py

- Created a new file `dice_rolling_bot.py`
- Imported required libraries (random and discord)
- Implemented the main functionality for the bot to roll dice based on user input
```

This is the codebase subset we decided to look at:
```

```

This is the plan for the file we're creating:
```
Created a new file `dice_rolling_bot.py` with required libraries (random and discord) and implemented the main functionality for the bot to roll dice based on user input
```

Please send me the contents of the file.

RESPONSE FORMAT INSTRUCTIONS
----------------------------

When responding to me, please use the following format. Make sure you return both the code enclosed in backticks and the JSON immediately after.

```
<string>
```
{
    "outcome": string  # A description of the outcome of the attempt to rewrite the file hunk according to the problem statement.
}
Raw Response
```python
import random
import discord
from discord.ext import commands

intents = discord.Intents.default()
intents.typing = False
intents.presences = False

bot = commands.Bot(command_prefix="!", intents=intents)

@bot.event
async def on_ready():
    print(f"{bot.user} has connected to Discord!")

@bot.command(name="roll", help="Rolls a die. Format: !roll <num_of_dice>d<sides_of_die>")
async def roll_dice(ctx, dice: str):
    num_of_dice, sides_of_die = map(int, dice.split("d"))
    rolls = [random.randint(1, sides_of_die) for _ in range(num_of_dice)]
    await ctx.send(', '.join(map(str, rolls)))

bot.run("your_bot_token_here")
```
{
    "outcome": "Created dice_rolling_bot.py with the necessary imports and implemented the main functionality for the bot to roll dice based on user input."
}
Result
{
  "contents": "import random\nimport discord\nfrom discord.ext import commands\n\nintents = discord.Intents.default()\nintents.typing = False\nintents.presences = False\n\nbot = commands.Bot(command_prefix=\"!\", intents=intents)\n\n@bot.event\nasync def on_ready():\n    print(f\"{bot.user} has connected to Discord!\")\n\n@bot.command(name=\"roll\", help=\"Rolls a die. Format: !roll <num_of_dice>d<sides_of_die>\")\nasync def roll_dice(ctx, dice: str):\n    num_of_dice, sides_of_die = map(int, dice.split(\"d\"))\n    rolls = [random.randint(1, sides_of_die) for _ in range(num_of_dice)]\n    await ctx.send(', '.join(map(str, rolls)))\n\nbot.run(\"your_bot_token_here\")",
  "outcome": "Created dice_rolling_bot.py with the necessary imports and implemented the main functionality for the bot to roll dice based on user input."
}

Notice that the LLM reflects on its change in the "outcome" JSON key. This is fed back into subsequent actions to inform what's been done so far.

Here's an example of a successful file edit:

Prompt
Human: Hey, we've got a new code hunk to diff.
    
This is the issue that was opened:
```
#29 Create a movie recommendation app

irgolic: In the `movie_recommendation` folder, develop a web app that provides personalized movie recommendations based on user preferences, such as genre, actors, and directors. The app should display movie posters, summaries, and trailers. Include a README.md file with an explanation of the app's features and installation instructions.


```
    
This is the pull request we're creating:
```
Title: #42 Movie Recommendation Web App

Closes #29

This pull request implements a movie recommendation web app that provides personalized movie recommendations based on user preferences, such as genre, actors, and directors. The app displays movie posters, summaries, and trailers. A README.md file with the app's features and installation instructions is also included.

Please review the changes and let me know if any modifications are needed.

1. Commit: Added initial app files - movie_recommendation/app.py, movie_recommendation/templates/base.html
    Files: movie_recommendation/app.py, movie_recommendation/templates/base.html
    Changes:
      * movie_recommendation/app.py - Add the initial flask app file and import required libraries.
      * movie_recommendation/templates/base.html - Create the base html file for the web app.
2. Commit: Implemented movie recommendation logic - movie_recommendation/movie_recommender.py
    Files: movie_recommendation/movie_recommender.py
    Changes:
      * movie_recommendation/movie_recommender.py - Add the logic for generating personalized movie recommendations based on user preferences.
3. Commit: Created movie recommendation form - movie_recommendation/templates/recommendation_form.html
    Files: movie_recommendation/templates/recommendation_form.html
    Changes:
      * movie_recommendation/templates/recommendation_form.html - Add a form to collect user preferences for movie recommendations.
4. Commit: Display recommended movies - movie_recommendation/templates/recommended_movies.html
    Files: movie_recommendation/templates/recommended_movies.html
    Changes:
      * movie_recommendation/templates/recommended_movies.html - Add a template to display the recommended movies with posters, summaries, and trailers.
5. Commit: Add routes for recommendation form and results - movie_recommendation/app.py
    Files: movie_recommendation/app.py
    Changes:
      * movie_recommendation/app.py - Add routes for the recommendation form and displaying the results.
6. Commit: Include README.md with app features and installation instructions - README.md
    Files: README.md
    Changes:
      * README.md - Add a detailed explanation of the app's features and installation instructions.
7. Commit: Refactor movie recommendation logic - movie_recommendation/movie_recommender.py
    Files: movie_recommendation/movie_recommender.py
    Changes:
      * movie_recommendation/movie_recommender.py - Refactor the movie recommendation logic to improve performance and readability.

```
    
This is the commit we're writing:
```
Add routes for recommendation form and results - movie_recommendation/app.py

* movie_recommendation/app.py - Add routes for the recommendation form and displaying the results.
```
    
This is the codebase subset we decided to look at:
```
>>> File: movie_recommendation/app.py

 1 | from flask import Flask, render_template, request, redirect, url_for
 2 | import movie_recommendation.movie_recommender as mr
 3 | 
 4 | app = Flask(__name__)
 5 | 
 6 | @app.route('/')
 7 | def home():
 8 |     return render_template('base.html')
 9 | 
10 | if __name__ == "__main__":
11 |     app.run(debug=True)
```
    
This is the hunk we're rewriting:
```
 3 | 
 4 | app = Flask(__name__)
 5 | 
 6 * @app.route('/')
 7 * def home():
 8 *     return render_template('base.html')
 9 | 
10 | if __name__ == "__main__":
11 |     app.run(debug=True)
```
    
This is the plan for how we want to rewrite the hunk:
```
Add routes for the recommendation form and displaying the results.
```
    
Please rewrite the hunk to match the plan, but do not include any lines prefixed with | in the result.

RULES:
- ONLY rewrite the lines prefixed with *, 
- submit only the lines without the * prefix,
- do not preserve the relative leading indentation of the lines (start the hunk's indentation at 0).
    
RESPONSE FORMAT INSTRUCTIONS
----------------------------

When responding to me, please use the following format. Make sure you return both the code enclosed in backticks and the JSON immediately after.

```
<string>
```
{
    "outcome": string  # A description of the outcome of the attempt to rewrite the file hunk according to the problem statement.
}
Raw Response
```
@app.route('/')
def home():
    return render_template('recommendation_form.html')

@app.route('/recommend', methods=['POST'])
def recommend():
    user_preferences = request.form
    recommended_movies = mr.get_recommendations(user_preferences)
    return render_template('recommended_movies.html', movies=recommended_movies)
```
{
    "outcome": "Hunk rewritten with the addition of routes for the recommendation form and displaying the results."
}
Result
{
  "contents": "@app.route('/')\ndef home():\n    return render_template('recommendation_form.html')\n\n@app.route('/recommend', methods=['POST'])\ndef recommend():\n    user_preferences = request.form\n    recommended_movies = mr.get_recommendations(user_preferences)\n    return render_template('recommended_movies.html', movies=recommended_movies)",
  "outcome": "Hunk rewritten with the addition of routes for the recommendation form and displaying the results."
}

Notice how the hunk to be rewritten is prefixed with * with some surrounding lines of context.

What's preventing gpt-3.5-turbo from working properly?

It's really bad at generating something in the format of:

```
<code>
```
{
  <metadata>
}

It commonly drops the closing backtick of the code, leaves the <code> empty, or repeats the description of the content instead of writing actual code. Omitting the trailing metadata will probably improve performance by a lot, as well as adding some relevant few-shot examples.

Also, it's really bad at following the instruction of rewriting only the code prefixed by *. It will just give you the full file/include the context lines.

An example exhibiting both these qualities:

Prompt
Human: Hey, we've got a new code hunk to diff.
    
This is the issue that was opened:
```
#31 Design a virtual art gallery

irgolic: In the `virtual_art_gallery` folder, create a web-based virtual art gallery that showcases a variety of artwork, such as paintings, sculptures, and photography. Allow users to browse through the gallery and view detailed information about each piece. Add a README.md file explaining the gallery's purpose, features, and how to set it up.

[irgolic/AutoPR#65](https://github.com/irgolic/AutoPR/issues/65)


```
    
This is the pull request we're creating:
```
Title: Design a virtual art gallery - Issue #31

- Added a new virtual art gallery that showcases various artwork
- Created an index.html file in the `virtual_art_gallery` folder to display the gallery and implemented CSS styles
- Added `artwork.json` file in the `virtual_art_gallery` folder to store artwork data
- Implemented artwork data parsing and rendering with `app.js` in the `virtual_art_gallery` folder
- Added a README.md file to explain the gallery's purpose, features, and how to set it up

1. Commit: Create virtual_art_gallery folder and index.html file
    Files: virtual_art_gallery/index.html
    Changes:
      - Added a new virtual art gallery that showcases various artwork
2. Commit: Add CSS styles to index.html
    Files: virtual_art_gallery/index.html, virtual_art_gallery/style.css
    Changes:
      - Created an index.html file in the `virtual_art_gallery` folder to display the gallery and implemented CSS styles
3. Commit: Create artwork data file
    Files: virtual_art_gallery/artwork.json
    Changes:
      - Added `artwork.json` file in the `virtual_art_gallery` folder to store artwork data
4. Commit: Implement artwork data parsing and rendering
    Files: virtual_art_gallery/index.html, virtual_art_gallery/app.js
    Changes:
      - Implemented artwork data parsing and rendering with `app.js` in the `virtual_art_gallery` folder
5. Commit: Add README.md
    Files: README.md
    Changes:
      - Added a README.md file to explain the gallery's purpose, features, and how to set it up

```
    
This is the commit we're writing:
```
Create virtual_art_gallery folder and index.html file

- Added a new virtual art gallery that showcases various artwork
```
    
This is the codebase subset we decided to look at:
```
>>> File: virtual_art_gallery/index.html

1 | <section id="virtual-gallery">
2 |   <h2>New Artwork</h2>
3 |   <section id="virtual-gallery">
4 |     <h2>New Artwork</h2>
5 |     <img src="path/to/new/artwork">
6 |   </section>
7 | </section>
```
    
This is the hunk we're rewriting:
```
1 | <section id="virtual-gallery">
2 |   <h2>New Artwork</h2>
3 *   <section id="virtual-gallery">
4 *     <h2>New Artwork</h2>
5 *     <img src="path/to/new/artwork">
6 *   </section>
7 | </section>
```
    
This is the plan for how we want to rewrite the hunk:
```
Add new artwork to the virtual gallery
```
    
Please rewrite the hunk to match the plan, but do not include any lines prefixed with | in the result.

RULES:
- ONLY rewrite the lines prefixed with *, 
- submit only the lines without the * prefix,
- do not preserve the relative leading indentation of the lines (start the hunk's indentation at 0).
    
RESPONSE FORMAT INSTRUCTIONS
----------------------------

When responding to me, please use the following format. Make sure you return both the code enclosed in backticks and the JSON immediately after.

```
<string>
```
{
    "outcome": string  # A description of the outcome of the attempt to rewrite the file hunk according to the problem statement.
}
Raw Response
<section id="virtual-gallery">
  <h2>New Artwork</h2>
  <section id="virtual-gallery">
    <h2>New Artwork</h2>
    <img src="path/to/new/artwork">
    <p>Description of new artwork</p>
  </section>
</section>
```
{
    "outcome": "Success"
}

What can we do about it?

Here are some ideas:

  • Only ask gpt-3.5-turbo to generate a code block enclosed by backticks, without the trailing metadata JSON. Alternatively, try using specific few-shot examples.
  • Restrict gpt-3.5-turbo's file edit action to only rewrite whole files
@cballou
Copy link

cballou commented May 4, 2023

Just wanted to add on my fail state as I don't yet have access to 4.0. I switched to gpt-3.5-turbo and received an error about my monorepo:

Screenshot 2023-05-04 at 8 25 42 AM

I think in my case, given the complexity of the repository I'm trying to analyze, I'll just need to wait for 4.0... I'm asking it to specifically drill down into a sub-directory of the repo, target a specific version of a framework, and try to uncover missing/incomplete code.

My guess is the overarching request was too verbose (targeting an entire project and not, say, one specific controller or action) which in turn correlates to too many tokens.

@irgolic irgolic closed this as completed Nov 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants