GPT-3.5-turbo compatibility #65

irgolic · 2023-04-30T19:38:54Z

I'm confident AutoPR can be optimized to work well with gpt-3.5-turbo.

Background

As it stands, the rail-v1 pull request agent works fine with gpt-3.5-turbo.

However, the codegen agent does not. rail-v1 codegen may work, but suffers from the same problems that it does when used by gpt-4 – it produces an unreliable diff with ambiguous errors.

My hope is to use auto-v1 codegen, which creates unambiguous changes.

Action selection

Following @ShreyaR's guidance, we've augmented the guardrails calls with better instructions. gpt-3.5-turbo is now reliable at selection actions.

Click here to reveal the write-up on action selection.

How `auto-v1` codegen works

Here is an example of a successful action selection call:

Prompt

```
You are about to make a decision on what to do next.

This is the issue that was opened:
```#20 Create a dice rolling bot

irgolic: In the `dice_rolling_bot` folder, develop a Discord bot that allows users to roll virtual dice by typing a command (e.g., "!roll 1d6"). The bot should support various dice types (e.g., d4, d6, d8, d10, d12, d20) and multiple dice rolls at once. Include a README.md file with information on the bot's features, setup instructions, and usage guidelines.



```

This is the pull request we're creating:
```Title: Feature/dice-rolling-bot: Add Dice Rolling Bot Functionality

This PR resolves issue #20, adding a dice rolling bot to the `dice_rolling_bot` folder. This Discord bot allows users to roll virtual dice by typing a command, such as "!roll 1d6". The bot supports various dice types (d4, d6, d8, d10, d12, d20) and multiple dice rolls at once. A README.md file has been included with information on the bot's features, setup instructions, and usage guidelines.

Commits:
1. Create dice_rolling_bot.py
2. Add README.md for dice rolling bot

Please review the changes and let me know if any modifications are required.

1. Commit: Create dice_rolling_bot.py
    Files: dice_rolling_bot/dice_rolling_bot.py:L1-L23
    Changes:
      - Created a new file `dice_rolling_bot.py`
      - Imported required libraries (random and discord)
      - Implemented the main functionality for the bot to roll dice based on user input
2. Commit: Add README.md for dice rolling bot
    Files: dice_rolling_bot/README.md:L1-L32
    Changes:
      - Created a new file `README.md`
      - Added setup instructions for the bot
      - Provided usage guidelines and examples for the dice rolling bot
```

This is the commit we're writing:
```Create dice_rolling_bot.py

- Created a new file `dice_rolling_bot.py`
- Imported required libraries (random and discord)
- Implemented the main functionality for the bot to roll dice based on user input```

This is the codebase subset we decided to look at:
``````

These are the actions we've taken so far:
``````

In our pursuit of implementing this commit, please make a decision on what to do next.
If a new file should be created, respond with `new_file` and the file path and description.
If one of the files in the codebase subset should be edited, respond with `edit_file` and the file path, description, and line numbers of the hunk to edit. When picking the line numbers, keep the hunk as short as possible while fulfilling the description. If you don't know which line numbers to pick, leave them blank.
If you're done, respond with `finished` and the commit message describing the past actions.
```


Given below is XML that describes the information to extract from this document and the tags to extract it into.

```xml
<output>
    <string name="action" choices="new_file,edit_file,finished"/>
    <object name="new_file" description="new_file" if="action==new_file">
        <string name="filepath" description="Path to the newly created file." required="true"/>
        <string name="description" description="Description of the contents of the new file." required="true"/>
    </object>
    <object name="edit_file" description="edit_file" if="action==edit_file">
        <string name="filepath" description="Path to the file to be edited." required="true"/>
        <string name="description" description="Description of the changes to be made to the file." required="true"/>
        <integer name="start_line" description="The line number of the first line of the hunk to be edited." format="positive" required="false" on-fail="noop"/>
        <integer name="end_line" description="The line number of the last line of the hunk to be edited. Keep the hunk as short as possible while fulfilling the description." format="positive" required="false" on-fail="noop"/>
    </object>
    <string name="finished" description="commit_message: A more appropriate commit message based on the actions taken." required="false" if="action==finished"/>
</output>
```


ONLY return a valid JSON object (no other text is necessary), where the key of the field in JSON is the `name` attribute of the corresponding XML, and the value is of the type specified by the corresponding XML's tag. The JSON MUST conform to the XML format, including any types and format requests e.g. requests for lists, objects and specific types. Be correct and concise.

Here are examples of simple (XML, JSON) pairs that show the expected behavior:
- `<string name='foo' format='two-words lower-case' />` => `{'foo': 'example one'}`
- `<list name='bar'><string format='upper-case' /></list>` => `{"bar": ['STRING ONE', 'STRING TWO', etc.]}`
- `<object name='baz'><string name="foo" format="capitalize two-words" /><integer name="index" format="1-indexed" /></object>` => `{'baz': {'foo': 'Some String', 'index': 1}}`

Raw Response

{
  "action": "new_file",
  "new_file": {
    "filepath": "dice_rolling_bot/dice_rolling_bot.py",
    "description": "Created a new file `dice_rolling_bot.py` with required libraries (random and discord) and implemented the main functionality for the bot to roll dice based on user input"
  }
}

Parsed Response

{
  "action": "new_file",
  "new_file": {
    "filepath": "dice_rolling_bot/dice_rolling_bot.py",
    "description": "Created a new file `dice_rolling_bot.py` with required libraries (random and discord) and implemented the main functionality for the bot to roll dice based on user input"
  },
  "edit_file": null,
  "commit_message": null
}

What's preventing gpt-3.5-turbo from working properly?

In the action selection stage, gpt-3.5-turbo commonly returns the action selection JSON in an incorrect format. Sometimes it wraps it in triple-backtick groups (now handled by guardrails-ai/guardrails#118). Commonly it flattens the dict, returning something akin to this:

Raw Response

{
  "action": "new_file",
  "filepath": "dice_rolling_bot/dice_rolling_bot.py",
  "description": "Created a new file `dice_rolling_bot.py` with required libraries (random and discord) and implemented the main functionality for the bot to roll dice based on user input"
}

In theory, this is an unambiguous error, and can be corrected for. See some discussion on this topic in guardrails-ai/guardrails#123. Ideally this fix is submitted through guardrails, but if that turns out to be too complicated, it's fine if AutoPR gets around it with a custom guardrails validator or parsing the raw string as a JSON on failure.

Following a conversation in a Discord thread, perhaps a better solution is to abstract out an "action selection engine", and implement different implementations for different models:

JSON with guardrails (as auto-v1 does now)
! discord command syntax (e.g., !newfile <filename> <description>)
pythonic syntax (e.g., def new_file(filename: str, description: str))

File creation and editing

How `auto-v1` codegen works

Here's an example of a successful new file generation:

Prompt

Human: Hey, we've got a new file to create.

This is the issue that was opened:
```
#20 Create a dice rolling bot

irgolic: In the `dice_rolling_bot` folder, develop a Discord bot that allows users to roll virtual dice by typing a command (e.g., "!roll 1d6"). The bot should support various dice types (e.g., d4, d6, d8, d10, d12, d20) and multiple dice rolls at once. Include a README.md file with information on the bot's features, setup instructions, and usage guidelines.




```

This is the pull request we're creating:
```
Title: Feature/dice-rolling-bot: Add Dice Rolling Bot Functionality

This PR resolves issue #20, adding a dice rolling bot to the `dice_rolling_bot` folder. This Discord bot allows users to roll virtual dice by typing a command, such as "!roll 1d6". The bot supports various dice types (d4, d6, d8, d10, d12, d20) and multiple dice rolls at once. A README.md file has been included with information on the bot's features, setup instructions, and usage guidelines.

Commits:
1. Create dice_rolling_bot.py
2. Add README.md for dice rolling bot

Please review the changes and let me know if any modifications are required.

1. Commit: Create dice_rolling_bot.py
    Files: dice_rolling_bot/dice_rolling_bot.py:L1-L23
    Changes:
      - Created a new file `dice_rolling_bot.py`
      - Imported required libraries (random and discord)
      - Implemented the main functionality for the bot to roll dice based on user input
2. Commit: Add README.md for dice rolling bot
    Files: dice_rolling_bot/README.md:L1-L32
    Changes:
      - Created a new file `README.md`
      - Added setup instructions for the bot
      - Provided usage guidelines and examples for the dice rolling bot

```

This is the commit we're writing:
```
Create dice_rolling_bot.py

- Created a new file `dice_rolling_bot.py`
- Imported required libraries (random and discord)
- Implemented the main functionality for the bot to roll dice based on user input
```

This is the codebase subset we decided to look at:
```

```

This is the plan for the file we're creating:
```
Created a new file `dice_rolling_bot.py` with required libraries (random and discord) and implemented the main functionality for the bot to roll dice based on user input
```

Please send me the contents of the file.

RESPONSE FORMAT INSTRUCTIONS
----------------------------

When responding to me, please use the following format. Make sure you return both the code enclosed in backticks and the JSON immediately after.

```
<string>
```
{
    "outcome": string  # A description of the outcome of the attempt to rewrite the file hunk according to the problem statement.
}

Raw Response

```python
import random
import discord
from discord.ext import commands

intents = discord.Intents.default()
intents.typing = False
intents.presences = False

bot = commands.Bot(command_prefix="!", intents=intents)

@bot.event
async def on_ready():
    print(f"{bot.user} has connected to Discord!")

@bot.command(name="roll", help="Rolls a die. Format: !roll <num_of_dice>d<sides_of_die>")
async def roll_dice(ctx, dice: str):
    num_of_dice, sides_of_die = map(int, dice.split("d"))
    rolls = [random.randint(1, sides_of_die) for _ in range(num_of_dice)]
    await ctx.send(', '.join(map(str, rolls)))

bot.run("your_bot_token_here")
```
{
    "outcome": "Created dice_rolling_bot.py with the necessary imports and implemented the main functionality for the bot to roll dice based on user input."
}

Result

{
  "contents": "import random\nimport discord\nfrom discord.ext import commands\n\nintents = discord.Intents.default()\nintents.typing = False\nintents.presences = False\n\nbot = commands.Bot(command_prefix=\"!\", intents=intents)\n\n@bot.event\nasync def on_ready():\n    print(f\"{bot.user} has connected to Discord!\")\n\n@bot.command(name=\"roll\", help=\"Rolls a die. Format: !roll <num_of_dice>d<sides_of_die>\")\nasync def roll_dice(ctx, dice: str):\n    num_of_dice, sides_of_die = map(int, dice.split(\"d\"))\n    rolls = [random.randint(1, sides_of_die) for _ in range(num_of_dice)]\n    await ctx.send(', '.join(map(str, rolls)))\n\nbot.run(\"your_bot_token_here\")",
  "outcome": "Created dice_rolling_bot.py with the necessary imports and implemented the main functionality for the bot to roll dice based on user input."
}

Notice that the LLM reflects on its change in the "outcome" JSON key. This is fed back into subsequent actions to inform what's been done so far.

Here's an example of a successful file edit:

Prompt

Human: Hey, we've got a new code hunk to diff.
    
This is the issue that was opened:
```
#29 Create a movie recommendation app

irgolic: In the `movie_recommendation` folder, develop a web app that provides personalized movie recommendations based on user preferences, such as genre, actors, and directors. The app should display movie posters, summaries, and trailers. Include a README.md file with an explanation of the app's features and installation instructions.


```
    
This is the pull request we're creating:
```
Title: #42 Movie Recommendation Web App

Closes #29

This pull request implements a movie recommendation web app that provides personalized movie recommendations based on user preferences, such as genre, actors, and directors. The app displays movie posters, summaries, and trailers. A README.md file with the app's features and installation instructions is also included.

Please review the changes and let me know if any modifications are needed.

1. Commit: Added initial app files - movie_recommendation/app.py, movie_recommendation/templates/base.html
    Files: movie_recommendation/app.py, movie_recommendation/templates/base.html
    Changes:
      * movie_recommendation/app.py - Add the initial flask app file and import required libraries.
      * movie_recommendation/templates/base.html - Create the base html file for the web app.
2. Commit: Implemented movie recommendation logic - movie_recommendation/movie_recommender.py
    Files: movie_recommendation/movie_recommender.py
    Changes:
      * movie_recommendation/movie_recommender.py - Add the logic for generating personalized movie recommendations based on user preferences.
3. Commit: Created movie recommendation form - movie_recommendation/templates/recommendation_form.html
    Files: movie_recommendation/templates/recommendation_form.html
    Changes:
      * movie_recommendation/templates/recommendation_form.html - Add a form to collect user preferences for movie recommendations.
4. Commit: Display recommended movies - movie_recommendation/templates/recommended_movies.html
    Files: movie_recommendation/templates/recommended_movies.html
    Changes:
      * movie_recommendation/templates/recommended_movies.html - Add a template to display the recommended movies with posters, summaries, and trailers.
5. Commit: Add routes for recommendation form and results - movie_recommendation/app.py
    Files: movie_recommendation/app.py
    Changes:
      * movie_recommendation/app.py - Add routes for the recommendation form and displaying the results.
6. Commit: Include README.md with app features and installation instructions - README.md
    Files: README.md
    Changes:
      * README.md - Add a detailed explanation of the app's features and installation instructions.
7. Commit: Refactor movie recommendation logic - movie_recommendation/movie_recommender.py
    Files: movie_recommendation/movie_recommender.py
    Changes:
      * movie_recommendation/movie_recommender.py - Refactor the movie recommendation logic to improve performance and readability.

```
    
This is the commit we're writing:
```
Add routes for recommendation form and results - movie_recommendation/app.py

* movie_recommendation/app.py - Add routes for the recommendation form and displaying the results.
```
    
This is the codebase subset we decided to look at:
```
>>> File: movie_recommendation/app.py

 1 | from flask import Flask, render_template, request, redirect, url_for
 2 | import movie_recommendation.movie_recommender as mr
 3 | 
 4 | app = Flask(__name__)
 5 | 
 6 | @app.route('/')
 7 | def home():
 8 |     return render_template('base.html')
 9 | 
10 | if __name__ == "__main__":
11 |     app.run(debug=True)
```
    
This is the hunk we're rewriting:
```
 3 | 
 4 | app = Flask(__name__)
 5 | 
 6 * @app.route('/')
 7 * def home():
 8 *     return render_template('base.html')
 9 | 
10 | if __name__ == "__main__":
11 |     app.run(debug=True)
```
    
This is the plan for how we want to rewrite the hunk:
```
Add routes for the recommendation form and displaying the results.
```
    
Please rewrite the hunk to match the plan, but do not include any lines prefixed with | in the result.

RULES:
- ONLY rewrite the lines prefixed with *, 
- submit only the lines without the * prefix,
- do not preserve the relative leading indentation of the lines (start the hunk's indentation at 0).
    
RESPONSE FORMAT INSTRUCTIONS
----------------------------

When responding to me, please use the following format. Make sure you return both the code enclosed in backticks and the JSON immediately after.

```
<string>
```
{
    "outcome": string  # A description of the outcome of the attempt to rewrite the file hunk according to the problem statement.
}

Raw Response

```
@app.route('/')
def home():
    return render_template('recommendation_form.html')

@app.route('/recommend', methods=['POST'])
def recommend():
    user_preferences = request.form
    recommended_movies = mr.get_recommendations(user_preferences)
    return render_template('recommended_movies.html', movies=recommended_movies)
```
{
    "outcome": "Hunk rewritten with the addition of routes for the recommendation form and displaying the results."
}

Result

{
  "contents": "@app.route('/')\ndef home():\n    return render_template('recommendation_form.html')\n\n@app.route('/recommend', methods=['POST'])\ndef recommend():\n    user_preferences = request.form\n    recommended_movies = mr.get_recommendations(user_preferences)\n    return render_template('recommended_movies.html', movies=recommended_movies)",
  "outcome": "Hunk rewritten with the addition of routes for the recommendation form and displaying the results."
}

Notice how the hunk to be rewritten is prefixed with * with some surrounding lines of context.

What's preventing gpt-3.5-turbo from working properly?

It's really bad at generating something in the format of:

```
<code>
```
{
  <metadata>
}

It commonly drops the closing backtick of the code, leaves the <code> empty, or repeats the description of the content instead of writing actual code. Omitting the trailing metadata will probably improve performance by a lot, as well as adding some relevant few-shot examples.

Also, it's really bad at following the instruction of rewriting only the code prefixed by *. It will just give you the full file/include the context lines.

An example exhibiting both these qualities:

Prompt

Human: Hey, we've got a new code hunk to diff.
    
This is the issue that was opened:
```
#31 Design a virtual art gallery

irgolic: In the `virtual_art_gallery` folder, create a web-based virtual art gallery that showcases a variety of artwork, such as paintings, sculptures, and photography. Allow users to browse through the gallery and view detailed information about each piece. Add a README.md file explaining the gallery's purpose, features, and how to set it up.

[irgolic/AutoPR#65](https://github.com/irgolic/AutoPR/issues/65)


```
    
This is the pull request we're creating:
```
Title: Design a virtual art gallery - Issue #31

- Added a new virtual art gallery that showcases various artwork
- Created an index.html file in the `virtual_art_gallery` folder to display the gallery and implemented CSS styles
- Added `artwork.json` file in the `virtual_art_gallery` folder to store artwork data
- Implemented artwork data parsing and rendering with `app.js` in the `virtual_art_gallery` folder
- Added a README.md file to explain the gallery's purpose, features, and how to set it up

1. Commit: Create virtual_art_gallery folder and index.html file
    Files: virtual_art_gallery/index.html
    Changes:
      - Added a new virtual art gallery that showcases various artwork
2. Commit: Add CSS styles to index.html
    Files: virtual_art_gallery/index.html, virtual_art_gallery/style.css
    Changes:
      - Created an index.html file in the `virtual_art_gallery` folder to display the gallery and implemented CSS styles
3. Commit: Create artwork data file
    Files: virtual_art_gallery/artwork.json
    Changes:
      - Added `artwork.json` file in the `virtual_art_gallery` folder to store artwork data
4. Commit: Implement artwork data parsing and rendering
    Files: virtual_art_gallery/index.html, virtual_art_gallery/app.js
    Changes:
      - Implemented artwork data parsing and rendering with `app.js` in the `virtual_art_gallery` folder
5. Commit: Add README.md
    Files: README.md
    Changes:
      - Added a README.md file to explain the gallery's purpose, features, and how to set it up

```
    
This is the commit we're writing:
```
Create virtual_art_gallery folder and index.html file

- Added a new virtual art gallery that showcases various artwork
```
    
This is the codebase subset we decided to look at:
```
>>> File: virtual_art_gallery/index.html

1 | <section id="virtual-gallery">
2 |   <h2>New Artwork</h2>
3 |   <section id="virtual-gallery">
4 |     <h2>New Artwork</h2>
5 |     <img src="path/to/new/artwork">
6 |   </section>
7 | </section>
```
    
This is the hunk we're rewriting:
```
1 | <section id="virtual-gallery">
2 |   <h2>New Artwork</h2>
3 *   <section id="virtual-gallery">
4 *     <h2>New Artwork</h2>
5 *     <img src="path/to/new/artwork">
6 *   </section>
7 | </section>
```
    
This is the plan for how we want to rewrite the hunk:
```
Add new artwork to the virtual gallery
```
    
Please rewrite the hunk to match the plan, but do not include any lines prefixed with | in the result.

RULES:
- ONLY rewrite the lines prefixed with *, 
- submit only the lines without the * prefix,
- do not preserve the relative leading indentation of the lines (start the hunk's indentation at 0).
    
RESPONSE FORMAT INSTRUCTIONS
----------------------------

When responding to me, please use the following format. Make sure you return both the code enclosed in backticks and the JSON immediately after.

```
<string>
```
{
    "outcome": string  # A description of the outcome of the attempt to rewrite the file hunk according to the problem statement.
}

Raw Response

<section id="virtual-gallery">
  <h2>New Artwork</h2>
  <section id="virtual-gallery">
    <h2>New Artwork</h2>
    <img src="path/to/new/artwork">
    <p>Description of new artwork</p>
  </section>
</section>
```
{
    "outcome": "Success"
}

What can we do about it?

Here are some ideas:

Only ask gpt-3.5-turbo to generate a code block enclosed by backticks, without the trailing metadata JSON. Alternatively, try using specific few-shot examples.
Restrict gpt-3.5-turbo's file edit action to only rewrite whole files

The text was updated successfully, but these errors were encountered:

cballou · 2023-05-04T12:28:59Z

Just wanted to add on my fail state as I don't yet have access to 4.0. I switched to gpt-3.5-turbo and received an error about my monorepo:

I think in my case, given the complexity of the repository I'm trying to analyze, I'll just need to wait for 4.0... I'm asking it to specifically drill down into a sub-directory of the repo, target a specific version of a framework, and try to uncover missing/incomplete code.

My guess is the overarching request was too verbose (targeting an entire project and not, say, one specific controller or action) which in turn correlates to too many tokens.

This was referenced Apr 30, 2023

KeyError: 'edit_file' #66

Closed

Add warning about gpt-3.5-turbo performance #67

Merged

irgolic mentioned this issue May 1, 2023

tenacity.RetryError: RetryError[<Future at 0x7f9bb4a6f8e0 state=finished raised InvalidRequestError>] #68

Closed

github-actions bot mentioned this issue Jun 11, 2023

Fix #6: implement Event Class zakaria1193/mockerbee#9

Closed

github-actions bot mentioned this issue Jul 16, 2023

Fix #10: Crash if cluster commands list is init to empty map zakaria1193/mockerbee#11

Closed

This was referenced Jul 31, 2023

Fix #1: Create Snake game in python using AutoPR hariexcel/CodeRevAutoPR#2

Merged

Implement Snake Game in Python hariexcel/CodeRevAutoPR#4

Merged

Optimize snake_game.py using AutoPR hariexcel/CodeRevAutoPR#6

Merged

github-actions bot mentioned this issue Aug 1, 2023

Fix #996: Testing AutoPR worknenjoy/gitpay#997

Closed

github-actions bot mentioned this issue Aug 1, 2023

Fix #996: Testing AutoPR worknenjoy/gitpay#998

Closed

This was referenced Aug 7, 2023

Create account.py under Models folder with Account class and static methods hariexcel/CodeRevAutoPR#8

Merged

Feature: Create Auth.py under Services folder hariexcel/CodeRevAutoPR#10

Merged

This was referenced Aug 7, 2023

Fix #1: Add todo Dylanb-dev/testgpt#4

Closed

Add todo list to home page Dylanb-dev/testgpt#5

Closed

hariexcel mentioned this issue Aug 7, 2023

Add Subscription controller and upgrade method hariexcel/CodeRevAutoPR#12

Merged

This was referenced Aug 7, 2023

Add todo list functionality and update Django settings Dylanb-dev/testgpt#6

Closed

Add Todo List Functionality Dylanb-dev/testgpt#7

Closed

This was referenced Aug 7, 2023

Complete the code in Billing.py and add unit tests hariexcel/CodeRevAutoPR#14

Merged

Complete Billing controller and add NewFile controller hariexcel/CodeRevAutoPR#16

Merged

hariexcel mentioned this issue Aug 9, 2023

Autopr Create login.py class and method in python under Controller folder hariexcel/CodeRevAutoPR#18

Merged

github-actions bot mentioned this issue Aug 10, 2023

Add views.py and update urls to display "hello world" response Dylanb-dev/testgpt#16

Open

irgolic closed this as completed Nov 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPT-3.5-turbo compatibility #65

GPT-3.5-turbo compatibility #65

irgolic commented Apr 30, 2023 •

edited

Loading

How `auto-v1` codegen works

What's preventing gpt-3.5-turbo from working properly?

cballou commented May 4, 2023 •

edited

Loading

GPT-3.5-turbo compatibility #65

GPT-3.5-turbo compatibility #65

Comments

irgolic commented Apr 30, 2023 • edited Loading

Background

Action selection

How auto-v1 codegen works

What's preventing gpt-3.5-turbo from working properly?

File creation and editing

How auto-v1 codegen works

What's preventing gpt-3.5-turbo from working properly?

What can we do about it?

cballou commented May 4, 2023 • edited Loading

irgolic commented Apr 30, 2023 •

edited

Loading

How `auto-v1` codegen works

How `auto-v1` codegen works

cballou commented May 4, 2023 •

edited

Loading