feat: SWE Agent Implementation #846

JayQuimby · 2024-04-07T06:49:49Z

Implementing an agent that works similar to SWE-Agent

Agent Features:

Application Computer Interface - simplifies the commands required by the llm.
- Implement: 'read', 'write', 'browse', 'exit', 'edit', 'goto', 'scroll'
Think-Act prompting structure - guides the model to think about what its next step should be and then make code to do it
Short Term Memory - Allows the model to see it's last n steps before taking the next one
Tells the model about its current working dir, file, and line

Help / Suggestions:

Open to other ideas people might have on how to improve this
This agent should be similar to SWE agent, so I would appreciate ideas staying related to that
Any help or additional code would be much appreciated

Links/context:

SWE-Agent Demo
Repo
Issue Add Agent Comparable to SWE-agent #570

Agent/model	SWE-bench % resolved
SWE-Agent (GPT-4)	12.29%
Devin (25% of eval set)	13.84%
Claude 3 Opus (RAG)	3.79%
GPT-4 (RAG)	1.44%

Since SWE-Agent is so close to Devin in performance, I figure trying to emulate it and then iterate on it is the best route.

PierrunoYT · 2024-04-07T10:53:28Z

Can you fix Github Issues whith this and create a PR?

JayQuimby · 2024-04-07T17:40:14Z

@PierrunoYT

Can you fix Github Issues whith this and create a PR?

What github issues are you talking about? can you provide a screenshot or something for me to go off of?

PierrunoYT · 2024-04-07T18:44:35Z

@PierrunoYT

Can you fix Github Issues whith this and create a PR?

What github issues are you talking about? can you provide a screenshot or something for me to go off of?

I mean like the SWE-Agent which can fix Github Issues.

JayQuimby · 2024-04-07T18:48:26Z

Ohhh, lol completely misread that the first time.

I do not think it is currently capable of doing that, not sure tho. I am just implementing the agent framework, but there will definitely need to be more work done on getting web pages, formatting, submitting pull requests etc. for this to do what you were talking about before...

PierrunoYT · 2024-04-07T18:50:58Z

Ohhh, lol completely misread that the first time.

I do not think it is currently capable of doing that, not sure tho. I am just implementing the agent framework, but there will definitely need to be more work done on getting web pages, formatting, submitting pull requests etc. for this to do what you were talking about before...

Oh okay. Waiting for it :)

foragerr · 2024-04-07T19:21:24Z

@PierrunoYT There's some discussion and work here about using OpenDevin to work on OpenDevin GitHub issues. The agent quality with the currently implemented agents is pretty poor though.

I'm excited to see a variant of SWE-agent implemented.

rbren · 2024-04-07T20:01:26Z

Rad! @JayQuimby I'm curious how well it seems to work in practice. Have you thrown any examples at it?

Edit: looks like it's not fully implemented yet

agenthub/thinkact_agent/plans.md

JayQuimby · 2024-04-07T20:06:11Z

Rad! @JayQuimby I'm curious how well it seems to work in practice. Have you thrown any examples at it?

@rbren, I have not tested it with gpt-4 / claude yet because it still needs some work. I will keep working on it, hope to have something usable in the next few days.

rbren · 2024-04-07T20:06:33Z

agenthub/thinkact_agent/plans.md

+'search_for': {
+    'params': '<keywords>',
+    'description': 'Will allow you to search your working directory for files and folders that match your <keyword>.',
+}


Definitely want to add a search_for command. IMO it should be called search_files so the LLM doesn't confuse it with searching on Google

foragerr · 2024-04-10T15:14:29Z

Just want to drop this for passers-by, totally worth a watch if you have 20 mins: https://www.youtube.com/watch?v=RJ6NN8Y-xok

JayQuimby · 2024-04-10T19:47:25Z

@rbren can you look at this again, I think we are getting close to something that is usable. There are still some bugs we are working on fixing but the overall agent is there.

JayQuimby · 2024-04-16T05:52:25Z

@rbren

I have tested this with gpt-4 a few times and it seems to be working well, local models do not appear to work well no matter what I try to do.
The custom bash commands do not appear to be working (search_file, search_dir, find_file...). I think I might just implement them in python instead of bash, curious on your thoughts there.
I renamed everything like you said and goto, scroll commands are implemented. I think the prompt ordering/phrasing might need some work but it is quite functional as is.
All the autogenerated linter changes that were unnecessary are gone
I have some improvement ideas that I want to implement, but I wanted to get this finished as a starting point. Let me know if you think there are any things that need changing.

opendevin/action/fileop.py

rbren · 2024-04-16T14:36:08Z

@JayQuimby this is looking great!

Only concern is the changing behavior of FileReadAction

foragerr · 2024-04-16T22:11:36Z

@JayQuimby Are you testing via the UI, or some other way?

JayQuimby · 2024-04-16T22:11:37Z

@rbren

Only concern is the changing behavior of FileReadAction

Changed this so the default behavior is just reading the whole file, also added negative indices to the command detection so the model can specify from any line to the end of the file.

JayQuimby · 2024-04-16T22:13:36Z

@JayQuimby Are you testing via the UI, or some other way?

@foragerr I am testing with both gpt-4 and codellama using the sandbox. localhost:3001

gpt-4-turbo is best IMO. Have not run any tests on swe-bench.

rbren · 2024-04-17T16:17:13Z

opendevin/action/fileop.py

+            else:
+                all_lines = file.readlines()
+                num_lines = len(all_lines)
+                begin = max(0, min(self.start, num_lines - 2))


why is this -2?

assertion · 2024-04-22T06:30:45Z

Not sure whether I missed something? Does OpenDevin run the SWE-bench with the new SWE agent？ what's the resolved rate of OpenDevin compared with the other Devin like open-source projects. @JayQuimby @rbren

JayQuimby added 2 commits April 7, 2024 02:31

Merge branch 'main' of https://github.com/JayQuimby/OpenDevin

ed608ee

Merge branch 'main' of github.com:JayQuimby/OpenDevin

cca6ba7

JayQuimby mentioned this pull request Apr 7, 2024

Add Agent Comparable to SWE-agent #570

Closed

Using commands.sh for ACI

823a631

rbren reviewed Apr 7, 2024

View reviewed changes

agenthub/thinkact_agent/plans.md Outdated Show resolved Hide resolved

rbren reviewed Apr 7, 2024

View reviewed changes

agenthub/thinkact_agent/plans.md Outdated Show resolved Hide resolved

rbren reviewed Apr 7, 2024

View reviewed changes

JayQuimby and others added 8 commits April 8, 2024 02:00

parsing, prompting, and actions modifications

1a882ac

Merge branch 'OpenDevin:main' into main

db608ab

added start and end index to read and write

24b21de

Merge branch 'main' of github.com:JayQuimby/OpenDevin

30f0847

bug fixes and test updates

e830627

Lint code changes to ensure code is proper

0f76f70

Merge branch 'OpenDevin:main' into main

ce9bd95

State management, bugs, prompts

87e3657

All-Hands-AI deleted a comment from cosark Apr 9, 2024

JayQuimby and others added 2 commits April 9, 2024 16:58

Prompt Engineering

26a1ce2

Merge branch 'OpenDevin:main' into main

04e9caa

foragerr mentioned this pull request Apr 10, 2024

Wishlist. #993

Closed

exception handling

2633ae1

JayQuimby and others added 9 commits April 16, 2024 00:34

prompt changes, bug fixes

085a431

merge conflicts

439e9ba

merge conflict

148f63f

Merge branch 'OpenDevin:main' into main

ac473e7

start and end line for read and write

41710b5

Merge branch 'main' of github.com:JayQuimby/OpenDevin

203ecde

more merge conflicts

5575c16

env error

9402fb4

linter error

68d77e8

JayQuimby changed the title ~~feat: New Agent ThinkActAgent - (based on SWE-Agent)~~ feat: SWE Agent Implementation Apr 16, 2024

JayQuimby marked this pull request as ready for review April 16, 2024 05:53

rbren reviewed Apr 16, 2024

View reviewed changes

opendevin/action/fileop.py Outdated Show resolved Hide resolved

rbren reviewed Apr 16, 2024

View reviewed changes

opendevin/action/fileop.py Outdated Show resolved Hide resolved

rbren and others added 5 commits April 16, 2024 16:36

Merge branch 'main' into main

4e1ed84

read fixed, prompt change, example added

6e92dcc

Merge branch 'main' of github.com:JayQuimby/OpenDevin

5b3884a

added x_line:end read operation

141ecce

Merge branch 'OpenDevin:main' into main

9baa679

JayQuimby added 2 commits April 17, 2024 01:45

Merge branch 'OpenDevin:main' into main

8c7fbaa

Merge branch 'main' into main

a8eac5d

rbren reviewed Apr 17, 2024

View reviewed changes

rbren approved these changes Apr 17, 2024

View reviewed changes

rbren merged commit 16fc728 into All-Hands-AI:main Apr 17, 2024

rbren mentioned this pull request Apr 23, 2024

Add SWE-agent tools as sandbox plugins #1305

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: SWE Agent Implementation #846

feat: SWE Agent Implementation #846

JayQuimby commented Apr 7, 2024 •

edited

Loading

PierrunoYT commented Apr 7, 2024

JayQuimby commented Apr 7, 2024 •

edited

Loading

PierrunoYT commented Apr 7, 2024

JayQuimby commented Apr 7, 2024 •

edited

Loading

PierrunoYT commented Apr 7, 2024

foragerr commented Apr 7, 2024

rbren commented Apr 7, 2024 •

edited

Loading

JayQuimby commented Apr 7, 2024

rbren Apr 7, 2024 •

edited

Loading

foragerr commented Apr 10, 2024

JayQuimby commented Apr 10, 2024

JayQuimby commented Apr 16, 2024

rbren commented Apr 16, 2024

foragerr commented Apr 16, 2024

JayQuimby commented Apr 16, 2024

JayQuimby commented Apr 16, 2024 •

edited

Loading

rbren Apr 17, 2024

assertion commented Apr 22, 2024 •

edited

Loading

feat: SWE Agent Implementation #846

feat: SWE Agent Implementation #846

Conversation

JayQuimby commented Apr 7, 2024 • edited Loading

Implementing an agent that works similar to SWE-Agent

Agent Features:

Help / Suggestions:

Links/context:

PierrunoYT commented Apr 7, 2024

JayQuimby commented Apr 7, 2024 • edited Loading

PierrunoYT commented Apr 7, 2024

JayQuimby commented Apr 7, 2024 • edited Loading

PierrunoYT commented Apr 7, 2024

foragerr commented Apr 7, 2024

rbren commented Apr 7, 2024 • edited Loading

JayQuimby commented Apr 7, 2024

rbren Apr 7, 2024 • edited Loading

Choose a reason for hiding this comment

foragerr commented Apr 10, 2024

JayQuimby commented Apr 10, 2024

JayQuimby commented Apr 16, 2024

rbren commented Apr 16, 2024

foragerr commented Apr 16, 2024

JayQuimby commented Apr 16, 2024

JayQuimby commented Apr 16, 2024 • edited Loading

rbren Apr 17, 2024

Choose a reason for hiding this comment

assertion commented Apr 22, 2024 • edited Loading

JayQuimby commented Apr 7, 2024 •

edited

Loading

JayQuimby commented Apr 7, 2024 •

edited

Loading

JayQuimby commented Apr 7, 2024 •

edited

Loading

rbren commented Apr 7, 2024 •

edited

Loading

rbren Apr 7, 2024 •

edited

Loading

JayQuimby commented Apr 16, 2024 •

edited

Loading

assertion commented Apr 22, 2024 •

edited

Loading