Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: SWE Agent Implementation #846

Merged
merged 40 commits into from
Apr 17, 2024
Merged

feat: SWE Agent Implementation #846

merged 40 commits into from
Apr 17, 2024

Conversation

JayQuimby
Copy link
Contributor

@JayQuimby JayQuimby commented Apr 7, 2024

Implementing an agent that works similar to SWE-Agent

Agent Features:

  • Application Computer Interface - simplifies the commands required by the llm.
    • Implement: 'read', 'write', 'browse', 'exit', 'edit', 'goto', 'scroll'
  • Think-Act prompting structure - guides the model to think about what its next step should be and then make code to do it
  • Short Term Memory - Allows the model to see it's last n steps before taking the next one
  • Tells the model about its current working dir, file, and line

Help / Suggestions:

  • Open to other ideas people might have on how to improve this
  • This agent should be similar to SWE agent, so I would appreciate ideas staying related to that
  • Any help or additional code would be much appreciated

Links/context:

Agent/model SWE-bench % resolved
SWE-Agent (GPT-4) 12.29%
Devin (25% of eval set) 13.84%
Claude 3 Opus (RAG) 3.79%
GPT-4 (RAG) 1.44%

Since SWE-Agent is so close to Devin in performance, I figure trying to emulate it and then iterate on it is the best route.

@PierrunoYT
Copy link
Contributor

Can you fix Github Issues whith this and create a PR?

@JayQuimby
Copy link
Contributor Author

JayQuimby commented Apr 7, 2024

@PierrunoYT

Can you fix Github Issues whith this and create a PR?

What github issues are you talking about? can you provide a screenshot or something for me to go off of?

@PierrunoYT
Copy link
Contributor

@PierrunoYT

Can you fix Github Issues whith this and create a PR?

What github issues are you talking about? can you provide a screenshot or something for me to go off of?

I mean like the SWE-Agent which can fix Github Issues.

@JayQuimby
Copy link
Contributor Author

JayQuimby commented Apr 7, 2024

Ohhh, lol completely misread that the first time.

I do not think it is currently capable of doing that, not sure tho. I am just implementing the agent framework, but there will definitely need to be more work done on getting web pages, formatting, submitting pull requests etc. for this to do what you were talking about before...

@PierrunoYT
Copy link
Contributor

Ohhh, lol completely misread that the first time.

I do not think it is currently capable of doing that, not sure tho. I am just implementing the agent framework, but there will definitely need to be more work done on getting web pages, formatting, submitting pull requests etc. for this to do what you were talking about before...

Oh okay. Waiting for it :)

@foragerr
Copy link
Collaborator

foragerr commented Apr 7, 2024

@PierrunoYT There's some discussion and work here about using OpenDevin to work on OpenDevin GitHub issues. The agent quality with the currently implemented agents is pretty poor though.

I'm excited to see a variant of SWE-agent implemented.

@rbren
Copy link
Collaborator

rbren commented Apr 7, 2024

Rad! @JayQuimby I'm curious how well it seems to work in practice. Have you thrown any examples at it?

Edit: looks like it's not fully implemented yet

@JayQuimby
Copy link
Contributor Author

Rad! @JayQuimby I'm curious how well it seems to work in practice. Have you thrown any examples at it?

@rbren, I have not tested it with gpt-4 / claude yet because it still needs some work. I will keep working on it, hope to have something usable in the next few days.

Comment on lines 5 to 8
'search_for': {
'params': '<keywords>',
'description': 'Will allow you to search your working directory for files and folders that match your <keyword>.',
}
Copy link
Collaborator

@rbren rbren Apr 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely want to add a search_for command. IMO it should be called search_files so the LLM doesn't confuse it with searching on Google

@All-Hands-AI All-Hands-AI deleted a comment from cosark Apr 9, 2024
@foragerr
Copy link
Collaborator

Just want to drop this for passers-by, totally worth a watch if you have 20 mins: https://www.youtube.com/watch?v=RJ6NN8Y-xok

@foragerr foragerr mentioned this pull request Apr 10, 2024
@JayQuimby
Copy link
Contributor Author

@rbren can you look at this again, I think we are getting close to something that is usable. There are still some bugs we are working on fixing but the overall agent is there.

@JayQuimby JayQuimby changed the title feat: New Agent ThinkActAgent - (based on SWE-Agent) feat: SWE Agent Implementation Apr 16, 2024
@JayQuimby
Copy link
Contributor Author

@rbren

  1. I have tested this with gpt-4 a few times and it seems to be working well, local models do not appear to work well no matter what I try to do.
  2. The custom bash commands do not appear to be working (search_file, search_dir, find_file...). I think I might just implement them in python instead of bash, curious on your thoughts there.
  3. I renamed everything like you said and goto, scroll commands are implemented. I think the prompt ordering/phrasing might need some work but it is quite functional as is.
  4. All the autogenerated linter changes that were unnecessary are gone
  5. I have some improvement ideas that I want to implement, but I wanted to get this finished as a starting point. Let me know if you think there are any things that need changing.

@JayQuimby JayQuimby marked this pull request as ready for review April 16, 2024 05:53
opendevin/action/fileop.py Outdated Show resolved Hide resolved
opendevin/action/fileop.py Outdated Show resolved Hide resolved
@rbren
Copy link
Collaborator

rbren commented Apr 16, 2024

@JayQuimby this is looking great!

Only concern is the changing behavior of FileReadAction

@foragerr
Copy link
Collaborator

@JayQuimby Are you testing via the UI, or some other way?

@JayQuimby
Copy link
Contributor Author

@rbren

Only concern is the changing behavior of FileReadAction

Changed this so the default behavior is just reading the whole file, also added negative indices to the command detection so the model can specify from any line to the end of the file.

@JayQuimby
Copy link
Contributor Author

JayQuimby commented Apr 16, 2024

@JayQuimby Are you testing via the UI, or some other way?

@foragerr I am testing with both gpt-4 and codellama using the sandbox. localhost:3001

gpt-4-turbo is best IMO. Have not run any tests on swe-bench.

else:
all_lines = file.readlines()
num_lines = len(all_lines)
begin = max(0, min(self.start, num_lines - 2))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this -2?

@rbren rbren merged commit 16fc728 into All-Hands-AI:main Apr 17, 2024
@assertion
Copy link
Contributor

assertion commented Apr 22, 2024

Not sure whether I missed something? Does OpenDevin run the SWE-bench with the new SWE agent? what's the resolved rate of OpenDevin compared with the other Devin like open-source projects. @JayQuimby @rbren

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants