Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent improvement #1032

Open
1 task done
qingyun-wu opened this issue May 10, 2023 · 3 comments
Open
1 task done

Agent improvement #1032

qingyun-wu opened this issue May 10, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@qingyun-wu
Copy link
Contributor

qingyun-wu commented May 10, 2023

Comments from @gagb:
Some observations:

  1. the agent many times starts to suggest shell commands which makes the code fail. Especially as the conversation gets longer
  2. Sometimes the user responds with empty strings and the code agent never returns terminal and the code gets stuck in a loop. Also happens when lang=unknown eg cuz the agent didn't wrap the python code in codeblockss
  3. The code fails if the context size > 8k
    Original comment: 3b3dd60#diff-9ac9829642f8aa5ad3ed717f7f60eabedf33210195465c1f6473cd2cfd4cd2af

PR #1025

Tasks

  1. 0 of 1
    enhancement
@sonichi sonichi added the enhancement New feature or request label May 12, 2023
@qingyun-wu
Copy link
Contributor Author

@gagb The second problem should have been addressed in the latest PR. Let me know if you still have this observation.

@gagb
Copy link
Collaborator

gagb commented May 25, 2023

More feedback based on integration with tinyRA and using gpt-3.5-turbo:

  1. Drift: The conversation may drift and start to execute code that unrelated to the goal and possibly very unsafe. We need more safety checks on the code it suggests.
  2. Memory refreshing: Others have found that occasionally refreshing agent memory with goal can help.
  3. Guaranteed structured output: Currently there are no guarantees that the coding agent will output a python code block (or even use code blocks). This can cause the conversation to fail.
  4. Shell agent: Currently agent can't execute shell commands to succeed (e.g., pip commands to install python packages).

@gagb
Copy link
Collaborator

gagb commented May 25, 2023

@gagb The second problem should have been addressed in the latest PR. Let me know if you still have this observation.

I think I still happens with gpt-3.5. I haven't been able to test with gpt-4 because I don't have access to it. I am working on a feature to share failure cases from tinyRA easily.

@sonichi sonichi added this to the Upgrade of autogen milestone Jun 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants