Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Self-learning from open-interpreter logs to create new entries #2

Open
talkingtoaj opened this issue Sep 16, 2023 · 0 comments
Open

Comments

@talkingtoaj
Copy link

This project is such a unique idea Killian! Is it based on an existing paper or named concept? Like Github gists, "shortcuts to upgrade the deficiencies in base LLM models to handle specific programming tasks common to open-interpreter", though I understand it could have broader uses.

The problem

Using open interpreter with GPT4 on Windows, I notice it makes several particular mistakes repeatedly; often it is able to debug itself and resolve to achieve a result, but that is time and spent tokens that continues to repeat due to the inability of a base LLM to "learn from past mistakes"

We might be able to manually log some of these common problems and add procedures to this project, but that's SO 2022! Wouldn't it better to automate this process?

How might open-interpreter log and learn from these mistakes?

I've had a little involvement with gpt-engineer project, and they have an option to ask a user to contribute their logs for the sake of learning.

If open-interpreter could add logging, then this opens up the door for future analysis by open-procedures project, which may follow a pipeline like so:

  1. Collection of logs
  2. Self-organizing of logs and Identifying repeated themes
  3. Invoking self-analysis and summarization
  4. Storing refined procedure

1. Collection of logs

gpt-engineer's logging
gpt-engineer's ask for consent and invitation to user to write a review

2. Self-organizing of logs and identifying repeated themes

Here's one potential solution - there may well be much better alternative solutions...

What if we stored logs into a vector database, using the log contents to calculate the embedding vector via the ada model.
Con: The standard OpenAI ada embedding model isn't optimized for code snippets
Pro: Using Ada is simple, the alternative would possibly require training a unique embedding model trained on code samples

We have a cron job on open-procedures that looks for emerging clusters of vectors, this indicates a common repeated situation.

3. Invoking self-analysis and summarization

We analyse the log (via a LLM query of a representative log (median point) from the cluster) to ask if mistakes were made and corrections required.

4. Storing refined procedure

We ask the LLM to write an open-procedures formatted prompt which declares the problem and presents an elegant foolproof solution (prompt engineering!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant