Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

codelion/optillm - Automatic prompt strategy proxy #922

Open
1 task
ShellLM opened this issue Sep 7, 2024 · 1 comment
Open
1 task

codelion/optillm - Automatic prompt strategy proxy #922

ShellLM opened this issue Sep 7, 2024 · 1 comment
Labels
AI-Chatbots Topics related to advanced chatbot platforms integrating multiple AI models Algorithms Sorting, Learning or Classifying. All algorithms go here. code-generation code generation models and tools like copilot and aider Git-Repo Source code repository like gitlab or gh human-verified <INST>NEVER PICK THIS LABEL</INST> in-context-learning Examples of few-shot prompts for in-context learning. llm Large Language Models llm-applications Topics related to practical applications of Large Language Models in various fields MachineLearning ML Models, Training and Inference openai OpenAI APIs, LLMs, Recipes and Evals prompt Collection of llm prompts and notes prompt-engineering Developing and optimizing prompts to efficiently use language models for various applications and re software-engineering Best practice for software engineering source-code Code snippets System-prompt System prompts guide an LLMs response to every user message.

Comments

@ShellLM
Copy link
Collaborator

ShellLM commented Sep 7, 2024

optillm

optillm is an OpenAI API compatible optimizing inference proxy which implements several state-of-the-art techniques that can improve the accuracy and performance of LLMs. The current focus is on implementing techniques that improve reasoning over coding, logical and mathematical queries. It is possible to beat the frontier models using these techniques across diverse tasks by doing additional compute at inference time.

SOTA results with moa-gpt-4o-mini on Arena-Hard-Auto

Results showing Mixture of Agents approach using gpt-4o-mini on Arena Hard Auto Benchmark

Installation

Just clone the repository with git and use pip install to setup the dependencies.

git clone https://github.com/codelion/optillm.git
cd optillm
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

You can then run the optillm proxy as follows.

python optillm.py                           
2024-09-06 07:57:14,191 - INFO - Starting server with approach: auto
2024-09-06 07:57:14,191 - INFO - Server configuration: {'approach': 'auto', 'mcts_simulations': 2, 'mcts_exploration': 0.2, 'mcts_depth': 1, 'best_of_n': 3, 'model': 'gpt-4o-mini', 'rstar_max_depth': 3, 'rstar_num_rollouts': 5, 'rstar_c': 1.4, 'base_url': ''}
 * Serving Flask app 'optillm'
 * Debug mode: off
2024-09-06 07:57:14,212 - INFO - WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:8000
 * Running on http://192.168.10.48:8000
2024-09-06 07:57:14,212 - INFO - Press CTRL+C to quit

Usage

Once the proxy is running, you can just use it as a drop in replacement for an OpenAI client by setting the base_url as http://localhost:8000/v1.

import os
from openai import OpenAI

OPENAI_KEY = os.environ.get("OPENAI_API_KEY")
OPENAI_BASE_URL = "http://localhost:8000/v1"
client = OpenAI(api_key=OPENAI_KEY, base_url=OPENAI_BASE_URL)

response = client.chat.completions.create(
  model="moa-gpt-4o",
  messages=[
    {
      "role": "user",
      "content": "Write a Python program to build an RL model to recite text from any position that the user provides, using only numpy."
    }
  ],
  temperature=0.2
)

print(response)

You can control the technique you use for optimization by prepending the slug to the model name {slug}-model-name. E.g. in the above code we are using moa or
mixture of agents as the optimization approach. In the proxy logs you will see the following showing the moa is been used with the base model as gpt-4o-mini.

2024-09-06 08:35:32,597 - INFO - Using approach moa, with gpt-4o-mini
2024-09-06 08:35:35,358 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-09-06 08:35:39,553 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-09-06 08:35:44,795 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-09-06 08:35:44,797 - INFO - 127.0.0.1 - - [06/Sep/2024 08:35:44] "POST /v1/chat/completions HTTP/1.1" 200 -

Implemented techniques

Technique Slug Description
Monte Carlo Tree Search mcts Uses MCTS for decision-making in chat responses
Best of N Sampling bon Generates multiple responses and selects the best one
Mixture of Agents moa Combines responses from multiple critiques
Round Trip Optimization rto Optimizes responses through a round-trip process
Z3 Solver z3 Utilizes the Z3 theorem prover for logical reasoning
Self-Consistency self_consistency Implements an advanced self-consistency method
PV Game pvg Applies a prover-verifier game approach at inference time
R* Algorithm rstar Implements the R* algorithm for problem-solving
CoT with Reflection cot_reflection Implements chain-of-thought reasoning with <thinking>, <reflection> and <output> sections
PlanSearch plansearch Implements a search algorithm over candidate plans for solving a problem in natural language

References

Suggested labels

None

@ShellLM ShellLM added AI-Chatbots Topics related to advanced chatbot platforms integrating multiple AI models Algorithms Sorting, Learning or Classifying. All algorithms go here. code-generation code generation models and tools like copilot and aider Git-Repo Source code repository like gitlab or gh openai OpenAI APIs, LLMs, Recipes and Evals software-engineering Best practice for software engineering source-code Code snippets labels Sep 7, 2024
@ShellLM
Copy link
Collaborator Author

ShellLM commented Sep 7, 2024

Related content

#884 similarity score: 0.91
#418 similarity score: 0.91
#774 similarity score: 0.91
#396 similarity score: 0.9
#390 similarity score: 0.9
#683 similarity score: 0.9

@irthomasthomas irthomasthomas changed the title optillm/README.md at main · codelion/optillm codelion/optillm - Automatic prompt strategy proxy Sep 7, 2024
@irthomasthomas irthomasthomas added prompt Collection of llm prompts and notes prompt-engineering Developing and optimizing prompts to efficiently use language models for various applications and re System-prompt System prompts guide an LLMs response to every user message. llm Large Language Models MachineLearning ML Models, Training and Inference in-context-learning Examples of few-shot prompts for in-context learning. llm-applications Topics related to practical applications of Large Language Models in various fields human-verified <INST>NEVER PICK THIS LABEL</INST> labels Sep 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AI-Chatbots Topics related to advanced chatbot platforms integrating multiple AI models Algorithms Sorting, Learning or Classifying. All algorithms go here. code-generation code generation models and tools like copilot and aider Git-Repo Source code repository like gitlab or gh human-verified <INST>NEVER PICK THIS LABEL</INST> in-context-learning Examples of few-shot prompts for in-context learning. llm Large Language Models llm-applications Topics related to practical applications of Large Language Models in various fields MachineLearning ML Models, Training and Inference openai OpenAI APIs, LLMs, Recipes and Evals prompt Collection of llm prompts and notes prompt-engineering Developing and optimizing prompts to efficiently use language models for various applications and re software-engineering Best practice for software engineering source-code Code snippets System-prompt System prompts guide an LLMs response to every user message.
Projects
None yet
Development

No branches or pull requests

2 participants