dspy/README.md at main · stanfordnlp/dspy #734
Labels
AI-Chatbots
Topics related to advanced chatbot platforms integrating multiple AI models
ai-leaderboards
leaderdoards for llm's and other ml models
base-model
llm base models not finetuned for chat
chat-templates
llm prompt templates for chat models
finetuning
Tools for finetuning of LLMs e.g. SFT or RLHF
in-context-learning
Examples of few-shot prompts for in-context learning.
Knowledge-Dataset
llm
Large Language Models
llm-applications
Topics related to practical applications of Large Language Models in various fields
llm-benchmarks
testing and benchmarking large language models
llm-completions
large language models for completion tasks, e.g. copilot
llm-evaluation
Evaluating Large Language Models performance and behavior through human-written evaluation sets
llm-experiments
experiments with large language models
llm-function-calling
Function Calling with Large Language Models
llm-inference-engines
Software to run inference on large language models
llm-serving-optimisations
Tips, tricks and tools to speedup inference of large language models
Models
LLM and ML model repos and links
multimodal-llm
LLMs that combine modes such as text and image recognition.
openai
OpenAI APIs, LLMs, Recipes and Evals
Papers
Research papers
programming-languages
Topics related to programming languages and their features.
prompt
Collection of llm prompts and notes
prompt-engineering
Developing and optimizing prompts to efficiently use language models for various applications and re
dspy/README.md at main · stanfordnlp/dspy
DSPy: Programming—not prompting—Foundation Models
[Oct'23] DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines
[Jan'24] In-Context Learning for Extreme Multi-Label Classification
[Dec'23] DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines
[Dec'22] Demonstrate-Search-Predict: Composing Retrieval & Language Models for Knowledge-Intensive NLP
Getting Started:
Documentation: DSPy Docs
DSPy is a framework for algorithmically optimizing LM prompts and weights, especially when LMs are used one or more times within a pipeline. To use LMs to build a complex system without DSPy, you generally have to: (1) break the problem down into steps, (2) prompt your LM well until each step works well in isolation, (3) tweak the steps to work well together, (4) generate synthetic examples to tune each step, and (5) use these examples to finetune smaller LMs to cut costs. Currently, this is hard and messy: every time you change your pipeline, your LM, or your data, all prompts (or finetuning steps) may need to change.
To make this more systematic and much more powerful, DSPy does two things. First, it separates the flow of your program (
modules
) from the parameters (LM prompts and weights) of each step. Second, DSPy introduces newoptimizers
, which are LM-driven algorithms that can tune the prompts and/or the weights of your LM calls, given ametric
you want to maximize.DSPy can routinely teach powerful models like
GPT-3.5
orGPT-4
and local models likeT5-base
orLlama2-13b
to be much more reliable at tasks, i.e. having higher quality and/or avoiding specific failure patterns. DSPy optimizers will "compile" the same program into different instructions, few-shot prompts, and/or weight updates (finetunes) for each LM. This is a new paradigm in which LMs and their prompts fade into the background as optimizable pieces of a larger system that can learn from data. tldr; less prompting, higher scores, and a more systematic approach to solving hard tasks with LMs.Table of Contents
If you need help thinking about your task, we recently created a Discord server for the community.
Analogy to Neural Networks
When we build neural networks, we don't write manual for-loops over lists of hand-tuned floats. Instead, you might use a framework like PyTorch to compose declarative layers (e.g.,
Convolution
orDropout
) and then use optimizers (e.g., SGD or Adam) to learn the parameters of the network.Ditto! DSPy gives you the right general-purpose modules (e.g.,
ChainOfThought
,ReAct
, etc.), which replace string-based prompting tricks. To replace prompt hacking and one-off synthetic data generators, DSPy also gives you general optimizers (BootstrapFewShotWithRandomSearch
orBayesianSignatureOptimizer
), which are algorithms that update parameters in your program. Whenever you modify your code, your data, your assertions, or your metric, you can compile your program again and DSPy will create new effective prompts that fit your changes.Mini-FAQs
What do DSPy optimizers tune? Each optimizer is different, but they all seek to maximize a metric on your program by updating prompts or LM weights. Current DSPy
optimizers
can inspect your data, simulate traces through your program to generate good/bad examples of each step, propose or refine instructions for each step based on past results, finetune the weights of your LM on self-generated examples, or combine several of these to improve quality or cut cost. We'd love to merge new optimizers that explore a richer space: most manual steps you currently go through for prompt engineering, "synthetic data" generation, or self-improvement can probably generalized into a DSPy optimizer that acts on arbitrary LM programs.How should I use DSPy for my task? Using DSPy is an iterative process. You first define your task and the metrics you want to maximize, and prepare a few example inputs — typically without labels (or only with labels for the final outputs, if your metric requires them). Then, you build your pipeline by selecting built-in layers (
modules
) to use, giving each layer asignature
(input/output spec), and then calling your modules freely in your Python code. Lastly, you use a DSPyoptimizer
to compile your code into high-quality instructions, automatic few-shot examples, or updated LM weights for your LM.What if I have a better idea for prompting or synthetic data generation? Perfect. We encourage you to think if it's best expressed as a module or an optimizer, and we'd love to merge it in DSPy so everyone can use it. DSPy is not a complete project; it's an ongoing effort to create structure (modules and optimizers) in place of hacky prompt and pipeline engineering tricks.
What does DSPy stand for? It's a long story but the backronym now is Declarative Self-improving Language Programs, pythonically.
1) Installation
All you need is:
Or open our intro notebook in Google Colab:
By default, DSPy installs the latest
openai
from pip. However, if you install old version before OpenAI changed their APIopenai~=0.28.1
, the library will use that just fine. Both are supported.For the optional (alphabetically sorted) Chromadb, Qdrant, Marqo, Pinecone, or Weaviate retrieval integration(s), include the extra(s) below:
2) Documentation
The DSPy documentation is divided into tutorials (step-by-step illustration of solving a task in DSPy), guides (how to use specific parts of the API), and examples (self-contained programs that illustrate usage).
A) Tutorials
Other resources people find useful:
B) Guides
If you're new to DSPy, it's probably best to go in sequential order. You will probably refer to these guides frequently after that, e.g. to copy/paste snippets that you can edit for your own DSPy programs.
C) Examples
The DSPy team believes complexity has to be justified. We take this seriously: we never release a complex tutorial (above) or example (below) unless we can demonstrate empirically that this complexity has generally led to improved quality or cost. This kind of rule is rarely enforced by other frameworks or docs, but you can count on it in DSPy examples.
There's a bunch of examples in the
examples/
directory and in the top-level directory. We welcome contributions!You can find other examples tweeted by @lateinteraction on Twitter/X.
Some other examples (not exhaustive, feel free to add more via PR):
There are also recent cool examples at Weaviate's DSPy cookbook by Connor Shorten. See tutorial on YouTube.
3) Syntax: You're in charge of the workflow—it's free-form Python code!
DSPy hides tedious prompt engineering, but it cleanly exposes the important decisions you need to make: [1] what's your
Suggested labels
The text was updated successfully, but these errors were encountered: