diff-sampler is an open source toolbox for fast sampling of diffusion models, with various model implementations, numerical-based solvers, time schedules and other features.
- This codebase mainly refers to the codebase of EDM. To install the required packages, please refer to the EDM codebase.
- This codebase supports the pre-trained diffusion models from EDM, ADM, Consistency models, LDM and Stable Diffusion. Please refer to the corresponding codebases for package installation when loading the pre-trained diffusion models from these codebases.
Run the commands in launch.sh to sample with specified ODE solvers and pre-trained diffusion models.
All the commands can be parallelized across multiple GPUs by adjusting --nproc_per_node
.
You can find the descriptions to all the parameters in the next section. The required models will be downloaded at "./src/dataset_name"
by default. Some basic commands are listed below.
Note: num_steps
is the number of timestamps, num_steps=7 equals 6 steps. For Stable Diffusion, 1 step = 2 NFE due to the classifier-free guidance.
# Generate a grid of images with DDIM (Euler's method) on CIFAR10
SOLVER_FLAGS="--solver=euler --num_steps=7 --afs=False"
SCHEDULE_FLAGS="--schedule_type=polynomial --schedule_rho=7"
python sample.py --dataset_name="cifar10" --batch=64 --seeds="0-63" --grid=True $SOLVER_FLAGS $SCHEDULE_FLAGS
# Generate 50k images with iPNDM solver on CIFAR10 for FID evaluation
SOLVER_FLAGS="--solver=ipndm --num_steps=6 --afs=False"
SCHEDULE_FLAGS="--schedule_type=polynomial --schedule_rho=7"
ADDITIONAL_FLAGS="--max_order=4"
torchrun --standalone --nproc_per_node=1 sample.py \
--dataset_name="cifar10" --batch=128 --seeds="0-49999" $SOLVER_FLAGS $SCHEDULE_FLAGS $ADDITIONAL_FLAGS
# Use your own prompt for text-to-image generation with Stable Diffusion v1.5
SOLVER_FLAGS="--solver=dpmpp --num_steps=6 --afs=False"
SCHEDULE_FLAGS="--schedule_type=discrete --schedule_rho=1"
ADDITIONAL_FLAGS="--max_order=2 --predict_x0=False --lower_order_final=True"
GUIDANCE_FLAGS="--guidance_type=cfg --guidance_rate=7.5"
torchrun --standalone --nproc_per_node=1 sample.py --dataset_name="ms_coco" --batch=4 --seeds="0-3" --grid=True \
--prompt="a photograph of an astronaut riding a horse" \
$SOLVER_FLAGS $SCHEDULE_FLAGS $ADDITIONAL_FLAGS $GUIDANCE_FLAGS
The generated images will be stored at "./samples"
by default. To compute Fréchet inception distance (FID) for a given model and sampler, compare the generated 50k images against the dataset reference statistics using fid.py
:
# FID evaluation
python fid.py calc --images=path/to/images --ref=path/to/fid/stat
Calculate the CLIP score with 30k images generated by Stable Diffusion using the provided prompts:
# CLIP score
python clip_score.py calc --images=path/to/images
Name | Paramater | Default | Description |
---|---|---|---|
General options | dataset_name | None | One in ['cifar10', 'ffhq', 'afhqv2', 'imagenet64', 'lsun_bedroom', 'imagenet256', 'lsun_bedroom_ldm', 'ffhq_ldm', 'ms_coco'] |
batch | 64 | Total batch size | |
seeds | 0-63 | Specify a different random seed for each image | |
grid | False | Organize the generated images as grid | |
SOLVER_FLAGS | solver | None | One in ['euler', 'heun', 'dpm', 'dpmpp', 'unipc', 'deis', 'ipndm', 'ipndm_v'] |
num_steps | 6 | Number of timestamps. When num_steps=N, there will be N-1 sampling steps. The exact NFE depends on the chosen solver | |
afs | False | Whether to use AFS which saves the first model evaluation | |
denoise_to_zero | False | Whether to denoise from the last timestamp (>0) to 0. Require one more sampling step | |
SCHEDULE_FLAGS | schedule_type | 'polynomial' | Time discretization schedule. One in ['polynomial', 'logsnr', 'time_uniform', 'discrete'] |
schedule_rho | 7 | Time step exponent. Need to be specified when schedule_type in ['polynomial', 'time_uniform', 'discrete'] | |
ADDITIONAL_FLAGS | max_order | None | Option for multi-step solvers. 1<=max_order<=4 for iPNDM, iPNDM_v and DEIS, 1<=max_order<=3 for DPM-Solver++ and UniPC |
predict_x0 | True | Option for DPM-Solver++ and UniPC. Whether to use the data prediction formulation. | |
lower_order_final | True | Option for DPM-Solver++ and UniPC. Whether to lower the order at the final stages of sampling. | |
variant | 'bh2' | Option for UniPC. One in ['bh1', 'bh2'] | |
deis_mode | 'tab' | Option for UniPC. One in ['tab', 'rhoab'] | |
GUIDANCE_FLAGS | guidance_type | None | One in ['cg', 'cfg', 'uncond', None]. 'cg' for classifier-guidance, 'cfg' for classifier-free-guidance used in Stable Diffusion, and 'uncond' for unconditional used in LDM |
guidance_rate | None | Guidance rate | |
prompt | None | Prompt for Stable Diffusion sampling |
Name | Max Order | Source |
---|---|---|
Euler | 1 | Denoising Diffusion Implicit Models |
Heun | 2 | Elucidating the Design Space of Diffusion-Based Generative Models |
DPM-Solver-2 | 2 | DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps |
DPM-Solver++ | 3 | DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps |
UniPC | 3 | UniPC: A Unified Predictor-Corrector Framework for Fast Sampling of Diffusion Models |
DEIS | 4 | Fast Sampling of Diffusion Models with Exponential Integrator |
iPNDM | 4 | Fast Sampling of Diffusion Models with Exponential Integrator |
iPNDM_v | 4 | The variable-step version of the Adams–Bashforth methods |
We perform sampling on a variaty of pre-trained diffusion models from different codebases including EDM, ADM, Consistency models, LDM and Stable Diffusion. The tested pre-trained models are listed below:
Codebase | dataset_name | Resolusion | Pre-trained Models | Description |
---|---|---|---|---|
EDM | cifar10 | 32 | edm-cifar10-32x32-uncond-vp.pkl | |
EDM | ffhq | 64 | edm-ffhq-64x64-uncond-vp.pkl | |
EDM | afhqv2 | 64 | edm-afhqv2-64x64-uncond-vp.pkl | |
EDM | imagenet64 | 64 | edm-imagenet-64x64-cond-adm.pkl | |
Consistency Models | lsun_bedroom | 256 | edm_bedroom256_ema.pt | Pixel-space |
ADM | imagenet256 | 256 | 256x256_diffusion.pt and 256x256_classifier.pt | Classifier-guidance. |
LDM | lsun_bedroom_ldm | 256 | lsun_bedrooms.zip | Latent-space |
LDM | ffhq_ldm | 256 | ffhq.zip | Latent-space |
Stable Diffusion | ms_coco | 512 | stable-diffusion-v1-5 | Classifier-free-guidance |
For facilitating the FID evaluation of diffusion models, we provide our FID statistics of various datasets. They are collected on the Internet or made by ourselves with the guidance of the EDM codebase.
You can compute the reference statistics for your own datasets as follows:
python fid.py ref --data=path/to/my-dataset.zip --dest=path/to/save/my-dataset.npz
For Euler, Heun, DPM-Solver-2, iPNDM, and iPNDM_v, we use schedule_type='polynomial'
and schedule_rho=7
as recommended in the EDM paper (https://arxiv.org/abs/2206.00364).
For DPM-Solver++ and UniPC, we use schedule_type='logsnr'
, predict_x0=True
and lower_order_final=True
. We use variant='bh2'
for UniPC solver.
For DEIS, we use schedule_type='time_uniform'
, schedule_rho=2
and deis_mode='tab'
for better results.
Solver | NFE=3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---|---|---|---|---|---|---|---|---|
Euler | 93.36 | 66.76 | 49.66 | 35.62 | 27.93 | 22.32 | 18.43 | 15.69 |
Heun | - | 319.87 | - | 99.74 | - | 38.06 | - | 15.93 |
DPM-Solver-2 | - | 145.98 | - | 60.00 | - | 10.30 | - | 5.01 |
DPM-Solver++(3M) | 110.03 | 46.52 | 24.97 | 11.99 | 6.74 | 4.54 | 3.42 | 3.00 |
UniPC-3 | 109.61 | 45.20 | 23.98 | 11.14 | 5.83 | 3.99 | 3.21 | 2.89 |
DEIS-tAB3 | 56.01 | 25.66 | 14.39 | 9.40 | 6.94 | 5.55 | 4.68 | 4.09 |
iPNDM-4 | 47.98 | 24.82 | 13.59 | 7.05 | 5.08 | 3.69 | 3.17 | 2.77 |
iPNDM_v-4 | 67.58 | 40.26 | 23.58 | 14.00 | 9.83 | 7.34 | 5.93 | 4.95 |
Solver | NFE=3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---|---|---|---|---|---|---|---|---|
Euler | 78.21 | 57.48 | 43.93 | 35.22 | 28.86 | 24.39 | 21.01 | 18.37 |
Heun | - | 344.87 | - | 142.39 | - | 57.21 | - | 29.54 |
DPM-Solver-2 | - | 238.57 | - | 83.17 | - | 22.84 | - | 9.46 |
DPM-Solver++(3M) | 86.45 | 45.95 | 22.51 | 13.74 | 8.44 | 6.04 | 4.77 | 4.12 |
UniPC-3 | 86.43 | 44.78 | 21.40 | 12.85 | 7.44 | 5.50 | 4.47 | 3.84 |
DEIS-tAB3 | 54.52 | 28.31 | 17.36 | 12.25 | 9.37 | 7.59 | 6.39 | 5.56 |
iPNDM-4 | 45.98 | 28.29 | 17.17 | 10.03 | 7.79 | 5.52 | 4.58 | 3.98 |
iPNDM_v-4 | 60.45 | 36.80 | 22.66 | 15.62 | 11.57 | 9.21 | 7.65 | 6.55 |
Solver | NFE=3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---|---|---|---|---|---|---|---|---|
Euler | 82.96 | 58.43 | 43.81 | 34.03 | 27.46 | 22.59 | 19.27 | 16.72 |
Heun | - | 249.41 | - | 89.63 | - | 37.65 | - | 16.46 |
DPM-Solver-2 | - | 129.75 | - | 44.83 | - | 12.42 | - | 6.84 |
DPM-Solver++(3M) | 91.52 | 56.34 | 25.49 | 15.06 | 10.14 | 7.84 | 6.48 | 5.67 |
UniPC-3 | 91.38 | 55.63 | 24.36 | 14.30 | 9.57 | 7.52 | 6.34 | 5.53 |
DEIS-tAB3 | 44.51 | 23.53 | 14.75 | 12.57 | 8.20 | 6.84 | 5.97 | 5.34 |
iPNDM-4 | 58.53 | 33.79 | 18.99 | 12.92 | 9.17 | 7.20 | 5.91 | 5.11 |
iPNDM_v-4 | 65.65 | 40.20 | 24.36 | 16.68 | 12.23 | 9.50 | 7.89 | 6.76 |
If you find this repository useful, please consider citing the following paper:
@article{zhou2023fast,
title={Fast ODE-based Sampling for Diffusion Models in Around 5 Steps},
author={Zhou, Zhenyu and Chen, Defang and Wang, Can and Chen, Chun},
journal={arXiv preprint arXiv:2312.00094},
year={2023}
}