Sicheng Mo1*, Fangzhou Mu2*, Kuan Heng Lin1, Yanli Liu3, Bochen Guan3, Yin Li2, Bolei Zhou1
1 UCLA, 2 University of Wisconsin-Madison, 3 Innopeak Technology, Inc
* Equal contribution
Computer Vision and Pattern Recognition (CVPR), 2024
This is the official implementation of FreeControl, a Generative AI algorithm for controllable text-to-image generation using pre-trained Diffusion Models.
-
10/21/2024: Added SDXL pipeline (thanks to @shirleyzhu233).
-
02/19/2024: Initial code release. The paper is accepted to CVPR 2024.
Environment Setup
- We provide a conda env file for environment setup.
conda env create -f environment.yml
conda activate freecontrol
pip install -U diffusers
pip install -U gradio
Sample Semantic Bases
- We provide three sample scripts in the scripts folder (one for each base model) to showcase how to compute target semantic bases.
- You may also download pre-computed bases from google drive. Put them in the dataset folder and launch the gradio demo.
Gradio demo
- We provide a graphical user interface (GUI) for users to try out FreeControl. Run the following command to start the demo.
python gradio_app.py
We are building a gallery of images generated with FreeControl. You are welcome to share your generated images with us.
@article{mo2023freecontrol,
title={FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition},
author={Mo, Sicheng and Mu, Fangzhou and Lin, Kuan Heng and Liu, Yanli and Guan, Bochen and Li, Yin and Zhou, Bolei},
journal={arXiv preprint arXiv:2312.07536},
year={2023}
}