Retrieval-Augmented Image Sequence Colorization
Authors: Junhao Zhuang, Xuan Ju, Zhaoyang Zhang, Yong Liu, Shiyi Zhang, Chun Yuan, Ying Shan
Your star means a lot for us to develop this project! ⭐
Automatic black-and-white image sequence colorization while preserving character and object identity (ID) is a complex task with significant market demand, such as in cartoon or comic series colorization. Despite advancements in visual colorization using large-scale generative models like diffusion models, challenges with controllability and identity consistency persist, making current solutions unsuitable for industrial application.
To address this, we propose ColorFlow, a three-stage diffusion-based framework tailored for image sequence colorization in industrial applications. Unlike existing methods that require per-ID finetuning or explicit ID embedding extraction, we propose a novel robust and generalizable Retrieval Augmented Colorization pipeline for colorizing images with relevant color references.
Our pipeline also features a dual-branch design: one branch for color identity extraction and the other for colorization, leveraging the strengths of diffusion models. We utilize the self-attention mechanism in diffusion models for strong in-context learning and color identity matching.
To evaluate our model, we introduce ColorFlow-Bench, a comprehensive benchmark for reference-based colorization. Results show that ColorFlow outperforms existing models across multiple metrics, setting a new standard in sequential image colorization and potentially benefiting the art industry.
-
Update Date: December 23, 2024 - We have released the weights for the Sketch_Shading model, along with updates to the related code and demo. You can access the model weights in our Hugging Face model repository and explore the updated demo here. 🎉🔥
-
Release Date: December 17, 2024 - The inference code and model weights have also been released! 🎉
- ✅ Release inference code and model weights
- ⬜️ Release training code
Follow these steps to set up and run ColorFlow on your local machine:
-
Clone the Repository
Download the code from our GitHub repository:
git clone https://github.com/TencentARC/ColorFlow cd ColorFlow
-
Set Up the Python Environment
Ensure you have Anaconda or Miniconda installed, then create and activate a Python environment and install required dependencies:
conda create -n colorflow python=3.8.5 conda activate colorflow pip install -r requirements.txt
-
Run the Application
You can launch the Gradio interface for PowerPaint by running the following command:
python app.py
-
Access ColorFlow in Your Browser
Open your browser and go to
http://localhost:7860
. If you're running the app on a remote server, replacelocalhost
with your server's IP address or domain name. To use a custom port, update theserver_port
parameter in thedemo.launch()
function of app.py.
You can try the demo of ColorFlow on Hugging Face Space.
The overview of ColorFlow. This figure presents the three primary components of our framework: the Retrieval-Augmented Pipeline (RAP), the In-context Colorization Pipeline (ICP), and the Guided Super-Resolution Pipeline (GSRP). Each component is essential for maintaining the color identity of instances across black-and-white image sequences while ensuring high-quality colorization.
🤗 We welcome your feedback, questions, or collaboration opportunities. Thank you for trying ColorFlow!
We would like to acknowledge the following open-source projects that have inspired and contributed to the development of ColorFlow:
- ScreenStyle: https://github.com/msxie92/ScreenStyle
- MangaLineExtraction_PyTorch: https://github.com/ljsabc/MangaLineExtraction_PyTorch
We are grateful for the valuable resources and insights provided by these projects.
- Junhao Zhuang
Email: zhuangjh23@mails.tsinghua.edu.cn
@misc{zhuang2024colorflow,
title={ColorFlow: Retrieval-Augmented Image Sequence Colorization},
author={Junhao Zhuang and Xuan Ju and Zhaoyang Zhang and Yong Liu and Shiyi Zhang and Chun Yuan and Ying Shan},
year={2024},
eprint={2412.11815},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2412.11815},
}
Please refer to our license file for more details.