Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models

This version is immigrated from a internal implementation of Alibaba Group, feel free to open an issue to address any problem!

Environment

conda create -n arldm python=3.8
conda activate arldm
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch-lts
git clone https://github.com/Flash-321/ARLDM.git
cd ARLDM
pip install -r requirements.txt

Data Preparation

Download the PororoSV dataset here.
Download the FlintstonesSV dataset here.
Download the VIST-SIS url links here
Download the VIST-DII url links here
Download the VIST images running

python data_script/vist_img_download.py
--json_dir /path/to/dii_json_files
--img_dir /path/to/save_images
--num_process 32

To accelerate I/O, using the following scrips to convert your downloaded data to HDF5

python data_script/pororo_hdf5.py
--data_dir /path/to/pororo_data
--save_path /path/to/save_hdf5_file

python data_script/flintstones_hdf5.py
--data_dir /path/to/flintstones_data
--save_path /path/to/save_hdf5_file

python data_script/vist_hdf5.py
--sis_json_dir /path/to/sis_json_files
--dii_json_dir /path/to/dii_json_files
--img_dir /path/to/vist_images
--save_path /path/to/save_hdf5_file

Training

Specify your directory and device configuration in config.yaml and run

python main.py

Sample

Specify your directory and device configuration in config.yaml and run

python main.py

Acknowledgment

Thanks a lot to @adymaharana for kindly sharing FlintstonesSV and PororoSV datasets (and the code), as well as the PororoSV pretrained checkpoint and Flintstones sampled results of StoryDALL·E.

Citation

If you find this code useful for your research, please cite our paper:

@article{pan2022synthesizing,
  title={Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models},
  author={Pan, Xichen and Qin, Pengda and Li, Yuhong and Xue, Hui and Chen, Wenhu},
  journal={arXiv preprint arXiv:2211.10950},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
assets		assets
data_script		data_script
datasets		datasets
models		models
README.md		README.md
config.yaml		config.yaml
fid_utils.py		fid_utils.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models

Environment

Data Preparation

Training

Sample

Acknowledgment

Citation

About

Languages

xichenpan/ARLDM

Folders and files

Latest commit

History

Repository files navigation

Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models

Environment

Data Preparation

Training

Sample

Acknowledgment

Citation

About

Resources

Stars

Watchers

Forks

Languages