Semantically Multi-modal Image Synthesis --- ### [Project page](http://seanseattle.github.io/SMIS) / [Paper](https://arxiv.org/abs/2003.12697) / [Demo](https://www.youtube.com/watch?v=uarUonGi_ZU&t=2s)  \ Semantically Multi-modal Image Synthesis(CVPR2020). \ Zhen Zhu, Zhiliang Xu, Ansheng You, Xiang Bai ### Requirements --- - torch>=1.0.0 - torchvision - dominate - dill - scikit-image - tqdm - opencv-python ### Getting Started ---- #### Data Preperation **DeepFashion** \ **Note:** We provide an example of the [DeepFashion](https://drive.google.com/open?id=1ckx35-mlMv57yzv47bmOCrWTm5l2X-zD) dataset. That is slightly different from the DeepFashion used in our paper due to the impact of the COVID-19. **Cityscapes** \ The Cityscapes dataset can be downloaded at [here](https://www.cityscapes-dataset.com/) **ADE20K** \ The ADE20K dataset can be downloaded at [here](http://sceneparsing.csail.mit.edu/) #### Test/Train the models Download the tar of the pretrained models from the [Google Drive Folder](https://drive.google.com/open?id=1og_9By_xdtnEd9-xawAj4jYbXR6A9deG). Save it in `checkpoints/` and unzip it. There are deepfashion.sh, cityscapes.sh and ade20k.sh in the scripts folder. Change the parameters like `--dataroot` and so on, then comment or uncomment some code to test/train model. And you can specify the `--test_mask` for SMIS test. ### Acknowledgments --- Our code is based on the popular [SPADE](https://github.com/NVlabs/SPADE)