Cloth Interactive Transformer for Virtual Try-On
Bin Ren1, Hao Tang1, Fanyang Meng2, Runwei Ding3, Ling Shao4, Philip H.S. Torr5, Nicu Sebe16.
1University of Trento, Italy,
2Peng Cheng Laboratory, China,
3Peking University Shenzhen Graduate School, China,
4Inception Institute of AI, UAE,
5University of Oxford, UK,
6Huawei Research Ireland, Ireland.
The repository offers the official implementation of our paper in PyTorch. The code and pre-trained models are tested with pytorch 0.4.1, torchvision 0.2.1, opencv-python 4.1, and pillow 5.4 (Python 3.6).
🦖News!!! We have updated the pre-trained model(June 5th, 2021)!
In the meantime, check out our recent paper XingGAN and XingVTON.
This pipeline is a combination of consecutive training and testing of Cloth Interactive Transformer (CIT) Matching block based GMM and CIT Reasoning block based TOM. GMM generates the warped clothes according to the target human. Then, TOM blends the warped clothes outputs from GMM into the target human properties, to generate the final try-on output.
- Install the requirements
- Download/Prepare the dataset
- Train the CIT Matching block based GMM network
- Get warped clothes for training set with trained GMM network, and copy warped clothes & masks inside
data/train
directory - Train the CIT Reasoning block based TOM network
- Test CIT Matching block based GMM for testing set
- Get warped clothes for testing set, copy warped clothes & masks inside
data/test
directory - Test CIT Reasoning block based TOM testing set
This implementation is built and tested in PyTorch 0.4.1.
Pytorch and torchvision are recommended to install with conda: conda install pytorch=0.4.1 torchvision=0.2.1 -c pytorch
For all packages, run pip install -r requirements.txt
For training/testing VITON dataset, our full and processed dataset is available here: https://1drv.ms/u/s!Ai8t8GAHdzVUiQQYX0azYhqIDPP6?e=4cpFTI. After downloading, unzip to your own data directory ./data/
.
Run python train.py
with your specific usage options for GMM and TOM stage.
For example, GMM: python train.py --name GMM --stage GMM --workers 4 --save_count 5000 --shuffle
.
Then run test.py for GMM network with the training dataset, which will generate the warped clothes and masks in "warp-cloth" and "warp-mask" folders inside the "result/GMM/train/" directory.
Copy the "warp-cloth" and "warp-mask" folders into your data directory, for example inside "data/train" folder.
Run TOM stage, python train.py --name TOM --stage TOM --workers 4 --save_count 5000 --shuffle
We adopt four evaluation metrics in our work for evaluating the performance of the proposed XingVTON. There are Jaccard score (JS), structral similarity index measure (SSIM), learned perceptual image patch similarity (LPIPS), and Inception score (IS).
Note that JS is used for the same clothing retry-on cases (with ground truth cases) in the first geometric matching stage, while SSIM and LPIPS are used for the same clothing retry-on cases (with ground truth cases) in the second try-on stage. In addition, IS is used for different clothing try-on (where no ground truth is available).
- Step1: Run
python test.py --name GMM --stage GMM --workers 4 --datamode test --data_list test_pairs_same.txt --checkpoint checkpoints/GMM_pretrained/gmm_final.pth
then the parsed segmentation area for current upper clothing is used as the reference image, accompanied with generated warped clothing mask then: - Step2: Run
python metrics/getJS.py
After we run test.py for GMM network with the testibng dataset, the warped clothes and masks will be generated in "warp-cloth" and "warp-mask" folders inside the "result/GMM/test/" directory. Copy the "warp-cloth" and "warp-mask" folders into your data directory, for example inside "data/test" folder. Then:
- Step1: Run TOM stage test
python test.py --name TOM --stage TOM --workers 4 --datamode test --data_list test_pairs_same.txt --checkpoint checkpoints/TOM_pretrained/tom_final.pth
Then the original target human image is used as the reference image, accompanied with the generated retry-on image then: - Step2: Run
python metrics/getSSIM.py
- Step1: You need to creat a new virtual enviriment, then install PyTorch 1.0+ and torchvision;
- Step2: Run
sh metrics/PerceptualSimilarity/testLPIPS.sh
;
- Step1: Run TOM stage test
python test.py --name TOM --stage TOM --workers 4 --datamode test --data_list test_pairs.txt --checkpoint checkpoints/TOM_pretrained/tom_final.pth
- Step2: Run
python metrics/getIS.py
The pre-trained models are provided here. Download the pre-trained models and put them in this project (./checkpoints) Then just run the same step as Evaluation to test/inference our model.
This source code is inspired by CP-VTON, CP-VTON+. We are extremely grateful for their public implementation.
If you use this code for your research, please consider giving a star ⭐ and citing our paper 🦖:
CIT
@article{ren2021cloth,
title={Cloth Interactive Transformer for Virtual Try-On},
author={Ren, Bin and Tang, Hao and Meng, Fanyang and Ding, Runwei and Shao, Ling and Torr, Philip HS and Sebe, Nicu},
journal={arXiv preprint arXiv:2104.05519},
year={2021}
}
If you have any questions/comments/bug reports, feel free to open a github issue or pull a request or e-mail to the author Bin Ren (bin.ren@unitn.it).