This paper is accepted by the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (CVPR2023) paper
This is the source code of PyTorch implementation of the FashionSAP.
We will introduce more about our project ...
- requirements.txt
-
- download the raw file and extract it in path
data_root
. - change the
data_root
andsplit
inprepare_dataset.py
and run it get the assitance file.
- download the raw file and extract it in path
-
- download the raw file and extract it in path
data_root
. - the directory
captions
andimages
in raw fileare put indata_root
. Besides the file, we also merge all kinds of train file intocap.train.json
file incaptions
, so as toval
.
- download the raw file and extract it in path
-
we define 3 kinds downstream names as
downstream_name
retrieval
: includes 2 downstream tasks: text-to-image retrieval downstream and image-to-text retrieval.catereg
: fashion domain category recognition and subcategory recognition.tgir
: text guided image retrieval or text modified image retrieval.
-
command
bash run_pretrain.sh
to run pretrain stage. -
command
bash run_{downstream_name}.sh
to train and evaluate different downstream tasks.
- Our pre-trained model can be downloaded from Google Driver
If you find this code useful for your research, please cite:
@inproceedings{FashionSAP,
title={FashionSAP: Symbols and Attributes Prompt for Fine-grained Fashion Vision-Language Pre-training},
author={Han, Yunpeng and Zhang, Lisai and Chen, Qingcai and Chen, Zhijian and Li, Zhonghua and Yang, Jianxin and Cao, Zhao},
year={2023},
booktitle={CVPR}
}
Some utils codes are referenced from project ALBEF