This repository contains the Fashion-MMT dataset and PyTorch implementation of our paper Product-oriented Machine Translation with Cross-modal Cross-lingual Pre-training (ACMMM 2021 Oral).
- Python 3.6
- Java 15.0.2
- PyTorch 1.1
- numpy, tqdm, h5py, scipy, six
Annotations of Fashion-MMT(C) and Fashion-MMT(L) datasets can be downloaded from BaiduNetdisk (code: i55n).
JSON Format:
[
{
"id": int,
"split": str,
"en": str,
"zh": str,
"images": [str],
"category": str,
"attr": [str]
}
]
The images can be downloaded from the url https://n.nordstrommedia.com/id/sr3/image_name with image_names.
We also provide the image features of ResNet101 pretrained on ImageNet and finetuned on Fashion-MMT at BaiduNetdisk (code: i55n)(~57G). Decompress and merge the downloaded features into one folder:
$ cat resnet101.finetune.tar.gz* | tar -xz
- Pre-train the model with three pre-training tasks
$ CUDA_VISIBLE_DEVICES=0 python train.py ../results/pretrain/model.json ../results/pretrain/path.json --is_train
- Fine-tune the model to MMT
$ CUDA_VISIBLE_DEVICES=0 python train.py ../results/finetune/model.json ../results/finetune/path.json --is_train --resume_file ../results/pretrain/model/step.*.th
$ CUDA_VISIBLE_DEVICES=0 python train.py ../results/finetune/model.json ../results/finetune/path.json --eval_set val --resume_file ../results/finetune/model/step.*.th
If you find this repo helpful, please consider citing:
@inproceedings{song2021FashionMMT,
title={Product-oriented Machine Translation with Cross-modal Cross-lingual Pre-training},
author={Song, Yuqing and Chen, Shizhe and Jin, Qin and Luo, Wei and Xie, Jun and Huang, Fei},
booktitle={Proceedings of the 29th {ACM} International Conference on Multimedia},
year={2021}
}