Data | Description | Download |
---|---|---|
Visual Genome | ln -s VG/images data/vg/images |
Official |
MSCOCO 2014 | ln -s coco2014/images data/refcoco/images |
Official |
Converted annotations | unzip data.zip |
OneDrive |
Meteor package | unzip meteor.zip -d controlcap/common/evaluation/ |
OneDrive |
Pre-trained ControlCap weights and logs (Optional) | mv <your_path>/ckpts/* ckpts/ |
OneDrive |
P.S. Files in BaiduDrive, the passpord is (3g1k).
To train and evaluate ControlCap, download the files in the table and arrange the files according to the file tree below. (Uploading)
|--ControlCap/
|--data/
|--vg/
|--controlcap/
|--images/
|--1000.jpg
|--1001.jpg
...
|--refcoco
|--controlcap/
|--images/
|--COCO_train2014_000000000009.jpg
|--COCO_train2014_000000000025.jpg
...
...
|--ckpts/
|--vg1.2_refcocog_5e.pth
|--refcocog_gt.pth
|--configs/
|--controlcap/
|--docs/
|--scripts/
|--train.py
|--eval.py
P.S. The converted annotations are generated using data.sh
, the original annotations are as follows:
-
annotations of Visual Genome for dense captioning.
-
test_caption.json and mdetr_annotations of GlaMM for evaluating referring expression generation.