Skip to content

Official implementation of IR-GAN: Image Manipulation with Lingustic Instruction by Increment Reasoning. ACM MM 2020

License

Notifications You must be signed in to change notification settings

Victarry/IR-GAN-Code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IR-GAN Code

Code of our work published in ACM MM 2020 IR-GAN: Image Manipulation with Lingustic Instruction by Increment Reasoning

Setup

1. Generate data folder for CoDraw and i-CLEVR datasets

See GeNeVA - Datasets - Generation Code

2. Install Miniconda

wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
rm Miniconda3-latest-Linux-x86_64.sh

3. Create a conda environment for this repository

conda env create -f environment.yml

4. Activate the environment

conda activate irgan

5. Run visdom

visdom

Training progress for all the experiments can be tracked in visdom which by default starts at http://localhost:8097/.

Training the object detector and localizer

python scripts/train_object_detector_localizer.py --num-classes=24 --train-hdf5=../GeNeVA_datasets/data/iCLEVR/clevr_obj_train.h5 --valid-hdf5=../GeNeVA_datasets/data/iCLEVR/clevr_obj_val.h5 --cuda-enabled  # for i-CLEVR
python scripts/train_object_detector_localizer.py --num-classes=58 --train-hdf5=../GeNeVA_datasets/data/CoDraw/codraw_obj_train.h5 --valid-hdf5=../GeNeVA_datasets/data/CoDraw/codraw_obj_val.h5 --cuda-enabled  # for CoDraw

Note: The above commands also have several options, which can be found in the python script, that need to be set. Batch size (--batch-size) is not per-GPU but combined across GPUs.

This trains the object detector and localizer model used for evaluating GeNeVA-GAN on Precision, Recall, F1-Score, and rsim metrics. For comparison with results in our paper, you should skip training the model yourself and download the pre-trained models (iclevr_inception_best_checkpoint.pth and codraw_inception_best_checkpoint.pth) from the GeNeVA Project Page.

Training on CoDraw

Modify geneva/config.yml and @args/irgan-iclevr.args if needed and run:

python train.py @args/irgan-iclevr.args

When training for multiple times, remember to change to exp_name and results_paths in args file.

Training on i-CLEVR

Modify geneva/config.yml and args/irgan-codraw.args if needed and run:

python train.py @args/irgan-codraw.args

Evaluating a trained model on CoDraw test set

You will have to add the line --load_snapshot=</path/to/trained/model> to args/irgan-codraw.args to specify the checkpoint to load from and then run:

python test.py @args/irgan-codraw.args 

Evaluating a trained model on i-CLEVR test set

You will have to add the line --load_snapshot=</path/to/trained/model> to args/irgan-iclevr.args to specify the checkpoint to load from and then run:

python test.py @args/irgan-iclevr.args

Reference

Zhenhuan Liu, Jincan Deng, Liang Li, Shaofei Cai, Qianqian Xu, Shuhui Wang, Qingming Huang. 2020. IR-GAN: Image Manipulation with Linguistic Instruction by Increment Reasoning. In Proceedings of the 28th ACM International Conference on Multimedia (MM’20)

@InProceedings{Liu_2020_ACMMM,
    author    = {Zhenhuan Liu, Jincan Deng, Liang Li, Shaofei Cai, Qianqian Xu, Shuhui Wang, Qingming Huang.},
    title     = {IR-GAN: Image Manipulation with Linguistic Instruction by Increment Reasoning},
    booktitle = {In Proceedings of the 28th ACM International Conference on Multimedia (MM’20),
    month     = {Oct},
    year      = {2020}
}

Acknowledgements

Our code is inspired by GeNeVA.

About

Official implementation of IR-GAN: Image Manipulation with Lingustic Instruction by Increment Reasoning. ACM MM 2020

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages