VLC-BERT

VLC-BERT is a vision-language-commonsense transformer model that incoporates contextualized commonsense for external knowledge visual questioning tasks, OK-VQA and A-OKVQA.

Note: This repository has code for the VLC-BERT transformer model. For Knowledge generation and selection (generating the final commonsense inferences that go into VLC-BERT), please refer to this project.

Citing VLC-BERT

@InProceedings{Ravi_2023_WACV,
    author    = {Ravi, Sahithya and Chinchure, Aditya and Sigal, Leonid and Liao, Renjie and Shwartz, Vered},
    title     = {VLC-BERT: Visual Question Answering With Contextualized Commonsense Knowledge},
    booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    month     = {January},
    year      = {2023},
    pages     = {1155-1165}
}

Setup

Please follow instructions in SETUP.md file. This file also provides links to download pretrained models.

Train and Eval

Configuration files under the ./cfgs folder can be edited to your needs. It is currently set up for single-GPU training on an RTX 2080Ti (12 GB memory).

To run OK-VQA training:

# ./scripts/dist_run_single.sh 1 okvqa/train_end2end.py cfgs/okvqa/semQO-5-weak-attn.yaml ./

To run A-OKVQA training:

./scripts/dist_run_single.sh 1 aokvqa/train_end2end.py cfgs/aokvqa/semQO-5-weak-attn.yaml ./

To run evaluation (example):

python aokvqa/test.py \
  --cfg cfgs/aokvqa/base/semQO-5-weak-attn.yaml \
  --ckpt output/vlc-bert/aokvqa/base/semQO-5-weak-attn/train2017_train/vlc-bert_base_aokvqa-latest.model \
  --split test2017 \
  --gpus 0

Acknowledgement

We built VLC-BERT on top of VL-BERT: https://github.com/jackroos/VL-BERT

In addition, we would like to acknowledge that we use the following works extensively:

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
aokvqa		aokvqa
cfgs		cfgs
common		common
data		data
external		external
figs		figs
model/pretrained_model		model/pretrained_model
okvqa		okvqa
scripts		scripts
viz		viz
vqa		vqa
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_VLBERT.md		README_VLBERT.md
SETUP.md		SETUP.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VLC-BERT

Citing VLC-BERT

Setup

Train and Eval

Acknowledgement

About

Releases

Packages

Languages

License

sahithyaravi/VLC-BERT

Folders and files

Latest commit

History

Repository files navigation

VLC-BERT

Citing VLC-BERT

Setup

Train and Eval

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages