We propose a Dual-Visual Graph Reasoning Unit (DualVGR) which reasons over videos in an end-to-end fashion. The first contribution of our DualVGR is the design of an explainable Query Punishment Module, which can filter out irrelevant visual features through mutiple cycles of reasoning. The second contribution is the proposed Video-based Multi-view Graph Attention Network, which captures the relations between appearance and motion features.
Illustrations of DualVGR unit and the whole framework of our DualVGR Network for VideoQA:
DualVGR Unit | DualVGR Architecture |
---|---|
-Download SVQA, MSRVTT-QA, MSVD-QA dataset and edit absolute paths in preprocess/preprocess_features.py
and preprocess/preprocess_questions.py
upon where you locate your data. What's more, for SVQA dataset, you have to split the datasets according to our official split.
Comparison with SoTA on MSVD-QA and MSRVTT-QA datasets
Comparison with SoTA on SVQA dataset
Be careful to set your feature file path correctly! The following is to run experiments with SVQA dataset, replace svqa
with msvd-qa
or msrvtt-qa
to run with other datasets.
- To extract appearance features:
python preprocess/preprocess_features.py --gpu_id 0 --dataset svqa --model resnet101 --num_clips {num_clips}
- To extract motion features:
-Download ResNeXt-101 pretrained model (resnext-101-kinetics.pth) and place it to data/preprocess/pretrained/
.
python preprocess/preprocess_features.py --dataset svqa --model resnext101 --image_height 112 --image_width 112 --num_clips {num_clips}
- To extract textual features:
-Download glove pretrained 300d word vectors to data/glove/
and process it into a pickle file:
python txt2pickle.py
-Process questions:
python preprocess/preprocess_questions.py --dataset msrvtt-qa --glove_pt data/glove/glove.840.300d.pkl --mode train
python preprocess/preprocess_questions.py --dataset msrvtt-qa --mode val
python preprocess/preprocess_questions.py --dataset msrvtt-qa --mode test
python train.py --cfg configs/svqa_DualVGR_20.yml --alpha {alpha} --beta {beta} --unit_layers {unit_layers}
First, you have to set the correct file path. Then, to evaluate the trained model, run the following:
python validate.py --cfg configs/svqa_DualVGR_20.yml --unit_layers {unit_layers}