Skip to content

A easy-to-use tool for real time Visual Question Answering

License

Notifications You must be signed in to change notification settings

yangdsh/VQA-BUTD-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Demo for Visual Question Answering with BUTD

This an user friendly demo for visual question answering. It is essentially a pipeline that combines an image feature extractaion tool and a fast attention implementation. These two repos implement the BUTD system described in "Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering" (https://arxiv.org/abs/1707.07998) and "Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge" (https://arxiv.org/abs/1708.02711).

We modify the BUTD code in both above repos to make it applicable to any new image on the website. In addition, we re-draw the image to show the attention on the image. We further improved the above models in the following ways:

  • Make use of position information in the attention model
  • Add a layer to the attention model, which improves the accuracy

We include the pre-trained attention model as a tar.gz in this repo, which is last missing piece of pre-trained models needed in those two repos. The users will need to decompress it. Also, the users need to follow the installation instruments in the two sub-folders, and download the pre-trained models and dictionaries..

About

A easy-to-use tool for real time Visual Question Answering

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published