Automatic-image-Captioning

Captioning generated by a CNN encoder (ResNet 101) and a decoder using LSTM with attention and BEAM search

Description

The notebook re-uses a pre-trained model and part of the code developped by Sagar Vinodababu in his tutorial available on github: https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning

The principles were described in the paper "Show, Attend, and Tell" (https://arxiv.org/abs/1502.03044).

The model takes an image as an input, encodes its key features (Encoder based on ResNet-101 feature detection conv blocks) and uses an LSTM RNN to decode and generate a caption for the image, word by word, using BEAM search for optimal sequence.

Additionally, the areas of the picture most relevant in the prediction of each word is highlighted and displayed as part of the result.

YOu can visit Sagar Vinadababu's tutorial for a detailed walkthrough.

Installation

The model uses pre-trained weights and word dictionary available from the author here: https://drive.google.com/open?id=189VY65I_n4RTpQnmLGj7IzVnOF6dmePC

collect the pretrained weights and dictionary
download the notebook
replicate following structure

insert your own images in an "images" folder
run the notebook (Pytorch).

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
asset		asset
Image_Captioning.ipynb		Image_Captioning.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automatic-image-Captioning

Description

Installation

Results

About

Releases

Packages

Languages

LaurentVeyssier/Automatic-image-Captioning

Folders and files

Latest commit

History

Repository files navigation

Automatic-image-Captioning

Description

Installation

Results

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages