Skip to content

It takes in picture and describe what going on in the picture in plain English.

Notifications You must be signed in to change notification settings

Faizan-E-Mustafa/Image-Captioning

Repository files navigation

  • Most of the code was taken from original repo. But Model creation is different so we get different results.

Keras implementation of Image-Captioning

Image captioning is a task that involves computer vision as well as Natural language processing. It takes an image and is able to describe whats going on in the image in plain English.

  • Keras With Tensorflow back-end
  • InceptionV3 for encoding
  • LSTM for decoding
  • Greedy as well as Beam serch was used.
  • Hyper parameters used
Hyper parameter Value
Embedding size 300
Vocabulary size 8256
Dropout 0.5
Batch Size 128
LSTM 1 Output 256
LSTM 1 Output 1000

I have also written a blog post describing my experience of implementing the project. You can find it here.

If you want to use pretrained weights for LSTM model. You can download them here.

Flickr8k dataset can be downloaded here.

Results:

upload_2 upload_1 Sometimes beam search do great job. upload_3

Dependencies:

  • Keras 2.1.6
  • Tensorflow 1.7.0
  • Numpy
  • Pandas
  • Pickle
  • PIL
  • Tqdm

Note : It is recommended to use above mentioned version of Keras and Tensorflow.

References

1)CS231n Winter 2016 Lesson 10 Recurrent Neural Networks, Image Captioning and LSTM

2)Another implementation of image captioning model.

About

It takes in picture and describe what going on in the picture in plain English.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published