This project is a part of my individual projects.
The purpose of the project is to generate captions for the given image.
- Deep Learning
- CNN + RNN
- Python
- Pytorch
- Numpy, Pandas
- Jupyter
- Text Generation
The project aims to generate text on processing the given image. Project is buit using state-of-the-art CNN and RNN deep learning models.
The dataset was acquired frorm kaggle. Dataset contains
- Images Directory containing images
- Captions.txt file containing image_id and corresponding caption.
- Downloading dataset: Link to download dataset
- Dataset visualization: Visualizing an image with its caption from the dataset.
- Preparing custom dataset ready to feed into our CNN architecture.
- Creating CNN + RNN architectures. Giving the output of CNN to RNN, to generate captions.
- Training and testing our model.
- SOS and EOS are the start of string and end of string labels.
- The model can be trained further to get better captions.