A WebApp that Generates Caption for Images using CNN-RNN.
Application Link: https://dev228-afk-image-caption-generator-app-c6ckdt.streamlitapp.com/
This Model consists of a CNN-RNN Layer, Which is made of Keras Sequential API. it's made of the following contents:
- CNN Encoder Model: Pretrained CNN Model, which generates Features for Input and Training Images. as an Encoder, Transfer Learning based Xception model has been used with its pretrained weights.
- word Embedding Layer: Converts Caption into Word Embedding Tokens. it takes the input/output dimension of the Vector (32,256).
- LSTM Decoder Model: LSTM is used as Text Sequence Processing in Encoder-Decoder Architecture, Which takes Input-pair of the feature vector of image and Partial Caption and returns Predicted Caption for input Image
- Fliker8k (Including Images and its Text description)
- Dataset link: https://academictorrents.com/details/9dea07ba660a722ae1008c4c8afdd303b6f6e53b
- Some of the Captions Generated by this model are as follows:
- Tensorflow
- Pandas
- Numpy
- Pillow
- Keras
- h5py
- Use Training_model.ipynb file for the training Model
- Use Model_Testing.ipnb file for testing model
If this Repository really helped you, please do Star to the Repo.