This research project trains an expandable image classification
system for place categorization. An expandable image classification system proposed in [1] which is capable of recognizing new classes is used in order to overcome the closed-set limitation of Convolutional Neural Network (convnet)
. The state-of-the-art Places365 convnet is trained using Places365 dataset with one vs all random forest classifier that outputs place labels.
Convnets are a popular choice for image classification model as
they generalize well and do not require re-training for a given environment. But, they can only recognize the classes on which they have been trained on, which is their major limitation called closed-set limitation
. This is not suitable for any dynamic environments. In order to nullify this limitation, expandable image classification system is proposed to recognize new labels in which it hasn't been trained on.
pip install -r requirements.txt
Places365
dataset is used for training and testing of the place classification model. It consists of RGB images belonging to
-
The preprocessing of Places365 dataset is defined in preprocess_data.py. It is done in two ways:
- Standard data shuffling, cleaning and transformation
- Data augmentation using
ImageDataGenerator
-
Places365 convnet uses Alexnet architecture which will be trained on
places365
dataset. The implementation is given in placescnn.py that uses standard data preprocessing procedure with no augmentation. placescnn_augment.py uses data augmentation as part of preprocessing. Click here for the model architecture. -
For expandable image classification system, the conventional model is tweaked by removing the final
softmax
layer that outputs class probabilities. Instead, the output offully connected layer 2
gives thefeature vector
representation of size4096
. The model is trained to learn those representations. Literally, the model plays the role of feature extractor, which convert the images to feature vectors. -
Now, the feature vectors are fed as input to
one vs all random forest classifier
which learns to predict place labels from those input vectors. -
By this way, the system can make predictions on new data, on which it hasn't seen during training.
Note: The code is not well structured!!!
[1] Niko Suenderhauf, Feras Dayoub, Sean McMahon, Ben Talbot, Ruth Schulz, Peter Corke, Gordon Wyeth, Ben Upcroft, and Michael Milford. Place categorization and semantic mapping on a mobile robot. In 2016 IEEE international conference on robotics and automation (ICRA), pages 5729–5736. IEEE, 2016