Although model has been optimized for specific image classification task, this code can be used as a generic image classifier for any problem.
This project was built using Ubuntu 16.10, Ananconda, Keras, and Tensorflow. The code has been tested with both CPU (64GB RAM computer) and GPU (2x Nvidia GeForce GTX 1080). To clone one of the two enviorments provided follow the instructions below.
- Download Anaconda
- Download Dataset AND Models: only the dogs vs. cats data is provided, since the text_images data is not public.
- Clone Environment: (CPU):
conda env create -f cpu-environment.yml
- Clone Environment: (GPU):
conda env create -f gpu-environment.yml
The code contains two training model, and one classification output.
- fine_tune.py: used to train CNN to classify between dogs vs. cats.
- fine_tune_text_images: used to train CNN to classify between handwritten vs. typed.
- classify.py: used to classify new samples for either training model
Both the dogs_cats and text_images classifier have already been trained and their best models will be saved on the model directory after downloading them models download. If you do not wish to re-train the models feel free to skip this step and go straight into classifying new samples.
image-classifier$ source activate image-classifier-cpu
image-classifier$ python code/fine_tune.py <data_dir/> <model_dir/>
Example: python code/fine_tune.py data/dogs_cats/ model/dogs_cats/
- Make sure to include the
/
at the end of every directory for the example to work. - Replace image-classifier-cpu with image-classifier-gpu if your using GPU
The training script will save the json model model.json
, and model_weights.h5
file in the specified <model_dir/>
For text_images dataset parameters are nb_test_samples
=10, img_width
=600, img_height
=150 must be change according to your model and needs. Values provided are just default ones.
For dogs vs cats dataset just leave default values.
image-classifier$ source activate image-classifier-cpu
image-classifier$ python code/classify.py <model_dir/> <test_dir/> <results_dir/>
Example: python code/classify.py model/dogs_cats/ data/dogs_cats/test/ results/dogs_cats/
- Make sure to include the
/
at the end of every directory for the example to work. - Replace image-classifier-cpu with image-classifier-gpu if your using GPU
<test_dir/>
should contain a subfolder inside and NOT the data directly. Example:test_dir/test
The classify script will save a predictions.csv
file in the specified <results_dir/>
This directory contains the best models already train for both classification tasks.
This directory contains the test results for both classification tasks.
Best Value:
- Dog vs. Cats: 99.38% validation accuracy, .02 validation log loss
- Handwritten vs Typed: 100% validation accuracy
This empty directory is created to store the dataset if desired. Download the dataset from the requirements section and place it inside this folder.
Example: data/dogs_cats/..
A significant portion of this script came from keras blog example: "Building powerful image classification models using very little data" My main contribution is to make it work with all keras pre-train applications models, and add a higher level of abstraction.