Skip to content

dpkpathak/deepDocClassifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DeepDocClassifier - Paper Implementation

In this repository, I have implemented a document classifier based on Deepdocclassifier: Document classification with deep Convolutional Neural Network (DCNN), which is trained and evaluated using the Tobacco-3482 dataset. I have partitioned the dataset into five partitions with different sizes for training and validation datasets for assessing the network. Five partitions used for evaluation include 20, 40, 60, 80, and 100 samples used for training and validations. The rest of the images are used for testing purposes. This evaluation method is taken from the original approach to compare results. To compare results, similar to the original paper, results are compared with Convolutional Neural Networks for Document Image Classification (CNN).

Note: In this implementation for DCNN, for AlexNet, I have not used original architecture as given in the paper because I couldn't find pre-trained network in PyTorch, hence I have used an updated pre-trained AlexNet based on paper "One weird trick for parallelizing convolutional neural networks" provided from TorchVision. For completion, I have provided the original implementation in python module adapted from Image-Classification-PyTorch repo. This can be used for training, by updating the config file for parameter model_param.version from torchvision to original.

In this repository, DCNN and CNN are implemented using PyTorch and PyTorchLightning. To create partitions of data with different training and validation sizes, LightningDataModule is used, consumed by the Trainer class of PyTorchLightning.

Both DCNN and CNN are trained using locally using

  • Processor: Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz 2.30 GHz and
  • GPU: NVIDIA GeForce MX230

Before training the neural network, the package from the repository is installed using the setup file. After installing the requirements using the requirement.txt file, the package can be installed locally after cloning the repository and running the below command from the home of the repository:

pip install .

After installing the repository into local, the DCNN method is trained using configuration from YAML file, provided in config/deepDocConfig.yaml. The neural network can be trained by executing the python module from the command line as below:

python -m deepDocClassifier.main -c <path of config file>

After training each model using different partitions of the training and validation dataset, the remaining samples are used as test data for evaluation. All models are tested using the remaining samples, and a CSV file is generated for further evaluation. The training process is logged using tensorboard and pushed to tensorboard dev for reference.

Using the tensorboard logs and test results, I have generated the learning curve, comparing results on test data for DCNN and CNN. Finally, a confusion matrix plot is plotted for the DCNN method for a partition using 100 samples.

Below plots shows the learning curve using five different partitions, using number of samples for training and validation data equal to 20, 40, 60, 80, and 100.

  • DCNN:

png

  • CNN

png

Test Accuracy

Partition size DCNN CNN
20 47.288239 45.917124
40 61.940299 49.545750
60 62.768910 58.362248
80 61.036540 60.402685
100 66.800967 59.790492

Comparison of the classification results on Tobaco-34

png

The class confusion matrix of the results obtained by one partition which contains 100 images from each class and rest of the images are used for testing.

png

A notebook extracting these tensorboard event files and csv files to generate above plots given here

Releases

No releases published

Packages

No packages published