DeepDocClassifier - Paper Implementation

In this repository, I have implemented a document classifier based on Deepdocclassifier: Document classification with deep Convolutional Neural Network (DCNN), which is trained and evaluated using the Tobacco-3482 dataset. I have partitioned the dataset into five partitions with different sizes for training and validation datasets for assessing the network. Five partitions used for evaluation include 20, 40, 60, 80, and 100 samples used for training and validations. The rest of the images are used for testing purposes. This evaluation method is taken from the original approach to compare results. To compare results, similar to the original paper, results are compared with Convolutional Neural Networks for Document Image Classification (CNN).

Note: In this implementation for DCNN, for AlexNet, I have not used original architecture as given in the paper because I couldn't find pre-trained network in PyTorch, hence I have used an updated pre-trained AlexNet based on paper "One weird trick for parallelizing convolutional neural networks" provided from TorchVision. For completion, I have provided the original implementation in python module adapted from Image-Classification-PyTorch repo. This can be used for training, by updating the config file for parameter model_param.version from torchvision to original.

In this repository, DCNN and CNN are implemented using PyTorch and PyTorchLightning. To create partitions of data with different training and validation sizes, LightningDataModule is used, consumed by the Trainer class of PyTorchLightning.

Both DCNN and CNN are trained using locally using

Processor: Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz 2.30 GHz and
GPU: NVIDIA GeForce MX230

Before training the neural network, the package from the repository is installed using the setup file. After installing the requirements using the requirement.txt file, the package can be installed locally after cloning the repository and running the below command from the home of the repository:

pip install .

After installing the repository into local, the DCNN method is trained using configuration from YAML file, provided in config/deepDocConfig.yaml. The neural network can be trained by executing the python module from the command line as below:

python -m deepDocClassifier.main -c <path of config file>

After training each model using different partitions of the training and validation dataset, the remaining samples are used as test data for evaluation. All models are tested using the remaining samples, and a CSV file is generated for further evaluation. The training process is logged using tensorboard and pushed to tensorboard dev for reference.

Using the tensorboard logs and test results, I have generated the learning curve, comparing results on test data for DCNN and CNN. Finally, a confusion matrix plot is plotted for the DCNN method for a partition using 100 samples.

Below plots shows the learning curve using five different partitions, using number of samples for training and validation data equal to 20, 40, 60, 80, and 100.

DCNN:

CNN

Test Accuracy

Partition size	DCNN	CNN
20	47.288239	45.917124
40	61.940299	49.545750
60	62.768910	58.362248
80	61.036540	60.402685
100	66.800967	59.790492

Comparison of the classification results on Tobaco-34

The class confusion matrix of the results obtained by one partition which contains 100 images from each class and rest of the images are used for testing.

A notebook extracting these tensorboard event files and csv files to generate above plots given here

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.idea		.idea
config		config
deepDocClassifier		deepDocClassifier
results		results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirement.txt		requirement.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepDocClassifier - Paper Implementation

Test Accuracy

Comparison of the classification results on Tobaco-34

The class confusion matrix of the results obtained by one partition which contains 100 images from each class and rest of the images are used for testing.

About

Releases

Packages

Languages

License

dpkpathak/deepDocClassifier

Folders and files

Latest commit

History

Repository files navigation

DeepDocClassifier - Paper Implementation

Test Accuracy

Comparison of the classification results on Tobaco-34

The class confusion matrix of the results obtained by one partition which contains 100 images from each class and rest of the images are used for testing.

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages