Skip to content

A basic ML project on classifying data from the MNIST database using various methods.

Notifications You must be signed in to change notification settings

prchandr/MNIST-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MNIST-Classification - ENEE436 Project 1

A basic ML project on classifying data from the MNIST database using various methods.

These are basic examples of classification using the MNIST dataset. The four programs use a Naive Bayes classifier, a k-Nearest Neighbors classifier, Linear Discriminant Analysis, and Principal Component Analysis. The libraries used are scikit learn and numpy. The first three programs are designed in such a way that they can be used by other programs, but running it as a main program will perform the expected classification.

The parse_data program takes in the MNIST files and parses them into a numpy array that can be used by scikit.

Results

PCA onto 2D

PCA2D

PCA onto 3D

PCA3D

Usage

The libraries used were numpy, matplotlib, and scikit learn. The 64-bit version of Python 3.8 is recommended to prevent memory issues. Once the requisite libraries are installed, the extracted MNIST files must be kept in the same directory as the programs. Then simply running the programs will perform the classification and output the error rates for the training and testing.

Examples

e.g. Running the Naive Bayes Classifier program

python naive.py

Training: 26106 incorrectly classified out of 60000 images. (43.510% error rate) Testing: 4442 incorrectly classified out of 10000 images. (44.420% error rate)

e.g. Running the Fisher Linear Disciminant program

python fisher.py

LDA for: [0, 9] Training: 59 incorrectly classified out of 11872 images. (0.497% error rate) Testing: 23 incorrectly classified out of 1989 images. (1.156% error rate) LDA for: [0, 8] Training: 133 incorrectly classified out of 11774 images. (1.130% error rate) Testing: 20 incorrectly classified out of 1954 images. (1.024% error rate) LDA for: [1, 7] Training: 91 incorrectly classified out of 13007 images. (0.700% error rate) Testing: 23 incorrectly classified out of 2163 images. (1.063% error rate)

e.g. Running the PCA program

python pca.py

n: 5 Naive Bayes Classifier: Training: 21276 incorrectly classified out of 60000 images. (35.460% error rate) Testing: 3420 incorrectly classified out of 10000 images. (34.200% error rate) Nearest Neighbors Classifier: 5 Training: 11181 incorrectly classified out of 60000 images. (18.635% error rate) 5 Testing: 2526 incorrectly classified out of 10000 images. (25.260% error rate)

n: 10 Naive Bayes Classifier: Training: 13777 incorrectly classified out of 60000 images. (22.962% error rate) Testing: 2218 incorrectly classified out of 10000 images. (22.180% error rate) Nearest Neighbors Classifier: 5 Training: 2729 incorrectly classified out of 60000 images. (4.548% error rate) 5 Testing: 724 incorrectly classified out of 10000 images. (7.240% error rate)

n: 20 Naive Bayes Classifier: Training: 9539 incorrectly classified out of 60000 images. (15.898% error rate) Testing: 1468 incorrectly classified out of 10000 images. (14.680% error rate) Nearest Neighbors Classifier: 5 Training: 1137 incorrectly classified out of 60000 images. (1.895% error rate) 5 Testing: 306 incorrectly classified out of 10000 images. (3.060% error rate)

n: 50 Naive Bayes Classifier: Training: 7750 incorrectly classified out of 60000 images. (12.917% error rate) Testing: 1225 incorrectly classified out of 10000 images. (12.250% error rate) Nearest Neighbors Classifier: 5 Training: 841 incorrectly classified out of 60000 images. (1.402% error rate) 5 Testing: 249 incorrectly classified out of 10000 images. (2.490% error rate)

n: 100 Naive Bayes Classifier: Training: 7825 incorrectly classified out of 60000 images. (13.042% error rate) Testing: 1199 incorrectly classified out of 10000 images. (11.990% error rate) Nearest Neighbors Classifier: 5 Training: 954 incorrectly classified out of 60000 images. (1.590% error rate) 5 Testing: 276 incorrectly classified out of 10000 images. (2.760% error rate)

About

A basic ML project on classifying data from the MNIST database using various methods.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages