Digit Recognizer

________  .__       .__  __    __________                                  .__                     
\______ \ |__| ____ |__|/  |_  \______   \ ____   ____  ____   ____   ____ |__|_______ ___________ 
 |    |  \|  |/ ___\|  \   __\  |       _// __ \_/ ___\/  _ \ / ___\ /    \|  \___   // __ \_  __ \
 |    `   \  / /_/  >  ||  |    |    |   \  ___/\  \__(  <_> ) /_/  >   |  \  |/    /\  ___/|  | \/
/_______  /__\___  /|__||__|    |____|_  /\___  >\___  >____/\___  /|___|  /__/_____ \\___  >__|   
        \/  /_____/                    \/     \/     \/     /_____/      \/         \/    \/

ABOUT

Simplest Machine Learning Model to recognizing hand written digits.

MNIST Dataset

Run the following commands to download datasets

wget https://myawsbucket003.s3.ap-south-1.amazonaws.com/AI+ML/Digit+Recognition/datasets/mnist_test.csv\n

wget https://myawsbucket003.s3.ap-south-1.amazonaws.com/AI+ML/Digit+Recognition/datasets/mnist_train.csv

MNIST consists of 60,000 handwritten digit images of all numbers from zero to nine. Each image has its corresponding label number representing the number in image.

Each image contains a single grayscale digit drawn by hand. And each image is a 784 dimensional vector (28 pixels for both height and width) of floating-point numbers where each value represents a pixel’s brightness.

the data sets also provided in notebook.

Note:

After downloading update path train_data_file, test_data_file in __init__ method

KNN

K Nearest Neighbors is a classification algorithm. It classifies the new data point (test input) into some category. To do so it basically looks at the new datapoint’s distance from all other data points in training set. Then out of the k closest training datapoints the class in majority is assigned to that new test data point.
Pretty Simple Right :)

Finding Distance

The LN Norm distances gives the distance between two points.

If n is 1, the LN Norm is Manhattan distance
If n is 2, it is Euclidean distance

By using LN Norm, we can find distances between test input and training input

Get K Nearest Neighbours

After finding the distances between test and train inputs, to know in which classification the test input would be, we need to get k nearest neighbours.

Voting

Distance Voting

Among KNN there could be voting tie,
One of the way to broke voting tie is Distance Weighted KNN, i.e., sum of inverse of distances of all predicted labels

for all k nearest neighbours in KNN

Majority Voting

Among KNN there could be voting tie,
Another of the way to broke voting tie is Majority Based KNN, i.e., out of predicted labels which label is most repeated that label does chosed

Hyperparameter Tuning

Choosing what are the best values of n and k that gives the best accuracy.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
digit_recognition.ipynb		digit_recognition.ipynb
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Digit Recognizer

ABOUT

MNIST Dataset

Note:

KNN

Finding Distance

Get K Nearest Neighbours

Voting

Distance Voting

Majority Voting

Hyperparameter Tuning

About

Releases

Packages

Languages

License

Spidey03/digit_recognizer

Folders and files

Latest commit

History

Repository files navigation

Digit Recognizer

ABOUT

MNIST Dataset

Note:

KNN

Finding Distance

Get K Nearest Neighbours

Voting

Distance Voting

Majority Voting

Hyperparameter Tuning

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages