Simple perceptron in C.
- with ascii plot:
clang *.c -fopenmp; ./a.out
- without ascii plot and openmp:
clang *.c; ./a.out
A perceptron is a machine learning model that performs well for classification against linearly separable data.
If you didn't get that, this video explains it visually pretty well
There are two types of perceptrons:
- single-layer perceptron - what this code and the rest of the readme is about.
- multi-layer perceptron - a fully-connected neural network. Can have more layers and more complex layer functions.
Usually when saying just "perceptron" it is inferred that it is a reference to a single-layer one.
The power of machine learning is letting a model optimize its own parameters (training). However, there are hyperparameters: defined by us to guide the model in its optimization process.
For perceptrons, there are two hyperparameters: the learning rate and error threshold.
The error threshold determines when training will stop and the learning rate scale the training steps.
A perceptron layer is composed of:
- neurons - which have weights that are fine-tuned using training data.
- threshold functions - that maps an input x to a single binary value.
threshold functions usually look like this:
if( w * x + b > 0 ) {
return 1;
} else {
return 0;
}
where w
is the weight vector and b
is the bias.
Each layer takes an vector as input, performs a dot operation with it on the weights and maps it to a binary value using the threshold function.
By the way, to compute the bias, you can just append a 1
element to the input vector and append another weight element.
The output can be send to the next layer or become the final result by being properly discretized.
update weights:
r = learning rate
y = true label
ŷ = predicted label
w[i] = w[i] + r * (y - ŷ) * x[i]
Perceptrons can do binary classification using just the threshold function:
- returning 1 means one class.
- returning 0 means another class.
Multiclass perceptrons can classify multiple classes. However, training and classification will be differ from the binary classifier Perceptron since each class will have it's own weight vector.
- Perform the dot product between the input and weight vector of every class
- The inferred class will be the one in which the dot product obtained the highest score
Define f(x, y)
, a function that returns every possible input/output pair. The training will iterate over these pairs.
0. Start with all weight vector filled with zeros
- Predict class using current weights
y = argmax(w[i] * f(x, y))
- If prediction is correct, nothing happens
- If prediction is incorrect:
- Lower the score of wrong answer
w[wrong] = w[wrong] - f(x, y)
- Raise the score of right answer
w[correct] = w[correct] + f(x, y)