###Neural Network Template
####Description
-
This is a 3-layer regularized neural network for classification, implemented in MATLAB and Python.
-
Logistic sigmoid is used as the activation function. Weights are learned by minimizing a square error cost function with fmincg or fminunc (MATLAB/Octave) and fmin_cg (Python).
-
MATLAB code is based on Ex.4 of ml-class.org.
-
Python code is adapted from the MATLAB version, and uses numpy and scipy libraries.
####Script steps
Read in .csv data file
Randomize rows in dataset
Define features (X) and class (y)
Standardize features (subtract mean, divide by st dev)
Split into training and test sets
Define NN layers
Initialize NN weights
Minimize cost function
Compute performance metrics
####Usage for python script
python nn_template.py fisher_iris.csv
####Example output for python script
Script started at 10:39:04
Using fisher_iris.csv
Initial cost: 2.14566322791
Training Neural Network...
fmin results:
Warning: Maximum number of iterations has been exceeded
Current function value: 0.723884
Iterations: 40
Function evaluations: 8245
Gradient evaluations: 97
Accuracy on training set: 96.1905
Accuracy on test set: 95.5556
Confusion matrix:
[[15 0 0]
[ 0 14 2]
[ 0 0 14]]
Total run time: 7.6800 seconds
####Constraints on input data
-
This template is suitable for data with a class variable designating 2 or more classes. The code expects numeric (not categorical) feature values.
-
The class column must be the last column in the input.
-
Classes must be designated with consecutive integers, starting from 1 (for example, {1,2,3,4} but not {1,2,4}). Due to Octave/MATLAB syntax, '0' cannot be used to designate a class.
-
Each data row must be complete (no missing values), and every class must be represented in the training set.
-
Comma-separated format is expected.
####Datasets used in development and testing
Fisher's Iris
Wine
[Breast Cancer Wisconsin (Diagnostic)](http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic\))
Vertebral Column
Ionosphere
####Some processing notes for datasets used
- Fisher's Iris: The Iris classes {setosa, versicolor, virginica} were relabeled to {1,2,3}.
- Wine: the class column has been moved to the last column in the dataset.
- Wisconsin Breast Cancer: the classes {malignant, benign} were relabeled to {1,2}. Class column has been moved to the end.
- Vertebral Column: the 3-class dataset was used. The classes {DH, SL, NO} were relabeled as {1,2,3}.
- Ionosphere: the 2nd feature column was removed, as all entries are zero. The classes {g,b} were relabeled as {1,2}.
####Files
-
Python implementation script and helper functions: nn_template.py, nn_hf.py
-
MATLAB/Octave implementation script: nn_template.m
-
Functions used by main .m script: sigmoid.m, sigmoidGradient.m, randInitializeWeights.m, nnCostFunction.m, fmincg.m, predict.m