Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNIST example #242

Merged
merged 55 commits into from
Feb 28, 2019
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
92b1169
WIP - Initial commit to new branch
ctrl-z-9000-times Dec 1, 2018
39c34ad
MNIST -> 80% accuracy
ctrl-z-9000-times Dec 10, 2018
cca50fc
Merge branch 'master' into mnist
ctrl-z-9000-times Dec 18, 2018
ec8c24e
Merge branch 'sp-stats' into mnist
ctrl-z-9000-times Dec 26, 2018
fbeabc7
Score 94%
ctrl-z-9000-times Dec 26, 2018
1dfe98f
SDR-Classifier fix missing header.
ctrl-z-9000-times Dec 26, 2018
81fd5fe
MNIST CMake materials
ctrl-z-9000-times Dec 26, 2018
752a122
Merge branch 'master_community' into mnist
breznak Jan 31, 2019
3193892
SpatialPooler: revert to upstream master version
breznak Feb 1, 2019
77acef0
MNIST: update example
breznak Feb 1, 2019
b944b90
Connections: add asserts
breznak Feb 1, 2019
ac3b020
Merge remote-tracking branch 'community/master' into mnist-example
breznak Feb 7, 2019
9a58b03
Update merge resolution
breznak Feb 7, 2019
f973f24
Pull MNIST experiment
ctrl-z-9000-times Feb 4, 2019
d05d392
MNIST: working example
breznak Feb 8, 2019
780083a
Connections: raisePermanenceToThreshold bug
breznak Feb 8, 2019
d2d4633
SpatialPooler: formating of doc
breznak Feb 9, 2019
49677ba
MNIST: try SP params
breznak Feb 9, 2019
1bf08eb
examples/Hotgym rename to HotgymMain
breznak Feb 9, 2019
af35e2a
example/MNIST make namespace examples
breznak Feb 9, 2019
c6fc009
SDRClassifier: add initialize() method
breznak Feb 9, 2019
25b076e
Serializable: include filesystem helper header
breznak Feb 9, 2019
97e204d
example: make classes MNIST, HelloSPTP, use namespace
breznak Feb 9, 2019
0a46993
CMake: download MNIST repo during configure WIP
breznak Feb 9, 2019
382cc84
fixed up the external part of this PR
dkeeney Feb 15, 2019
4e0f026
Made mnist-example optional.
dkeeney Feb 17, 2019
3221315
Merge remote-tracking branch 'community/master' into mnist-example
breznak Feb 18, 2019
26b12c8
MNIST: use 3rd party repo for data and data access methods
breznak Feb 18, 2019
4f03d5e
Revert "Made mnist-example optional."
breznak Feb 18, 2019
f2c8718
MNIST small fixes
breznak Feb 18, 2019
508282e
MNIST: external repo updated
breznak Feb 18, 2019
c0aaefc
fixes
breznak Feb 18, 2019
62a63a9
Merge branch 'master_community' into mnist-example
breznak Feb 18, 2019
7c1712a
Mnist cleanup
breznak Feb 19, 2019
f9eae0c
MNIST: try full headers
breznak Feb 22, 2019
c60fdb2
cleanup
breznak Feb 22, 2019
d5031ed
TMP Cmake disable shared so build
breznak Feb 22, 2019
684bf70
MNIST tuning params, WIP
breznak Feb 22, 2019
7250138
MNIST param tuning 2
breznak Feb 22, 2019
920eff7
MNIST scores > 30%
ctrl-z-9000-times Feb 24, 2019
acaca33
Merge branch 'master_community' into mnist-example
breznak Feb 25, 2019
149a113
cmake fixes
breznak Feb 25, 2019
f728658
Merge remote-tracking branch 'community/mnist-example' into mnist-exa…
breznak Feb 25, 2019
c93486b
CI: skip TP performance on Windows
breznak Feb 25, 2019
d0fe52e
debugging details
breznak Feb 26, 2019
d75188e
CMAke: do not bundle examples with main library
breznak Feb 26, 2019
b928fc3
fix: skip performance test for Windows in CI
breznak Feb 26, 2019
a14c5f4
Connections: resolve merge conflicts
breznak Feb 26, 2019
907fc3d
another try for passing performance CI on Windows
breznak Feb 26, 2019
4bf889b
Merge branch 'master_community' into mnist-example
breznak Feb 26, 2019
4cee012
MNIST: use 2D input, SP and smaller columns
breznak Feb 26, 2019
41a68ee
Revert "TMP Cmake disable shared so build"
breznak Feb 26, 2019
a4b2732
MNIST: tuned params 45%
breznak Feb 26, 2019
4cc6cd8
MNIST: stript unlearned cols on inference
breznak Feb 26, 2019
d26299a
MNIST: local inh 43%
breznak Feb 26, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -325,6 +325,25 @@ add_custom_target(hotgym
VERBATIM)


#########################################################
## MNIST Spatial Pooler Example
#
set(src_executable_mnistsp mnist_sp)
add_executable(${src_executable_mnistsp} "examples/mnist/MNIST_SP.cpp")
target_link_libraries(${src_executable_mnistsp} ${core_library})
target_compile_options(${src_executable_mnistsp} PUBLIC ${INTERNAL_CXX_FLAGS})
target_compile_definitions(${src_executable_mnistsp} PRIVATE ${COMMON_COMPILER_DEFINITIONS})
target_include_directories(${src_executable_mnistsp} PRIVATE
${CORE_LIB_INCLUDES}
${EXTERNAL_INCLUDES}
)
add_custom_target(mnist
COMMAND ${src_executable_mnistsp}
DEPENDS ${src_executable_mnistsp}
COMMENT "Executing ${src_executable_mnistsp}"
VERBATIM)


############ INSTALL ######################################
#
# Install targets into CMAKE_INSTALL_PREFIX
Expand Down Expand Up @@ -359,6 +378,12 @@ install(TARGETS
LIBRARY DESTINATION lib
ARCHIVE DESTINATION lib)

install(TARGETS
${src_executable_mnistsp}
RUNTIME DESTINATION bin
LIBRARY DESTINATION lib
ARCHIVE DESTINATION lib)


#
# `make package` results in
Expand Down
272 changes: 272 additions & 0 deletions src/examples/mnist/MNIST_SP.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,272 @@
/* ---------------------------------------------------------------------
* Copyright (C) 2018, David McDougall.
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero Public License version 3 as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
* See the GNU Affero Public License for more details.
*
* You should have received a copy of the GNU Affero Public License
* along with this program. If not, see http://www.gnu.org/licenses.
* ----------------------------------------------------------------------
*/

/**
* Solving the MNIST dataset with Spatial Pooler.
* TODO: DESCRIPTION
* TODO: RUN INSTRUCTIONS
breznak marked this conversation as resolved.
Show resolved Hide resolved
*/

#include <algorithm>
#include <cmath>
#include <ctime>
#include <iostream>
#include <vector>
#include <random>

#include "nupic/algorithms/SpatialPooler.hpp"
#include <nupic/algorithms/SDRClassifier.hpp>
breznak marked this conversation as resolved.
Show resolved Hide resolved
#include <nupic/algorithms/ClassifierResult.hpp>
#include "nupic/ntypes/SdrMetrics.hpp"

using namespace std;
breznak marked this conversation as resolved.
Show resolved Hide resolved
using namespace nupic;
using nupic::algorithms::spatial_pooler::SpatialPooler;
using nupic::algorithms::sdr_classifier::SDRClassifier;
using nupic::algorithms::cla_classifier::ClassifierResult;


#include <stdio.h>
#include <execinfo.h>
#include <signal.h>
#include <stdlib.h>
#include <unistd.h>
void keyboard_interrupt_handler(int sig) {
void *array[50];
size_t size;

// get void*'s for all entries on the stack
size = backtrace(array, 50);

// print out all the frames to stderr
fprintf(stderr, "Error: signal %d:\n", sig);
backtrace_symbols_fd(array, size, STDERR_FILENO);
exit(1);
}


vector<UInt> read_mnist_labels(string path) {
breznak marked this conversation as resolved.
Show resolved Hide resolved
ifstream file(path);
if( !file.is_open() ) {
cerr << "ERROR: Failed to open file " << path << endl;
exit(1);
}
int magic_number = 0;
int number_of_labels = 0;
file.read( (char*) &magic_number, 4);
file.read( (char*) &number_of_labels, 4);
if(magic_number != 0x00000801) {
std::reverse((char*) &magic_number, (char*) &magic_number + 4);
std::reverse((char*) &number_of_labels, (char*) &number_of_labels + 4);
}
if(magic_number != 0x00000801) {
cerr << "ERROR: MNIST data is compressed or corrupt" << endl;
exit(1);
}
vector<UInt> retval;
for(int i = 0; i < number_of_labels; ++i) {
unsigned char label = 0;
file.read( (char*) &label, 1);
retval.push_back((UInt) label);
}
return retval;
}


vector<UInt*> read_mnist_images(string path) {
ifstream file(path);
if( !file.is_open() ) {
cerr << "ERROR: Failed to open file " << path << endl;
exit(1);
}
int magic_number = 0;
int number_of_images = 0;
int n_rows = 0;
int n_cols = 0;
file.read( (char*) &magic_number, 4);
file.read( (char*) &number_of_images, 4);
file.read( (char*) &n_rows, 4);
file.read( (char*) &n_cols, 4);
if(magic_number != 0x00000803) {
std::reverse((char*) &magic_number, (char*) &magic_number + 4);
std::reverse((char*) &number_of_images, (char*) &number_of_images + 4);
std::reverse((char*) &n_rows, (char*) &n_rows + 4);
std::reverse((char*) &n_cols, (char*) &n_cols + 4);
}
if(magic_number != 0x00000803) {
cerr << "ERROR: MNIST data is compressed or corrupt" << endl;
exit(1);
}
NTA_ASSERT(n_rows == 28);
NTA_ASSERT(n_cols == 28);
UInt img_size = n_rows * n_cols;
vector<UInt*> retval;
for(int i = 0; i < number_of_images; ++i) {
auto data_raw = new unsigned char[img_size];
file.read( (char*) data_raw, img_size);
// Copy the data into an array of UInt's
auto data = new UInt[2 * img_size];
// Apply a threshold to the image, yielding a B & W image.
for(UInt pixel = 0; pixel < img_size; pixel++) {
data[2 * pixel] = data_raw[pixel] >= 128 ? 1 : 0;
data[2 * pixel + 1] = 1 - data[2 * pixel];
}
retval.push_back(data);
delete[] data_raw;
}
return retval;
}


int main(int argc, char **argv) {
breznak marked this conversation as resolved.
Show resolved Hide resolved
UInt verbosity = 1;
int train_dataset_iterations = 1;
int opt;
while ( (opt = getopt(argc, argv, "tv")) != -1 ) { // for each option...
switch ( opt ) {
case 't':
train_dataset_iterations += 1;
break;
case 'v':
verbosity = 1;
break;
case '?':
cerr << "Unknown option: '" << char(optopt) << "'!" << endl;
break;
}
}
UInt train_time = train_dataset_iterations * 60000;

signal(SIGINT, keyboard_interrupt_handler); // install stack trace printout on error
signal(SIGSEGV, keyboard_interrupt_handler); // install stack trace printout on error

SDR input({28, 28, 2});
SpatialPooler sp(
/* numInputs */ input.dimensions,
/* numColumns */ {10, 10, 120},
/* potentialRadius */ 0, // hardcoded elsewhere
/* potentialPct */ .0000001, // hardcoded elsewhere
/* globalInhibition */ true,
/* localAreaDensity */ .015,
/* numActiveColumnsPerInhArea */ -1,
/* stimulusThreshold */ 28,
/* synPermInactiveDec */ .00928,
/* synPermActiveInc */ .032,
/* synPermConnected */ .422,
/* minPctOverlapDutyCycles */ 0.,
/* dutyCyclePeriod */ 1402,
/* boostStrength */ 0,
/* CPP SP seed */ 0,
/* spVerbosity */ verbosity,
/* wrapAround */ 0 // discarded
);

SDR columns({sp.getNumColumns()});
SDR_Metrics columnStats(columns, 1402);

if( verbosity ) {
// Print the min/mean/max potential pool size
UInt min = 999999;
Real mean = 0;
UInt max = 0;
for( UInt cell = 0; cell < sp.getNumColumns(); cell++) {
auto pool = sp.initMapPotential_( cell, false );
UInt size = 0;
for( auto presyn : pool ) {if ( presyn ) size++; };
min = size < min ? size : min;
max = size > max ? size : max;
mean += size;
}
mean /= sp.getNumColumns();
cerr << "POTENTIAL POOL MIN / MEAN / MAX " << min << " / " << mean << " / " << max << endl;
}

SDRClassifier clsr(
/* steps */ {0},
/* alpha */ .001,
/* actValueAlpha */ .3,
verbosity);

// Train
auto train_images = read_mnist_images("./mnist_data/train-images-idx3-ubyte");
auto train_labels = read_mnist_labels("./mnist_data/train-labels-idx1-ubyte");
breznak marked this conversation as resolved.
Show resolved Hide resolved
if(verbosity)
cout << "Training for " << train_time << " cycles ..." << endl;
for(UInt i = 0; i < train_time; i++) {
// Get the input & label
UInt index = rand() % train_labels.size();
UInt *image = train_images[index];
UInt label = train_labels[index];

// Compute & Train
input.setDense( image );
sp.compute(input, true, columns);
ClassifierResult result;
clsr.compute(sp.getIterationNum(), columns.getFlatSparse(),
/* bucketIdxList */ {label},
/* actValueList */ {(Real)label},
/* category */ true,
/* learn */ true,
/* infer */ false,
&result);
if( verbosity and i % 100 == 0 )
cout << "." << flush;
}
if( verbosity ) cout << endl;

cout << columnStats << endl;

// Test
auto test_images = read_mnist_images("./mnist_data/t10k-images-idx3-ubyte");
auto test_labels = read_mnist_labels("./mnist_data/t10k-labels-idx1-ubyte");
Real score = 0;
UInt n_samples = 0;
if(verbosity)
cout << "Testing for " << test_labels.size() << " cycles ..." << endl;
for(UInt i = 0; i < test_labels.size(); i++) {
// Get the input & label
UInt *image = test_images[i];
UInt label = test_labels[i];

// Compute
input.setDense( image );
sp.compute(input, false, columns);
ClassifierResult result;
clsr.compute(sp.getIterationNum(), columns.getFlatSparse(),
/* bucketIdxList */ {},
/* actValueList */ {},
/* category */ true,
/* learn */ false,
/* infer */ true,
&result);
// Check results
for(auto iter : result) {
if( iter.first == 0 ) {
auto *pdf = iter.second;
auto max = std::max_element(pdf->begin(), pdf->end());
UInt cls = max - pdf->begin();
if(cls == label)
score += 1;
n_samples += 1;
}
}
if( verbosity and i % 100 == 0 )
cout << "." << flush;
}
if( verbosity ) cout << endl;
cout << "Score: " << score / n_samples << endl;
}
16 changes: 12 additions & 4 deletions src/nupic/algorithms/Connections.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -466,7 +466,9 @@ void Connections::raisePermanencesToThreshold(
if( segmentThreshold == 0 )
return;

NTA_ASSERT(segment < segments_.size()) << "Accessing segment out of bounds.";
auto &segData = segments_[segment];

if( segData.numConnected >= segmentThreshold )
return;

Expand All @@ -477,10 +479,12 @@ void Connections::raisePermanencesToThreshold(
// connected synapses. Then calculate how much to increase the N'th synapses
// permance by such that it becomes a connected synapse.

auto minPermSynPtr = synapses.begin() + segmentThreshold - 1;
NTA_ASSERT(synapses.size() <= segmentThreshold) << "Threshold too large";
NTA_ASSERT(segmentThreshold >= 1) << "Threshold must be >= 1, otherwise overflows here";
auto minPermSynPtr = synapses.begin() + segmentThreshold - 1; //FIXME I suspect in edge conditions this causes some err
// Do a partial sort, it's faster than a full sort. Only minPermSynPtr is in
// its final sorted position.
auto permanencesGreater = [&](Synapse &A, Synapse &B)
const auto permanencesGreater = [&](Synapse &A, Synapse &B)
{ return synapses_[A].permanence > synapses_[B].permanence; };
std::nth_element(synapses.begin(), minPermSynPtr, synapses.end(), permanencesGreater);

Expand All @@ -489,15 +493,19 @@ void Connections::raisePermanencesToThreshold(
return; // Enough synapses are already connected.

// Raise the permance of all synapses in the potential pool uniformly.
for( const auto &syn : synapses )
for( const auto &syn : synapses ) {
NTA_ASSERT(syn < synapses_.size()) << "Synapse out of bounds.";
updateSynapsePermanence(syn, synapses_[syn].permanence + increment);
}
}


void Connections::bumpSegment(const Segment segment, const Permanence delta) {
const vector<Synapse> &synapses = synapsesForSegment(segment);
for( const auto &syn : synapses )
for( const auto &syn : synapses ) {
NTA_ASSERT(syn < synapses_.size()) << "Synapse out of range!";
updateSynapsePermanence(syn, synapses_[syn].permanence + delta);
}
}


Expand Down
2 changes: 1 addition & 1 deletion src/nupic/algorithms/SDRClassifier.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
#include <stdio.h>
#include <string>
#include <vector>
#include <algorithm>
#include <algorithm> // sort


#include <nupic/algorithms/ClassifierResult.hpp>
Expand Down