Skip to content
This repository has been archived by the owner on Oct 15, 2019. It is now read-only.

Walkthrough: MNIST

Minjie Wang edited this page Apr 19, 2015 · 14 revisions

Walkthrough: MNIST using Minerva

About MNIST

If you are not familiar with MNIST dataset, please see here.

Classify MNIST using 3 layer perceptron

The network used here consists of

  • One input layer of size 784
  • One hidden layer of size 256; use RELU non-linearity
  • One classifier layer of size 10; use Softmax loss function

Suppose the minibatch size is 256. For each minibatch, we have already converted them into two matrices: data and label. They are of size 784x256 and 10x256, respectively. Then,

  1. Initialization: Weight and bias matrices are initialized as follows:

    w1 = owl.randn([256, 784], 0.0, 0.01)
    w2 = owl.randn([10, 256], 0.0, 0.01)
    b1 = owl.zeros([256, 1])
    b2 = owl.zeros([10, 1])
  2. Feed-forward Propagation:

    a1 = owl.elewise.relu(w1 * data + b1)  # hidden layer
    a2 = owl.conv.softmax(w2 * data + b2)  # classifier layer
  3. Backward Propagation:

    s2 = a2 - label                                 # classifier layer
    s1 = owl.elewise.relu_back(w2.trans() * s2, a1) # hidden layer
    gw2 = s2 * a2.trans()                           # gradient of w2
    gw1 = s1 * data.trans()                         # gradient of w1
    gb2 = s2.sum(1)                                 # gradient of b2
    gb1 = s1.sum(1)                                 # gradient of b1
  4. Update:

    w1 -= lr * gw1
    w2 -= lr * gw2
    b1 -= lr * gb1
    b2 -= lr * gb2

When putting them together, we got:

import owl
import owl.conv
import owl.elewise
import mnist_io, sys
# initial system
owl.initialize(sys.argv)
gpu = owl.create_gpu_device(0)
owl.set_device(gpu)
# training parameters and weights
MAX_EPOCH=10
lr = 0.01
w1 = owl.randn([256, 784], 0.0, 0.01)
w2 = owl.randn([10, 256], 0.0, 0.01)
b1 = owl.zeros([256, 1])
b2 = owl.zeros([10, 1])
(train_set, test_set) = mnist_io.load_mb_from_mat("mnist.dat", 256)
# training
for epoch in range(MAX_EPOCH):
  for (data, label) in train_set:
    # ff
    a1 = owl.elewise.relu(w1 * data + b1)  # hidden layer
    a2 = owl.conv.softmax(w2 * a1 + b2)    # classifier layer
    # bp
    s2 = a2 - label                                 # classifier layer
    s1 = owl.elewise.relu_back(w2.trans() * s2, a1) # hidden layer
    gw2 = s2 * a2.trans()                           # gradient of w2
    gw1 = s1 * data.trans()                         # gradient of w1
    gb2 = s2.sum(1)                                 # gradient of b2
    gb1 = s1.sum(1)                                 # gradient of b1
    # update
    w1 -= lr * gw1
    w2 -= lr * gw2
    b1 -= lr * gb1
    b2 -= lr * gb2

Classify MNIST using Convolution Neural Network

Clone this wiki locally