Layers

Introduction

All layer configurations comes into

netconfig = start
layer[from->to] = layer_type:nick
netconfig = end

from is an integer, 0 means input data
to is an integer, max integer in layer configuration part is the output.
layer_type is described below
nick is an optional

Layers contains weight ( Connection Layers, Convolution Layers ) require random weight initialization. By default it is using this configuration globally:

random_type = gaussian
init_sigma = 0.01

We extra provide Xavier initialization method, by using the configuration

random_type = xavier

Global setting can be override in the layer configuration, eg

# global setting
random_type = gaussian
netconfig = start
eta = 0.1
layer[0->1] = fullc:fc1
  # local setting start
  nhidden = 50
  random_type = xavier
  # local setting end 
layer[1->2] = relu
layer[2-3] = fullc
  # local setting start
  nhidden = 6
  init_sigma = 0.005
  wmat:lr = 0.2
  # local setting end
netconfig = end

By using this configuration, the fc1 layer will use Xavier method to do initialization, fully connection layer without nick will use Gaussian random number with mu=0, sigma=0.005 to do initialization. Meanwhile fully connection layer without nick will use a learning rate different with global.

Globally the network will use Gaussian method to initialize weight, but in fc1, the weight will be initialized by using Xavier method.

This page will introduce layers supported by cxxnet, including

= Connection Layer

= Activation Layer

= Convolution and Pooling Layer

= Normalization Layer

Local Response Normalization Layer

=

Connection Layer

Connection Layer is used to connect two nodes. We provide three connection layers; Flatten Layer , Fully Connection and Drop Connection .

Flatten Layer

Flatten Layer is used for flatten convolution layer. After flattening, we can use convolution output in the feed forward neural network. Here is an example:

layer[15->16] = flatten

Fully Connection Layer

Fully Connection Layer fully connection layer is the basic element in feed forward neural network.

layer[18->19] = fullc
  nhidden = 1024

Drop Connection Layer

Drop Connection Layer is still in experiment, It drops connection between two layer

layer[18->19] = dropconn
  threshold = 0.5
  nhidden = 1024

threshold is the threshold to abandon an edge.

Activation Layer

We provide common active layers including Softmax , Rectified Linear , Sigmoid , Tanh, Soft Plus ,and so on. Here we treat Dropout as a special activation layer. Declare layer should follow the general configuration format

layer[ from_num -> to_num ] = layer_type:nick

=

Rectified Linear

Rectified Linear need to set to_num different to the from_num , eg

layer[4->5] = relu:rl3

Tanh

Tanh need to set to_num different to the from_num , eg

layer[2->3] = tanh:th2

Sigmoid

Sigmoid need to set to_num different to the from_num , eg

layer[2->3] = sigmoid:sg2

Soft Plus

Soft Plus need to set to_num different to the from_num , eg

layer[2->3] = softplus:sp2

Dropout

Dropout Layer need to set to_num equal the from_num, eg

layer[3->3] = dropout:dp
  threshold = 0.5

threshold is the threshold to abandon an edge.

Softmax

Softmax Layer need to set to_num equal the from_num, eg

layer[5->5] = softmax:sm

=

Convolution Layer

Our convolution implementation is fastest so far. And it is extremely easy to use. The configuration looks like

layer[0->1] = conv
  kernel_size = 11
  stride = 4
  nchannel = 96

kernel_size is the convolution kernel size
stride is stride for convolution operation
nchannel is the output channel
temp_col_max is the maximum size in convolution operation. The default value is 64, means the maximum size of temp_col is 64MB. Adjusting this variable may boost speed in training especially the input size is small in the convolution network.

Pooling Layer

Currectly we provide 3 Pooling methods: Sum Pooling , Max Pooling and Average Pooling . All pooling layers shared same option kinds stride and kernel_size

Sum Pooling

Sum Pooling need to set to_num different to the from_num , eg

layer[4->5] = sum_pooling
  kernel_size = 3
  stride = 2

Max Pooling

Max Pooling need to set to_num different to the from_num , eg

layer[4->5] = max_pooling
  kernel_size = 3
  stride = 2

Average Pooling

Average Pooling need to set to_num different to the from_num , eg

layer[4->5] = avg_pooling
  kernel_size = 3
  stride = 2

Normalization Layer

Currently we provide Local Response Normalization for convolution layer. LRN normalize the response of nearby kernels. Details can be found in the Alex's paper.

Local Response Normalization

layer[3->4] = lrn
  local_size = 5
  alpha = 0.001
  beta = 0.75
  knorm = 1

local_size change the nearby kernel size to be evaluated
alpha, beta and knorm is normalization param.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly