Skip to content
This repository has been archived by the owner on May 24, 2018. It is now read-only.
Tianqi Chen edited this page Apr 9, 2014 · 16 revisions

Introduction

All layer configurations comes into

netconfig = start
layer[from->to] = layer_type:nick
netconfig = end
  • from is an integer, 0 means input data
  • to is an integer, max integer in layer configuration part is the output.
  • layer_type is described below
  • nick is an optional

Layers contains weight ( Connection Layers, Convolution Layers ) require random weight initialization. By default it is using this configuration globally:

random_type = gaussian
init_sigma = 0.01

We extra provide Xavier initialization method, by using the configuration

random_type = xavier

Global setting can be override in the layer configuration, eg

# global setting
random_type = gaussian
netconfig = start
eta = 0.1
layer[0->1] = fullc:fc1
  # local setting start
  nhidden = 50
  random_type = xavier
  # local setting end 
layer[1->2] = relu
layer[2-3] = fullc
  # local setting start
  nhidden = 6
  init_sigma = 0.005
  wmat:lr = 0.2
  # local setting end
netconfig = end

By using this configuration, the fc1 layer will use Xavier method to do initialization, fully connection layer without nick will use Gaussian random number with mu=0, sigma=0.005 to do initialization. Meanwhile fully connection layer without nick will use a learning rate different with global.

Globally the network will use Gaussian method to initialize weight, but in fc1, the weight will be initialized by using Xavier method.

This page will introduce layers supported by cxxnet, including

= Connection Layer

= Activation Layer

= Convolution and Pooling Layer

= Normalization Layer

=

Connection Layer

Connection Layer is used to connect two nodes. We provide three connection layers; Flatten Layer , Fully Connection and Drop Connection .

Flatten Layer
  • Flatten Layer is used for flatten convolution layer. After flattening, we can use convolution output in the feed forward neural network. Here is an example:
layer[15->16] = flatten
Fully Connection Layer
  • Fully Connection Layer fully connection layer is the basic element in feed forward neural network.
layer[18->19] = fullc
  nhidden = 1024
Drop Connection Layer
  • Drop Connection Layer is still in experiment, It drops connection between two layer
layer[18->19] = dropconn
  threshold = 0.5
  nhidden = 1024
  • threshold is the threshold to abandon an edge.

Activation Layer

We provide common active layers including Softmax , Rectified Linear , Sigmoid , Tanh, Soft Plus ,and so on. Here we treat Dropout as a special activation layer. Declare layer should follow the general configuration format

layer[ from_num -> to_num ] = layer_type:nick

=

Rectified Linear
  • Rectified Linear need to set to_num different to the from_num , eg
layer[4->5] = relu:rl3
Tanh
  • Tanh need to set to_num different to the from_num , eg
layer[2->3] = tanh:th2
Sigmoid
  • Sigmoid need to set to_num different to the from_num , eg
layer[2->3] = sigmoid:sg2
Soft Plus
  • Soft Plus need to set to_num different to the from_num , eg
layer[2->3] = softplus:sp2
Dropout
  • Dropout Layer need to set to_num equal the from_num, eg
layer[3->3] = dropout:dp
  threshold = 0.5
  • threshold is the threshold to abandon an edge.
Softmax
  • Softmax Layer need to set to_num equal the from_num, eg
layer[5->5] = softmax:sm

=

Convolution Layer

Our convolution implementation is fastest so far. And it is extremely easy to use. The configuration looks like

layer[0->1] = conv
  kernel_size = 11
  stride = 4
  nchannel = 96
  • kernel_size is the convolution kernel size
  • stride is stride for convolution operation
  • nchannel is the output channel
  • temp_col_max is the maximum size in convolution operation. The default value is 64, means the maximum size of temp_col is 64MB. Adjusting this variable may boost speed in training especially the input size is small in the convolution network.

Pooling Layer

Currectly we provide 3 Pooling methods: Sum Pooling , Max Pooling and Average Pooling . All pooling layers shared same option kinds stride and kernel_size

Sum Pooling
  • Sum Pooling need to set to_num different to the from_num , eg
layer[4->5] = sum_pooling
  kernel_size = 3
  stride = 2
Max Pooling
  • Max Pooling need to set to_num different to the from_num , eg
layer[4->5] = max_pooling
  kernel_size = 3
  stride = 2
Average Pooling
  • Average Pooling need to set to_num different to the from_num , eg
layer[4->5] = avg_pooling
  kernel_size = 3
  stride = 2

Normalization Layer

Currently we provide Local Response Normalization for convolution layer. LRN normalize the response of nearby kernels. Details can be found in the Alex's paper.

Local Response Normalization
layer[3->4] = lrn
  local_size = 5
  alpha = 0.001
  beta = 0.75
  knorm = 1
  • local_size change the nearby kernel size to be evaluated
  • alpha, beta and knorm is normalization param.