caffe架构学习之(一)--基于google protocol buffer开源项目的深度网络定义

时间:2015-03-26 18:00:21   收藏:0   阅读:3644

   学习深度学习,不可避免的要选择一个适合于自己的框架,目前深度学习的主流框架有caffe,Theano,Torch7等,而由Yangqing Jia等人开发并维护的caffe框架,由于其代码简洁,可读性强,较高的运行效率以及CPU/GPU切换简单,并且拥有较大的user group,所以越来越受到学习者的重视.

   caffe框架基于c++语言编写,并且具有licensed BSD,开放源码,具有matlab和python接口,关于caffe的详细介绍参考论文"Caffe:convolutional architecture for fast feature embedding",网址http://caffe.berkeleyvision.org/.本文先介绍caffe框架的网络定义.GitHub中有源码下载,源代码中的文件类型有.cpp,.prototxt,.sh,.m.,.py等,caffe框架的网络定义都放在.prototxt文件中.所以我们先分析此类文件.

1.Vision layers(头文件位置: ./include/caffe/vision_layers.hpp)

(i)卷积层

例子:

layers {
  name: "conv1"
  type: CONVOLUTION  #层类型
  bottom: "data"
  top: "conv1"
  blobs_lr: 1          # learning rate multiplier for the filters
  blobs_lr: 2          # learning rate multiplier for the biases
  weight_decay: 1      # weight decay multiplier for the filters
  weight_decay: 0      # weight decay multiplier for the biases
  convolution_param {
    num_output: 96     # learn 96 filters
    kernel_size: 11    # each filter is 11x11
    stride: 4          # step 4 pixels between each filter application
    weight_filler {
      type: "gaussian" # initialize the filters from a Gaussian
      std: 0.01        # distribution with stdev 0.01 (default mean: 0)
    }
    bias_filler {
      type: "constant" # initialize the biases to zero (0)
      value: 0
    }
  }
}
参数解释如下:

top和bottom:输出和输入

convolution_param:

       必须:

       建议:

        weight_filler [default type: ‘constant‘ value: 0]:滤波器权重

       可选:

 (ii)池化层

例子:

layers {
  name: "pool1"
  type: POOLING
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3 # pool over a 3x3 region
    stride: 2      # step two pixels (in the bottom blob) between pooling regions
  }
}
参数解释如下

   top和bottom:输出和输入

   pooling_param:

         必须:

               kernel_size (or kernel_h and kernel_w): 指定滤波器尺寸

         可选:

(iii)局部响应正则化(Local Response Normalization)

参数解释如下

   层类型:LRN

   lrn_param:

          可选:

(事实上,LRN操作是对图像区域执行了侧抑制,对每个输入值都乘以技术分享).

2.Loss layers

神经网络能够执行运算的动力就是误差层,forward pass运算得到loss,backward pass运算利用loss计算gradient.

(i)softmax

    层类型:SOFTMAX_LOSS

(ii)Sum-of-Squares / Euclidean

     层类型: EUCLIDEAN_LOSS

(iii)Hinge / Margin

例子:

# L1 Norm
layers {
  name: "loss"
  type: HINGE_LOSS   #层类型
  bottom: "pred"
  bottom: "label"
}

# L2 Norm
layers {
  name: "loss"
  type: HINGE_LOSS
  bottom: "pred"
  bottom: "label"
  top: "loss"
  hinge_loss_param {
    norm: L2
  }
}
    可选参数

          norm [default L1]: 范数类型. 可选范数 L1, L2

(IV)Sigmoid Cross-Entropy

     层类型:SIGMOID_CROSS_ENTROPY_LOSS

(V)Infogain

     层类型:INFOGAIN_LOSS

(VI)Accuracy and Top-k


3.Activation / Neuron Layers

(i)ReLU / Rectified-Linear and Leaky-ReLU

例子:

layers {
  name: "relu1"
  type: RELU  #层类型
  bottom: "conv1"
  top: "conv1"
}
参数解释如下

relu_param:

           可选

            negative_slope [default 0]: 指定输入为负时的输出

(ReLU是常用的激活函数,因为收敛速度快,不易饱和.给定输入x,如果x>0,则输出x,否则输出negative_slope*x,如果未设定negative_slope的值,则等效于标准的relu操作,支持in-place运算,意味着bottom和top相同时可以避免内存的消耗.)

技术分享


(ii)Sigmoid

例子

layers {
  name: "encode1neuron"
  bottom: "encode1"
  top: "encode1neuron"
  type: SIGMOID   #层类型
}
(iii)TanH / Hyperbolic Tangent

例子

layers {
  name: "layer"
  bottom: "in"
  top: "out"
  type: TANH  #层类型
}
(IV)Absolute Value

例子

layers {
  name: "layer"
  bottom: "in"
  top: "out"
  type: ABSVAL  #层类型
}
(V)Power

例子

layers {
  name: "layer"
  bottom: "in"
  top: "out"
  type: POWER  #层类型
  power_param {
    power: 1
    scale: 1
    shift: 0
  }
}
参数 解释如下

power_param:

       可选

(输出等于(shift + scale * x) ^ power)

(V)BNLL(binomial normal log likelihood)

layers {
  name: "layer"
  bottom: "in"
  top: "out"
  type: BNLL  #层类型
}
(输出等于log(1 + exp(x)))


4.Data Layers

根据数据输入网络的方式,参数也有所不同.

(i)Database

例子

layers {
  name: "mnist"
  # DATA layer 加载 leveldb or lmdb数据库进行大数据运算.
  type: DATA   #层类型
  # the 1st top is the data itself: 名字是任意的
  top: "data"
  # the 2nd top is the ground truth: 名字是任意的
  top: "label"
  # the DATA layer configuration
  data_param {
    # path to the DB
    source: "examples/mnist/mnist_train_lmdb"
    # 数据库类型: LEVELDB or LMDB (LMDB supports concurrent reads)
    backend: LMDB
    # 批处理块大小
    batch_size: 64
  }
  # 数据转换
  transform_param {
    # 归一化系数: this maps the [0, 255] MNIST data to [0, 1]
    scale: 0.00390625  #1/256
  }
}
参数解释如下

       必须

       

        可选

(ii)In-Memory

         层类型:MEMORY_DATA

         必须参数:batch_size, channels, height, width: 指定从内存中读取数据块的大小

(直接从内存中读取数据而非复制,使用是需要调用MemoryDataLayer::Reset (from C++) or Net.set_input_arrays (from Python)来指定读取的数据块)

(iii)HDF5 Input

          层类型:HDF5_DATA

          必须参数

(IV)HDF5 Output(执行其他层相反的操作,把输入数据块写入硬盘中)

           层类型:HDF5_OUTPUT

           必须参数:

          file_name: 写文件名

(V)Images

            层类型:IMAGE_DATA

            参数类型

                 必须

                  可选

(VI)Windows

              层类型:WINDOW_DATA

(VII)Dummy

              层类型:DUMMY_DATA(DUMMY_DATA is for development and debugging. See DummyDataParameter.)


5.Common Layers

(I)Inner Product(fully connected layer)

例子

layers {
  name: "fc8"
  type: INNER_PRODUCT  #层类型
  blobs_lr: 1          # 滤波器学习率
  blobs_lr: 2          # 偏置学习率
  weight_decay: 1      # 滤波器权重衰减
  weight_decay: 0      # 偏置权重衰减
  inner_product_param {
    num_output: 1000
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
  bottom: "fc7"
  top: "fc8"
}
参数解释如下

inner_product_param:

          必须

          num_output (c_o): 滤波器个数

          建议

          weight_filler [default type: ‘constant‘ value: 0]

          可选

(ii)Splitting

           层类型:SPLIT

(用于讲输入blob分为多个blobs,适用于有多个输出层的情况)

(iii)Flattening

           层类型:FLATTEN

(把输入矩阵 n * c * h * w 转换为向量n * (c*h*w) * 1 * 1.)

(IV)Concatenation

例子

layers {
  name: "concat" 
  bottom: "in1"
  bottom: "in2"
  top: "out"
  type: CONCAT   #层类型
  concat_param {
    concat_dim: 1
  }
}
可选参数: (V)Slicing(将输入层沿着某个维度(num或者channel)划分,以分别输出到多个输出层)

例子

layers {
  name: "slicer_label"
  type: SLICE   #层类型
  bottom: "label"
  ## Example of label with a shape N x 3 x 1 x 1
  top: "label1"
  top: "label2"
  top: "label3"
  slice_param {
      slice_dim: 1  #目标维度,0 for num and1for channel
      slice_point: 1  #slice_point表示指定维度中的索引,索引数等于输出blobs中最小的那个
      slice_point: 2
  }
}
(VI)Elementwise Operations

      层类型:ELTWISE

(VII)Argmax

      层类型:ARGMAX

(VIII)Softmax

      层类型:SOFTMAX

(IVV)Mean-Variance Normalization

      层类型:MVN


由以上的定义我们也能看出来,caffe框架还不太完善,需要更多的人才不懈努力,尤其是我们青年才俊更是如此,以后还会对此文进行更新的.

原文:http://blog.csdn.net/linzertling/article/details/44648737

评论(0
© 2014 bubuko.com 版权所有 - 联系我们:wmxa8@hotmail.com
打开技术之扣,分享程序人生!