介紹
Inception v2與Inception v3被作者放在了一篇paper里面,因此我們也作為一篇blog來對其講解。
Google家的Inception系列模型提出的初衷主要為了解決CNN分類模型的兩個問題,其一是如何使得網絡深度增加的同時能使得模型的分類性能隨著增加,而非像簡單的VGG網絡那樣達到一定深度后就陷入了性能飽和的困境(Resnet針對的也是此一問題);其二則是如何在保證分類網絡分類準確率提升或保持不降的同時使得模型的計算開銷與內存開銷充分地降低。在這兩個問題中,他們尤其關注第二個問題,畢竟在移動互聯網大行天下的今天,如何將復雜的CNN模型部署在計算與存儲資源均有限的移動端,并使之有效地運行有著更大的實際價值。
在Inception v1當中,它用于參賽的Googlenet模型只使用了約5百萬個參數,與它相比,Alexnet使用了約6千萬個參數,VGG用的參數更是多達1億八千萬個(當然其中主要是因為后兩者使用了較大的FC層)。
而在Inception v2模型中,作者們又進一步考慮了其它減少計算與可訓練參數的可能,最終這一新模型在使用較少訓練參數的情況下達到了更高的state-of-art分類準確率。
模型設計的通用準則
對于CNN模型設計與訓練這一‘煉金術’學問,Googler們在一番摸索與思考后提出了以下幾個尚未被證偽的‘煉金法則’。而Inception v2/v3模型的設計思想即是源自于它們。
慎用bottleneck
如果出于模型分類精度考慮,那么應盡量避免使用bottleneck模塊(關于此可參考不才之前的那篇inception v1博客),尤其是不應當在模型的早期階段使用。作者們認為CNN模型本質上是一個DAG(有向無環圖),其上信息自底向上流動,而每一個bottleneck的使用都會損失一部分信息,因此當我們出于計算與存儲節省而使用bottleneck時,一定要下手輕些(不要一下子使用1x1 conv縮減過多的feature maps的channels數目,如果一定要用reduction模塊也要在模型較靠后的幾層使用)。
層寬些還是有好處的
撇開計算與內存開銷增加的負面因素不談,每個Conv層的kernels數目增加對于模型處理局部信息,進而更有效地對其表達還是有好處的。畢竟多些參數就可使得每層獲得多些的表達能力,所謂一寸長一寸強。它還可使得模型收斂得更快(當然是指的整體所需的迭代次數減少,而非整體訓練下來所需的全部時間)。
更深的較底層(size亦小)可以使勁壓
這個純是實驗多了試出來的結論。即對于網絡后面的feature maps可以使用像bottleneck那樣的模塊對其進行channels數目縮減再進行3x3 conv這樣的較大計算。在其中1x1 conv reduction op的使用不僅不會影響模型精度,反而還能使其收斂速度加快。他們給出的猜測性解釋是,后期較小size的feature maps之上的相鄰units(即channels)之間具有更加的關聯性(即冗余信息不少),因此可以折騰的厲害些(使輸出的channels變少)而不擔心信息丟失(反而信息被梳理的更清晰、有效)。。(好吧,我承認這一段乍讀像是在胡說,我自己也是云里霧里。。什么時候SCI文章也開始像李義山的詩一樣朦朧了。。)
平衡網絡的深度與寬度
Googler們將深度學習網絡的設計問題視為了一個在計算/內存資源限定條件存在的情況下,通過有效組合、堆加各種層/模塊,從而使得模型分類精度最高的一種最優化問題。而這自然也是最近火熱的所謂AutoML的核心思想。。
他們認為(也是通過實驗后總結)一個成功的CNN網絡設計一定要將深度與寬度同時增加,瘦高或矮胖的CNN網絡都不如一個身材勻稱的網絡的效果好。
Inception v2中引入的一些變動
將kernel size較大的conv計算進一步分解
inception v1中稀疏表達模塊的思想在inception v2中得到了較好的繼承。既然我們可以用稀疏的inception模塊來有力地表達多維度信息,那么干嗎不再進一步將其中大的kernel size的conv層再進一步分解展開呢。。Network in network文章中提到一個表達力強的復雜網絡可以由較簡單的小網絡來組成,那么干脆就將網絡的組合維度再增加些好了,說不定就能更有效地逼近人腦神經元的組合復雜度呢。。下圖為inception v1中所使用的inception 模塊。
大kernel分解為多個小kernel的累加
首先試著將一個5x5的conv分解為了兩個累加在一塊的3x3 conv。如此可以有效地只使用約(3x3 + 3x3)/(5x5)=72%的計算開銷。下圖可看出此替換的有效性。
它的實用直接可將我們原來在inception v1中所用的inception module升級為了如下一種新的inception module。
將對稱的conv計算分解為非對稱的conv計算
這一次是將一個3x3的conv分解為了兩個分別為1x3與3x1的conv計算。這樣同樣可以只使用約(1x3 + 3x1) / (3x3) = 67%的計算開銷。下圖是此替換的有效性。作者更進一步發揮想象,認為任一個nxn conv都可通過替換為兩個分別為1xn與nx1的convs層來節省計算與內存。
它的使用帶來了另外一種更新的inception模塊變種,如下圖所示。
增加的分類層的作用分析
在inception v1中,作者為了減少深度模型中反向傳播時梯度消失的問題,而提出了在模型的中間與較底部增加了兩個extra 分類loss層的方案。
在inception v2中,作者同樣使用了extra loss層。不過他們反思了之前說過的話,覺著不大對了,果斷以今日之我否定了昨日之我。他們現在(當時是2015年)覺著extra loss的真正意義在于對訓練參數進行regularization。為此他們試著在這些extra loss的FC層里添加了BN或者dropout層,果然發現分類結果好了些,于是就興沖沖地發布了這一‘重大’最新發現。。
更高效的下采樣方案
深度CNN網絡中一般會不斷使用Pool層來減少feature maps size。這必然意味著傳遞信息的不斷丟失。一般為了減少信息的過度丟失,在加入Pool層減少feature maps size的同時都會同比例擴大它的channels數目(此一思想與做法可在VGG網絡中明顯看到,亦已被所有的CNN網絡設計所遵循)。
真正實行可以有兩個辦法,其一先將channels數目擴大(一般使用1x1 conv),然后再使用pool層來減少feature map size,不過其中1x1 conv的計算顯然會有非常大的計算開銷;其二則是先做Pooling減少feature map size,然后再使用1x1 conv對其channels數目放大,不過顯然首先使用Pooling的話會造成信息硬性丟失的不可避免,在此之后再使用1x1 conv去增加channels數目的做法已經有些亡羊補牢之嫌了。。下圖反映了這兩種較為傳統的做法。
所以作者提出了他們的辦法,確實比較新穎。即分別使用pool與conv直接減少feature map size的做法分別計算,完了后再將兩者算出的feature maps組合起來,妙哉,直欲為此飲上一大浮白也!下圖是此一方法的表示。
最終的Inception v2/inception v3模型
講到這里,inception v2/v3已經呼之欲出了。請見下表。
其中v2/v3模型結構上的差別只有一點即在inception v3中使用的Aug loss里面使用了BN進行regularization。
使用Label smoothing來對模型進行規則化處理
作者認為softmax loss過于注重使模型學習分類出正確的類別(label),而過于地試著偏離其它的非正確labels。。如此的話可能使得訓練得到的模型在新的數據集上擴展性不好(即易陷入overfitting的困局)。為此他們認為有必要使用label的先驗分布信息對其loss進行校正。如下為他們最終使用的loss。
實驗結果
下圖為inception v3與其它模型相比的實驗結果。
代碼分析
我們還是通過intel caffe里面的inception v3 prototxt file來看下它的模型設計實驗吧。當然重點是看新引入的兩種inception module設計。單純去看prototxt file的話會有些不方便,畢竟配置文件實在是太長了,建議使用Netscope這個工具導入caffe的prototxt file去圖形化認識它的網絡。
此模型位置可于intel caffe的此處看到:models/intel_optimized_models/benchmark/googlenet_v3/train_val.prototxt。
如下為inception模型變形一。
layer {
name: "mixed_3_conv_conv2d"
type: "Convolution"
bottom: "ch_concat_mixed_2_chconcat"
top: "mixed_3_conv_conv2d"
param {
lr_mult: 1.0
decay_mult: 1.0
}
convolution_param {
num_output: 384
bias_term: false
pad: 0
kernel_size: 3
stride: 2
weight_filler {
type: "gaussian"
std: 0.01
}
}
}
layer {
name: "mixed_3_conv_batchnorm"
type: "BatchNorm"
bottom: "mixed_3_conv_conv2d"
top: "mixed_3_conv_conv2d_bn"
batch_norm_param {
}
}
layer {
name: "mixed_3_conv_relu"
type: "ReLU"
bottom: "mixed_3_conv_conv2d_bn"
top: "mixed_3_conv_conv2d_relu"
}
layer {
name: "mixed_3_tower_conv_conv2d"
type: "Convolution"
bottom: "ch_concat_mixed_2_chconcat"
top: "mixed_3_tower_conv_conv2d"
param {
lr_mult: 1.0
decay_mult: 1.0
}
convolution_param {
num_output: 64
bias_term: false
pad: 0
kernel_size: 1
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
}
}
layer {
name: "mixed_3_tower_conv_batchnorm"
type: "BatchNorm"
bottom: "mixed_3_tower_conv_conv2d"
top: "mixed_3_tower_conv_conv2d_bn"
batch_norm_param {
}
}
layer {
name: "mixed_3_tower_conv_relu"
type: "ReLU"
bottom: "mixed_3_tower_conv_conv2d_bn"
top: "mixed_3_tower_conv_conv2d_relu"
}
layer {
name: "mixed_3_tower_conv_1_conv2d"
type: "Convolution"
bottom: "mixed_3_tower_conv_conv2d_relu"
top: "mixed_3_tower_conv_1_conv2d"
param {
lr_mult: 1.0
decay_mult: 1.0
}
convolution_param {
num_output: 96
bias_term: false
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
}
}
layer {
name: "mixed_3_tower_conv_1_batchnorm"
type: "BatchNorm"
bottom: "mixed_3_tower_conv_1_conv2d"
top: "mixed_3_tower_conv_1_conv2d_bn"
batch_norm_param {
}
}
layer {
name: "mixed_3_tower_conv_1_relu"
type: "ReLU"
bottom: "mixed_3_tower_conv_1_conv2d_bn"
top: "mixed_3_tower_conv_1_conv2d_relu"
}
layer {
name: "mixed_3_tower_conv_2_conv2d"
type: "Convolution"
bottom: "mixed_3_tower_conv_1_conv2d_relu"
top: "mixed_3_tower_conv_2_conv2d"
param {
lr_mult: 1.0
decay_mult: 1.0
}
convolution_param {
num_output: 96
bias_term: false
pad: 0
kernel_size: 3
stride: 2
weight_filler {
type: "gaussian"
std: 0.01
}
}
}
layer {
name: "mixed_3_tower_conv_2_batchnorm"
type: "BatchNorm"
bottom: "mixed_3_tower_conv_2_conv2d"
top: "mixed_3_tower_conv_2_conv2d_bn"
batch_norm_param {
}
}
layer {
name: "mixed_3_tower_conv_2_relu"
type: "ReLU"
bottom: "mixed_3_tower_conv_2_conv2d_bn"
top: "mixed_3_tower_conv_2_conv2d_relu"
}
layer {
name: "max_pool_mixed_3_pool"
type: "Pooling"
bottom: "ch_concat_mixed_2_chconcat"
top: "max_pool_mixed_3_pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
pad: 0
}
}
如下為inception v2/v3中所使用的較為新穎的下采樣模塊的表示。
layer {
name: "mixed_2_conv_conv2d"
type: "Convolution"
bottom: "ch_concat_mixed_1_chconcat"
top: "mixed_2_conv_conv2d"
param {
lr_mult: 1.0
decay_mult: 1.0
}
convolution_param {
num_output: 64
bias_term: false
pad: 0
kernel_size: 1
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
}
}
layer {
name: "mixed_2_conv_batchnorm"
type: "BatchNorm"
bottom: "mixed_2_conv_conv2d"
top: "mixed_2_conv_conv2d_bn"
batch_norm_param {
}
}
layer {
name: "mixed_2_conv_relu"
type: "ReLU"
bottom: "mixed_2_conv_conv2d_bn"
top: "mixed_2_conv_conv2d_relu"
}
layer {
name: "mixed_2_tower_conv_conv2d"
type: "Convolution"
bottom: "ch_concat_mixed_1_chconcat"
top: "mixed_2_tower_conv_conv2d"
param {
lr_mult: 1.0
decay_mult: 1.0
}
convolution_param {
num_output: 48
bias_term: false
pad: 0
kernel_size: 1
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
}
}
layer {
name: "mixed_2_tower_conv_batchnorm"
type: "BatchNorm"
bottom: "mixed_2_tower_conv_conv2d"
top: "mixed_2_tower_conv_conv2d_bn"
batch_norm_param {
}
}
layer {
name: "mixed_2_tower_conv_relu"
type: "ReLU"
bottom: "mixed_2_tower_conv_conv2d_bn"
top: "mixed_2_tower_conv_conv2d_relu"
}
layer {
name: "mixed_2_tower_conv_1_conv2d"
type: "Convolution"
bottom: "mixed_2_tower_conv_conv2d_relu"
top: "mixed_2_tower_conv_1_conv2d"
param {
lr_mult: 1.0
decay_mult: 1.0
}
convolution_param {
num_output: 64
bias_term: false
pad: 2
kernel_size: 5
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
}
}
layer {
name: "mixed_2_tower_conv_1_batchnorm"
type: "BatchNorm"
bottom: "mixed_2_tower_conv_1_conv2d"
top: "mixed_2_tower_conv_1_conv2d_bn"
batch_norm_param {
}
}
layer {
name: "mixed_2_tower_conv_1_relu"
type: "ReLU"
bottom: "mixed_2_tower_conv_1_conv2d_bn"
top: "mixed_2_tower_conv_1_conv2d_relu"
}
layer {
name: "mixed_2_tower_1_conv_conv2d"
type: "Convolution"
bottom: "ch_concat_mixed_1_chconcat"
top: "mixed_2_tower_1_conv_conv2d"
param {
lr_mult: 1.0
decay_mult: 1.0
}
convolution_param {
num_output: 64
bias_term: false
pad: 0
kernel_size: 1
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
}
}
layer {
name: "mixed_2_tower_1_conv_batchnorm"
type: "BatchNorm"
bottom: "mixed_2_tower_1_conv_conv2d"
top: "mixed_2_tower_1_conv_conv2d_bn"
batch_norm_param {
}
}
layer {
name: "mixed_2_tower_1_conv_relu"
type: "ReLU"
bottom: "mixed_2_tower_1_conv_conv2d_bn"
top: "mixed_2_tower_1_conv_conv2d_relu"
}
layer {
name: "mixed_2_tower_1_conv_1_conv2d"
type: "Convolution"
bottom: "mixed_2_tower_1_conv_conv2d_relu"
top: "mixed_2_tower_1_conv_1_conv2d"
param {
lr_mult: 1.0
decay_mult: 1.0
}
convolution_param {
num_output: 96
bias_term: false
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
}
}
layer {
name: "mixed_2_tower_1_conv_1_batchnorm"
type: "BatchNorm"
bottom: "mixed_2_tower_1_conv_1_conv2d"
top: "mixed_2_tower_1_conv_1_conv2d_bn"
batch_norm_param {
}
}
layer {
name: "mixed_2_tower_1_conv_1_relu"
type: "ReLU"
bottom: "mixed_2_tower_1_conv_1_conv2d_bn"
top: "mixed_2_tower_1_conv_1_conv2d_relu"
}
layer {
name: "mixed_2_tower_1_conv_2_conv2d"
type: "Convolution"
bottom: "mixed_2_tower_1_conv_1_conv2d_relu"
top: "mixed_2_tower_1_conv_2_conv2d"
param {
lr_mult: 1.0
decay_mult: 1.0
}
convolution_param {
num_output: 96
bias_term: false
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
}
}
layer {
name: "mixed_2_tower_1_conv_2_batchnorm"
type: "BatchNorm"
bottom: "mixed_2_tower_1_conv_2_conv2d"
top: "mixed_2_tower_1_conv_2_conv2d_bn"
batch_norm_param {
}
}
layer {
name: "mixed_2_tower_1_conv_2_relu"
type: "ReLU"
bottom: "mixed_2_tower_1_conv_2_conv2d_bn"
top: "mixed_2_tower_1_conv_2_conv2d_relu"
}
layer {
name: "AVE_pool_mixed_2_pool"
type: "Pooling"
bottom: "ch_concat_mixed_1_chconcat"
top: "AVE_pool_mixed_2_pool"
pooling_param {
pool: AVE
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "mixed_2_tower_2_conv_conv2d"
type: "Convolution"
bottom: "AVE_pool_mixed_2_pool"
top: "mixed_2_tower_2_conv_conv2d"
param {
lr_mult: 1.0
decay_mult: 1.0
}
convolution_param {
num_output: 64
bias_term: false
pad: 0
kernel_size: 1
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
}
}
layer {
name: "mixed_2_tower_2_conv_batchnorm"
type: "BatchNorm"
bottom: "mixed_2_tower_2_conv_conv2d"
top: "mixed_2_tower_2_conv_conv2d_bn"
batch_norm_param {
}
}
layer {
name: "mixed_2_tower_2_conv_relu"
type: "ReLU"
bottom: "mixed_2_tower_2_conv_conv2d_bn"
top: "mixed_2_tower_2_conv_conv2d_relu"
}
以下為inception模塊變形二的描述。
layer {
name: "mixed_9_conv_conv2d"
type: "Convolution"
bottom: "ch_concat_mixed_8_chconcat"
top: "mixed_9_conv_conv2d"
param {
lr_mult: 1.0
decay_mult: 1.0
}
convolution_param {
num_output: 320
bias_term: false
pad: 0
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
kernel_h: 1
kernel_w: 1
}
}
layer {
name: "mixed_9_conv_batchnorm"
type: "BatchNorm"
bottom: "mixed_9_conv_conv2d"
top: "mixed_9_conv_conv2d_bn"
batch_norm_param {
}
}
layer {
name: "mixed_9_conv_relu"
type: "ReLU"
bottom: "mixed_9_conv_conv2d_bn"
top: "mixed_9_conv_conv2d_relu"
}
layer {
name: "mixed_9_tower_conv_conv2d"
type: "Convolution"
bottom: "ch_concat_mixed_8_chconcat"
top: "mixed_9_tower_conv_conv2d"
param {
lr_mult: 1.0
decay_mult: 1.0
}
convolution_param {
num_output: 384
bias_term: false
pad: 0
kernel_size: 1
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
}
}
layer {
name: "mixed_9_tower_conv_batchnorm"
type: "BatchNorm"
bottom: "mixed_9_tower_conv_conv2d"
top: "mixed_9_tower_conv_conv2d_bn"
batch_norm_param {
}
}
layer {
name: "mixed_9_tower_conv_relu"
type: "ReLU"
bottom: "mixed_9_tower_conv_conv2d_bn"
top: "mixed_9_tower_conv_conv2d_relu"
}
layer {
name: "mixed_9_tower_mixed_conv_conv2d"
type: "Convolution"
bottom: "mixed_9_tower_conv_conv2d_relu"
top: "mixed_9_tower_mixed_conv_conv2d"
param {
lr_mult: 1.0
decay_mult: 1.0
}
convolution_param {
num_output: 384
bias_term: false
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
pad_h: 0
pad_w: 1
kernel_h: 1
kernel_w: 3
}
}
layer {
name: "mixed_9_tower_mixed_conv_batchnorm"
type: "BatchNorm"
bottom: "mixed_9_tower_mixed_conv_conv2d"
top: "mixed_9_tower_mixed_conv_conv2d_bn"
batch_norm_param {
}
}
layer {
name: "mixed_9_tower_mixed_conv_relu"
type: "ReLU"
bottom: "mixed_9_tower_mixed_conv_conv2d_bn"
top: "mixed_9_tower_mixed_conv_conv2d_relu"
}
layer {
name: "mixed_9_tower_mixed_conv_1_conv2d"
type: "Convolution"
bottom: "mixed_9_tower_conv_conv2d_relu"
top: "mixed_9_tower_mixed_conv_1_conv2d"
param {
lr_mult: 1.0
decay_mult: 1.0
}
convolution_param {
num_output: 384
bias_term: false
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
pad_h: 1
pad_w: 0
kernel_h: 3
kernel_w: 1
}
}
layer {
name: "mixed_9_tower_mixed_conv_1_batchnorm"
type: "BatchNorm"
bottom: "mixed_9_tower_mixed_conv_1_conv2d"
top: "mixed_9_tower_mixed_conv_1_conv2d_bn"
batch_norm_param {
}
}
layer {
name: "mixed_9_tower_mixed_conv_1_relu"
type: "ReLU"
bottom: "mixed_9_tower_mixed_conv_1_conv2d_bn"
top: "mixed_9_tower_mixed_conv_1_conv2d_relu"
}
layer {
name: "mixed_9_tower_1_conv_conv2d"
type: "Convolution"
bottom: "ch_concat_mixed_8_chconcat"
top: "mixed_9_tower_1_conv_conv2d"
param {
lr_mult: 1.0
decay_mult: 1.0
}
convolution_param {
num_output: 448
bias_term: false
pad: 0
kernel_size: 1
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
}
}
layer {
name: "mixed_9_tower_1_conv_batchnorm"
type: "BatchNorm"
bottom: "mixed_9_tower_1_conv_conv2d"
top: "mixed_9_tower_1_conv_conv2d_bn"
batch_norm_param {
}
}
layer {
name: "mixed_9_tower_1_conv_relu"
type: "ReLU"
bottom: "mixed_9_tower_1_conv_conv2d_bn"
top: "mixed_9_tower_1_conv_conv2d_relu"
}
layer {
name: "mixed_9_tower_1_conv_1_conv2d"
type: "Convolution"
bottom: "mixed_9_tower_1_conv_conv2d_relu"
top: "mixed_9_tower_1_conv_1_conv2d"
param {
lr_mult: 1.0
decay_mult: 1.0
}
convolution_param {
num_output: 384
bias_term: false
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
}
}
layer {
name: "mixed_9_tower_1_conv_1_batchnorm"
type: "BatchNorm"
bottom: "mixed_9_tower_1_conv_1_conv2d"
top: "mixed_9_tower_1_conv_1_conv2d_bn"
batch_norm_param {
}
}
layer {
name: "mixed_9_tower_1_conv_1_relu"
type: "ReLU"
bottom: "mixed_9_tower_1_conv_1_conv2d_bn"
top: "mixed_9_tower_1_conv_1_conv2d_relu"
}
layer {
name: "mixed_9_tower_1_mixed_conv_conv2d"
type: "Convolution"
bottom: "mixed_9_tower_1_conv_1_conv2d_relu"
top: "mixed_9_tower_1_mixed_conv_conv2d"
param {
lr_mult: 1.0
decay_mult: 1.0
}
convolution_param {
num_output: 384
bias_term: false
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
pad_h: 0
pad_w: 1
kernel_h: 1
kernel_w: 3
}
}
layer {
name: "mixed_9_tower_1_mixed_conv_batchnorm"
type: "BatchNorm"
bottom: "mixed_9_tower_1_mixed_conv_conv2d"
top: "mixed_9_tower_1_mixed_conv_conv2d_bn"
batch_norm_param {
}
}
layer {
name: "mixed_9_tower_1_mixed_conv_relu"
type: "ReLU"
bottom: "mixed_9_tower_1_mixed_conv_conv2d_bn"
top: "mixed_9_tower_1_mixed_conv_conv2d_relu"
}
layer {
name: "mixed_9_tower_1_mixed_conv_1_conv2d"
type: "Convolution"
bottom: "mixed_9_tower_1_conv_1_conv2d_relu"
top: "mixed_9_tower_1_mixed_conv_1_conv2d"
param {
lr_mult: 1.0
decay_mult: 1.0
}
convolution_param {
num_output: 384
bias_term: false
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
pad_h: 1
pad_w: 0
kernel_h: 3
kernel_w: 1
}
}
layer {
name: "mixed_9_tower_1_mixed_conv_1_batchnorm"
type: "BatchNorm"
bottom: "mixed_9_tower_1_mixed_conv_1_conv2d"
top: "mixed_9_tower_1_mixed_conv_1_conv2d_bn"
batch_norm_param {
}
}
layer {
name: "mixed_9_tower_1_mixed_conv_1_relu"
type: "ReLU"
bottom: "mixed_9_tower_1_mixed_conv_1_conv2d_bn"
top: "mixed_9_tower_1_mixed_conv_1_conv2d_relu"
}
layer {
name: "AVE_pool_mixed_9_pool"
type: "Pooling"
bottom: "ch_concat_mixed_8_chconcat"
top: "AVE_pool_mixed_9_pool"
pooling_param {
pool: AVE
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "mixed_9_tower_2_conv_conv2d"
type: "Convolution"
bottom: "AVE_pool_mixed_9_pool"
top: "mixed_9_tower_2_conv_conv2d"
param {
lr_mult: 1.0
decay_mult: 1.0
}
convolution_param {
num_output: 192
bias_term: false
pad: 0
kernel_size: 1
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
}
}
layer {
name: "mixed_9_tower_2_conv_batchnorm"
type: "BatchNorm"
bottom: "mixed_9_tower_2_conv_conv2d"
top: "mixed_9_tower_2_conv_conv2d_bn"
batch_norm_param {
}
}
layer {
name: "mixed_9_tower_2_conv_relu"
type: "ReLU"
bottom: "mixed_9_tower_2_conv_conv2d_bn"
top: "mixed_9_tower_2_conv_conv2d_relu"
}
參考文獻
- Rethinking the Inception Architecture for Computer Vision, Christian-Szegedy, 2015
- https://github.com/intel/caffe
- http://ethereon.github.io/netscope/#/editor