最近在看tf-faster-rcnn的代碼,看到了VGG16網絡的定義中使用到了slim這個模塊,如下圖所示,因此百度了一下slim這個模塊,下面的內容是基于一些博客按照自己的思路整理的,便于理解和日后查用。
slim這個模塊是在16年新推出的,主要是來做所謂的“代碼瘦身”, 可以消除原生tensorflow里面很多重復的模板性的代碼,讓代碼更緊湊,更具備可讀性。另外slim提供了很多計算機視覺方面的著名模型(VGG, ResNet等),我們可以直接下載使用(checkpoint,即 .ckpt文件)。
import tensorflow.contrib.slim as slim
下面分別介紹slim的代碼瘦身功能和slim中一些常用函數。
一、代碼瘦身功能:
1、首先讓我們看看tensorflow怎么實現一個層,例如卷積層:
input = ...
with tf.name_scope('conv1_1') as scope:
kernel = tf.Variable(tf.truncated_normal([3, 3, 64, 128], dtype=tf.float32,
stddev=1e-1), name='weights')
conv = tf.nn.conv2d(input, kernel, [1, 1, 1, 1], padding='SAME')
biases = tf.Variable(tf.constant(0.0, shape=[128], dtype=tf.float32),
trainable=True, name='biases')
bias = tf.nn.bias_add(conv, biases)
conv1 = tf.nn.relu(bias, name=scope)
然后slim的實現:
input = ...
net = slim.conv2d(input, 128, [3, 3], scope='conv1_1')
2、repeat操作
假設在tensorflow中定義三個相同的卷積層:
net = ...
net = slim.conv2d(net, 256, [3, 3], scope='conv3_1')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_2')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_3')
net = slim.max_pool2d(net, [2, 2], scope='pool2')
在slim中的repeat操作:
net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
net = slim.max_pool2d(net, [2, 2], scope='pool2')
3、stack操作:處理卷積核或者輸出不一樣的情況
假設在tensorflow中定義三層FC:
x = slim.fully_connected(x, 32, scope='fc/fc_1')
x = slim.fully_connected(x, 64, scope='fc/fc_2')
x = slim.fully_connected(x, 128, scope='fc/fc_3')
在slim中的stack操作:
slim.stack(x, slim.fully_connected, [32, 64, 128], scope='fc')
同理卷積層也一樣:
# 普通方法:
x = slim.conv2d(x, 32, [3, 3], scope='core/core_1')
x = slim.conv2d(x, 32, [1, 1], scope='core/core_2')
x = slim.conv2d(x, 64, [3, 3], scope='core/core_3')
x = slim.conv2d(x, 64, [1, 1], scope='core/core_4')
# 簡便方法:
slim.stack(x, slim.conv2d, [(32, [3, 3]), (32, [1, 1]), (64, [3, 3]), (64, [1, 1])], scope='core')
4、slim中的argscope:如果網絡有大量相同的參數,如下:
net = slim.conv2d(inputs, 64, [11, 11], 4, padding='SAME',
weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
weights_regularizer=slim.l2_regularizer(0.0005), scope='conv1')
net = slim.conv2d(net, 128, [11, 11], padding='VALID',
weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
weights_regularizer=slim.l2_regularizer(0.0005), scope='conv2')
net = slim.conv2d(net, 256, [11, 11], padding='SAME',
weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
weights_regularizer=slim.l2_regularizer(0.0005), scope='conv3')
然后用arg_scope處理一下:
with slim.arg_scope([slim.conv2d], padding='SAME',
weights_initializer=tf.truncated_normal_initializer(stddev=0.01)
weights_regularizer=slim.l2_regularizer(0.0005)):
net = slim.conv2d(inputs, 64, [11, 11], scope='conv1')
net = slim.conv2d(net, 128, [11, 11], padding='VALID', scope='conv2')
net = slim.conv2d(net, 256, [11, 11], scope='conv3')
Tips:arg_scope的作用范圍內,定義了指定層的默認參數,若想特別指定某些層的參數,可以重新賦值(相當于重寫),如上倒數第二行代碼。那如果除了卷積層還有其他層呢?那就要如下定義:
with slim.arg_scope([slim.conv2d, slim.fully_connected],
activation_fn=tf.nn.relu,
weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
weights_regularizer=slim.l2_regularizer(0.0005)):
with slim.arg_scope([slim.conv2d], stride=1, padding='SAME'):
net = slim.conv2d(inputs, 64, [11, 11], 4, padding='VALID', scope='conv1')
net = slim.conv2d(net, 256, [5, 5],
weights_initializer=tf.truncated_normal_initializer(stddev=0.03),
scope='conv2')
net = slim.fully_connected(net, 1000, activation_fn=None, scope=
寫兩個arg_scope就行了。
基于以上的“瘦身”后,定義一個VGG16,二十來行代碼就搞定了。
def vgg16(inputs):
with slim.arg_scope([slim.conv2d, slim.fully_connected],
activation_fn=tf.nn.relu,
weights_initializer=tf.truncated_normal_initializer(0.0, 0.01),
weights_regularizer=slim.l2_regularizer(0.0005)):
net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')
net = slim.max_pool2d(net, [2, 2], scope='pool1')
net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')
net = slim.max_pool2d(net, [2, 2], scope='pool2')
net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
net = slim.max_pool2d(net, [2, 2], scope='pool3')
net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv4')
net = slim.max_pool2d(net, [2, 2], scope='pool4')
net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv5')
net = slim.max_pool2d(net, [2, 2], scope='pool5')
net = slim.fully_connected(net, 4096, scope='fc6')
net = slim.dropout(net, 0.5, scope='dropout6')
net = slim.fully_connected(net, 4096, scope='fc7')
net = slim.dropout(net, 0.5, scope='dropout7')
net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc8')
return net
二、常用函數:
1、slim.arg_scope: slim.arg_scope: 可以定義一些函數的默認參數值,在scope內,我們重復用到這些函數時可以不用把所有參數都寫一遍。
with slim.arg_scope([slim.conv2d, slim.fully_connected],
trainable=True,
activation_fn=tf.nn.relu,
weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
weights_regularizer=slim.l2_regularizer(0.0001)):
with slim.arg_scope([slim.conv2d],
kernel_size=[3, 3],
padding='SAME',
normalizer_fn=slim.batch_norm):
net = slim.conv2d(net, 64, scope='conv1'))
net = slim.conv2d(net, 128, scope='conv2'))
net = slim.conv2d(net, 256, [5, 5], scope='conv3'))
slim.arg_scope的用法基本都體現在上面了。一個slim.arg_scope內可以用list來同時定義多個函數的默認參數(前提是這些函數都有這些參數),另外,slim.arg_scope也允許相互嵌套。在其中調用的函數,可以不用重復寫一些參數(例如kernel_size=[3, 3]),但也允許覆蓋(例如最后一行,卷積核大小為[5,5])。
2、slim.cov2d: 卷積層,一般調用方法如下:
net = slim.conv2d(inputs, 256, [3, 3], stride=1, scope='conv1_1')
前三個參數依次為網絡的輸入,輸出的通道,卷積核大小,stride是做卷積時的步長。除此之外,還有幾個經常被用到的參數:
padding : 補零的方式,例如'SAME'
activation_fn : 激活函數,默認是nn.relu, VGG16中就用的是ReLU
normalizer_fn : 正則化函數,默認為None,這里可以設置為batch normalization,函數用slim.batch_norm
normalizer_params : slim.batch_norm中的參數,以字典形式表示
weights_initializer : 權重的初始化器,initializers.xavier_initializer()
weights_regularizer : 權重的正則化器,一般不怎么用到
biases_initializer : 如果之前有batch norm,那么這個及下面一個就不用管了
biases_regularizer :
trainable : 參數是否可訓練,默認為True
3、slim.max_pool2d: 池化層(最大池化和平均池化),用法如下:
net = slim.max_pool2d(net, [2, 2], scope='pool1')
4、slim.fully_connected: 全連接層,前兩個參數分別為網絡輸入、輸出的神經元數量。
slim.fully_connected(x, 128, scope='fc1')