TensorFlow-Slim

TF-Slim 是一個用于定義、訓(xùn)練、和評估復(fù)雜模型的TensorFlow高級庫。tf-slim能與原生tensorflow和其他高層框架(如tf.contrib.learn等)混合使用。

使用Tf-Slim

import tensorflow as tf
import tensorflow.contrib.slim as slim

為什么使用tf-slim？

TF-Slim是一個能夠簡化神經(jīng)網(wǎng)絡(luò)的構(gòu)建、訓(xùn)練和評估流程的高級庫 :

通過消除模板化代碼，使得用戶定義模板的過程變得更加簡潔。這是通過參數(shù)的作用域和大量高級網(wǎng)絡(luò)層和變量的定義來實現(xiàn)的。這些工具能夠增加代碼的可讀性、穩(wěn)定性，減少超參復(fù)制粘貼過程中出錯的可能性，同時簡化了超參的調(diào)節(jié)過層。
提供內(nèi)置的正則化選項，使得模型開發(fā)更加簡單
已經(jīng)實現(xiàn)好的幾個常見圖像模型（如VGG，AlexNet等）。一方面用戶可以吧這些模型當(dāng)作黑匣子來使用，另一方面，通過在不同的中間層加"multiple heads",這些模型能以多種方式進行擴展。
通過調(diào)用預(yù)訓(xùn)練的參數(shù)，縮短訓(xùn)練時間，Slim能夠輕松地實現(xiàn)復(fù)雜的模型。

Tf-slim包含那些組件？

Tf-Slim包含多個獨立的組成部件，主要包括：

arg_scope:在該參數(shù)作用域內(nèi)，用戶可以為特定操作類型定義默認參數(shù)。
data:保含了tf-slim對于數(shù)據(jù)集的定義、數(shù)據(jù)提供者、并行讀取和解碼工具。
evaluation:包含了模型的評估模板
layers:包含了經(jīng)過封裝的高級網(wǎng)絡(luò)層，用于構(gòu)建tensorflow模型。
learning:包含了模型的訓(xùn)練模板
losses:包含了常用的損失函數(shù)
metrics:包含了常用的評估指標(biāo)
nets:包含了常用的網(wǎng)絡(luò)模型如VGG、AlexNet等等
quenes:包含一個上下文管理機制，用于安全地開啟和關(guān)閉QueueRunners
regularizers:包含了正則化選項
variables:包含一些裝飾器，用于構(gòu)造和處理變量

定義模型

通過對變量、網(wǎng)絡(luò)層和作用域進行組合，tf-slim能成功構(gòu)建神經(jīng)網(wǎng)絡(luò)模型。不同組件的定義如下。

變量

在原生tensorflow代碼中，創(chuàng)建一個變量需要一個初始化機制（如高斯分布、均勻分布等），或者一個已經(jīng)定義過的變量。此外，變量使用的硬件（GPU或CPU）必須被明確指定。為了簡化創(chuàng)建變量的代碼，tf-slim在文件variables.py中提供了一系列的裝飾函數(shù)。例如，創(chuàng)建一個權(quán)重變量，用truncated normal distribution來初始化，并添加L2正則化項，且指定其在CPU上運行，我們只需要如下代碼：

weights=slim.variable('weights',
                      shape=[10,10,3,3],
                      initializer=tf.truncated_normal_initializer(stddev=0.1),
                      regularizer=slim.l2_regularizer(0.05),
                      device='/CPU:0')

值得注意的是，在原生tensorflow中，有兩種變量：常規(guī)變量和局部（臨時）變量。其中大部分變量都是常規(guī)變量：一旦創(chuàng)建，這些變量就能通過savar存儲到磁盤中。局部變量則只存在于會話（session）期間，無法被保存。
tf-slim通過定義表示模型參數(shù)的模型變量（model variables），進一步對變量進行了區(qū)分。模型變量在訓(xùn)練過層中被調(diào)整，在評估和推斷過層中通過加載點（checkpoint）來加載。模型變量包括由slim.fully_connected或slim.conv2d創(chuàng)建的變量等。非模型變量則只是在訓(xùn)練和評估過層中被使用，在實際推斷過層中則不使用。例如，global_step就是一個非模型變量，只是用于對模型的訓(xùn)練，但它不屬于最終模型的組成部分。同樣地，moving average variables（滑動平均相關(guān)的變量）會影響模型變量的取值，但他們本身不屬于模型變量。
這兩種變量在都能在tf-slim中被創(chuàng)建和檢索：

#model variables
model_weights = slim.model_variable('model_weights',
                             shape=[10,10,3,3],
                             initializer=tf.truncated_normal_initializer(stddev=0.1),
                             regularizer=slim.l2_regularizer(0.05),
                            device='/CPU:0')
model_variables=slim.get_model_variables()
#regular variables
my_var = slim.variable('my_var',
                      shape=[20,1],
                      initializer=tf.zeros_initializer())
regular_variables_and_model_variables = slim.get_variables()

這是如何運作的呢？當(dāng)你通過tf-slim的layers或者slim.model_variable函數(shù)創(chuàng)建一個模型變量時，tf-slim自動將該變量加入集合tf.GraphKeys.MODEL_VARIABLES中。如果你想自定義layers或者變量，但是仍然希望tf-slim管理這些模型變量，應(yīng)該怎么做呢？tf.slim提供了簡單的函數(shù)用于將自定義變量加入模型變量集合：

my_model_variable=CreateViaCustomCode()
#letting tf-slim konw about the additianal variable.
slim.add_model_variable(my_model_variable)

網(wǎng)絡(luò)層(layers)

一方面，tensorflow的操作集合包含內(nèi)容很多;另一方面，神經(jīng)網(wǎng)絡(luò)開發(fā)者往往傾向于考慮高層概如"layers","losses","metrics"和"networks"等等。一個網(wǎng)絡(luò)層，如卷積層、全連接層、BatchNorm Layer等往往比一個單一的tensorflow操作（operation）更加抽象，并且一般涉及多個操作。此外，與原始的操作不同，一個layer通常包含與之相關(guān)的可調(diào)整變量。例如，一個卷積層一般包括如果幾個基本操作：

創(chuàng)建權(quán)重變量和偏置變量
將來之前一網(wǎng)絡(luò)層的輸出與本層權(quán)重矩陣做卷積
將卷積的結(jié)果與偏置矩陣相加
添加激活函數(shù)
如果使用原生tensorflow代碼來實現(xiàn)，這會耗費大量的精力：

with tf.name_scope('conv1_1') as scope:
    kernel = tf.Variable(tf.truncated_normal([3,3,64,128],dtype=tf.float32,stddev=le-1),name='weights')
    conv = tf.nn.conv2d(input,kernel,[1,1,1,1],padding='SAME')
    biases = tf.Variable(tf.cosntant(0.0,shape=[128],dtype=tf.float32),
                        trainable=True,name='biases')
    bias = tf.nn.bias_add(conv,biases)
    conv1= tf.nn.relu(bias,name=scope)

為了減少重復(fù)的代碼，tf-slim提供了一些神經(jīng)網(wǎng)絡(luò)層級別的更加高級的操作。例如，上文的代碼轉(zhuǎn)換為tf-slim代碼如下：

input =...
net = slim.conv2d(input,128,[3,3],scope='conv1_1')

Tf-Slim為大量的高級神經(jīng)網(wǎng)路組件提供了標(biāo)準實現(xiàn)，這些包括：

Layer	TF-Slim
BiasAdd	slim.bias_add
BatchNorm	slim.batch_norm
Conv2d	slim.conv2d
Conv2dInPlane	slim.conv2d_in_plane
Conv2dTranspose(Deconv)	slim.conv2d_transpose
FullyConnected	slim.fully_connected
AvgPool2D	slim.avg_pool2d
Dropout	slim.dropout
Flatten	slim.flatten
MaxPool2D	slim.max_pool2d
OneHotEncoding	slim.one_hot_encoding
SeparableConv2	slim.separable_conv2d
UnitNorm	slim.unit_norm

Tf-Slim同時提供兩種元操作(meta-operations)，分別是repeat和stack，他們允許用戶重復(fù)執(zhí)行相同的操作。例如，考慮如下部分VGG網(wǎng)絡(luò)，該部分由卷積層和池化層組成：

net = ...
net = slim.conv2d(net, 256, [3, 3], scope='conv3_1')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_2')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_3')
net = slim.max_pool2d(net, [2, 2], scope='pool2')

一種減少冗余代碼的方式是通過for循環(huán)：

net = ...
for i in range(3):
    net = slim.conv2d(net,256,[3, 3], scope='conv3_' % (i+1))
net = slim.max_maxpool2d(net, [2, 2], scope='pool2')

tf-slim的repeat的操作則使得代碼更加簡單：

net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
net = slim.max_pool2d(net, [2, 2], scope='pool2')

注意到slim.repeat不僅僅是重復(fù)執(zhí)行該行代碼中相同的參數(shù)，它也足夠智能，能夠根據(jù)迭代次數(shù)對slim.conv2d所創(chuàng)建的子網(wǎng)絡(luò)層的作用域進行適當(dāng)修改。具體來說，上述代碼中，作用域?qū)⒎謩e被命名為'conv3/conv3_1'，'conv3/conv3_2'和'conv3/conv3_3'。
此外，tf-slim的slim.stack操作允許調(diào)用者將同意一個操作賦予不同的參數(shù)值，進行迭代并最終創(chuàng)建一個包含多個網(wǎng)絡(luò)層的棧。slim.stack同時會為每個操作創(chuàng)建一個新的tf.variable_scope。例如，一個創(chuàng)建多層感知器的方法如下：

#verbose way
x = slim.fully_connected(x, 32, scope='fc/fc_1')
x = slim.fully_connected(x, 64, scope='fc/fc_2')
x = slim.fully_connected(x, 128, scope='fc/fc_3')
#equivalent, tf-slim way using slim.stack
slim.stack(x, slim.fully_connected, [32, 64, 128], scope='tc')

在這個例子中，slim.stack調(diào)用了三次slim.fully_connected，分別將輸入進行處理并向后傳遞。然而，不同網(wǎng)絡(luò)層的節(jié)點數(shù)是不一樣的（分別為32,64,128）。同樣地，我們可以用stack來簡化構(gòu)建多層卷積層的過程：

#verbose way:
x=slim.conv2d(x, 32, [3,3], scope='core/core_1')
x=slim.conv2d(x, 32, [1,1], scope='core/core_2')
x=slim.conv2d(x, 64, [3,3], scope='core/core_3')
x=slim.conv2d(x, 64, [1,1], scope='core/core_4')
#using stack:
slim.stack(x, slim.conv2d, [(32, [3, 3]), (32, [1, 1]), (64, [3, 3]), (64, [1, 1])], scope='core')

作用域

作為對tensorflow作用域機制（name_scope, variable_scope)的補充，tf-slim增加了一個新的作用域機制arg_scope，這個新的作用域允許用戶指定一個或多個操作以及一系列參數(shù)，這些參數(shù)在上述arg_scope中被傳遞給對應(yīng)的操作。這個功能最好用例子來描述，考慮如下代碼段：

net = slim.conv2d(inputs, 64, [11, 11], 4, padding='SAME',
                  weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                  weights_regularizer=slim.l2_regularizer(0.0005), scope='conv1')
net = slim.conv2d(net, 128, [11, 11], padding='VALID',
                  weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                  weights_regularizer=slim.l2_regularizer(0.0005), scope='conv2')
net = slim.conv2d(net, 256, [11, 11], padding='SAME',
                  weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                  weights_regularizer=slim.l2_regularizer(0.0005), scope='conv3')

很明顯，上述三個卷積層包含很多相同的超參。其中兩個有相同的填充模式，三個都包含相同的weights_initializer和weight_regularizer。這段代碼很難閱讀，且包含許多可被省略的重復(fù)值。一個解決方案是設(shè)定默認值：

padding = 'SAME'
initializer = tf.truncated_normal_initializer(stddev=0.01)
regularizer = slim.l2_regularizer(0.0005)
net = slim.conv2d(inputs, 64, [11, 11], 4,
                  padding=padding,
                  weights_initializer=initializer,
                  weights_regularizer=regularizer,
                  scope='conv1')
net = slim.conv2d(net, 128, [11, 11],
                  padding='VALID',
                  weights_initializer=initializer,
                  weights_regularizer=regularizer,
                  scope='conv2')
net = slim.conv2d(net, 256, [11, 11],
                  padding=padding,
                  weights_initializer=initializer,
                  weights_regularizer=regularizer,
                  scope='conv3')

這個方法確保了每個卷積分層包行相同的參數(shù)值，但是不能有效減少代碼量。通過使用arg_scope，我們能在減少代碼的同時，保證每層使用同樣的參數(shù)值：

with slim.arg_scope([slim.conv2d], padding='SAME',
                    weights_initializer=tf.truncated_normal_initializer(seddev=0.01)
                    weights_regularizer=slim.l2_regularizer(0.0005)):
    net = slim.conv2d(inputs, 64, [11, 11], scope='conv1')
    net = slim.conv2d(net, 128, [11, 11], padding='VALID', scope='conv2')
    net = slim.conv2d(net, 256, [11 11], scope='conv3')

如上例表示的那樣，arg_scope的使用似的代碼更簡單明了，且更加易于維護。注意到一方面參數(shù)值在arg_scope中被指定，但是針對特定參數(shù)，其值可以進行局部重寫。例如，padding參數(shù)在arg_scope中被設(shè)定為'SAME'，作為默認值，但是該值在第二個卷積中被改寫了。
arg_scope可以被嵌套使用，并且同一arg_scope可以作用于不同的操作。例如：

with slim.arg_scope([slim.conv2d, slim.fully_connected],
                      activation_fn=tf.nn.relu,
                      weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                      weights_regularizer=slim.l2_regularizer(0.0005)):
    with slim.arg_scope([slim.conv2d], stride=1, padding='SAME'):
        net = slim.conv2d(inputs, 64, [11, 11], 4, padding='VALID', scope='conv1')
        net = slim.conv2d(net, 256, [5, 5],
                      weights_initializer=tf.truncated_normal_initializer(stddev=0.03),
                      scope='conv2')
        net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc')

在這個例子中，第一個arg_scope為conv2d和fully_connected兩個網(wǎng)絡(luò)層設(shè)定了相同的weights_initializer和weights_regularizer;在第二個arg_scope中，進一步為conv2d網(wǎng)絡(luò)層設(shè)定了其他參數(shù)的默認值。

實際案例：設(shè)定VGG16的網(wǎng)絡(luò)層

通過組合tf-slim變量，操作和作用域，我們能夠在少量代碼內(nèi)實現(xiàn)復(fù)雜的網(wǎng)絡(luò)結(jié)構(gòu)。例如，整個VGG架構(gòu)可以用以下代碼段實現(xiàn)：

def vgg16(inputs):
    with slim.arg_scope([slim.conv2d, slim.fully_connected],
                       activate_fn=tf.nn.relu,
                       weights_initializer=tf.truncated_normal_initializer(0.0,0.01),
                       weights_regularizer=slim.l2_regularizer(0.0005)):
        net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')
        net = slim.max_pool2d(net, [2, 2], scope='pool1')
        net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')
        net = slim.max_pool2d(net, [2, 2], scope='pool2')
        net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
        net = slim.max_pool2d(net, [2, 2], scope='pool3')
        net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv4')
        net = slim.max_pool2d(net, [2, 2], scope='pool4')
        net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv5')
        net = slim.max_pool2d(net, [2, 2], scope='pool5')
        net = slim.fully_connected(net, 4096, scope='fc6')
        net = slim.dropout(net, 0.5, scope='dropout6')
        net = slim.fully_connected(net, 4096, scope='fc7')
        net = slim.dropout(net, 0.5, scope='dropout7')
        net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc8')
    return net

訓(xùn)練模型

訓(xùn)練tensorflow模型需要一個網(wǎng)絡(luò)模型，一個損失函數(shù)，梯度計算和訓(xùn)練例程用于迭代計算模型參數(shù)關(guān)于損失函數(shù)的梯度并進行相應(yīng)的更新。tf-slim同時提供了常用損失函數(shù)和一些輔助函數(shù)用于執(zhí)行訓(xùn)練和評估過程。

損失函數(shù)

損失函數(shù)定義了一個我們希望最小化的實數(shù)。對于分類問題來說，典型的損失函數(shù)是標(biāo)簽和預(yù)測的概率分布之間的交叉熵。對于回歸問題來說，損失函數(shù)一般使用預(yù)測值和實際值之間的誤差平方和。
對于有些模型，比如多任務(wù)模型，可能需要同時使用多種損失函數(shù)的加權(quán)和。換句話說，我們需要最小化的是各項損失函數(shù)的加權(quán)和。例如，對于一個同時預(yù)測圖像類型和像素深度的模型來說，其損失函數(shù)應(yīng)該是分類損失函數(shù)和像素深度預(yù)測損失函數(shù)的加權(quán)和。
tf-slim通過損失函數(shù)模型提供了簡單易用的機制用于定義和跟蹤損失函數(shù)。以VGG模型的情況為例：

import tensorflow as tf
vgg = tf.contrib.slim.nets.vgg

#load the images and labels.
images, labels = ...

#create the model.
predictions,_ = vgg.vgg_16(images)

#define the loss functions and get the total loss.
loss = slim.losses.softmax_cross_entropy(predictions, labels)

在這個例子中，我們以創(chuàng)建模型開始（使用tf-slim的VGG實現(xiàn)方式），隨后添加標(biāo)準的分類損失函數(shù)。現(xiàn)在，讓我們考慮多任務(wù)模型的情況：

# Load the images and labels.
images, scene_labels, depth_labels = ...

# Create the model.
scene_predictions, depth_predictions = CreateMultiTaskModel(images)

# Define the loss functions and get the total loss.
classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)

# The following two lines have the same effect:
total_loss = classification_loss + sum_of_squares_loss
total_loss = slim.losses.get_total_loss(add_regularization_losses=False)

在這個例子中，我們有兩個損失函數(shù)項：slim.losses.softmax_cross_entropy和slim.losses.sum_of_squares。我們可以通過加法操作或者調(diào)用slim.losses.get_total_loss()得到總的損失函數(shù)項。這是如何運作的呢？每當(dāng)你通過tf-slim創(chuàng)建一個損失項，tf-slim自動將損失項加入損失函數(shù)集合。這使得你即可以手動管理總的損失函數(shù)，也可以委托tf-slim代為管理。
如果你想自定義損失函數(shù)，并委托tf-slim代為管理，應(yīng)該怎樣做呢？loss_ops.py文件提供了相應(yīng)的函數(shù)用于將自定義損失函數(shù)加入對應(yīng)集合。例如：

# Load the images and labels.
images, scene_labels, depth_labels, pose_labels = ...

# Create the model.
scene_predictions, depth_predictions, pose_predictions = CreateMultiTaskModel(images)

# Define the loss functions and get the total loss.
classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)
pose_loss = MyCustomLossFunction(pose_predictions, pose_labels)
slim.losses.add_loss(pose_loss) # Letting TF-Slim know about the additional loss.

# The following two ways to compute the total loss are equivalent:
regularization_loss = tf.add_n(slim.losses.get_regularization_losses())
total_loss1 = classification_loss + sum_of_squares_loss + pose_loss + regularization_loss

# (Regularization Loss is included in the total loss by default).
total_loss2 = slim.losses.get_total_loss()

在這個例子中，我們一方面可以手動管理總的損失函數(shù)，同時也可以委托tf-slim代為管理。

訓(xùn)練過程

tf-slim在文檔learning.py中提供了一系列簡單有效的工具用于訓(xùn)練模型。這些包括一個訓(xùn)練函數(shù)用于重復(fù)測量損失值、計算梯度和保存模型到磁盤，同時包括其它簡便的函數(shù)用于處理梯度函數(shù)。例如，一旦確定了一個模型和對應(yīng)的損失函數(shù)及優(yōu)化策略，我們就能調(diào)用slim.learning.create_train_op 和slim.learning.train來處理優(yōu)化過程：

g = tf.Graph()

# Create the model and specify the losses...
...

total_loss = slim.losses.get_total_loss()
optimizer = tf.train.GradientDescentOptimizer(learning_rate)

# create_train_op ensures that each time we ask for the loss, the update_ops
# are run and the gradients being computed are applied too.
train_op = slim.learning.create_train_op(total_loss, optimizer)
logdir = ... # Where checkpoints are stored.

slim.learning.train(
    train_op,
    logdir,
    number_of_steps=1000,
    save_summaries_secs=300,
    save_interval_secs=600):

在該案例中，slim.learning.train和train_op被用于計算損失函數(shù)和梯度并進行參數(shù)更新。logdir指定了加載點和事件文件的存儲目錄。我們可以將梯度算法的迭代步數(shù)限定為任意值，本例中，我們的迭代次數(shù)為1000。最后，save_summaries_secs=300表示每300s做一次總結(jié)，save_interval_secs=600表示每600s保存一次模型加載點。

實際案例：訓(xùn)練VGG16模型

為了說明問題，我們考察如下VGG網(wǎng)絡(luò)的訓(xùn)練過程：

import tensorflow as tf

slim = tf.contrib.slim

import tensorflow as tf 

slim = tf.contrib.slim
vgg = tf.contrib.slim.nets.vgg

...

train_log_dir=...
if not tf.gfile.Exists(train_log_dir):
    tf.gfile.MakeDirs(train_log_dir)
    
with tf.Graph().as_default():
    #set up the data loading:
    images, labels = ...
    
    #define the model:
    predictions = vgg.vgg_16(images, is_training=True)
    
    #specify the loss function:
    slim.losses.scalar('losses/total_lose', total_loss)
    
    #specify the optimization scheme:
    optimizer = tf.train.GradientDescentOptimizer(learning_rate=0,.001)
    
    #create_train_op that ensures that when we evaluate it to get the loss,
    #the update_ops are done and the gradient updates are computed.
    train_tensor = slim.learning.create_train_op(total_loss, optimizer)
    
    #actually runs training
    slim.learning.train(train_tensor, train_log_dir)

微調(diào)（Fine-Tuning）已有模型

簡要介紹從checkpoint加載變量

已經(jīng)訓(xùn)練好的模型，可以通過tf.train.Saver()進行恢復(fù)，該函數(shù)能從給定的checkpoint中加載變量。在很多情況下，tf.train.Saver()為重新加載部分或全部變量提供了簡單的機制。

#create some variables.
v1 = tf.Variable(..., name='v1')
v2 = tf.Variable(..., name='v2')
...

#add ops to restore all the variables
restorer = tf.train.Saver()

#add ops to restore some variables.
restorer = tf.train.Saver([v1, v2])

#later, launch the model,use the saver to restore variables from disk, and 
#do sth with the model
with tf.Session() as sess:
    #restore variables from disk.
    restorer.restore(sess, '/tmp/model.ckpt')
    print('model restored.')
    #do sth with the model
    ...

更多細節(jié)請查看文檔變量中的加載變量、選擇性保存和加載變量部分。

加載部分模型

針對一個新的數(shù)據(jù)集甚至一個新的任務(wù)，對一個預(yù)先訓(xùn)練好的模型進行微調(diào)是很有必要的。這種情況下，我們可以通過tf-slim的輔助函數(shù)來選擇性加載部分變量：

# Create some variables.
v1 = slim.variable(name="v1", ...)
v2 = slim.variable(name="nested/v2", ...)
...

# Get list of variables to restore (which contains only 'v2'). These are all
# equivalent methods:
variables_to_restore = slim.get_variables_by_name("v2")
# or
variables_to_restore = slim.get_variables_by_suffix("2")
# or
variables_to_restore = slim.get_variables(scope="nested")
# or
variables_to_restore = slim.get_variables_to_restore(include=["nested"])
# or
variables_to_restore = slim.get_variables_to_restore(exclude=["v1"])

# Create the saver which will be used to restore the variables.
restorer = tf.train.Saver(variables_to_restore)

with tf.Session() as sess:
    
    # Restore variables from disk.
    restorer.restore(sess, "/tmp/model.ckpt") 
    print 'Model restored.'
    # Do some work with the model

加載變量名不相同的模型

當(dāng)我們從checkpoint加載變量時，Saver將checkpoint文件中的變量與當(dāng)前計算圖中的變量通過變量名進行一一映射。上文代碼中，我們通過傳遞變量列表創(chuàng)建了一個saver。在這種情況下，checkpoint中變量的名稱由當(dāng)前變量提供的var.op.name隱式地確定。
當(dāng)checkpoint文件和當(dāng)前計算圖中的變量名相互匹配時，這種做法并沒有什么問題。然而，有些時候，我們希望從與當(dāng)前計算圖中變量名不匹配的checkpoint中加載模型。這種情況下，我們必須用一個字典顯式地為Saver提供checkpoint與當(dāng)前計算圖的變量映射關(guān)系。在下面的例子中，checkpoint中的變量名通過一個簡單的函數(shù)來得到：

# Assuming than 'conv1/weights' should be restored from 'vgg16/conv1/weights'
def name_in_checkpoint(var):
    return 'vgg16/' + var.op.name

# Assuming than 'conv1/weights' and 'conv1/bias' should be restored from 'conv1/params1' and 'conv1/params2'
def name_in_checkpoint(var):
    if "weights" in var.op.name:
        return var.op.name.replace("weights", "params1")
    if "bias" in var.op.name:
        return var.op.name.replace("bias", "params2")

variables_to_restore = slim.get_model_variables()
variables_to_restore = {name_in_checkpoint(var):var for var in variables_to_restore}
restorer = tf.train.Saver(variables_to_restore)

with tf.Session() as sess:
  # Restore variables from disk.
    restorer.restore(sess, "/tmp/model.ckpt")

針對一個新的任務(wù)對模型進行微調(diào)

考慮已經(jīng)訓(xùn)練好的VGG-16模型。這個模型是在ImageNet數(shù)據(jù)集上訓(xùn)練的，它包含一千個分類類別。然而，我們希望將該模型應(yīng)用到只有20個類別的Pascal VOC數(shù)據(jù)集上。為了完成這個任務(wù)，我們可以除去最后的全連接層，用其它預(yù)訓(xùn)練好的參數(shù)來初始化我們的新模型。

#load the Pascal VOC data
image, label = MyPascalVocDataLoader(...)
images, lables = tf.train.batch([image, label], batch_size=32)

#Create the model
predictions = vgg.vgg_16(images)

train_op = slim.learning.create_train_op(...)

#specify whte the Moder, trained on ImageNet, was saved.
model_path = '/path/to/pre_trained_on_imagenet.checkpoint'

#specify where the new model will live:
log_dir = '/path/to/my_pascal_model_dir'

#restore only the convolutional layers:
variables_to_restore = slim.get_variables_to_restore(exclude=['fc6','fc7','fc8'])
init_fn = assign_from_checkpoint_fn(model_path, variables_to_restore)

#Start training.
slim.learning.train(train_op, log_dir, init_fn=init_fn)

評估模型

對于一個已經(jīng)訓(xùn)練好（或者正在訓(xùn)練）的模型，我們希望知道其實際效果怎么樣。這可以通過選擇合適的評估指標(biāo)、給模型打分來實現(xiàn)。相應(yīng)的代碼首先會下載一些測試數(shù)據(jù)，然后執(zhí)行推斷過程，并將估計結(jié)果與實際標(biāo)簽進行對比，獲得并記錄相應(yīng)的評估分數(shù)。這個過程可能只執(zhí)行一次，也可能周期性重復(fù)執(zhí)行。

評估指標(biāo)

我們定義一個評估指標(biāo)作為對模型性能的度量，該指標(biāo)不是損失函數(shù)（損失值在訓(xùn)練過層中被直接優(yōu)化），但是為了觀察模型的好壞，我們?nèi)匀粚@個評估指標(biāo)感興趣。例如，我們在訓(xùn)練時可能想最小化對數(shù)損失函數(shù)，但是我們的感興趣的評估指標(biāo)可能是F1分數(shù)（測試精確度），或者IOU（Intersection Over Union，該指標(biāo)是不可微的，因此不能被用作損失函數(shù)）。
TF-Slim提供了一些指標(biāo)使得模型的評估過程變得更簡單。概括來說，計算一個指標(biāo)的值可以被分成三步：

Initialization:初始化用于計算metrics的值。
Aggregation: 執(zhí)行用于計算metrics的運算步驟（如求和等）。
Finalization: （可選）執(zhí)行最后的運算步驟來計算metrics的值，如均值、取最小值、取最大值等。
例如，為了計算mean_absolute_error,兩個變量（一個count和一個total variable）被初始化為0。在aggregation過程中，我們觀測一系列的估計值和標(biāo)簽，計算他們的絕對誤差，與total variable相加。每當(dāng)我們移到下一個觀測樣本，count值被加1。最后，在finalization過程中，total variable除以count得到誤差均值。
下面的例子演示了申明metrics的API。因為metrics的值一般在測試集（這是與訓(xùn)練集不同的）中計算，所以我們假定下面的數(shù)據(jù)來自測試集：

images, labels = LoadTestData(...)
predictions = MyModel(images)

mae_value_op, mae_update_op = slim.metrics.streaming_mean_absolute_error(predictions, labels)
mre_value_op, mre_update_op = slim.metrics.streaming_mean_relative_error(predictions, labels)
pl_value_op, pl_update_op = slim.metrics.percentage_less(mean_relative_errors, 0.3)

如上例所示，創(chuàng)建一個metrics會返回兩個值：一個value_op和一個update_op。value_op是一個冪等的操作，返回metric的當(dāng)前值，update_op則執(zhí)行上述的aggregation步驟，并返回相應(yīng)的metric值。
跟蹤每個value_op和update_op是很麻煩的事，因此Tf-Slim提供了兩個相應(yīng)的功能：

# Aggregates the value and update ops in two lists:
value_ops, update_ops = slim.metrics.aggregate_metrics(
    slim.metrics.streaming_mean_absolute_error(predictions, labels),
    slim.metrics.streaming_mean_squared_error(predictions, labels))

# Aggregates the value and update ops in two dictionaries:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    "eval/mean_absolute_error": slim.metrics.streaming_mean_absolute_error(predictions, labels),
    "eval/mean_squared_error": slim.metrics.streaming_mean_squared_error(predictions, labels),
})

實際案例：跟蹤多個評估指標(biāo)

import tensorflow as tf

slim = tf.contrib.slim
vgg = tf.contrib.slim.nets.vgg

# Load the data
images, labels = load_data(...)

# Define the network
predictions = vgg.vgg_16(images)

# Choose the metrics to compute:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    "eval/mean_absolute_error": slim.metrics.streaming_mean_absolute_error(predictions, labels),
    "eval/mean_squared_error": slim.metrics.streaming_mean_squared_error(predictions, labels),
})

# Evaluate the model using 1000 batches of data:
num_batches = 1000

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    sess.run(tf.local_variables_initializer())

    for batch_id in range(num_batches):
        sess.run(names_to_updates.values())

    metric_values = sess.run(names_to_values.values())
    for metric, value in zip(names_to_values.keys(), metric_values):
        print('Metric %s has value: %f' % (metric, value))

評估循環(huán)

TF-Slim提供了評估模型(evaluation.py)，該模型包含用于編寫模型評估腳本的輔助函數(shù)（metrics由metric_ops.py定義）。這包括一個周期性執(zhí)行評估、總結(jié)和輸出的函數(shù)。例如：

import tensorflow as tf

slim = tf.contrib.slim

# Load the data
images, labels = load_data(...)

# Define the network
predictions = MyModel(images)

# Choose the metrics to compute:
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
    'accuracy': slim.metrics.accuracy(predictions, labels),
    'precision': slim.metrics.precision(predictions, labels),
    'recall': slim.metrics.recall(mean_relative_errors, 0.3),
})

# Create the summary ops such that they also print out to std output:
summary_ops = []
for metric_name, metric_value in names_to_values.iteritems():
    op = tf.summary.scalar(metric_name, metric_value)
    op = tf.Print(op, [metric_value], metric_name)
    summary_ops.append(op)

num_examples = 10000
batch_size = 32
num_batches = math.ceil(num_examples / float(batch_size))

# Setup the global step.
slim.get_or_create_global_step()

output_dir = ... # Where the summaries are stored.
eval_interval_secs = ... # How often to run the evaluation.
slim.evaluation.evaluation_loop(
    'local',
    checkpoint_dir,
    log_dir,
    num_evals=num_batches,
    eval_op=names_to_updates.values(),
    summary_op=tf.summary.merge(summary_ops),
    eval_interval_secs=eval_interval_secs)

作者

Sergio Guadarrama和Nathan Silberman

最后編輯于：2017.12.11 03:47:06

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
平臺聲明：文章內(nèi)容（如有圖片或視頻亦包括在內(nèi)）由作者上傳并發(fā)布，文章內(nèi)容僅代表作者本人觀點，簡書系信息發(fā)布平臺，僅提供信息存儲服務(wù)。

人面猴
序言：七十年代末，一起剝皮案震驚了整個濱河市，隨后出現(xiàn)的幾起案子，更是在濱河造成了極大的恐慌，老刑警劉巖，帶你破解...
沈念sama閱讀 229,963評論 6贊 542
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件，死亡現(xiàn)場離奇詭異，居然都是意外死亡，警方通過查閱死者的電腦和手機，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 99,348評論 3贊 429
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進店門，熙熙樓的掌柜王于貴愁眉苦臉地迎上來，“玉大人，你說我怎么就攤上這事。” “怎么了？”我有些...
開封第一講書人閱讀 178,083評論 0贊 383
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵，是天一觀的道長。經(jīng)常有香客問我，道長，這世上最難降的妖魔是什么？我笑而不...
開封第一講書人閱讀 63,706評論 1贊 317
?港島之戀（遺憾婚禮）
正文為了忘掉前任，我火速辦了婚禮，結(jié)果婚禮上，老公的妹妹穿的比我還像新娘。我一直安慰自己，他們只是感情好，可當(dāng)我...
茶點故事閱讀 72,442評論 6贊 412
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布。她就那樣靜靜地躺著，像睡著了一般。火紅的嫁衣襯著肌膚如雪。梳的紋絲不亂的頭發(fā)上，一...
開封第一講書人閱讀 55,802評論 1贊 328
城市分裂傳說
那天，我揣著相機與錄音，去河邊找鬼。笑死，一個胖子當(dāng)著我的面吹牛，可吹牛的內(nèi)容都是我干的。我是一名探鬼主播，決...
沈念sama閱讀 43,795評論 3贊 446
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼，長吁一口氣：“原來是場噩夢啊……” “哼！你這毒婦竟也來了？” 一聲冷哼從身側(cè)響起，我...
開封第一講書人閱讀 42,983評論 0贊 290
萬榮殺人案實錄
序言：老撾萬榮一對情侶失蹤，失蹤者是張志新（化名）和其女友劉穎，沒想到半個月后，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體，經(jīng)...
沈念sama閱讀 49,542評論 1贊 335
?護林員之死
正文獨居荒郊野嶺守林人離奇死亡，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點故事閱讀 41,287評論 3贊 358
?白月光啟示錄
正文我和宋清朗相戀三年，在試婚紗的時候發(fā)現(xiàn)自己被綠了。大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
茶點故事閱讀 43,486評論 1贊 374
活死人
序言：一個原本活蹦亂跳的男人離奇死亡，死狀恐怖，靈堂內(nèi)的尸體忽然破棺而出，到底是詐尸還是另有隱情，我是刑警寧澤，帶...
沈念sama閱讀 39,030評論 5贊 363
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布，位于F島的核電站，受9級特大地震影響，放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜，卻給世界環(huán)境...
茶點故事閱讀 44,710評論 3贊 348
男人毒藥：我在死后第九天來索命
文/蒙蒙一、第九天我趴在偏房一處隱蔽的房頂上張望。院中可真熱鬧，春花似錦、人聲如沸。這莊子的主人今日做“春日...
開封第一講書人閱讀 35,116評論 0贊 28
一樁弒父案，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽。三九已至，卻和暖如春，著一層夾襖步出監(jiān)牢的瞬間，已是汗流浹背。一陣腳步聲響...
開封第一講書人閱讀 36,412評論 1贊 294
情欲美人皮
我被黑心中介騙來泰國打工，沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留，地道東北人。一個月前我還...
沈念sama閱讀 52,224評論 3贊 398
代替公主和親
正文我出身青樓，卻偏偏與公主長得像，于是被迫代替她去往敵國和親。傳聞我的和親對象是個殘疾皇子，可洞房花燭夜當(dāng)晚...
茶點故事閱讀 48,462評論 2贊 378

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频

TensorFlow-Slim

TensorFlow-Slim

TensorFlow-Slim

使用Tf-Slim

為什么使用tf-slim？

Tf-slim包含那些組件？

定義模型

變量

網(wǎng)絡(luò)層(layers)

作用域

實際案例：設(shè)定VGG16的網(wǎng)絡(luò)層

訓(xùn)練模型

損失函數(shù)

訓(xùn)練過程

實際案例：訓(xùn)練VGG16模型

微調(diào)（Fine-Tuning）已有模型

簡要介紹從checkpoint加載變量

加載部分模型

加載變量名不相同的模型

針對一個新的任務(wù)對模型進行微調(diào)

評估模型

評估指標(biāo)

實際案例：跟蹤多個評估指標(biāo)

評估循環(huán)

作者

推薦閱讀更多精彩內(nèi)容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美 国产 综合 欧美 视频

TensorFlow-Slim

TensorFlow-Slim

使用Tf-Slim

為什么使用tf-slim？

Tf-slim包含那些組件？

定義模型

變量

網(wǎng)絡(luò)層(layers)

作用域

實際案例：設(shè)定VGG16的網(wǎng)絡(luò)層

訓(xùn)練模型

損失函數(shù)

訓(xùn)練過程

實際案例：訓(xùn)練VGG16模型

微調(diào)（Fine-Tuning）已有模型

簡要介紹從checkpoint加載變量

加載部分模型

加載變量名不相同的模型

針對一個新的任務(wù)對模型進行微調(diào)

評估模型

評估指標(biāo)

實際案例：跟蹤多個評估指標(biāo)

評估循環(huán)

作者

推薦閱讀更多精彩內(nèi)容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频