image from unsplash.com by @wolfgang_hasselmann
上一篇文章,我們用 Tensorflow 和 PyTorch 分別完成了函數(shù)自動(dòng)求導(dǎo)以及參數(shù)手動(dòng)和自動(dòng)優(yōu)化的任務(wù),這篇文章我們就通過經(jīng)典的 MNSIT 手寫數(shù)字識(shí)別數(shù)據(jù)集,對(duì)比一下,如何使用兩個(gè)框架建立訓(xùn)練全鏈接的神經(jīng)網(wǎng)絡(luò),對(duì)手寫數(shù)字進(jìn)行分類.
獲取文章代碼請(qǐng)關(guān)注微信公眾號(hào)"tensor_torch" 二維碼見文末
1. 數(shù)據(jù)導(dǎo)入
像 MNIST 這樣經(jīng)典的數(shù)據(jù)集 Tensorflow 和 PyTorch 都能直接下載,并提供非常方便快捷的加載工具.
Tensorflow 用
tf.keras.datasets.mnist.load_data()
加載數(shù)據(jù),數(shù)據(jù)為numpy.ndarray
格式 .PyTorch 從
torchvison.datasets.MNIST
中加載,數(shù)據(jù)格式為 Image,無(wú)法直接使用,需要設(shè)置transform = transforms.ToTensor()
轉(zhuǎn)換成張量數(shù)據(jù);這里的transform
不僅可以轉(zhuǎn)換數(shù)據(jù)格式, 如果傳入transform.Compose()
可以通過 list 傳入更多轉(zhuǎn)換的參數(shù),比如代碼中就將數(shù)據(jù)同時(shí)進(jìn)行了normalize 的處理.Tensorflow 中可以通過
tf.data.Dataset.from_tensor_slices()
構(gòu)建數(shù)據(jù)集對(duì)象.并通過.map
自定義的preprocess函數(shù),對(duì)數(shù)據(jù)進(jìn)行預(yù)處理.還可以直接使用.shuffle()
和.batch()
對(duì)數(shù)據(jù)進(jìn)行打散和批處理.PyTorch 中使用
torch.utils.data.DataLoader
構(gòu)建數(shù)據(jù)集對(duì)象,完成數(shù)據(jù) 創(chuàng)建batch 批處理,以及對(duì)數(shù)據(jù)進(jìn)行打散(Shuffle)注意處理后數(shù)據(jù)的 shape, Tensorflow 中 image shape: [b, 28, 28], label shape: [b], PyTorch image shape: [b, 1,28, 28], label shape: [b]
PyTorch 的 DataLoader 可以設(shè)置訓(xùn)練數(shù)據(jù)的
Train = False
避免在測(cè)試數(shù)據(jù)庫(kù)中對(duì)數(shù)據(jù)進(jìn)行訓(xùn)練,而 Tensorflow 就只能在搭建網(wǎng)絡(luò)的時(shí)候才能聲明了.
# ------------------------Tensorflow -----------------------------
(x, y),(x_test, y_test) = keras.datasets.mnist.load_data()
ds_train = tf.data.Dataset.from_tensor_slices((x,y))
ds_test = tf.data.Dataset.from_tensor_slices((x_test, y_test))
def preprocess(x, y):
x = (tf.cast(x, tf.float32)/255)-0.1307
y = tf.cast(y, tf.int32)
# y = tf.one_hot(y,depth=10)
return x, y
ds_train = ds_train.map(preprocess).shuffle(1000).batch(batch_size)
ds_test = ds_test.map(preprocess).shuffle(1000).batch(batch_size)
# ------------------------PyTorch --------------------------------
train_loader = torch.utils.data.DataLoader(
datasets.MNIST('../data', train=True, download=True,
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
])),
batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(
datasets.MNIST('../data', train=False, transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
])),
batch_size=batch_size, shuffle=True
2. 手動(dòng)搭網(wǎng)
2.1 參數(shù)初始化
我們首先介紹如何手動(dòng)搭建全鏈接的神經(jīng)網(wǎng)絡(luò),這里的難點(diǎn)是參數(shù)的初始化和管理.我們的模型有三層全鏈接的神經(jīng)網(wǎng)絡(luò),所以我們需要初始化3組 w 和 b.注意每一組的shape:
網(wǎng)絡(luò):[b, 786] -> [b, 200] -> [b, 100] -> [b, 10]
w1: [786, 200], b1: [200],
w2: [200,100], b2: [100],
w3: [100,10], b3:[10]
# ------------------------Tensorflow -----------------------------
w1 = tf.Variable(tf.random.uniform([28*28, 200]))
b1 = tf.Variable(tf.zeros([200]))
w2 = tf.Variable(tf.random.uniform([200, 100]))
b2 = tf.Variable(tf.zeros([100]))
w3 = tf.Variable(tf.random.uniform([100, 10]))
b3 = tf.Variable(tf.zeros([10]))
# ------------------------PyTorch --------------------------------
w1 = torch.rand(28*28, 200 , requires_grad=True)
b1 = torch.zeros(200, requires_grad=True)
w2 = torch.rand(200, 100, requires_grad=True)
b2 = torch.zeros(100, requires_grad=True)
w3 = torch.rand(100, 10, requires_grad=True)
b3 = torch.zeros(10, requires_grad=True)
2.2 搭建網(wǎng)絡(luò)
這里我們均采用自定義函數(shù)的方式來搭建網(wǎng)絡(luò),這個(gè)部分兩個(gè)框架沒有太大區(qū)別.我們手動(dòng)定義了三層神經(jīng)網(wǎng)絡(luò),前兩層包含 relu 激活函數(shù),最后一層沒有使用激活函數(shù).
# ------------------------Tensorflow -----------------------------
# forward func
def model(x):
x = tf.nn.relu(x@w1 + b1)
x = tf.nn.relu(x@w2 + b2)
x = x@w3 + b3
return x
# ------------------------PyTorch --------------------------------
# forward func
def forward(x):
x = F.relu(x@w1 + b1)
x = F.relu(x@w2 + b2)
x = x@w3 + b3
return x
2.3 訓(xùn)練網(wǎng)絡(luò)
該部分與前文中介紹的自動(dòng)求導(dǎo),參數(shù)優(yōu)化的部分一致,按照套路進(jìn)行就行了,需注意以下幾點(diǎn).
- 對(duì)于全鏈接網(wǎng)絡(luò)首先需要對(duì)數(shù)據(jù)打平,Tensorflow 和 PyTorch 都可以用 reshape 方法實(shí)現(xiàn).
- 為了與 PyTorch 中
torch.nn.CrossEntropyLoss()
求交叉熵的方法一致,Tensorflow 中并未對(duì)label 進(jìn)行 One-Hot 編碼,所以使用了tf.losses.sparse_categorical_crossentropy()
方法計(jì)算交叉熵.
# ------------------------Tensorflow -----------------------------
optimizer = tf.optimizers.Adam(learning_rate)
for epoch in range(epochs):
for step, (x, y) in enumerate(ds_train):
x = tf.reshape(x, [-1, 28*28])
with tf.GradientTape() as tape:
logits = model(x)
losses = tf.losses.sparse_categorical_crossentropy(y,logits,from_logits=True)
loss = tf.reduce_mean(losses)
grads = tape.gradient(loss, [w1,b1,w2,b2,w3,b3])
optimizer.apply_gradients(zip(grads, [w1,b1,w2,b2,w3,b3]))
# ------------------------PyTorch --------------------------------
optimizer = torch.optim.Adam([w1,b1,w2,b2,w3,b3],
lr=learning_rate)
criteon = torch.nn.CrossEntropyLoss()
for epoch in range(epochs):
for step, (x, y) in enumerate(train_loader):
x = x.reshape(-1,28*28)
logits = forward(x)
loss = criteon(logits, y)
optimizer.zero_grad()
loss.backward()
optimizer.step()
3. 高級(jí) API 搭建網(wǎng)絡(luò)
手動(dòng)搭建網(wǎng)絡(luò)的好處是,都是采用最底層的方式,整個(gè)過程透明可控.但是壞處就是需要手動(dòng)管理每一個(gè)參數(shù),網(wǎng)絡(luò)一旦復(fù)雜起來就容易出錯(cuò).
Tensorflow 和 PyTorch 均可采用創(chuàng)建模型對(duì)象(Class)的方式創(chuàng)建神經(jīng)網(wǎng)絡(luò)模型.
- Tensorflow 繼承
tf.keras.Model
對(duì)象,PyTorch 繼承torch.nn.Module
對(duì)象.- Tensorflow 模型對(duì)象中,前向傳播調(diào)用
call()
函數(shù),PyTorch 調(diào)用forward()
函數(shù).- 在訓(xùn)練過程中僅需將手動(dòng)搭網(wǎng)的函數(shù)替換成初始化后的網(wǎng)絡(luò)模型對(duì)象即可.
# ------------------------Tensorflow -----------------------------
class FC_model(keras.Model):
def __init__(self):
super().__init__()
self.model = keras.Sequential(
[layers.Dense(200),
layers.ReLU(),
layers.Dense(100),
layers.ReLU(),
layers.Dense(10)]
)
def call(self,x):
x = self.model(x)
return x
model = FC_model()
# ------------------------PyTorch --------------------------------
class FC_NN(nn.Module):
def __init__(self):
super().__init__()
self.model = nn.Sequential(
nn.Linear(28*28, 200),
nn.ReLU(inplace=True),
nn.Linear(200, 100),
nn.ReLU(inplace=True),
nn.Linear(100,10)
)
def forward(self, x):
x = self.model(x)
return x
network = FC_NN().to(device)
4. 使用 GPU 加速訓(xùn)練
如果訓(xùn)練環(huán)境支持 GPU ,Tensorflow 和 PyTorch 均可以調(diào)用 GPU 加速計(jì)算.Tensorflow 如果使用的是 Tensorflow-gpu 版本,我們無(wú)需任何操作,直接就是調(diào)用的GPU進(jìn)行計(jì)算.
對(duì)于 PyTorch ,需要?jiǎng)?chuàng)建 device = torch.device('cuda:0')
并將網(wǎng)絡(luò)和參數(shù)搬到這個(gè) device 上進(jìn)行計(jì)算.
...
device = torch.device('cuda:0')
network = FC_NN().to(device)
criteon = torch.nn.CrossEntropyLoss().to(device)
...
for epoch in range(epochs):
...
x, y = x.to(device), y.to(device)
...
5. 模型測(cè)試
模型訓(xùn)練好了之后需要使用驗(yàn)證數(shù)據(jù)集進(jìn)行測(cè)試。這里我們簡(jiǎn)單的采用正確率(accuracy)來對(duì)模型進(jìn)行驗(yàn)證
正確率 = 預(yù)測(cè)正確的樣本數(shù) / 所有樣本數(shù)
代碼看起來比較繁瑣,不過就是以下幾個(gè)步驟:
- 將所有驗(yàn)證數(shù)據(jù)帶入訓(xùn)練好的模型中,給出預(yù)測(cè)值。
- 將預(yù)測(cè)值與實(shí)際值進(jìn)行比較。
- 累加預(yù)測(cè)正確的樣本數(shù)和總樣本數(shù)。
- 用上面的公式算出正確率
實(shí)際上 tensorflow 可以調(diào)用tf.keras.metrics
這個(gè)在之前的文章中已經(jīng)提到,這里就不贅述了。
# ------------------------Tensorflow -----------------------------
if(step%100==0):
print("epoch:{}, step:{} loss:{}".
format(epoch, step, loss.numpy()))
# test accuracy:
total_correct = 0
total_num = 0
for x_test, y_test in ds_test:
x_test = tf.reshape(x_test, [-1, 28*28])
y_pred = tf.argmax(model(x_test),axis=1)
y_pred = tf.cast(y_pred, tf.int32)
correct = tf.cast((y_pred == y_test), tf.int32)
correct = tf.reduce_sum(correct)
total_correct += int(correct)
total_num += x_test.shape[0]
accuracy = total_correct/total_num
print('accuracy: ', accuracy)
# ------------------------PyTorch --------------------------------
if(step%100 == 0):
print("epoch:{}, step:{}, loss:{}".
format(epoch, step, loss.item()))
# test accuracy
total_correct = 0
total_num = 0
for x_test, y_test in test_loader:
x_test = x_test.reshape(-1,28*28)
x_test, y_test = x_test.to(device), y_test.to(device)
y_pred = network(x_test)
y_pred = torch.argmax(y_pred, dim = 1)
correct = y_pred == y_test
correct = correct.sum()
total_correct += correct
total_num += x_test.shape[0]
acc = total_correct.float()/total_num
print("accuracy: ", acc.item())
相關(guān)文章
【教程】Tensorflow vs PyTorch —— 自動(dòng)求導(dǎo)
【教程】Tensorflow vs PyTorch —— 數(shù)學(xué)運(yùn)算
【教程】Tensorflow vs PyTorch —— 張量的基本操作
Tensorflow 2 vs PyTorch 對(duì)比學(xué)習(xí)教程開啟
Tensorflow 2.0 --- ResNet 實(shí)戰(zhàn) CIFAR100 數(shù)據(jù)集
Tensorflow2.0——可視化工具tensorboard
Tensorflow2.0-數(shù)據(jù)加載和預(yù)處理
Tensorflow 2.0 快速入門 —— 引入Keras 自定義模型
Tensorflow 2.0 快速入門 —— 自動(dòng)求導(dǎo)與線性回歸
Tensorflow 2.0 輕松實(shí)現(xiàn)遷移學(xué)習(xí)
Tensorflow入門——Eager模式像原生Python一樣簡(jiǎn)潔優(yōu)雅
Tensorflow 2.0 —— 與 Keras 的深度融合