前言
學(xué)習(xí)pytorch已經(jīng)一周了,pytorch官網(wǎng)的示例代碼基本上都敲了一遍,關(guān)于tensor的使用,數(shù)據(jù)集,網(wǎng)絡(luò)定義等。和之前學(xué)習(xí)caffe痛苦的經(jīng)歷相比,pytorch對(duì)常用的操作都進(jìn)行了封裝,只要安裝流程做即可。在之前的學(xué)習(xí)基礎(chǔ)上,本章節(jié)內(nèi)容將在CIFAR10數(shù)據(jù)集上訓(xùn)練一個(gè)簡(jiǎn)單的CNN網(wǎng)絡(luò)。
目的
- 基于CIFAR-10數(shù)據(jù)集,訓(xùn)練一個(gè)簡(jiǎn)單CNN網(wǎng)絡(luò)。
- 保存訓(xùn)練好的模型,測(cè)試
- 使用GPU訓(xùn)練
開發(fā)/實(shí)驗(yàn)環(huán)境
- Ubuntu 18.04
- pytorch 1.0
- Anaconda3, python3.6
- pycharm
CIFAR數(shù)據(jù)集
http://www.cs.toronto.edu/~kriz/cifar.html
The CIFAR-10 and CIFAR-100 are labeled subsets of the 80 million tiny images dataset. They were collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton.
CIFAR數(shù)據(jù)集可分為CIFAR10, CIFAR100。 CIFAR-10是指包含10個(gè)種類, CIFAR-100包含100個(gè)種類。
CIFAR-10
The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.
特點(diǎn):
- 32x32 彩色圖像
- 10個(gè)類別
- 總共60000張圖像
- 50000張訓(xùn)練樣本 + 10000張測(cè)試樣本
- 每個(gè)類別有6000張圖像, 10 x 6000 = 60000
10個(gè)類別:
- airplane
- automobile
- bird
- cat
- deer
- dog
- frog
- horse
- ship
- truck
-
下載地址
image.png 鏈接
CIFAR-10 python version
CIFAR-10 Matlab version
CIFAR-10 binary version (suitable for C programs)Tips
不需要手動(dòng)下載, 使用pytorch中的Dataset API自動(dòng)下載即可
實(shí)驗(yàn)過(guò)程
準(zhǔn)備數(shù)據(jù)集
這一步驟在pytorch中非常方便,pytorch已經(jīng)為我們準(zhǔn)備好了常見的數(shù)據(jù)集合,只需要導(dǎo)入即可。
數(shù)據(jù)集在torchvision.dataset
包里面
import torch
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import matplotlib.pyplot as plt
import numpy as np
torchvision.dataset.CFAIR10
是一個(gè)類, 通過(guò)實(shí)例化該類的一個(gè)對(duì)象,就可以操作數(shù)據(jù)集。
參數(shù):
root
-----數(shù)據(jù)集下載后保存的路徑
train
-----訓(xùn)練or測(cè)試
download
----是否需要自動(dòng)下載
transform
----對(duì)圖像進(jìn)行變換, 一般需要對(duì)原始圖像進(jìn)行ToTensor()
, Normalize()
變換
之后,使用DataLoader
類對(duì)數(shù)據(jù)集進(jìn)行包裝,目的是為了方便讀取和使用,比如可以min_batch讀取, 采用多線程。
# --------------------準(zhǔn)備數(shù)據(jù)集------------------
# Dataset, DataLoader
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), std =(0.5, 0.5, 0.5))])
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
download=True, transform=transform)
testset = torchvision.datasets.CIFAR10(root='./data',train=False,
transform=transform, download=True)
trainloader = DataLoader(dataset=trainset, batch_size=4, shuffle=True, num_workers=4)
testloader = DataLoader(dataset=testset, batch_size=4, shuffle=True, num_workers=4)
#
dataiter = iter(trainloader)
images, labels = dataiter.next()
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))
隨機(jī)顯示一個(gè)min_batch的圖像,batch=4
標(biāo)簽:
至此,數(shù)據(jù)集就準(zhǔn)備OK,下面開始定義一個(gè)網(wǎng)絡(luò)。
定義CNN網(wǎng)絡(luò)
簡(jiǎn)單起見,采用LeNet網(wǎng)絡(luò),將第一個(gè)卷積層的輸入通道改為3,因?yàn)镃IFAR-10是彩色3通道圖像。
#定義一個(gè)簡(jiǎn)單的網(wǎng)絡(luò)
# LeNet -5
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(in_channels=3, out_channels=6, kernel_size=5)
self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2)
self.conv2 = nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5)
self.fc1 = nn.Linear(in_features=16 * 5 * 5,out_features=120)
self.fc2 = nn.Linear(in_features=120, out_features=84)
self.fc3 = nn.Linear(in_features=84, out_features=10)
def forward(self, x):
x = self.pool1(F.relu(self.conv1(x)))
x = self.pool1(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5) # reshape tensor
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
設(shè)置網(wǎng)絡(luò)的優(yōu)化、迭代方法,訓(xùn)練網(wǎng)絡(luò)
CNN網(wǎng)絡(luò)訓(xùn)練本質(zhì)上就是對(duì)一個(gè)目標(biāo)函數(shù)(損失函數(shù))求最小的問(wèn)題,在數(shù)學(xué)中,對(duì)于一般的凸函數(shù),優(yōu)化方法有梯度下降法、牛頓法等。(除此之外還有啟發(fā)式搜索,比如遺傳算法等)。 對(duì)于神經(jīng)網(wǎng)絡(luò)的訓(xùn)練,常用的優(yōu)化方法為隨機(jī)梯度下降法SGD。
- 定義損失函數(shù),優(yōu)化方法
采用交叉熵?fù)p失函數(shù)
采用SGD隨機(jī)梯度下降法進(jìn)行優(yōu)化(帶動(dòng)量項(xiàng))
# 定義損失函數(shù),優(yōu)化方法
# 采用Cross-Entropy loss, SGD with moment
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
- 進(jìn)行迭代優(yōu)化,訓(xùn)練
之前學(xué)習(xí)caffe時(shí)候,終于搞清楚2個(gè)概念。
Iter------一次迭代,是指一個(gè)min_batch的一次forward+backward
Epoch------迭代完所有的訓(xùn)練數(shù)據(jù)(1次),稱為一個(gè)epoch
這里總共跑20個(gè)epoch。
# 訓(xùn)練網(wǎng)絡(luò)
# 迭代epoch
for epoch in range(20):
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
# get the input
inputs, labels = data
# zeros the paramster gradients
optimizer.zero_grad() #
# forward + backward + optimize
outputs = net(inputs)
loss = criterion(outputs, labels) # 計(jì)算loss
loss.backward() # loss 求導(dǎo)
optimizer.step() # 更新參數(shù)
# print statistics
running_loss += loss.item() # tensor.item() 獲取tensor的數(shù)值
if i % 2000 == 1999:
print('[%d, %5d] loss: %.3f' %
(epoch + 1, i + 1, running_loss / 2000)) # 每2000次迭代,輸出loss的平均值
running_loss = 0.0
print('Finished Training')
訓(xùn)練過(guò)程:
可以看到,loss值不斷下降。因?yàn)楸救瞬捎肅PU,所以地只跑了20個(gè)epoch。
保存模型
# --------保存模型-----------
torch.save(net, './model/model_cfair10_2.pth') # 保存整個(gè)模型,體積比較大
# torch.save(net.state_dict(), './model/model_cfair10.pth')
測(cè)試模型
import torch
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
CIFAR-10總共包含10個(gè)類別:
CFAIR10_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'forg', 'horse', 'ship', 'truck']
載入一張圖像,RBG,要屬于上述類別的某一類,不然識(shí)別不出來(lái)
# load a image
image = Image.open('/xxxx/image/dog.jpg')
對(duì)圖像進(jìn)行相同的變換:
transform = transforms.Compose(
[transforms.Resize((32, 32)),
transforms.ToTensor(),
transforms.Normalize(
mean=(0.5, 0.5, 0.5),
std=(0.5, 0.5, 0.5)
)])
image_transformed = transform(image)
print(image_transformed.size())
需要注意的地方
CNN網(wǎng)絡(luò)的輸入為4D Tensor (NxCxHxW), 轉(zhuǎn)換之后的圖像需要變換為4D
torsor1.unsqueeze(0)
即可增加一個(gè)維度,這樣輸入的tensor為: 1x3x32x32
#
transform = transforms.Compose(
[transforms.Resize((32, 32)),
transforms.ToTensor(),
transforms.Normalize(
mean=(0.5, 0.5, 0.5),
std=(0.5, 0.5, 0.5)
)])
image_transformed = transform(image)
print(image_transformed.size())
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(in_channels=3, out_channels=6, kernel_size=5)
self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2)
self.conv2 = nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5)
self.fc1 = nn.Linear(in_features=16 * 5 * 5,out_features=120)
self.fc2 = nn.Linear(in_features=120, out_features=84)
self.fc3 = nn.Linear(in_features=84, out_features=10)
def forward(self, x):
x = self.pool1(F.relu(self.conv1(x)))
x = self.pool1(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5) # reshape tensor
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
net = torch.load('./model/model_cfair10.pth')
# print(net)
image_transformed = image_transformed.unsqueeze(0)
output = net(image_transformed)
predict_value, predict_idx = torch.max(output, 1) # 求指定維度的最大值,返回最大值以及索引
plt.figure()
plt.imshow(np.array(image))
plt.title(CFAIR10_names[predict_idx])
plt.axis('off')
plt.show()
測(cè)試結(jié)果:
GPU訓(xùn)練、模型學(xué)習(xí)率調(diào)整
問(wèn)題:
采用CPU模型訓(xùn)練了20個(gè)Epoch之后,loss下降到了0.6左右,后來(lái)在之前訓(xùn)練的基礎(chǔ)上迭代了20個(gè)Epoch,發(fā)現(xiàn)loss處于0.5~0.6之間。
CPU上訓(xùn)練確實(shí)很慢,跑20個(gè)Epoch基本上用了1h多(具體時(shí)間不記得),比較漫長(zhǎng)。
使用GPU訓(xùn)練模型
- 計(jì)算機(jī)配置 GPU: 1080
首先,需要安裝GPU版本的pytorch, 具體安裝步驟pytorch官網(wǎng)有。使用GPU訓(xùn)練需要對(duì)代碼做一些小調(diào)整。
step1:在代碼中,首先使用pytorch中的函數(shù)判斷是否支持GPU
is_support = torch.cunda.is_available()
if is_support:
device = torch.device('cuda:0')
# device = torch.device('cuda:1')
else:
device = torch.device('cpu')
step2: 將CPU上的計(jì)算轉(zhuǎn)移到GPU上
net = Net()
net.to(device) # GPU模式需要添加
# 訓(xùn)練網(wǎng)絡(luò)
# 迭代epoch
for epoch in range(20):
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
# get the input
inputs, labels = data
inputs = inputs.to(device) # GPU計(jì)算
labels = labels.to(device) # GPU計(jì)算
# zeros the paramster gradients
optimizer.zero_grad() #
# forward + backward + optimize
outputs = net(inputs)
loss = criterion(outputs, labels) # 計(jì)算loss
loss.backward() # loss 求導(dǎo)
optimizer.step() # 更新參數(shù)
# print statistics
running_loss += loss.item() # tensor.item() 獲取tensor的數(shù)值
if i % 2000 == 1999:
print('[%d, %5d] loss: %.3f' %
(epoch + 1, i + 1, running_loss / 2000)) # 每2000次迭代,輸出loss的平均值
running_loss = 0.0
print('Finished Training')
run, 會(huì)發(fā)現(xiàn)迭代速度飛起,10min左右就可以完成20個(gè)Epoch迭代,速度非???。
學(xué)習(xí)率調(diào)整
- 隨機(jī)梯度下降法SGD重要的一個(gè)參數(shù)是:學(xué)習(xí)率 leaning_rate
上面的代碼采用的是固定學(xué)習(xí)率lr=0.001,。 剛開始迭代時(shí)候,學(xué)習(xí)率可以大一些,這樣收斂速度快,隨著迭代次數(shù)增加,學(xué)習(xí)率應(yīng)該減小防止loss震蕩。
簡(jiǎn)單起見,本人將學(xué)習(xí)率調(diào)整為lr=0.0001,然后在之前模型的基礎(chǔ)上迭代20個(gè)Epoch。明顯發(fā)現(xiàn)Loss變?yōu)?.3, 0.2, 0.1。
GPU模型測(cè)試結(jié)果
雖然采用GPU訓(xùn)練, lr減小為0.0001, loss也減少了(訓(xùn)練集loss)。 在測(cè)試中,1個(gè)horse識(shí)別為deer, bird識(shí)別為cat。 因?yàn)?,要?xùn)練到一個(gè)適合模型,還需要其他策略,包括采用其他網(wǎng)絡(luò)模型。
在整個(gè)測(cè)試集上評(píng)價(jià)模型性能
- 計(jì)算Acc
import torch
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
CFAIR10_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'forg', 'horse', 'ship', 'truck']
# --------------測(cè)試數(shù)據(jù)集------------------------------
transform = transforms.Compose(
[transforms.Resize((32, 32)),
transforms.ToTensor(),
transforms.Normalize(
mean=(0.5, 0.5, 0.5),
std=(0.5, 0.5, 0.5)
)])
testset = torchvision.datasets.CIFAR10(root='./data', train=False,
download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
shuffle=False, num_workers=4)
# -----------------網(wǎng)咯模型-------------------------------
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(in_channels=3, out_channels=6, kernel_size=5)
self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2)
self.conv2 = nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5)
self.fc1 = nn.Linear(in_features=16 * 5 * 5,out_features=120)
self.fc2 = nn.Linear(in_features=120, out_features=84)
self.fc3 = nn.Linear(in_features=84, out_features=10)
def forward(self, x):
x = self.pool1(F.relu(self.conv1(x)))
x = self.pool1(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5) # reshape tensor
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
net = torch.load('./model/model_cfair10_20.pth',map_location='cpu')
# ------------在整個(gè)測(cè)試集上測(cè)試-------------------------------------------
correct = 0
total = 0
count = 0
with torch.no_grad():
for sample_batch in testloader:
images = sample_batch[0]
labels = sample_batch[1]
# forward
out = net(images)
#
_, pred = torch.max(out, 1)
correct += (pred == labels).sum().item()
total += labels.size(0)
print('batch:{}'.format(count + 1))
count += 1
#
# Acc
accuracy = float(correct) / total
print('Acc = {:.5f}'.format(accuracy))
自己訓(xùn)練的模型,采用GPU, 學(xué)習(xí)率中途調(diào)整過(guò)一次:
End
通過(guò)這個(gè)完整的例子,學(xué)習(xí)到了不少內(nèi)容。關(guān)于GPU的使用,這里只是簡(jiǎn)單的應(yīng)用,還沒(méi)有使用多塊GPU加速訓(xùn)練。后續(xù),再接再厲,學(xué)習(xí)GPU加速,CNN調(diào)參。
參考:
https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html