神經(jīng)網(wǎng)絡(luò) Neural Networks
本文介紹使用前向反饋傳播神經(jīng)網(wǎng)絡(luò),并使用該算法來(lái)預(yù)測(cè)手寫(xiě)數(shù)字。
代價(jià)函數(shù) Cost Function
后向傳遞 Backpropagation
正則化 Regularization
代碼實(shí)現(xiàn)
function [J grad] = nnCostFunction(nn_params, ...
input_layer_size, ...
hidden_layer_size, ...
num_labels, ...
X, y, lambda)
%NNCOSTFUNCTION 實(shí)現(xiàn)了一個(gè)輸入層,一個(gè)隱藏層,一個(gè)輸出層的分類(lèi)神經(jīng)網(wǎng)絡(luò)代價(jià)函數(shù)
% [J grad] = NNCOSTFUNCTON(nn_params, hidden_layer_size, num_labels, X, y, lambda)
% 神經(jīng)網(wǎng)絡(luò)的參數(shù)nn_params展成了向量傳遞,在代碼中應(yīng)該重新轉(zhuǎn)換成權(quán)重Theta矩陣?
% 返回的參數(shù)grad 應(yīng)是展成向量的神經(jīng)網(wǎng)絡(luò)偏導(dǎo)值
%
% 重塑Theta1, Theta2
% for our 2 layer neural network
Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ...
hidden_layer_size, (input_layer_size + 1));
Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ...
num_labels, (hidden_layer_size + 1));
% 設(shè)置一些實(shí)用的變量
m = size(X, 1);
% 初始化需要返回的變量
J = 0;
Theta1_grad = zeros(size(Theta1));
Theta2_grad = zeros(size(Theta2));
% ====================== 自己需要完成的代碼 ======================
a1 = X;
z2 = [ones(size(a1,1),1) a1]*Theta1'; % 5000訓(xùn)練集,對(duì)于同一個(gè)訓(xùn)練集運(yùn)行相同的計(jì)算
a2 = sigmoid(z2);
z3 = [ones(size(a2,1),1) a2]*Theta2';
a3 = sigmoid(z3);
h = a3;
yNew = zeros(m,size(Theta2,1)); %需要把真實(shí)值轉(zhuǎn)換成對(duì)應(yīng)十個(gè)分類(lèi)結(jié)果的矩陣 y = [0,0,...,0,1,0,...,0]
for i = 1:m
yNew(i,y(i))=1;
end
y = yNew;
% 代價(jià)函數(shù)
J = (y.*log(h)+(1-y).*log(1-h)/(-m);
%正則化
Theta1(:,1)=0;%正則化一般不包含第一項(xiàng)
Theta2(:,1)=0;
regularization = (sum(sum(Theta1.*Theta1)) + sum(sum(Theta2.*Theta2)))*(lambda/(2*m));
%正則化后的代價(jià)函數(shù)
J = sum(sum(J))+regularization;
%%%%%%
delta1 = 0;
delta2 = 0;
%使用for循環(huán)實(shí)現(xiàn),速度慢很多
%for i = 1:m
% sigma3 = a3(i,:) - y(i,:);
% sigma2 = (sigma3*Theta2)(1,2:end).*sigmoidGradient(z2(i,:));
%
% delta2 += (sigma3'.*a2(i,:))';
% delta1 += (sigma2'*a1(i,:))';
%end
%使用向量化來(lái)實(shí)現(xiàn)
sigma3 = a3 - y;
sigma2 = (sigma3*Theta2)(:,2:end).*sigmoidGradient(z2);
delta2 = sigma3'*[ones(m,1) a2];%記得補(bǔ)充哇!
delta1 = sigma2'*[ones(m,1) a1];%記得補(bǔ)充哇!
%梯度
Theta1_grad = 1/m .* delta1;
Theta2_grad = 1/m .* delta2;
%正則化后的梯度
Theta1_grad += lambda /m *Theta1;
Theta2_grad += lambda /m *Theta2;
% -------------------------------------------------------------
% =========================================================================
% Unroll gradients
grad = [Theta1_grad(:) ; Theta2_grad(:)];
end