神经网络输出层的矢量化公式

我有一个神经网络，想用训练好的神经网络来求解一组测试数据。我正在努力为隐藏层和输出层写公式。我的目标是制作一个矢量化公式，但我也很乐意实现一个循环变化。神经网络输出层的矢量化公式

现在我相信我有隐藏层的正确公式，只需要一个用于输出层，但是如果有人确认它是向量化公式，将会很感激。

% Variables 
% Xtest test training data 
% thetah - trained weights for inputs to hidden layer 
% thetao - trained weights for hidden layer to outputs 
% ytest - output 

htest = (1 ./ (1 + exp(-(thetah * Xtest'))))' ; % FORMULA FOR HIDDEN LAYER 
ytest = ones(mtest, num_outputs) ; % FORMULA FOR OUTPUT LAYER

来源

2015-12-27 Jean de Toit

哪个公式应得到确认？你的代码中的ytest表达式只是初始化一个新的矩阵，并且肯定是不正确的。你会发布你到目前为止？ Xtest的维度是什么？它是一个矢量还是一组输入矢量？ – Anton

htest应该确认，目前ytest只是一个占位符代码，它会给出正确的尺寸，Xtest是6,7，其余的是6,6 –

下面您可以找到向前传播的向量化和循环实现。由于不同的符号和你在矩阵中存储数据的方式，你的输入数据必须适应下面的代码是可能的。

您需要向输入层和隐藏层添加偏置单位。

为了简化实施和调试工作我花了一些数据来自于开源machine learning repository和训练有素的网络the wine classification task。

Xtest - 输入数据[178x13]
y - 输出类[178x1]
thetah - 隐藏层的参数[15x14]
thetao - 输出层的参数[3x16]

网络将输入数据分隔率97.7％

下面是代码：

function [] = nn_fp() 

    load('Xtest.mat'); %input data 178x13 
    load('y.mat'); %output data 178x1 
    load('thetah.mat'); %Parameters of the hidden layer 15x14 
    load('thetao.mat'); %Parameters of the output layer 3x16 

    predict_simple(Xtest, y, thetah, thetao); 

    predict_vectorized(Xtest, y, thetah, thetao); 
end 

function predict_simple(Xtest, y, thetah, thetao) 

    mtest = size(Xtest, 1); %number of input examples 
    n = size(Xtest, 2); %number of features 
    hl_size = size(thetah, 1); %size of the hidden layer (without the bias unit) 
    num_outputs = size(thetao, 1); %size of the output layer 

    %add a bias unit to the input layer 
    a1 = [ones(mtest, 1) Xtest]; %[mtest x (n+1)] 

    %compute activations of the hidden layer 
    z2 = zeros(mtest, hl_size); %[mtest x hl_size] 
    a2 = zeros(mtest, hl_size); %[mtest x hl_size] 

    for i=1:mtest 
     for j=1:hl_size 
      for k=1:n+1 
       z2(i, j) = z2(i, j) + a1(i, k)*thetah(j, k); 
      end 

      a2(i, j) = sigmoid_simple(z2(i, j)); 
     end 
    end 

    %add a bias unit to the hidden layer 
    a2 = [ones(mtest, 1) a2]; %[mtest x (hl_size+1)] 

    %compute activations of the output layer 
    z3 = zeros(mtest, num_outputs); %[mtest x num_outputs] 
    h = zeros(mtest, num_outputs); %[mtest x num_outputs] 

    for i=1:mtest 
     for j=1:num_outputs 
      for k=1:hl_size+1 
       z3(i, j) = z3(i, j) + a2(i, k)*thetao(j, k); 
      end 

      h(i, j) = sigmoid_simple(z3(i, j)); %the hypothesis 
     end 
    end 

    %calculate predictions for each input example based on the maximum term 
    %of the hypothesis h 
    p = zeros(size(y)); 

    for i=1:mtest 
     max_ind = 1; 
     max_value = h(i, 1); 
     for j=2:num_outputs 
      if (h(i, j) > max_value) 
       max_ind = j; 
       max_value = h(i, j); 
      end 
     end 

     p(i) = max_ind; 
    end 

    %calculate the success rate of the prediction 
    correct_count = 0; 
    for i=1:mtest 
     if (p(i) == y(i)) 
      correct_count = correct_count + 1; 
     end 
    end 

    rate = correct_count/mtest*100; 

    display(['simple version rate:', num2str(rate)]); 
end 

function predict_vectorized(Xtest, y, thetah, thetao) 

    mtest = size(Xtest, 1); %number of input examples 

    %add a bias unit to the input layer 
    a1 = [ones(mtest, 1) Xtest]; 

    %compute activations of the hidden layer 
    z2 = a1*thetah'; 
    a2 = sigmoid_universal(z2); 

    %add a bias unit to the hidden layer 
    a2 = [ones(mtest, 1) a2]; 

    %compute activations of the output layer 
    z3 = a2*thetao'; 
    h = sigmoid_universal(z3); %the hypothesis 

    %calculate predictions for each input example based on the maximum term 
    %of the hypothesis h 
    [~,p] = max(h, [], 2); 
    %calculate the success rate of the prediction 
    rate = mean(double((p == y))) * 100; 
    display(['vectorized version rate:', num2str(rate)]); 
end 

function [ s ] = sigmoid_simple(z) 
    s = 1/(1+exp(-z)); 
end 

function [ s ] = sigmoid_universal(z) 
    s = 1./(1+exp(-z)); 
end

来源

2015-12-30 00:29:10 Anton

假设你Xtest具有尺寸N by M其中N是实施例中的数量，M是特征的数量，thetah是M by H1矩阵，其中H1为隐藏层中的第一层的数目和thetao是H1 by O矩阵，其中O是跟着你做输出类的数量：

a1 = Xtest * thetah; 
z1 = 1/(1 + exp(-a1)); %Assuming you are using sigmoid units 

a2 = z1 * thetao; 
z2 = softmax(a2);

了解更多关于SOFTMAX here。

来源

2015-12-27 15:38:20 Amir

神经网络输出层的矢量化公式

回答

相关问题