如何更新RPROP神经网络中的偏倚？

我正在实施这个神经网络的一些分类问题。我最初尝试反向传播，但收敛需要更长的时间。所以我尽管使用RPROP。在我的测试设置中，RPROP可以很好地用于与门仿真，但从不收敛于OR和XOR门仿真。如何更新RPROP神经网络中的偏倚？

如何以及何时应该更新RPROP的偏差？
这里我的体重更新逻辑：

对（INT l_index = 1; l_index < _total_layers; l_index ++）{ 层* curr_layer = get_layer_at（l_index）;

//iterate through each neuron 
    for (unsigned int n_index = 0; n_index < curr_layer->get_number_of_neurons(); n_index++) { 
     Neuron* jth_neuron = curr_layer->get_neuron_at(n_index); 

     double change = jth_neuron->get_change(); 

     double curr_gradient = jth_neuron->get_gradient(); 
     double last_gradient = jth_neuron->get_last_gradient(); 

     int grad_sign = sign(curr_gradient * last_gradient); 

     //iterate through each weight of the neuron 
     for(int w_index = 0; w_index < jth_neuron->get_number_of_weights(); w_index++){ 
      double current_weight = jth_neuron->give_weight_at(w_index); 
      double last_update_value = jth_neuron->give_update_value_at(w_index); 

      double new_update_value = last_update_value; 
      if(grad_sign > 0){ 
       new_update_value = min(last_update_value*1.2, 50.0); 
       change = sign(curr_gradient) * new_update_value; 
      }else if(grad_sign < 0){ 
       new_update_value = max(last_update_value*0.5, 1e-6); 
       change = -change; 
       curr_gradient = 0.0; 
      }else if(grad_sign == 0){ 
       change = sign(curr_gradient) * new_update_value; 
      } 

      //Update neuron values 
      jth_neuron->set_change(change); 
      jth_neuron->update_weight_at((current_weight + change), w_index); 
      jth_neuron->set_last_gradient(curr_gradient); 
      jth_neuron->update_update_value_at(new_update_value, w_index); 

      double current_bias = jth_neuron->get_bias(); 
      jth_neuron->set_bias(current_bias + _learning_rate * jth_neuron->get_delta()); 
     } 
    } 
}

来源

2015-09-25 puru020

原则上，当您进行反向传播时，您不会像以前那样对待偏差。这是你似乎在做的learning_rate * delta。

错误的一个来源可能是体重变化的标志取决于您如何计算您的错误。有不同的约定和(t_i-y_i)而不是(y_i - t_i)应导致返回(new_update_value * sgn(grad))而不是-(new_update_value * sign(grad))所以请尝试切换标志。我也不确定你是如何专门实现一切的，因为这里没有显示很多东西。但这里是我的一个Java实现的一个片段可能会有所帮助：

// gradient didn't change sign: 
if(weight.previousErrorGradient * errorGradient > 0) 
    weight.lastUpdateValue = Math.min(weight.lastUpdateValue * step_pos, update_max); 
// changed sign: 
else if(weight.previousErrorGradient * errorGradient < 0) 
{ 
    weight.lastUpdateValue = Math.max(weight.lastUpdateValue * step_neg, update_min); 
} 
else 
    weight.lastUpdateValue = weight.lastUpdateValue; // no change   

// Depending on language, you should check for NaN here. 

// multiply this with -1 depending on your error signal's sign: 
return (weight.lastUpdateValue * Math.signum(errorGradient));

另外，请记住，50.0，1E-6，尤其是0.5，1.2是经验聚集值，所以他们可能需要进行调整。你一定要打印出渐变和重量变化，看看是否有奇怪的事情发生（例如，爆炸渐变 - > NaN，虽然你只是测试AND/XOR）。您的last_gradient值也应在第一时间步骤初始化为0。

来源

2015-09-27 14:03:54 runDOSrun

感谢您的回复。我正在使用（t_i-y_i）约定。我的last_gredient在类构造函数中初始化为0。我会尽管再次通过代码，看看我是否找到你建议的东西。 – puru020

所以我解决了这个问题，现在工作（至少部分）。在10次的6-7次迭代中，错误要么首先下降，要么停下来，要么停下来，直到下降。可能是什么原因。我不认为这是正常的？ – puru020

由于梯度下降步骤过大，它会超出局部最小值。尝试调整参数。 – runDOSrun

如何更新RPROP神经网络中的偏倚？

回答

相关问题