Tensorflow与1-hdden层预测神经网络不会改变 - 回归

我是新来TensorFlow和神经网络在一般情况下，我想开发一个神经网络，可以预测一个属性的值（这是在Kaggle.com上开始的比赛），我知道使用神经网络可能不是解决回归问题的最佳模型，但我决定尝试一下。Tensorflow与1-hdden层预测神经网络不会改变 - 回归

当使用单层神经网络（没有隐藏层，这可能是一个线性回归）时，模型实际上预测值接近实际值，但是当我添加一个隐藏层时，预测的所有值都是相同的批次的20个输入张量：

('real', array([[ 181000.], 
     [ 128900.], 
     [ 161500.], 
     [ 180500.], 
     [ 181000.], 
     [ 183900.], 
     [ 122000.], 
     [ 378500.], 
     [ 381000.], 
     [ 144000.], 
     [ 260000.], 
     [ 185750.], 
     [ 137000.], 
     [ 177000.], 
     [ 139000.], 
     [ 137000.], 
     [ 162000.], 
     [ 197900.], 
     [ 237000.], 
     [ 68400.]])) 
('prediction ', array([[ 4995.10597687], 
     [ 4995.10597687], 
     [ 4995.10597687], 
     [ 4995.10597687], 
     [ 4995.10597687], 
     [ 4995.10597687], 
     [ 4995.10597687], 
     [ 4995.10597687], 
     [ 4995.10597687], 
     [ 4995.10597687], 
     [ 4995.10597687], 
     [ 4995.10597687], 
     [ 4995.10597687], 
     [ 4995.10597687], 
     [ 4995.10597687], 
     [ 4995.10597687], 
     [ 4995.10597687], 
     [ 4995.10597687], 
     [ 4995.10597687], 
     [ 4995.10597687]]))

更新： 我注意到，预测值只反射输出层的偏见，而两者的隐含层和输出层的重量没有变化，并始终零点

为了进一步检查发生了什么问题，我生成了模型的图（一次使用隐藏层，另一次使用隐藏层）比较两个图，看看是否有某些东西丢失，不幸的是它们都是看起来是正确的我，但我还是不明白，为什么样板工程时，有没有隐藏的图层并采用了隐藏层

图的工作模式（中间没有隐藏层）时不工作：

图中的不工作的模型（具有隐含层和输出层）

我的全代码如下：

# coding: utf-8 
import tensorflow as tf 
import numpy as np 
def loadDataFromCSV(fileName , numberOfFields , numberOfOutputFields , numberOfRecords): 
    XsArray = np.ndarray([numberOfRecords ,(numberOfFields-numberOfOutputFields)] , dtype=np.float64) 
    YsArray = np.ndarray([numberOfRecords ,numberOfOutputFields] , dtype=np.float64) 
    fileQueue = tf.train.string_input_producer(fileName) 
    defaultValues = [[0]]*numberOfFields 
    decodedLine = [[None]]*numberOfFields 
    reader = tf.TextLineReader() 
    key , singleLine = reader.read(fileQueue) 
    decodedLine = tf.decode_csv(singleLine,record_defaults=defaultValues) 
    inputFeatures = decodedLine[0:numberOfFields-numberOfOutputFields] 
    outputFeatures =decodedLine[numberOfFields-numberOfOutputFields:numberOfFields] 
    with tf.Session() as session : 
     tf.global_variables_initializer().run() 
     coor = tf.train.Coordinator() 
     threads = tf.train.start_queue_runners(coord=coor) 
     for i in range(numberOfRecords) : 
      XsArray[i,:] ,YsArray[i,:] = session.run([inputFeatures , outputFeatures]) 
     coor.request_stop() 
     coor.join(threads) 
    return XsArray , YsArray 
x , y =loadDataFromCSV(['/Users/mousaalsulaimi/Downloads/convertcsv.csv'] , 289 , 1, 1460) 
num_steps = 10000 
batch_size = 20 

graph = tf.Graph() 
with graph.as_default() : 
    with tf.name_scope('input'): 
     inputProperties = tf.placeholder(tf.float32 , shape=(batch_size ,287)) 
    with tf.name_scope('realPropertyValue') : 
     outputValues = tf.placeholder(tf.float32,shape=(batch_size,1)) 
    with tf.name_scope('weights'): 
     hidden1_w = tf.Variable(tf.truncated_normal([287,1000],stddev=math.sqrt(3/(287+1000)) , dtype=tf.float32)) 
    with tf.name_scope('baises'): 
     hidden1_b = tf.Variable(tf.zeros([1000] , dtype=tf.float32)) 
    with tf.name_scope('hidden_layer'): 
     hidden1 =tf.matmul(inputProperties,hidden1_w) + hidden1_b 
    #hidden1_relu = tf.nn.relu(hidden1) 
    #hidden1_dropout = tf.nn.dropout(hidden1_relu,.5) 
    with tf.name_scope('layer2_weights'): 
     output_w = tf.Variable(tf.truncated_normal([1000,1],stddev=math.sqrt(3/(1000+1)) , dtype=tf.float32)) 
    with tf.name_scope('layer2_baises'): 
     output_b = tf.Variable(tf.zeros([1] , dtype=tf.float32)) 
    with tf.name_scope('layer_2_predictions'): 
     output =tf.matmul(hidden1,output_w) + output_b 
    with tf.name_scope('predictions'): 
     predictedValues = (output) 
    loss = tf.sqrt(tf.reduce_mean(tf.square(predictedValues-outputValues))) 
    loss_l2 = tf.nn.l2_loss(hidden1_w) 
    with tf.name_scope('minimization') : 
     minimum = tf.train.AdamOptimizer(.5).minimize(loss+.004*loss_l2) 

with tf.Session(graph=graph) as session: 
    tf.global_variables_initializer().run() 
    print("Initialized") 
    for step in range(num_steps): 
     # Pick an offset within the training data, which has been randomized. 
     # Note: we could use better randomization across epochs. 
     offset = (step * batch_size) % (y.shape[0] - batch_size) 
     # Generate a minibatch. 
     batch_data = x[offset:(offset + batch_size), 1:] 
     batch_labels = y[offset:(offset + batch_size), :] 
     print("real" , batch_labels) 
     # Prepare a dictionary telling the session where to feed the minibatch. 
     # The key of the dictionary is the placeholder node of the graph to be fed, 
     # and the value is the numpy array to feed to it. 
     feed_dict = {inputProperties : batch_data, outputValues : batch_labels} 
     _, l, predictions , inp = session.run([minimum, loss, predictedValues ,inputProperties ], feed_dict=feed_dict) 
     print("prediction " , predictions) 
     print("loss : " , l) 
     print("----------") 

     print('+++++++++++')

也是我在的情况下上传数据文件convertcsv.csv here要看一看。

我很感谢任何帮助弄清楚我做错了什么。

谢谢

来源

2017-04-22 mousa alsulaimi

我不认为这些是导致性能不佳的原因，但我注意到了3件事：首先，您使用'hidden1'而不是'hidden_dropout'来定义'output'，所以您基本上只是在做线性回归，因为层之间没有激活功能。其次，您可能想要将'output_w'的正则化添加到'loss_l2'。最后，32位通常绰绰有余，因此明确使用64位浮点数可能没有什么区别。 – Styrke

您也可以尝试权重的初始化。如果使用Xavier初始化，标准偏差应该是'sqrt（3。/（in + out））'。对于'output_w'，''hidden1_w'和'sqrt（3。/（1000 + 1））'是'sqrt（3。/（287 + 1000））'。 – Styrke

谢谢Styrke，我删除了relu激活函数和退出，因为我认为他们导致问题的地方，我刚刚返回他们，我也尝试了Xavier initalization，正如您所建议的那样，但没有改变，输出层仍然不能正确预测任何事物。 –

好了，我终于得到了什么问题是，并且预计它在神经网络中的权重，我也投入了一些预处理，以增强预测：

import tensorflow as tf 
import numpy as np 
import math 
from sklearn import preprocessing 

def loadDataFromCSV(fileName , numberOfFields , numberOfOutputFields , numberOfRecords): 
    XsArray = np.ndarray([numberOfRecords ,(numberOfFields-numberOfOutputFields)] , dtype=np.float64) 
    YsArray = np.ndarray([numberOfRecords ,numberOfOutputFields] , dtype=np.float64) 
    fileQueue = tf.train.string_input_producer(fileName) 
    defaultValues = [[0]]*numberOfFields 
    decodedLine = [[None]]*numberOfFields 
    reader = tf.TextLineReader() 
    key , singleLine = reader.read(fileQueue) 
    decodedLine = tf.decode_csv(singleLine,record_defaults=defaultValues) 
    inputFeatures = decodedLine[0:numberOfFields-numberOfOutputFields] 
    outputFeatures =decodedLine[numberOfFields-numberOfOutputFields:numberOfFields] 
    with tf.Session() as session : 
     tf.global_variables_initializer().run() 
     coor = tf.train.Coordinator() 
     threads = tf.train.start_queue_runners(coord=coor) 
     for i in range(numberOfRecords) : 
      XsArray[i,:] ,YsArray[i,:] = session.run([inputFeatures , outputFeatures]) 
     coor.request_stop() 
     coor.join(threads) 
    return XsArray , YsArray 
x , y =loadDataFromCSV(['/Users/mousaalsulaimi/Downloads/convertcsv.csv'] , 289 , 1, 1460) 
num_steps = 10000 
batch_size = 20 



graph = tf.Graph() 
beta = .00009 
with graph.as_default() : 
    keepprop = tf.placeholder(tf.float32 , shape=([1])) 
    with tf.name_scope('input'): 
     inputProperties = tf.placeholder(tf.float32 , shape=(None ,287)) 
    with tf.name_scope('realPropertyValue') : 
     outputValues = tf.placeholder(tf.float32,shape=(None,1)) 
    with tf.name_scope('weights'): 
     hidden1_w = tf.Variable(tf.truncated_normal([287,2000],stddev=math.sqrt(3/(1)) , dtype=tf.float32)) 
    with tf.name_scope('baises'): 
     hidden1_b = tf.Variable(tf.zeros([2000] , dtype=tf.float32)) 
    with tf.name_scope('hidden_layer'): 
     hidden1 =tf.matmul(inputProperties,hidden1_w) + hidden1_b 
     hidden1_relu = tf.nn.relu(hidden1) 
     hidden1_dropout = tf.nn.dropout(hidden1_relu,keep_prob=keepprop[0]) 
    with tf.name_scope('layer2_weights'): 
     hidden2_w = tf.Variable(tf.truncated_normal([2000,500],stddev=math.sqrt(3/(1)) , dtype=tf.float32)) 
    with tf.name_scope('layer2_baises'): 
     hidden2_b = tf.Variable(tf.zeros([500] , dtype=tf.float32)) 
    with tf.name_scope('layer_2'): 
     hidden2 =tf.matmul(hidden1_dropout,hidden2_w) + hidden2_b 
     hidden2_relu = tf.nn.relu(hidden2) 
    hidden2_dropout= tf.nn.dropout(hidden2_relu,keepprop[0]) 
    with tf.name_scope('output_layer_weights'): 
     output_w = tf.Variable(tf.truncated_normal([500,1],stddev=math.sqrt(3/(1)) , dtype=tf.float32)) 
    with tf.name_scope('outout_layer_baises'): 
     output_b = tf.Variable(tf.zeros([1] , dtype=tf.float32)) 
    with tf.name_scope('output_layer'): 
     output = tf.matmul(hidden2_dropout,output_w) + output_b  
    with tf.name_scope('predictions'): 
     predictedValues = tf.nn.relu(output) 
    loss = tf.sqrt(tf.reduce_mean(tf.square((predictedValues)-(outputValues)))) 
    loss_l2 = tf.nn.l2_loss(hidden1_w) + tf.nn.l2_loss(hidden2_w) + tf.nn.l2_loss(output_w) + tf.reduce_mean(output_w) + tf.reduce_mean(hidden2_w) + tf.reduce_mean(hidden1_w) 
    global_step = tf.Variable(0,trainable=False) 
    start_step = .5 
    learning_rate = tf.train.exponential_decay(start_step ,global_step , 100 , .94 , staircase=True) 
    with tf.name_scope('minimization') : 
     minimum = tf.train.AdadeltaOptimizer(learning_rate).minimize(loss+beta*loss_l2 , global_step=global_step) 

with tf.Session(graph=graph) as session: 
    tf.global_variables_initializer().run() 
    '''writer = tf.summary.FileWriter('/Users/mousaalsulaimi/Downloads/21' , graph=graph)''' 
    num_steps = 1000 
    batch_size = 730 
    print("Initialized") 
    for step in range(num_steps): 
     # Pick an offset within the training data, which has been randomized. 
     # Note: we could use better randomization across epochs. 
     offset = (step * batch_size) % (y.shape[0] - batch_size) 
      # Generate a minibatch. 

     batch_data_ss = preprocessing.MinMaxScaler().fit(x[offset:(offset + batch_size), 1:]) 
     batch_data = batch_data_ss.transform(x[offset:(offset + batch_size), 1:]) 
     batch_labels = y[offset:(offset + batch_size), :] 
     # Prepare a dictionary telling the session where to feed the minibatch. 
     # The key of the dictionary is the placeholder node of the graph to be fed, 
     # and the value is the numpy array to feed to it. 
     feed_dict = {keepprop:[.65], inputProperties : batch_data, outputValues : batch_labels } 
     _, l, predictions , inp , w_l = session.run([minimum, loss, predictedValues ,inputProperties , hidden1_w ], feed_dict=feed_dict) 
     print("loss2 : " , l) 
     print("loss : " , accuricy((predictions) ,(batch_labels)))

下面

是从预测的样本

('loss : ', 0.15377927727091956) 
('loss2 : ', 29109.197) 
('loss : ', 0.1523804301893735) 
('loss2 : ', 29114.414) 
('loss : ', 0.15479254974665729) 
('loss2 : ', 30617.834) 
('loss : ', 0.15270011182205656) 
('loss2 : ', 29519.598) 
('loss : ', 0.15641723449772593) 
('loss2 : ', 29307.811) 
('loss : ', 0.15460120852074882) 
('loss2 : ', 27985.998) 
('loss : ', 0.14993038617463786) 
('loss2 : ', 28811.738) 
('loss : ', 0.1549284462882819) 
('loss2 : ', 29157.725) 
('loss : ', 0.15402833737387819) 
('loss2 : ', 27079.215) 
('loss : ', 0.14974744509723023) 
('loss2 : ', 26622.93) 
('loss : ', 0.1419577502544874

预测也并不完美，但它取得了一些进展，你可以看到他们是关闭的$ 30,000每个prope rty

来源

2017-05-02 06:34:32

Tensorflow与1-hdden层预测神经网络不会改变 - 回归

回答

相关问题