Tensorflow随机值

我正在采取深入的学习和张量流程的第一步。因此，我有一些问题。Tensorflow随机值

根据教程和入门指南，我创建了一个隐藏层以及一些简单的softmax模型的DNN。我使用了https://archive.ics.uci.edu/ml/datasets/wine的数据集，并将其分解为训练和测试数据集。

from __future__ import print_function 
import tensorflow as tf 


num_attributes = 13 
num_types = 3 


def read_from_cvs(filename_queue): 
    reader = tf.TextLineReader() 
    key, value = reader.read(filename_queue) 
    record_defaults = [[] for col in range(
     num_attributes + 1)] 
    attributes = tf.decode_csv(value, record_defaults=record_defaults) 
    features = tf.stack(attributes[1:], name="features") 
    labels = tf.one_hot(tf.cast(tf.stack(attributes[0], name="labels"), tf.uint8), num_types + 1, name="labels-onehot") 
    return features, labels 


def input_pipeline(filename='wine_train.csv', batch_size=30, num_epochs=None): 
    filename_queue = tf.train.string_input_producer([filename], num_epochs=num_epochs, shuffle=True) 
    features, labels = read_from_cvs(filename_queue) 

    min_after_dequeue = 2 * batch_size 
    capacity = min_after_dequeue + 3 * batch_size 
    feature_batch, label_batch = tf.train.shuffle_batch(
     [features, labels], batch_size=batch_size, capacity=capacity, 
     min_after_dequeue=min_after_dequeue) 
    return feature_batch, label_batch 


def train_and_test(hidden1, hidden2, learning_rate, epochs, train_batch_size, test_batch_size, test_interval): 
    examples_train, labels_train = input_pipeline(filename="wine_train.csv", batch_size=train_batch_size) 
    examples_test, labels_test = input_pipeline(filename="wine_train.csv", batch_size=test_batch_size) 

    with tf.name_scope("first layer"): 
     x = tf.placeholder(tf.float32, [None, num_attributes], name="input") 
     weights1 = tf.Variable(
      tf.random_normal(shape=[num_attributes, hidden1], stddev=0.1), name="weights") 
     bias = tf.Variable(tf.constant(0.0, shape=[hidden1]), name="bias") 
     activation = tf.nn.relu(
      tf.matmul(x, weights1) + bias, name="relu_act") 
     tf.summary.histogram("first_activation", activation) 

    with tf.name_scope("second_layer"): 
     weights2 = tf.Variable(
      tf.random_normal(shape=[hidden1, hidden2], stddev=0.1), 
      name="weights") 
     bias2 = tf.Variable(tf.constant(0.0, shape=[hidden2]), name="bias") 
     activation2 = tf.nn.relu(
      tf.matmul(activation, weights2) + bias2, name="relu_act") 
     tf.summary.histogram("second_activation", activation2) 

    with tf.name_scope("output_layer"): 
     weights3 = tf.Variable(
      tf.random_normal(shape=[hidden2, num_types + 1], stddev=0.5), name="weights") 
     bias3 = tf.Variable(tf.constant(1.0, shape=[num_types+1]), name="bias") 
     output = tf.add(
      tf.matmul(activation2, weights3, name="mul"), bias3, name="output") 
     tf.summary.histogram("output_activation", output) 

    y_ = tf.placeholder(tf.float32, [None, num_types+1]) 

    with tf.name_scope("loss"): 
     cross_entropy = tf.reduce_mean(
      tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=output)) 
     tf.summary.scalar("cross_entropy", cross_entropy) 
    with tf.name_scope("train"): 
     train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(cross_entropy) 

    with tf.name_scope("tests"): 
     correct_prediction = tf.equal(tf.argmax(output, 1), tf.argmax(y_, 1)) 
     accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 
     tf.summary.scalar("accuracy", accuracy) 

    summary_op = tf.summary.merge_all() 
    sess = tf.InteractiveSession() 
    writer = tf.summary.FileWriter("./wineDnnLow", sess.graph) 
    tf.global_variables_initializer().run() 
    coord = tf.train.Coordinator() 
    threads = tf.train.start_queue_runners(coord=coord, sess=sess) 


    try: 
     step = 0 
     while not coord.should_stop() and step < epochs: 
      # train 
      ex, lab = sess.run([examples_train, labels_train]) 
      _ = sess.run([train_step], feed_dict={x: ex, y_: lab}) 
      # test 
      if step % test_interval == 0: 
       ex, lab = sess.run([examples_test, labels_test]) 
       summery, test_accuracy = sess.run([summary_op, accuracy], feed_dict={x: ex, y_: lab}) 
       writer.add_summary(summery, step) 
       print("accurary= {0:f} on {}".format(test_accuracy, step)) 
      step += 1 
    except tf.errors.OutOfRangeError: 
     print("Done training for %d steps" % (step)) 

    coord.request_stop() 
    coord.join(threads) 
    sess.close() 



def main(): 
    train_and_test(10, 20, 0.5, 700, 30, 10, 1) 


if __name__ == '__main__': 
    main()

的问题是，准确性因素不收敛，似乎得到随机值。但是，当我尝试tf.contrib.learn.DNNClassifier我的数据被分类得很好。所以任何人都可以给我一些提示，问题出在我自己创建的DNN上？

此外，我还有第二个问题。在训练中，我在session.run（）上提供train_step，而不是在测试上。这是否确保权重不受影响，因此图形没有通过测试学习？

编辑：如果我使用MNIST数据集及其批处理insteat我的净行为良好。因此，我认为问题是由input_pipeline引起的。

来源

2017-08-29 user98765

降低学习率，减少所有层的stddev。总的来说 - 你是怎么想出所有这些常量的？看起来你似乎在每个变量中都提供了随机初始值。 – lejlot

我尝试了不同的学习率，但问题仍然是一样的。此外，如果我使用MNIST数据集进行批处理，则网络正常工作。因此，我认为这应该是由我的input_pipeline – user98765

快速浏览一下数据集，向我表明我要做的第一件事就是将它归一化（减去平均值，除以标准偏差）。也就是说，与MNIST相比，它仍然是一个非常小的数据集，所以不要指望所有东西都一模一样。

如果您不确定输入流水线，只需将所有数据加载到内存中，而不是使用输入流水线。

一些常规注意事项：

您的输入管道不是节省您的任何时间。你的数据集很小，所以我只是使用feed_dict，但是如果它很大，你最好去掉占位符，并使用input_pipeline的输出（并建立一个单独的测试图）。

对于常见图层类型，使用tf.layers API。例如，您的推理部分可以通过以下三行有效缩小。

activation = tf.layers.dense(x, hidden1, activation=tf.nn.relu) 
activation2 = tf.layers.dense(x, hidden2, activation=tf.nn.relu) 
output = tf.layers.dense(activation2, num_types+1)

（你不会有相同的初始化，但您可以指定那些具有可选参数，默认值是一个良好的开端，但。）

GradientDescentOptimizer是非常原始的。我目前的最爱是AdamOptimizer，但与其他人一起试验。如果这看起来太复杂，MomentumOptimizer通常会在复杂性和性能优势之间进行折衷。

查看tf.estimator.Estimator API。它会让你做的更容易，并迫使你从模型本身分离数据加载（一件好事）。

查看tf.contrib.data.Dataset API进行数据预处理。队列在tensorflow中已经存在了一段时间，所以这是大多数教程的写作内容，但我认为Dataset API更直观/更简单。同样，对于这种情况，您可以轻松地将所有数据加载到内存中，这有点矫枉过正。有关如何使用从CSV文件开始的Dataset的问题，请参阅this。

来源

2017-08-29 23:51:30 DomJack

谢谢。为了清楚起见，我在这个小数据集上使用了过度投入的input_pipeline，因为后来我想使用更大的数据集，但认为在小数据集上学习会更容易，但使用“正确”的方法。 – user98765

值得赞扬 - 但我最先得到最简单的东西，然后详细说明:)。奖励标志，如果你去，并转换为'tfrecords'，而不是每次你在数据集中运行它时都解析每个csv记录。无论您使用什么（csv，tfrecords），您都不应该为每个训练步骤执行2次会话运行（1为获取数据，1为将数据提供给主图） - 您应该将两者连接起来以避免不必要地传输数据在这个地方。 – DomJack

为避免每次训练执行2次运行，我必须移除占位符以直接提供张量？ “并建立一个单独的测试图”我如何获得一个额外的图与我列车状态？我必须使用tf.train.Saver来保存和恢复它还是有其他方法？ – user98765

Tensorflow随机值

回答

相关问题