Tensorflow中的LSTM反向传播

从部门Truncated Backpropagation的官方PTB google教程中，有一个实现使用BasicLSTMCell，方法是创建一个for循环，展开num_steps步骤的图形。Tensorflow中的LSTM反向传播

# Placeholder for the inputs in a given iteration. 
words = tf.placeholder(tf.int32, [batch_size, num_steps]) 

lstm = rnn_cell.BasicLSTMCell(lstm_size) 
# Initial state of the LSTM memory. 
initial_state = state = tf.zeros([batch_size, lstm.state_size]) 

for i in range(num_steps): 
    # The value of state is updated after processing each batch of words. 
    output, state = lstm(words[:, i], state) 

# The rest of the code. 
# ... 

final_state = state

我必须让使用BasicLSTMCell预测时间序列与尊重，我不使用任何循环在图表上，但我在程序执行循环更新lstmCells'状态的实现。下面是代码：

input_layer = tf.placeholder(tf.float32, [input_width, input_dim * 1]) 
lstm_cell1 = tf.nn.rnn_cell.BasicLSTMCell(input_dim * input_width) 
lstm_state1 = tf.Variable(tf.zeros([input_width,lstm_cell1.state_size])) 
lstm_output1, lstm_state_output1 = lstm_cell1(input_layer, lstm_state1, scope='LSTM1') 
lstm_update_op1 = lstm_state1.assign(lstm_state_output1) 

for i in range(39000): 
    input_v, output_v = get_new_input_output(i, A) 
    _, _, network_output = sess.run([lstm_update_op1, train_step, final_output], 
           feed_dict={input_layer: input_v, correct_output: output_v})

如何第二种实现通过时间实现THA反向传播，并在这一个tensorflow正确使用lstmCell的。 Personaly我更喜欢第二个实现，因为我发现它更清晰，并且还能够支持数据流。但谷歌提出第一个实现的事实让我怀疑我做错了什么。

来源

2016-07-28 Marios Mourelatos

为了在训练过程中通过时间进行反向传播，该图需要存储正向通过期间所有张量的值，以便在反向通过期间使用它们来计算梯度。在你的代码中，正向传球应该没问题（虽然我没有测试过），但是反向传球无法正常运行，因为图形无法在正向传球期间保留张量值（因为assign() op）。

我建议你看看Danijar Hafner的this great post。它解释了如何使用dynamic_rnn()函数来做你想做的事情。

来源

2016-09-16 15:46:24 MiniQuark

Tensorflow中的LSTM反向传播

回答

相关问题