2017-04-22 157 views
1

我想训练两个不同的LSTM,使它们在对话环境中相互作用(即一个产生一个序列,它将用作第二个rnn的上下文,它将回答等等)。但是,我不知道如何在tensorflow上分别进行训练(我认为我没有完全理解tf图的逻辑)。当我执行我的代码时,出现以下错误:如何在同一个tensorflow会话中训练不同的LSTM?

变量rnn/basic_lstm_cell /权重已存在,不允许。你是否想在VarScope中设置reuse = True?

当我创建第二个RNN时发生错误。你知道如何解决这个问题吗?

我的代码如下:

#User LSTM 
no_units=100 
_seq_user = tf.placeholder(tf.float32, [batch_size, max_length_user, user_inputShapeLen], name='seq') 
_seq_length_user = tf.placeholder(tf.int32, [batch_size], name='seq_length') 

cell = tf.contrib.rnn.BasicLSTMCell(
     no_units) 

output_user, hidden_states_user = tf.nn.dynamic_rnn(
    cell, 
    _seq_user, 
    dtype=tf.float32, 
    sequence_length=_seq_length_user 
) 
out2_user = tf.reshape(output_user, shape=[-1, no_units]) 
out2_user = tf.layers.dense(out2_user, user_outputShapeLen) 

out_final_user = tf.reshape(out2_user, shape=[-1, max_length_user, user_outputShapeLen]) 
y_user_ = tf.placeholder(tf.float32, [None, max_length_user, user_outputShapeLen]) 


softmax_user = tf.nn.softmax(out_final_user, dim=-1) 
loss_user = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=out_final_user, labels=y_user_)) 
optimizer = tf.train.AdamOptimizer(learning_rate=10**-4) 
minimize = optimizer.minimize(loss_user) 

init = tf.global_variables_initializer() 
sess = tf.Session() 
sess.run(init) 

for i in range(epoch): 
    print 'Epoch: ', i 
    batch_X, batch_Y, batch_sizes = lstm.batching(user_train_X, user_train_Y, sizes_user_train) 
    for data_, target_, size_ in zip(batch_X, batch_Y, batch_sizes): 
     sess.run(minimize, {_seq_user:data_, _seq_length_user:size_, y_user_:target_}) 

#System LSTM 
no_units_system=100 
_seq_system = tf.placeholder(tf.float32, [batch_size, max_length_system, system_inputShapeLen], name='seq_') 
_seq_length_system = tf.placeholder(tf.int32, [batch_size], name='seq_length_') 

cell_system = tf.contrib.rnn.BasicLSTMCell(
     no_units_system) 

output_system, hidden_states_system = tf.nn.dynamic_rnn(
    cell_system, 
    _seq_system, 
    dtype=tf.float32, 
    sequence_length=_seq_length_system 
) 
out2_system = tf.reshape(output_system, shape=[-1, no_units]) 
out2_system = tf.layers.dense(out2_system, system_outputShapeLen) 

out_final_system = tf.reshape(out2_system, shape=[-1, max_length_system, system_outputShapeLen]) 
y_system_ = tf.placeholder(tf.float32, [None, max_length_system, system_outputShapeLen]) 

softmax_system = tf.nn.softmax(out_final_system, dim=-1) 
loss_system = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=out_final_system, labels=y_system_)) 
optimizer = tf.train.AdamOptimizer(learning_rate=10**-4) 
minimize = optimizer.minimize(loss_system) 

for i in range(epoch): 
    print 'Epoch: ', i 
    batch_X, batch_Y, batch_sizes = lstm.batching(system_train_X, system_train_Y, sizes_system_train) 
    for data_, target_, size_ in zip(batch_X, batch_Y, batch_sizes): 
     sess.run(minimize, {_seq_system:data_, _seq_length_system:size_, y_system_:target_}) 

回答

0

关于可变范围误差,尝试对于每个图形设置不同的可变范围。

with tf.variable_scope('User_LSTM'): your user_lstm graph

with tf.variable_scope('System_LSTM'): your system_lstm graph

此外,应避免使用相同的名称为不同的Python对象。 (例如:optimizer)第二个声明将覆盖第一个声明,当您使用张量板时这会使您感到困惑。顺便说一句,我建议培训模型端到端的方式,而不是分别运行两个会话。尝试将第一个LSTM的输出张量馈送到具有单个优化器和丢失函数的第二个LSTM中。

0

简而言之,要解决问题(Variable rnn/basic_lstm_cell/weights already exists),您需要的是两个分开的变量范围(如@ J-min所述)。因为在张量流中,变量按其名称进行组织,并且通过管理这两个范围中的这两组变量,tensorflow将能够将它们彼此区分开来。

而通过train them separately on tensorflow,我想你要定义两个不同的损失函数,并用两个优化器来优化这两个LSTM网络,每个优化器都对应于之前的一个损失函数。

在这种情况下,你需要获得这两组变量的列表,并通过这些清单到您的优化,这样

opt1 = GradientDescentOptimizer(learning_rate=0.1) 
opt_op1 = opt.minimize(loss1, var_list=<list of variables from scope 1>) 

opt2 = GradientDescentOptimizer(learning_rate=0.1) 
opt_op2 = opt.minimize(loss2, var_list=<list of variables from scope 2>) 
相关问题