TL; DR
看来,在从细胞中的问题重量和偏见的代码将可以正常使用。多个小区lstm1
和lstm2
将具有相同的行为,并且MultiRNNCell内的小区将具有独立的权重和偏差。即在伪:
lstm1._cells[0].weights == lstm2._cells[0].weights
lstm1._cells[1].weights == lstm2._cells[1].weights
加长版
这是至今没有一个明确的答案,但是这是研究我迄今所取得的结果。
它看起来像一个黑客,但我们可以覆盖get_variable
方法来查看哪些变量被访问。例如像这样:
from tensorflow.python.ops import variable_scope as vs
def verbose(original_function):
# make a new function that prints a message when original_function starts and finishes
def new_function(*args, **kwargs):
print('get variable:', '/'.join((tf.get_variable_scope().name, args[0])))
result = original_function(*args, **kwargs)
return result
return new_function
vs.get_variable = verbose(vs.get_variable)
现在我们可以运行下面的修改后的代码:
def create_lstm_multicell(name):
def lstm_cell(i, s):
print('creating cell %i in %s' % (i, s))
return rnn.LSTMCell(nstates, reuse=tf.get_variable_scope().reuse)
lstm_multi_cell = rnn.MultiRNNCell([lstm_cell(i, name) for i in range(n_layers)])
return lstm_multi_cell
with tf.variable_scope('lstm') as scope:
lstm1 = create_lstm_multicell('lstm1')
layer1, _ = tf.nn.dynamic_rnn(lstm1, x, dtype=tf.float32)
val_1 = tf.reduce_sum(layer1)
with tf.variable_scope('lstm') as scope:
scope.reuse_variables()
lstm2 = create_lstm_multicell('lstm2')
layer2, _ = tf.nn.dynamic_rnn(lstm2, x, dtype=tf.float32)
val_2 = tf.reduce_sum(layer2)
输出看起来像这样(我删除重复的线条):
creating cell 0 in lstm1
creating cell 1 in lstm1
get variable: lstm/rnn/multi_rnn_cell/cell_0/lstm_cell/weights
get variable: lstm/rnn/multi_rnn_cell/cell_0/lstm_cell/biases
get variable: lstm/rnn/multi_rnn_cell/cell_1/lstm_cell/weights
get variable: lstm/rnn/multi_rnn_cell/cell_1/lstm_cell/biases
creating cell 0 in lstm2
creating cell 1 in lstm2
get variable: lstm/rnn/multi_rnn_cell/cell_0/lstm_cell/weights
get variable: lstm/rnn/multi_rnn_cell/cell_0/lstm_cell/biases
get variable: lstm/rnn/multi_rnn_cell/cell_1/lstm_cell/weights
get variable: lstm/rnn/multi_rnn_cell/cell_1/lstm_cell/biases
此输出指示lstm1
和lstm2
单元格将使用相同的权重&偏差,两者都有分开权重&第一个偏差和MultiRNNCell内的第二个单元。
另外,val_1
和val_2
的输出lstm1
和lstm2
在优化期间是相同的。
我认为MultiRNNCell在其内部创建命名空间cell_0
,cell_1
等。因此,lstm1
和lstm2
之间的权重将被重新使用。
重用重量是什么意思?你想建立一个有状态的流程吗? – dv3
@ dv3不,我不需要国家的LSTM。我只想让lstm1和lstm2表现相同,即多单元中每个单元的权重应该在lstm1和lstm2之间相同。 –