2017-05-10 24 views
4

我对语言模型使用了TensorFlow LSTM(我有一个单词序列并希望预测下一个单词),并且在运行语言模型时,我想打印出忘记的值,输入,转换和输出门在每个步骤。我该怎么做呢?如何在TensorFlow中打印出LSTM门的值?

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/rnn/python/ops/rnn_cell.py检查代码,我看到LayerNormBasicLSTMCell类有包含i, j, f, o变量,我想打印出call方法。

def call(self, inputs, state): 
    """LSTM cell with layer normalization and recurrent dropout.""" 
    c, h = state 
    args = array_ops.concat([inputs, h], 1) 
    concat = self._linear(args) 

    i, j, f, o = array_ops.split(value=concat, num_or_size_splits=4, axis=1) 
    if self._layer_norm: 
     i = self._norm(i, "input") 
     j = self._norm(j, "transform") 
     f = self._norm(f, "forget") 
     o = self._norm(o, "output") 

    g = self._activation(j) 
    if (not isinstance(self._keep_prob, float)) or self._keep_prob < 1: 
     g = nn_ops.dropout(g, self._keep_prob, seed=self._seed) 

    new_c = (c * math_ops.sigmoid(f + self._forget_bias) 
      + math_ops.sigmoid(i) * g) 
    if self._layer_norm: 
     new_c = self._norm(new_c, "state") 
    new_h = self._activation(new_c) * math_ops.sigmoid(o) 

    new_state = core_rnn_cell.LSTMStateTuple(new_c, new_h) 
    return new_h, new_state 

但是,有没有一种简单的方法可以将这些变量输出?或者我必须在我的脚本中使用此方法基本上重新创建相关的代码行,我正在运行LTSM?

回答

1

我曾经在git问题中问过类似的问题。答案是原始单元格只返回ch(这也是每个步骤的输出y)。如果你想获得内部变量,你需要自己做。

这里是链接:https://github.com/tensorflow/tensorflow/issues/5731

+0

是你能做到这一点?我也需要记录所有的LSTM单元门。不过,我改变了呼叫的输出,它打破了太多东西。你有什么样的例子吗? – dsalaj

0

基本上可以做到这一点的方法:

首先返回你所需要的状态,例如,return new_h, new_state, i, j, f, o。为了做出这样的改变,你应该从TensorFlow复制源代码文件并将它像你自己的代码一样导入到代码中。
然后在你的代码,在session.run(to_return, feed_dict),使to_return这样的:

output, state, i, j, f, o = lstm_cell(input, state) 
to_return = { 
    "new_h": output, 
    "new_state": state, 
    "i": i, 
    "j": j, 
    "f": f, 
    "o": o, 
} 

results = session.run(to_return, feed_dict) # get what you want from the 
# graph(which are tensors), resulting in results of a dictionary with values 
# being numpy arrays. 

print results["i"] # you'll get a numpy array representing the i gate