2016-08-03 36 views
0

我正在尝试构建两个预测不同输出类型的相似模型。一个预测两个类别,另一个有六个输出类别。他们的输入是相同的,他们都是LSTM RNN。在同一图表中构建多个模型

我已将培训和预测分解为各自文件model1.py,model2.py中的单独函数。

我已在每个模型变量命名同样的事情的错误,这样,当我打电话分别与predict1从predict2和MODEL1 MODEL2我碰到下面的命名空间的错误: ValueError异常:变量W已经存在,不允许。你是否想在VarScope中设置reuse = True?最初定义在:

其中W是权重矩阵的名称。

有没有从同一地点运行这些预测的好方法?我试图重命名所涉及的变量,但仍然出现以下错误。似乎没有可能在它的创建上命名lstm_cell,是吗?

ValueError: Variable RNN/BasicLSTMCell/Linear/Matrix already exists 

编辑:作用域各地model1pred和model2pred的预测文件,我得到以下错误调用model1pred时后(),然后model2pred()

tensorflow.python.framework.errors.NotFoundError: Tensor name model1/model1/BasicLSTMCell/Linear/Matrix" not found in checkpoint files './variables/model1.chk 

编辑:该代码包括在这里。 model2.py中的代码缺失,但相当于model1.py,除了n_classes = 2,并且在dynamicRNN函数和pred内部,范围设置为'model2'。

解决方案:问题是保存程序尝试从第一个pred()执行中恢复包含变量的图形。我能够将pred函数的调用包装在不同的图表中来解决问题,从而消除了对变量范围的需求。

在收集的预测文件:

def model1pred(test_x, test_seqlen): 
    from model1 import pred 
    with tf.Graph().as_default(): 
     return pred(test_x, test_seqlen) 

def model2pred(test_x, test_seqlen): 
    from model2 import pred 
    with tf.Graph().as_default(): 
     return pred(test_x, test_seqlen) 

##Import test_x, test_seqlen 

probs1, preds1 = model1pred(test_x, test_seq) 
probs2, cpreds2 = model2Pred(test_x, test_seq) 

在model1.py

def dynamicRNN(x, seqlen, weights, biases): 
    n_steps = 10 
    n_input = 14 
    n_classes = 6 
    n_hidden = 100 

    # Prepare data shape to match `rnn` function requirements 
    # Current data input shape: (batch_size, n_steps, n_input) 
    # Required shape: 'n_steps' tensors list of shape (batch_size, n_input) 

    # Permuting batch_size and n_steps 
    x = tf.transpose(x, [1, 0, 2]) 
    # Reshaping to (n_steps*batch_size, n_input) 
    x = tf.reshape(x, [-1,n_input]) 
    # Split to get a list of 'n_steps' tensors of shape (batch_size, n_input) 
    x = tf.split(0, n_steps, x) 

    # Define a lstm cell with tensorflow 
    lstm_cell = rnn_cell.BasicLSTMCell(n_hidden, forget_bias=1.0) 

    # Get lstm cell output, providing 'sequence_length' will perform dynamic calculation. 
    outputs, states = tf.nn.rnn(lstm_cell, x, dtype=tf.float32, sequence_length=seqlen) 

    # When performing dynamic calculation, we must retrieve the last 
    # dynamically computed output, i.e, if a sequence length is 10, we need 
    # to retrieve the 10th output. 
    # However TensorFlow doesn't support advanced indexing yet, so we build 
    # a custom op that for each sample in batch size, get its length and 
    # get the corresponding relevant output. 

    # 'outputs' is a list of output at every timestep, we pack them in a Tensor 
    # and change back dimension to [batch_size, n_step, n_input] 
    outputs = tf.pack(outputs) 
    outputs = tf.transpose(outputs, [1, 0, 2]) 

    # Hack to build the indexing and retrieve the right output. 
    batch_size = tf.shape(outputs)[0] 
    # Start indices for each sample 
    index = tf.range(0, batch_size) * n_steps + (seqlen - 1) 
    # Indexing 
    outputs = tf.gather(tf.reshape(outputs, [-1, n_hidden]), index) 

    # Linear activation, using outputs computed above 
    return tf.matmul(outputs, weights['out']) + biases['out'] 

def pred(test_x, test_seqlen): 
    with tf.Session() as sess: 
     n_steps = 10 
     n_input = 14 
     n_classes = 6 
     n_hidden = 100 
     weights = {'out': tf.Variable(tf.random_normal([n_hidden, n_classes]), name='W1')} 
     biases = {'out': tf.Variable(tf.random_normal([n_classes]), name='b1')} 
     x = tf.placeholder("float", [None, n_steps, n_input]) 
     y = tf.placeholder("float", [None, n_classes]) 
     seqlen = tf.placeholder(tf.int32, [None]) 

     pred = dynamicRNN(x, seqlen, weights, biases) 
     saver = tf.train.Saver(tf.all_variables()) 
     y_p =tf.argmax(pred,1) 

     init = tf.initialize_all_variables() 
     sess.run(init) 

     saver.restore(sess,'./variables/model1.chk') 
     y_prob, y_pred= sess.run([pred, y_p], feed_dict={x: test_x, seqlen: test_seqlen}) 
     y_prob = np.array([softmax(x) for x in y_prob]) 
     return y_prob, y_pred 

'

+0

也许在自定义[variable_scope](https://www.tensorflow.org/versions/r0.10/how_tos/variable_scope/index.html)块中创建一个模型? –

+0

你真的需要巨大的散文来解释你的问题吗?考虑将问题分为哪一部分容易看出问题的核心是什么,而不是抛出大量代码或解释问题的动机。这个网站更多关于编码,所以试着把重点放在这个。 –

+0

你的问题的标题看起来相当宽泛,而细节看起来相当具体。你能否改变标题以更好地反映你的问题? –

回答

0

为此,您可以通过添加各地的模型构建两段代码with tf.variable_scope():块。这具有在变量的名称前加上不同前缀的作用,避免了冲突。

例如(使用在你的问题中定义的model1pred()model2pred()功能):

with tf.variable_scope('model1'): 
    # Variables created in here will be named 'model1/W', etc. 
    probs1, preds1 = model1pred(test_x, test_seq) 

with tf.variable_scope('model2'): 
    # Variables created in here will be named 'model2/W', etc. 
    probs2, cpreds2 = model2Pred(test_x, test_seq) 

有关详细信息,请参阅深入HOWTO on variable sharing in TensorFlow

+0

我会注意到,模型是单独的文件,如果这改变了任何东西。我将每种方法都用于训练并使用variable_scopes对每个模型进行预测。 在创建LSTM单元格的单独方法中,我还设置了tf.nn.rnn(....,scope ='model1')。 如前所述,每个模型都运行,但如果连续运行,则第二个模型将失败。 – John

+0

如果您在不同的最外层变量作用域中调用代码,它会工作吗? (该文件应该对变量作用域没有影响。)如果不是,您可以使用程序的顶级代码更新该问题吗? – mrry

+0

我通过调用不同最外层变量作用域中的代码来假定您的意思是在调用该函数时将pred1函数包含在预测文件中的model1pred和model2pred变量作用域中? 这并没有解决这个错误。我在原帖 – John

相关问题