有嵌入层的有状态LSTM（形状不匹配）

我想用Keras创建一个有状态的LSTM，但我不明白如何在LSTM运行之前添加嵌入层。问题似乎是stateful标志。如果我的网络不是有状态的，那么添加嵌入层是非常直接的，并且可行。有嵌入层的有状态LSTM（形状不匹配）

不嵌入层的工作状态LSTM着眼于这样的时刻：

model = Sequential() 
model.add(LSTM(EMBEDDING_DIM, 
       batch_input_shape=(batchSize, longest_sequence, 1), 
       return_sequences=True, 
       stateful=True)) 
model.add(TimeDistributed(Dense(maximal_value))) 
model.add(Activation('softmax')) 
model.compile(...)

当添加嵌入层I移动至batch_input_shape参数到嵌入层即只需要已知的形状的第一层？像这样：

model = Sequential() 
model.add(Embedding(vocabSize+1, EMBEDDING_DIM,batch_input_shape=(batchSize, longest_sequence, 1),)) 
model.add(LSTM(EMBEDDING_DIM, 
       return_sequences=True, 
       stateful=True)) 
model.add(TimeDistributed(Dense(maximal_value))) 
model.add(Activation('softmax')) 
model.compile(...)

我得到知道唯一的例外是Exception: Input 0 is incompatible with layer lstm_1: expected ndim=3, found ndim=4

所以我在这里停留在那一刻。将字嵌入结合到有状态的LSTM中有什么窍门？

来源

2016-11-19 toobee

埋入层的batch_input_shape参数应该是（的batch_size，time_steps），其中time_steps是细胞的batch_size和的展开LSTM /数的长度是在间歇式实施例中的数量。

model = Sequential() 
model.add(Embedding(
    input_dim=input_dim, # e.g, 10 if you have 10 words in your vocabulary 
    output_dim=embedding_size, # size of the embedded vectors 
    input_length=time_steps, 
    batch_input_shape=(batch_size,time_steps) 
)) 
model.add(LSTM(
    10, 
    batch_input_shape=(batch_size,time_steps,embedding_size), 
    return_sequences=False, 
    stateful=True) 
)

有一个极好的blog post这解释了在Keras状态LSTMs。另外，我上传了一个gist，其中包含一个带有嵌入层的有状态LSTM的简单示例。

来源

2016-12-01 14:12:05 bruThaler

你如何决定embedding_size或找出嵌入的向量的大小？ – naisanza

@naisanza embedding_size是一个超参数。这意味着embedding_size取决于您的问题，您可以自由选择它。不幸的是，我无法给您一个关于如何选择优秀超参数的一般性答案，但https://arxiv.org/pdf/1206.5533.pdf为该主题提供了一个良好的开端。 – bruThaler

有嵌入层的有状态LSTM（形状不匹配）

回答

相关问题