在Tensorflow中嵌入特征向量

在文本处理中，有embedding显示（如果我正确地理解）数据库字作为向量（降维后）。现在，我想知道，有没有像这样的任何方法来显示通过CNN提取的功能？在Tensorflow中嵌入特征向量

例如：考虑我们有一个CNN和火车和测试集。我们想用列车集训练CNN，同时在张量板的嵌入部分通过CNN看到提取的特征（来自密集层）相应的类标签。

这项工作的目的是查看每个批次中输入数据的特征，并了解它们离一起的距离有多远或多远。最后，在训练好的模型中，我们可以找出分类器的准确性（如softmax等）。

非常感谢您的帮助。

来源

2017-09-06 Hajbabaei_M_R

我已经接受了Tensorflow文档的帮助。

对于如何运行TensorBoard，并确保您记录所有必要的信息，请参阅：TensorBoard: Visualizing Learning.

可视化你的嵌入，有三件事情你需要做的：

1 ）设置一个包含你的嵌入的2D张量。

embedding_var = tf.get_variable(....)

2）定期保存在LOG_DIR一个检查站的模型变量。

saver = tf.train.Saver() 
saver.save(session, os.path.join(LOG_DIR, "model.ckpt"), step)

3）（可选）与嵌入关联的元数据。

如果你有你的嵌入相关的任何元数据（标签，图像），你可以告诉TensorBoard它无论是在LOG_DIR直接存储projector_config.pbtxt，或使用我们的API的Python。

例如，以下projector_config.ptxt关联起来的元数据的word_embedding张量存储在$ LOG_DIR/metadata.tsv：

embeddings { 
    tensor_name: 'word_embedding' 
    metadata_path: '$LOG_DIR/metadata.tsv' 
}

相同的配置可以编程方式使用下面的代码段来生产：

from tensorflow.contrib.tensorboard.plugins import projector 

# Create randomly initialized embedding weights which will be trained. 
vocabulary_size = 10000 
embedding_size = 200 
embedding_var = tf.get_variable('word_embedding', [vocabulary_size, 
embedding_size]) 

# Format: tensorflow/tensorboard/plugins/projector/projector_config.proto 
config = projector.ProjectorConfig() 

# You can add multiple embeddings. Here we add only one. 
embedding = config.embeddings.add() 
embedding.tensor_name = embedding_var.name 
# Link this tensor to its metadata file (e.g. labels). 
embedding.metadata_path = os.path.join(LOG_DIR, 'metadata.tsv') 

#Use the same LOG_DIR where you stored your checkpoint. 
summary_writer = tf.summary.FileWriter(LOG_DIR) 

# The next line writes a projector_config.pbtxt in the LOG_DIR. TensorBoard will 
# read this file during startup. 
projector.visualize_embeddings(summary_writer, config)

来源

2017-09-06 07:36:12

在Tensorflow中嵌入特征向量

回答

相关问题