如何打印出由elasticsearch创建的倒排索引？

如果我想获得该elasticsearch创建（我使用的rails elasticsearch gem）索引的所有令牌，我怎么会去这样做呢？做这样的事情只得到一组特定的标记为搜索词：如何打印出由elasticsearch创建的倒排索引？

curl -XGET 'http://localhost:9200/development_test/_analyze?text=John Smith'

来源

2014-10-09 Nona

没有通过elasticsearch提供以查看Lucene索引API。但是有一些工具可以让你查看lucene索引，如Luke。以下是关于如何为弹性搜索设置它的[博客] [1]可能可能有所帮助。 [1]：http://rosssimpson.com/blog/2014/05/06/using-luke-with-elasticsearch/ – keety 2014-10-09 15:45:30

谢谢我设法让卢克起来和运行..任何想法在哪里指数弹性搜索创建存储在Linux上？我检查了/etc/init.d并没有看到任何.idx文件。 – Nona 2014-10-09 17:19:16

索引路径应该在elasticsearch的config的path.data字段中提供。该指数应该是类似的路径/<群集名>/// /指数/ – keety 2014-10-09 20:14:16

您可以用Term Vectors API的Scroll API结合，列举出倒排索引项：

require "elastomer/client" 
require "set" 

client = Elastomer::Client.new({ :url => "http://localhost:9200" }) 
index = "someindex" 
type = "sometype" 
field = "somefield" 

terms = Set.new 

client.scan(nil, :index => index, :type => type).each_document do |document| 
    term_vectors = client.index(index).docs(type).termvector({ :fields => field, :id => document["_id"] })["term_vectors"] 
    if term_vectors.key?(field) 
    term_vectors[field]["terms"].keys.each do |term| 
     unless terms.include?(term) 
     terms << term 
     puts(term) 
     end 
    end 
    end 
end

这是相当缓慢并且浪费，因为它对索引中的每个单独文档执行一个_termvectors HTTP请求，将所有条目保存在RAM中，并在枚举期间保持滚动上下文打开。然而，这并不需要像Luke这样的其他工具，而且这些术语可以从索引中流出。

来源

2017-04-08 06:42:15

如何打印出由elasticsearch创建的倒排索引？

回答

相关问题