获取聚集

这里是我的ES查询：获取聚集

===创建索引===

PUT /sample

===插入数据===

PUT /sample/docs/1 
{"data": "And the world said, 'Disarm, disclose, or face serious consequences'—and therefore, we worked with the world, we worked to make sure that Saddam Hussein heard the message of the world."} 
PUT /sample/docs/2 
{"data": "Never give in — never, never, never, never, in nothing great or small, large or petty, never give in except to convictions of honour and good sense. Never yield to force; never yield to the apparently overwhelming might of the enemy"}

===查询，得到的结果===

POST sample/docs/_search 
{ 
    "query": { 
    "match": { 
     "data": "never" 
    } 
    }, 
    "highlight": { 
    "fields": { 
     "data":{} 
    } 
    } 
}

===检索结果===

... 
     "highlight": { 
      "data": [ 
      "<em>Never</em> give in — <em>never</em>, <em>never</em>, <em>never</em>, <em>never</em>, in nothing great or small, large or petty, <em>never</em> give", 
      " in except to convictions of honour and good sense. <em>Never</em> yield to force; <em>never</em> yield to the apparently overwhelming might of the enemy" 
      ] 
     }

===所需的结果===

所需期限由文件搜索词的频率如下例所示

Doc Id: 2 
Term Frequency :{ 
    "never": 8 
}

我已经试过桶聚合，术语聚合和其他聚合，但我没有得到这个结果。

感谢您的帮助！

来源

2017-09-23 Callisto

您应该使用Term Vector，它支持根据频率查询特定的术语。

https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-termvectors.html

在这种情况下，您的查询将

GET /sample/docs/_termvectors 
{ 
    "doc": { 
     "data": "never" 
    }, 
    "term_statistics" : true, 
    "field_statistics" : true, 
    "positions": false, 
    "offsets": false, 
    "filter" : { 
     "min_term_freq" : 8 
    } 
}

来源

2017-09-23 20:40:38

我越来越如果我执行你的建议的查询以下错误： '{ “错误”：{ “ROOT_CAUSE”： [ { “type”：“illegal_state_exception”， “reason”：“术语向量请求的字段统计信息存在错误：值为\ nsum_doc_freq 0 \ ndoc_count 0 \ nsum_ttf 0” } ]， “类型”： “illegal_state_exception”， “原因”： “出毛病与术语载体请求的字段统计：此数值\ nsum_doc_freq 0 \ ndoc_count 0 \ nsum_ttf 0” }， “状态” ：500 }' – Callisto

而我的需求是不同的，根据您的建议查询它将返回结果与术语频率8，但我想要的结果是术语频率的数量。 – Callisto

回答

相关问题