2017-06-12 29 views
1

我有一个索引“测试”。文档结构如下所示。每个文档都有一组“标签”。我无法弄清楚如何查询此索引以获得排名前十的最常出现的标签?Elasticsearch数组中跨所有记录排名前10位最频繁值

另外,如果我们在这个索引中有超过2mil的文档,那么最好的做法是什么?

{ 
    "_index" : "test", 
    "_type" : "data", 
    "_id" : "1412879673545024927_1373991666", 
    "_score" : 1.0, 
    "_source" : { 
     "instagramuserid" : "1373991666", 
     "likes_count" : 163, 
     "@timestamp" : "2017-06-08T08:52:41.803Z", 
     "post" : { 
     "created_time" : "1482648403", 
     "comments" : { 
      "count" : 9 
     }, 
     "user_has_liked" : true, 
     "link" : "https://www.instagram.com/p/BObjpPMBWWf/", 
     "caption" : { 
      "created_time" : "1482648403", 
      "from" : { 
      "full_name" : "PARAMSahib ™", 
      "profile_picture" : "https://scontent.cdninstagram.com/t51.2885-19/s150x150/12750236_1692144537739696_350427084_a.jpg", 
      "id" : "1373991666", 
      "username" : "parambanana" 
      }, 
      "id" : "17845953787172829", 
      "text" : "This feature talks about how to work pastels .\n\nDull gold pullover + saffron khadi kurta + baby pink pants + Deep purple patka and white sneakers - Perfect colours for a Happy sunday christmas morning . \n#paramsahib #men #menswear #mensfashion #mensfashionblog #mensfashionblogger #menswearofficial #menstyle #fashion #fashionfashion #fashionblog #blog #blogger #designer #fashiondesigner #streetstyle #streetfashion #sikh #sikhfashion #singhstreetstyle #sikhdesigner #bearded #indian #indianfashionblog #indiandesigner #international #ootd #lookbook #delhistyleblog #delhifashionblog" 
     }, 
     "type" : "image", 
     "tags" : [ 
      "men", 
      "delhifashionblog", 
      "menswearofficial", 
      "fashiondesigner", 
      "singhstreetstyle", 
      "fashionblog", 
      "mensfashion", 
      "fashion", 
      "sikhfashion", 
      "delhistyleblog", 
      "sikhdesigner", 
      "indianfashionblog", 
      "lookbook", 
      "fashionfashion", 
      "designer", 
      "streetfashion", 
      "international", 
      "paramsahib", 
      "mensfashionblogger", 
      "indian", 
      "blog", 
      "mensfashionblog", 
      "menstyle", 
      "ootd", 
      "indiandesigner", 
      "menswear", 
      "blogger", 
      "sikh", 
      "streetstyle", 
      "bearded" 
     ], 
     "filter" : "Normal", 
     "attribution" : null, 
     "location" : null, 
     "id" : "1412879673545024927_1373991666", 
     "likes" : { 
      "count" : 163 
     } 
     } 
    } 
    }, 

回答

1

如果你的标签映射类型是object(这是默认设置),你可以使用聚合查询是这样的:

{ 
    "size": 0, 
    "aggs": { 
     "frequent_tags": { 
     "terms": {"field": "post.tags"} 
     } 
    } 
} 
+0

“预期[START_OBJECT]在[项],但得到了[ VALUE_STRING] in [频繁标签]“ 出现此错误。 –

+0

对不起。我更新了查询。 –

+0

'{ “花”:54, “TIMED_OUT”:假的, “_shards”:{ “总”:5, “成功”:5, “失败”:0 }, “点击” :{ “总”:99912, “MAX_SCORE”:0, “命中”:[] }, “聚合”:{ “frequent_tags”:{ “doc_count_error_upper_bound”:0, “sum_other_doc_count”: 0, “buckets”:[] } } }' –