多场方面聚集的做法

我有类似下面的文档的索引：多场方面聚集的做法

[ 
    { 
     "name": "Marco", 
     "city_id": 45, 
     "city": "Rome" 
    }, 
    { 
     "name": "John", 
     "city_id": 46, 
     "city": "London" 
    }, 
    { 
     "name": "Ann", 
     "city_id": 47, 
     "city": "New York" 
    }, 
    ... 
]

和聚合：

"aggs": { 
    "city": { 
     "terms": { 
      "field": "city" 
     } 
    } 
}

这给了我这样的回应：

{ 
    "aggregations": {  
     "city": { 
      "doc_count_error_upper_bound": 0, 
      "sum_other_doc_count": 694, 
      "buckets": [ 
       { 
        "key": "Rome", 
        "doc_count": 15126 
       }, 
       { 
        "key": "London", 
        "doc_count": 11395 
       }, 
       { 
        "key": "New York", 
        "doc_count": 14836 
       }, 
       ... 
      ] 
     }, 
     ... 
    } 
}

我的问题是我需要在我的聚合结果上也有city_id。我一直在阅读here，我无法使用多场术语聚合，但我不需要通过两个字段进行聚合，而只是返回另一个字段，该字段对于每个术语字段（基本上都是city/city_id对））。在不损失业绩的情况下，实现这一目标的最佳方式是什么？

我可以创建一个名为city_with_id的字段，其值为"Rome;45","London;46"等，并按此字段进行聚合。对我来说，这是可行的，因为我可以简单地将结果分解到我的后端并获得我需要的ID，但这是否是最好的方法？

来源

2016-06-24 stefanobaldo

一种方法是使用top_hits并使用源过滤仅返回city_id，如下例所示。我不认为这会导致性能降低您可以在尝试使用OP中指定的city_name_id字段的方法之前，在您的索引中尝试使用它来查看影响。

例子：

post <index>/_search 
    { 
     "size" : 0, 
     "aggs": { 
      "city": { 
       "terms": { 
        "field": "city" 
       }, 
       "aggs" : { 
        "id" : { 
         "top_hits" : { 
          "_source": { 
           "include": [ 
            "city_id" 
           ] 
          }, 
          "size" : 1 
         } 
        } 
       } 
      } 
     } 
    }

结果：

{ 
       "key": "London", 
       "doc_count": 2, 
       "id": { 
        "hits": { 
        "total": 2, 
        "max_score": 1, 
        "hits": [ 
         { 
          "_index": "country", 
          "_type": "city", 
          "_id": "2", 
          "_score": 1, 
          "_source": { 
           "city_id": 46 
          } 
         } 
        ] 
        } 
       } 
      }, 
      { 
       "key": "New York", 
       "doc_count": 1, 
       "id": { 
        "hits": { 
        "total": 1, 
        "max_score": 1, 
        "hits": [ 
         { 
          "_index": "country", 
          "_type": "city", 
          "_id": "3", 
          "_score": 1, 
          "_source": { 
           "city_id": 47 
          } 
         } 
        ] 
        } 
       } 
      }, 
      { 
       "key": "Rome", 
       "doc_count": 1, 
       "id": { 
        "hits": { 
        "total": 1, 
        "max_score": 1, 
        "hits": [ 
         { 
          "_index": "country", 
          "_type": "city", 
          "_id": "1", 
          "_score": 1, 
          "_source": { 
           "city_id": 45 
          } 
         } 
        ] 
        } 
       } 
      }

来源

2016-06-24 03:18:10 keety

它的工作！其实我已经失去了相当多的时间使用这种方法，因为我的例子只是说明 - 在真实场景中，我有很多字段需要应用嵌套聚合，结果是不可接受的。无论如何，它的工作，我会接受你的答案。非常感谢你！ – stefanobaldo

多场方面聚集的做法

回答

相关问题