通过字符串数组进行Elasticsearch聚合

我有一个ElasticSearch索引，用于存储电话事务（SMS，MMS，Calls等）及其相关成本。通过字符串数组进行Elasticsearch聚合

这些文档的关键是MSISDN（MSISDN =电话号码）。在我的应用程序中，我知道有一组用户。每个用户可以有一个或多个MSISDN。

下面是这种文件的映射：

"mappings" : { 
     "cdr" : { 
     "properties" : { 
      "callDatetime" : { 
      "type" : "long" 
      }, 
      "callSource" : { 
      "type" : "string" 
      }, 
      "callType" : { 
      "type" : "string" 
      }, 
      "callZone" : { 
      "type" : "string" 
      }, 
      "calledNumber" : { 
      "type" : "string" 
      }, 
      "companyKey" : { 
      "type" : "string" 
      }, 
      "consumption" : { 
      "properties" : { 
       "data" : { 
       "type" : "long" 
       }, 
       "voice" : { 
       "type" : "long" 
       } 
      } 
      }, 
      "cost" : { 
      "type" : "double" 
      }, 
      "country" : { 
      "type" : "string" 
      }, 
      "included" : { 
      "type" : "boolean" 
      }, 
      "msisdn" : { 
      "type" : "string" 
      }, 
      "network" : { 
      "type" : "string" 
      } 
     } 
     } 
    }

我的目标和问题：

我的目标是使检索通过成本通过CALLTYPE查询组。但是，组只在我的PostgreSQL数据库中不在ElasticSearch中表示。

所以我会检索所有的MSISDN为每个现有组的方法，并得到类似的字符串数组的列表，其中包含每个组中的每个MSISDN。

比方说，我有这样的事情：

"msisdn_by_group" : [ 
    { 
     "group1" : ["01111111111", "02222222222", "033333333333", "044444444444"] 
    }, 
    { 
     "group2" : ["05555555555","06666666666"] 
    } 
]

现在，我将用它来生成Elasticsearch查询。我想用汇总（总成本）来计算不同桶中所有这些术语的总和，然后再通过callType分割它。（制作一个叠式条形图）。

我已经试过几件事情，但没有管理，使其工作（直方图，水桶，期限和金额，主要是我与打关键字）。

如果有人在这里可以帮助我的订单，以及关键字，我可以用它来实现这一点，那将是巨大的:)谢谢

编辑：这是我最后一次尝试： QUERY：

{ 
    "aggs" : { 
     "cost_histogram": { 
      "terms": { 
       "field": "callType" 
      }, 
      "aggs": { 
       "cost_histogram_sum" : { 
        "sum": { 
         "field": "cost" 
        } 
       } 
      } 
     } 
    } 
}

我去了预期的结果，但它缺少“组”分裂，因为我不知道如何通过MSISDN数组作为一个标准：

结果：

"aggregations": { 
    "cost_histogram": { 
     "doc_count_error_upper_bound": 0, 
     "sum_other_doc_count": 0, 
     "buckets": [ 
     { 
      "key": "data", 
      "doc_count": 5925, 
      "cost_histogram_sum": { 
      "value": 0 
      } 
     }, 
     { 
      "key": "sms_mms", 
      "doc_count": 5804, 
      "cost_histogram_sum": { 
      "value": 91.76999999999995 
      } 
     }, 
     { 
      "key": "voice", 
      "doc_count": 5299, 
      "cost_histogram_sum": { 
      "value": 194.1196 
      } 
     }, 
     { 
      "key": "sms_mms_plus", 
      "doc_count": 35, 
      "cost_histogram_sum": { 
      "value": 7.2976 
      } 
     } 
     ] 
    } 
    }

来源

2016-11-14 Alex

也许显示您现在的查询并解释仍然缺少的内容？ – Val

@Val当然我忘了，我的坏！请参阅我的编辑请 – Alex

您需要另外一个“术语”聚合来包装您当前的一个。 – Val

好，我发现了如何使用一个查询使这个，而是因为它重复每一个组是该死的长查询，但我没有最佳的选择。我正在使用“过滤器”聚合器。

这是基于我写在我的问题上面的阵列上的工作示例：

POST本地主机：9200/CDR/_search大小= 0

{ 
    "query": { 
     "term" : { 
      "companyKey" : 1 
     } 
    }, 
    "aggs" : { 
     "group_1_split_cost": { 
      "filter": { 
       "bool": { 
        "should": [{ 
         "bool": { 
          "must": { 
           "match": { 
            "msisdn": "01111111111" 
           } 
          } 
         } 
        },{ 
         "bool": { 
          "must": { 
           "match": { 
            "msisdn": "02222222222" 
           } 
          } 
         } 
        },{ 
         "bool": { 
          "must": { 
           "match": { 
            "msisdn": "03333333333" 
           } 
          } 
         } 
        },{ 
         "bool": { 
          "must": { 
           "match": { 
            "msisdn": "04444444444" 
           } 
          } 
         } 
        }] 
       } 
      }, 
      "aggs": { 
       "cost_histogram": { 
        "terms": { 
         "field": "callType" 
        }, 
        "aggs": { 
         "cost_histogram_sum" : { 
          "sum": { 
           "field": "cost" 
          } 
         } 
        } 
       } 
      } 
     }, 
     "group_2_split_cost": { 
      "filter": { 
       "bool": { 
        "should": [{ 
         "bool": { 
          "must": { 
           "match": { 
            "msisdn": "05555555555" 
           } 
          } 
         } 
        },{ 
         "bool": { 
          "must": { 
           "match": { 
            "msisdn": "06666666666" 
           } 
          } 
         } 
        }] 
       } 
      }, 
      "aggs": { 
       "cost_histogram": { 
        "terms": { 
         "field": "callType" 
        }, 
        "aggs": { 
         "cost_histogram_sum" : { 
          "sum": { 
           "field": "cost" 
          } 
         } 
        } 
       } 
      } 
     } 
    } 
}

多亏了新版本的Elasticsearch，我们现在可以嵌套非常深的聚合，但是它仍然有点太糟糕，我们无法将值数组传递给“OR”运算符或类似的东西。我想，这可能会减少这些查询的大小。即使它们有点特殊，并用于利基案例，就像我的一样。

来源

2016-11-14 16:11:58 Alex

通过字符串数组进行Elasticsearch聚合

回答

相关问题