有一个相当复杂的解决方案,并没有真正的高性能,使用以下scripted_metric
aggregation。
{
"size": 0,
"query": {
"match_all": {}
},
"aggs": {
"docs_per_month": {
"date_histogram": {
"field": "created_date",
"interval": "month",
"min_doc_count": 0
},
"aggs": {
"avg_doc_per_biz_day": {
"scripted_metric": {
"init_script": "_agg.bizdays = []; _agg.allbizdays = [:]; start = new DateTime(1970, 1, 1, 0, 0); now = new DateTime(); while (start < now) { def end = start.plusMonths(1); _agg.allbizdays[start.year + '_' + start.monthOfYear] = (start.toDate()..<end.toDate()).sum {(it.day != 6 && it.day != 0) ? 1 : 0 }; start = end; }",
"map_script": "_agg.bizdays << _agg.allbizdays[doc. created_date.date.year+'_'+doc. created_date.date.monthOfYear]",
"combine_script": "_agg.allbizdays = null; doc_count = 0; for (d in _agg.bizdays){ doc_count++ }; return doc_count/_agg.bizdays[0]",
"reduce_script": "res = 0; for (a in _aggs) { res += a }; return res"
}
}
}
}
}
}
让我们来详细介绍下面的每个脚本。
我在做什么在init_script
是创建地图工作日每个月的数量自1970年以来和存储,在_agg.allbizdays
地图。
_agg.bizdays = [];
_agg.allbizdays = [:];
start = new DateTime(1970, 1, 1, 0, 0);
now = new DateTime();
while (start < now) {
def end = start.plusMonths(1);
_agg.allbizdays[start.year + '_' + start.monthOfYear] = (start.toDate()..<end.toDate()).sum {(it.day != 6 && it.day != 0) ? 1 : 0 };
start = end;
}
在map_script
,我只是平日检索每个文档的月份数;
_agg.bizdays << _agg.allbizdays[doc.created_date.date.year + '_' + doc. created_date.date.monthOfYear];
在combine_script
,我总结的平均文档数为每个碎片
_agg.allbizdays = null;
doc_count = 0;
for (d in _agg.bizdays){ doc_count++ };
return doc_count/_agg.bizdays[0];
在
reduce_script
最后,我总结的平均文档数为每个节点:
res = 0;
for (a in _aggs) { res += a };
return res
再一次,我认为它非常复杂,而且正如Andrei所说的那样,最好等待2.0让它按照它应该的方式工作,但是在此期间,如果你需要的话。
来源
2015-06-15 09:39:56
Val
明确需要更新我的个人资料... –