2013-08-20 65 views
0

对于MongoDB中的mapReduce和聚合,我还有点新意。Mongodb中有多个MapReduce函数或聚合框架的唯一值和计数?

这里的数据集的例子:

{ "_id" : ObjectId("521002161e0787522098d110"), "userId" : 4545454, "pickId" : 1, "answerArray" : [ "yes" ], "city" : "New York", "state" : "New York" } 
{ "_id" : ObjectId("521002481e0787522098d111"), "userId" : 64545454, "pickId" : 1, "answerArray" : [ "no" ], "city" : "New York", "state" : "New York" } 
{ "_id" : ObjectId("521002871e0787522098d112"), "userId" : 78263636, "pickId" : 1, "answerArray" : [ "yes" ], "city" : "Albany", "state" : "New York" } 
{ "_id" : ObjectId("5211507c1e0787522098d113"), "userId" : 78263636, "pickId" : 2, "answerArray" : [ "yes" ], "city" : "New York", "state" : "New York" } 
{ "_id" : ObjectId("5211507c1e0787522098d113"), "userId" : 78263636, "pickId" : 1, "answerArray" : [ "yes" ], "city" : "Wichita", "state" : "Kansas" } 

我希望得到国家,城市,pickId,answerArray的唯一值的列表,再算上这些独特的组合。其结果将需要看起来像这样:

{"pickId": 1, "city": "New York", "state": "New York", "answerArray": ["yes"], "count":2} 
{"pickId": 1, "city": "Albany", "state": "New York", "answerArray": ["no"], "count":1} 
{"pickId": 1, "city": "New York", "state": "New York", "answerArray": ["no"], "count":1} 
{"pickId": 1, "city": "Wichita", "state": "Kansas", "answerArray": ["yes"], "count":1} 

我遇到的问题是,MapReduce的只需要两个参数:

Error: fast_emit takes 2 args near... 

但我期待到多个唯一值映射到一个pickId。

这里是MapReduce的代码我在看:

var mapFunct = function() { 
if(this.answerArray == "yes"){ 
emit(this.pickId,1);} 
else{ 
emit(this.pickId,0);};} 

var mapReduce2 = function(keyPickId,answerVals){ 
return Array.sum(answerVals);}; 

db.answers.mapReduce(mapFunct, mapReduce2, { out: "mapReduceAnswers"}) 

任何帮助或进一步建议,将不胜感激。我也研究过集合框架,但它似乎并没有得到我需要的那种输出。

回答

0

我想你可以使用聚合获得你想要的格式,特别是$group$project运营商。看看这个集合的调用:

var agg_output = db.answers.aggregate([ 
    { $group: { _id: { 
       city: "$city", 
       state: "$state", 
       answerArray: "$answerArray", 
       pickId: "$pickId" 
      }, count: { $sum: 1 }} 
    }, 
    { $project: { city: "$_id.city", 
       state: "$_id.state", 
       answerArray: "$_id.answerArray", 
       pickId: "$_id.pickId", 
       count: "$count", 
       _id: 0} 
    } 
]); 

db.answer_counts.insert(agg_output.result); 

$group阶段负责城市/州/ answerArray/pickId的每个唯一组合的出现次数相加,而$project阶段,将数据输入你想要的形式。

insert调用将结果输出保存到新的集合中。那有意义吗?

+0

它的确如此!谢谢!我要去运行它并跟进...... – user2694845

+0

太棒了 - 让我知道它是怎么回事。 :) – Amalia