如何通过在MongoDB中将数组元素与MapReduce进行匹配来对文档进行分组？

我有一个包含一个字符串数组的列的数据库。实施例表：如何通过在MongoDB中将数组元素与MapReduce进行匹配来对文档进行分组？

name | words       | ... 
Ash | ["Apple", "Pear", "Plum"]  | ... 
Joe | ["Walnut", "Peanut"]   | ... 
Max | ["Pineapple", "Apple", "Plum"] | ...

现在我想该表针对通过匹配率的单词和组的文档的给定阵列相匹配。

示例输入与预期的结果：

// matched for input = ["Walnut", "Peanut", "Apple"] 
{ 
    "1.00": [{name:"Joe", match:"1.00"}], 
    "0.33": [{name:"Ash", match:"0.33"}, {name:"Max", match:"0.33"}] 
}

我使用以下map功能与匹配率作为关键排放文档：

function map() { 
    var matches = 0.0; 
    for(var i in input) 
     if(this.words.indexOf(input[i]) !== -1) matches+=1; 
    matches /= input.length; 
    var key = ""+matches.toFixed(2); 
    emit(key, {name: this.name, match: key}); 
}

现在缺少的是一个匹配reduce函数来将发射的KV对组合成结果对象。

我已经尝试过这样的：

function reduce(key, value) { 
    var res = {}; 
    res[key] = values; 
    return res; 
}

但是我对此

MongoDB中可以调用函数减少多次为同一关键规范的麻烦。在这种情况下，该键的reduce函数的前一个输出将成为该键的下一个减少函数调用的输入值之一。

...产生嵌套的结果对象。根据他们的比赛将文档分组的正确方法是什么？

来源

2016-09-20 Appleshell

为同一个键多次调用reduce函数。

这就是idempotence，并且reduce函数必须尊重它。

但是，为了简单起见，您只需确保地图输出格式与缩小格式相同。

对于你的情况，这样的事情会工作：

db.col.insert({"name": "Ash", "words": ["Apple", "Pear", "Plum"]}) 
db.col.insert({"name": "Joe", "words": ["Walnut", "Peanut"]}) 
db.col.insert({"name": "Max", "words": ["Pineapple", "Apple", "Plum"]}) 

function map() { 

    input = ["Walnut", "Peanut", "Apple"] 

    var matches = 0.0; 
    for(var i in input) 
     if(this.words.indexOf(input[i]) !== -1) matches+=1; 
    matches /= input.length; 
    var key = ""+matches.toFixed(2); 

    emit(key, {users: [{name: this.name, match: key}]}); 
} 

function reduce(key, value) { 

    ret = value[0] 

    for(var i=1; i<value.length; i++){ 
     ret.users = ret.users.concat(value[i].users) 
    } 

    return ret 

} 

db.col.mapReduce(map, reduce, {"out": {inline:1}})

输出：

{ 
    "results" : [ 
     { 
      "_id" : "0.33", 
      "value" : { 
       "users" : [ 
        { 
         "name" : "Ash", 
         "match" : "0.33" 
        }, 
        { 
         "name" : "Max", 
         "match" : "0.33" 
        } 
       ] 
      } 
     }, 
     { 
      "_id" : "0.67", 
      "value" : { 
       "users" : [ 
        { 
         "name" : "Joe", 
         "match" : "0.67" 
        } 
       ] 
      } 
     } 
    ], 
    "timeMillis" : 22, 
    "counts" : { 
     "input" : 3, 
     "emit" : 3, 
     "reduce" : 1, 
     "output" : 2 
    }, 
    "ok" : 1 
}

来源

2016-09-20 13:57:46 joao

谢谢，这正是我之后。非常有用的答案！ – Appleshell

如何通过在MongoDB中将数组元素与MapReduce进行匹配来对文档进行分组？

回答

相关问题