MongoDB Count（）与Aggregation

我已经在mongo中使用了很多聚合，我知道在分组计数等方面的性能优势。但是，在这两种方法中，mongo在计算所有文档的性能方面有任何差异收集?:MongoDB Count（）与Aggregation

collection.aggregate（[{ $ 匹配：{} }，{ $组：{ _id：空，计数：{$总和：1}} }]） ;

和

collection.find({}).count()

更新：第二种情况：比方说，我们有这样的样本数据：

{_id: 1, type: 'one', value: true} 
{_id: 2, type: 'two', value: false} 
{_id: 4, type: 'five', value: false}

随着aggregate()：

var _ids = ['id1', 'id2', 'id3']; 
var counted = Collections.mail.aggregate([ 
    { 
    '$match': { 
     _id: { 
     '$in': _ids 
     }, 
     value: false 
    } 
    }, { 
    '$group': { 
     _id: "$type", 
     count: { 
     '$sum': 1 
     } 
    } 
    } 
]);

随着count()：

var counted = {}; 
var type = 'two'; 
for (i = 0, len = _ids.length; i < len; i++) { 
    counted[_ids[i]] = Collections.mail.find({ 
    _id: _ids[i], value: false, type: type 
    }).count(); 
}

来源

2015-10-17 dr.dimitru

为什么不试试看看？ – JohnnyHK

@JohnnyHK试过''collection.aggregate（）'似乎有点快，但不确定，100K的速度测试几乎一样。我希望看到社区的经验。 –

.count()要快得多。你可以通过调用

// Note the missing parentheses at the end 
db.collection.count

返回光标的长度看执行。默认查询（如果调用count()而没有查询文档），这又被实现为返回_id_索引iirc的长度。

但是，聚合会读取每个文档并对其进行处理。这只能在与.count()相同的数量级上执行一半，只能处理大约100K的文档（根据您的RAM进行分配）。

下面的功能被应用于收集一些12M条目：

function checkSpeed(col,iterations){ 

    // Get the collection 
    var collectionUnderTest = db[col]; 

    // The collection we are writing our stats to 
    var stats = db[col+'STATS'] 

    // remove old stats 
    stats.remove({}) 

    // Prevent allocation in loop 
    var start = new Date().getTime() 
    var duration = new Date().getTime() 

    print("Counting with count()") 
    for (var i = 1; i <= iterations; i++){ 
    start = new Date().getTime(); 
    var result = collectionUnderTest.count() 
    duration = new Date().getTime() - start 
    stats.insert({"type":"count","pass":i,"duration":duration,"count":result}) 
    } 

    print("Counting with aggregation") 
    for(var j = 1; j <= iterations; j++){ 
    start = new Date().getTime() 
    var doc = collectionUnderTest.aggregate([{ $group:{_id: null, count:{ $sum: 1 } } }]) 
    duration = new Date().getTime() - start 
    stats.insert({"type":"aggregation", "pass":j, "duration": duration,"count":doc.count}) 
    } 

    var averages = stats.aggregate([ 
    {$group:{_id:"$type","average":{"$avg":"$duration"}}} 
    ]) 

    return averages 
}

并返回：

{ "_id" : "aggregation", "average" : 43828.8 } 
{ "_id" : "count", "average" : 0.6 }

，单位为毫秒。

hth

来源

2015-10-17 13:03:55

谢谢你的回复。您能否就我的更新（第二种情况）给出您的意见？ –

@itsme No.首先缺乏基本的礼貌，其次是因为我不会支持滥用MongoDB作为RDBMS。 –

MongoDB Count（）与Aggregation

回答

相关问题