MongoDB查询真的很慢，当包裹在$查询运算符

我在MongoDB中有一个约有350K文件的集合。我有Updated（降序）和SecondaryCategories（升序）的复合指数。

db.Content.ensureIndex({ "Updated" : -1, "SecondaryCategories" : 1 },{ "name" : "Updated_SecondaryCategories", "background" : true });

我使用MongoDB的C＃驱动程序使用lambda语法来构建查询：

IQueryable<Content> query = repo.GetAll() 
      .Where(
       x => 
        x.SecondaryCategories.ContainsAny(sel) && 
        (x.Type == ContentType.News || x.Type == ContentType.FotoNews || x.Type == ContentType.LinkedNews || x.Type == ContentType.MediaNews) && 
        x.Texts.Any(y => (int)y.Language == cultureId)) 
      .OrderByDescending(x => x.Updated) 
      .Skip(skipItems) 
      .Take(count);

和我得到以下几点：

db.Content.find({ "$query" : { "SecondaryCategories" : { "$in" : [524, 615, 550, 546, 552, 617, 547, 549, 548, 613, 614, 551, 618, 545] }, "$or" : [{ "Type" : 4 }, { "Type" : 8 }, { "Type" : 32 }, { "Type" : 16 }], "Texts" : { "$elemMatch" : { "Language" : 0 } } }, $orderby: { Updated: -1 }}).limit(20);

查询大约为运行1300ms这很慢。现在，当我删除$query经营者，我应该得到以下几点：

db.Content.find({ "SecondaryCategories" : { "$in" : [524, 615, 550, 546, 552, 617, 547, 549, 548, 613, 614, 551, 618, 545] }, "$or" : [{ "Type" : 4 }, { "Type" : 8 }, { "Type" : 32 }, { "Type" : 16 }], "Texts" : { "$elemMatch" : { "Language" : 0 } }}).sort({"Updated" : -1}).limit(20);

该查询只在1ms内运行。

解释第一个查询（用$query运营商）：

db.Content.find({ "$query" : { "SecondaryCategories" : { "$in" : [524, 615, 550, 546, 552, 617, 547, 549, 548, 613, 614, 551, 618, 545] }, "$or" : [{ "Type" : 4 }, { "Type" : 8 }, { "Type" : 32 }, { "Type" : 16 }], "Texts" : { "$elemMatch" : { "Language" : 0 } } }, $orderby: { Updated: -1 }, $explain: 1 }).limit(20).pretty(); 
{ 
    "cursor" : "BtreeCursor Updated_SecondaryCategories multi", 
    "isMultiKey" : true, 
    "n" : 188173, 
    "nscannedObjects" : 188668, 
    "nscanned" : 337619, 
    "nscannedObjectsAllPlans" : 189056, 
    "nscannedAllPlans" : 338007, 
    "scanAndOrder" : false, 
    "indexOnly" : false, 
    "nYields" : 1, 
    "nChunkSkips" : 0, 
    "millis" : 1304, 
    "indexBounds" : { 
      "Updated" : [ 
        [ 
          { 
            "$maxElement" : 1 
          }, 
          { 
            "$minElement" : 1 
          } 
        ] 
      ], 
      "SecondaryCategories" : [ 
        [ 
          524, 
          524 
        ], 
        [ 
          545, 
          545 
        ], 
        [ 
          546, 
          546 
        ], 
        [ 
          547, 
          547 
        ], 
        [ 
          548, 
          548 
        ], 
        [ 
          549, 
          549 
        ], 
        [ 
          550, 
          550 
        ], 
        [ 
          551, 
          551 
        ], 
        [ 
          552, 
          552 
        ], 
        [ 
          613, 
          613 
        ], 
        [ 
          614, 
          614 
        ], 
        [ 
          615, 
          615 
        ], 
        [ 
          617, 
          617 
        ], 
        [ 
          618, 
          618 
        ] 
      ] 
    }

而对于第二（不$query运营商）：

db.Content.find({ "SecondaryCategories" : { "$in" : [524, 615, 550, 546, 552, 617, 547, 549, 548, 613, 614, 551, 618, 545] }, "$or" : [{ "Type" : 4 }, { "Type" : 8 }, { "Type" : 32 }, { "Type" : 16 }], "Texts" : { "$elemMatch" : { "Language" : 0 } }}).sort({"Updated" : -1}).limit(20).explain(); 
{ 
    "cursor" : "BtreeCursor Updated_SecondaryCategories multi", 
    "isMultiKey" : true, 
    "n" : 20, 
    "nscannedObjects" : 29, 
    "nscanned" : 69, 
    "nscannedObjectsAllPlans" : 94, 
    "nscannedAllPlans" : 134, 
    "scanAndOrder" : false, 
    "indexOnly" : false, 
    "nYields" : 0, 
    "nChunkSkips" : 0, 
    "millis" : 0, 
    "indexBounds" : { 
      "Updated" : [ 
        [ 
          { 
            "$maxElement" : 1 
          }, 
          { 
            "$minElement" : 1 
          } 
        ] 
      ], 
      "SecondaryCategories" : [ 
        [ 
          524, 
          524 
        ], 
        [ 
          545, 
          545 
        ], 
        [ 
          546, 
          546 
        ], 
        [ 
          547, 
          547 
        ], 
        [ 
          548, 
          548 
        ], 
        [ 
          549, 
          549 
        ], 
        [ 
          550, 
          550 
        ], 
        [ 
          551, 
          551 
        ], 
        [ 
          552, 
          552 
        ], 
        [ 
          613, 
          613 
        ], 
        [ 
          614, 
          614 
        ], 
        [ 
          615, 
          615 
        ], 
        [ 
          617, 
          617 
        ], 
        [ 
          618, 
          618 
        ] 
      ] 
    }

我实在无法理解查询这种行为差异。它似乎是第一个查询扫描通过338k文件，并返回188173，这是有限的，而第二个扫描只有69.

我的索引是否错误或我必须重写查询？如果没有使用C＃MongoDB Driver的$query运算符，是否有任何方法来编写查询？

来源

2014-01-28 Villu89

你试过解释查询吗？ http://docs.mongodb.org/manual/reference/operator/meta/query/#op._S_query - 我想你可能需要添加限制条款的查询文件本身，而不是'.limit' –

是的，我使用'$ explain：1'来解释第一个查询。我找不到任何其他限制方式。有'$ maxScan'，但它只是限制扫描的文件数量。 – Villu89

在 mongodb docs

的状态

不要混合使用的查询表格。如果使用$查询格式，请不要将游标方法附加到find（）。要修改查询，请使用meta-query 运算符，如$ explain。

所以在你的情况下，我认为这是.limit(20)的使用，它被省略并导致你的查询继续查看所有文档的所有集合。

也许这是c#驱动程序上的一个错误。

来源

2014-01-28 23:32:38 xlembouras

我无法为'limit（）'找到任何元查询运算符。有'$ maxScan'操作符，但根据手册：'限制查询只在执行查询时扫描指定数量的文档。所以它只会阻止Mongo在整个索引竞赛中扫描超过指定数量的文档。我永远无法确定它是否会返回我需要的20个文档。 – Villu89

我没有看到它......我只想指出，它是影响性能的“极限”使用（或不）。尝试从两个查询中删除它并查看其差异 – xlembouras

好了，因为没有为limit()和$maxScan元查询运营商有不同的含义，我就只能把条件对更新的领域，例如比上个月度："Updated" : {$gte : new ISODate("2013-12-29T00:00:01Z")}。这样我的查询将在50ms内返回。如果返回的文档少于限制，我可以扩展日期筛选器并再次运行查询。

来源

2014-01-29 11:05:14 Villu89

MongoDB查询真的很慢，当包裹在$查询运算符

回答

相关问题