MongoDB的查询不会对复合索引使用前缀文本字段

我已经创建了我的收集提供了以下指标：MongoDB的查询不会对复合索引使用前缀文本字段

db.myCollection.createIndex({ 
    user_id: 1, 
    name: 'text' 
})

如果我尝试看看包含这两个字段的查询的执行计划，这样：

db.getCollection('campaigns').find({ 
    user_id: ObjectId('xxx') 
    ,$text: { $search: 'bla' } 
}).explain('executionStats')

我得到如下结果：

... 
"winningPlan" : { 
    "stage" : "TEXT", 
    "indexPrefix" : { 
     "user_id" : ObjectId("xxx") 
    }, 
    "indexName" : "user_id_1_name_text", 
    "parsedTextQuery" : { 
     "terms" : [ 
      "e" 
     ], 
     "negatedTerms" : [], 
     "phrases" : [], 
     "negatedPhrases" : [] 
    }, 
    "inputStage" : { 
     "stage" : "TEXT_MATCH", 
     "inputStage" : { 
      "stage" : "TEXT_OR", 
      "inputStage" : { 
       "stage" : "IXSCAN", 
       "keyPattern" : { 
        "user_id" : 1.0, 
        "_fts" : "text", 
        "_ftsx" : 1 
       }, 
       "indexName" : "user_id_1_name_text", 
       "isMultiKey" : true, 
       "isUnique" : false, 
       "isSparse" : false, 
       "isPartial" : false, 
       "indexVersion" : 1, 
       "direction" : "backward", 
       "indexBounds" : {} 
      } 
     } 
    } 
} 
...

正如documentation指出， MongoDB可以使用索引前缀来执行索引查询。

由于user_id是该指数的前缀之上，我预计，只有user_id查询将使用索引，但如果我尝试以下方法：

db.myCollection.find({ 
    user_id: ObjectId('xxx') 
}).explain('executionStats')

我得到：

... 
"winningPlan" : { 
    "stage" : "COLLSCAN", 
    "filter" : { 
     "user_id" : { 
      "$eq" : ObjectId("xxx") 
     } 
    }, 
    "direction" : "forward" 
}, 
...

因此，它根本没有使用索引并执行完整的集合扫描。

来源

2017-07-03 Henrique Barcelos

一般来说，MongoDB可以使用索引前缀来支持查询，但是复合索引（包括地理空间或文本字段）是sparse compound indexes的特例。如果文档不包含复合索引中任何文本索引字段的值，则它不会包含在索引中。

为了确保correct results为前缀搜索，另一种查询计划将选择在稀疏复合索引：

如果稀疏索引会导致结果不完整的组查询和排序操作，除非一个hint（）明确指定索引，否则MongoDB不会使用该索引。

设置MongoDB中3.4.5一些测试数据，以证实潜在的问题：

db.myCollection.createIndex({ user_id:1, name: 'text' }, { name: 'myIndex'}) 

// `name` is a string; this document will be included in a text index 
db.myCollection.insert({ user_id:123, name:'Banana' }) 

// `name` is a number; this document will NOT be included in a text index 
db.myCollection.insert({ user_id:123, name: 456 }) 

// `name` is missing; this document will NOT be included in a text index 
db.myCollection.insert({ user_id:123 })

然后，迫使化合物文本索引中使用：仅

db.myCollection.find({user_id:123}).hint('myIndex')

结果包括索引文本字段name的单个文档，而不是预期的三个文档：

{ 
    "_id": ObjectId("595ab19e799060aee88cb035"), 
    "user_id": 123, 
    "name": "Banana" 
}

这个异常应该在MongoDB文档中更清楚地突出显示;在MongoDB问题跟踪器中观看/上传DOCS-10322以获取更新。

来源

2017-07-03 21:30:05 Stennie

所以，基本上这里的解决方案，因为我需要两个查询，将有2个索引：1个单独包含'user_id'，另一个包含'{user_id，name}'？ –

@HenriqueBarcelos是的，你需要'user_id'上的单独索引。这可能是另一个非稀疏复合索引的前缀。 – Stennie

这种现象是由于文本索引被sparse by default：

对于包括与其他类型的键沿文本索引键的化合物指数，仅文本索引字段确定是否索引引用一个文件。其他键不确定索引是否引用文档。

查询过滤器不引用文本索引字段，那么查询规划不会考虑这个指标，因为它不能确定，充分结果集的文件将通过索引返回。

来源

2017-07-03 21:33:18

MongoDB的查询不会对复合索引使用前缀文本字段

回答

相关问题