我的目标是让我的Map-Reduce作业始终在我的MongoDB集群的分片的辅助节点上运行。在分片群集上运行MapReduce时,MongoDB会忽略readPreference?
我将readPreference
设置为secondary,将out
参数的MapReduce命令设置为inline
以实现此目的。这在非分片副本集合上工作正常:作业在辅助副本上运行。但是,在分片群集上,此作业在Primary上运行。
有人可以解释为什么发生这种情况或指向任何相关的文档?我在relevant documentation中找不到任何东西。从二次
public static final String mapfunction = "function() { emit(this.custid, this.txnval); }";
public static final String reducefunction = "function(key, values) { return Array.sum(values); }";
...
private void mapReduce() {
...
MapReduceIterable<Document> iterable = collection.mapReduce(mapfunction, reducefunction);
...
}
...
Builder options = MongoClientOptions.builder().readPreference(ReadPreference.secondary());
MongoClientURI uri = new MongoClientURI(MONGO_END_POINT, options);
MongoClient client = new MongoClient(uri);
...
日志时,这是一个上副本集执行:
2016-11-23T15:05:26.735 + 0000我COMMAND [conn671]命令test.txns命令:MapReduce的映射精简{: “txns”,map:function(){emit(this.custid,this.txnval); },reduce:function(key,values){return Array.sum(values); },out:{inline:1},query:null,sort:null,finalize:null,scope:null,verbose:true} planSummary:COUNT keyUpdates:0 writeConflicts:0 numYields:7 reslen:4331 locks:全球:{acquireCount:{r:44}},数据库:{acquireCount:{r:3,R:19}},集合:{acquireCount:{r:3}}}协议:op_query 124ms
Sharded collection :从碎片-0初级
mongos> db.txns.getShardDistribution()
Shard Shard-0 at Shard-0/primary.shard0.example.com:27017,secondary.shard0.example.com:27017
data : 498KiB docs : 9474 chunks : 3
estimated data per chunk : 166KiB
estimated docs per chunk : 3158
Shard Shard-1 at Shard-1/primary.shard1.example.com:27017,secondary.shard1.example.com:27017
data : 80KiB docs : 1526 chunks : 3
estimated data per chunk : 26KiB
estimated docs per chunk : 508
Totals
data : 579KiB docs : 11000 chunks : 6
Shard Shard-0 contains 86.12% data, 86.12% docs in cluster, avg obj size on shard : 53B
Shard Shard-1 contains 13.87% data, 13.87% docs in cluster, avg obj size on shard : 53B
日志:
2016-11-24T08:46:30.828 + 0000我COMMAND [conn357]命令测试$ cmd命令:mapreduce.shardedfinish {mapred uce.shardedfinish:{mapreduce:“txns”,map:function(){emit(this.custid,this.txnval); },reduce:function(key,values){return Array.sum(values); },out:{in line:1},query:null,sort:null,finalize:null,scope:null,verbose:true,$ queryOptions:{$ readPreference:{mode:“secondary”}}},inputDB:“test”,shardedOutputCollection:“tmp.mrs.txns_1479977190_0”,shards:{Shard-0/primary.shard0.example.com:27017,secondary.shard0.example.com:27017:{result :“tmp.mrs.txns_1479977190_0”,timeMillis:123,timing:{mapTime:51,emitLoop:116,reduceTime:9,mode:“mixed”,total:123},counts:{input:9474,emit:9474, reduce:909,output:101},ok:1.0,$ gleS tats:{lastOpTime:Timestamp 1479977190000 | 103,electionId:ObjectId('7fffffff0000000000000001')}},Shard-1/primary.shard1.example.com:27017 ,secondary.shard1.example.com:27017:{result:“tmp.mrs.txns_1479977190_0”,timeMil lis:71,时间: {mapTime:8,emitLoop:63,reduceTime:4,mode:“mixed”,total:71},counts:{input:1526,emit:1526,reduce:197,output:101} ,ok:1.0,$ gleStats:{lastOpTime:Timestamp 1479977190000 | 103,electionId:ObjectId('7fffffff0000000000000001')}}},shardCounts:{Sha rd-0/primary.shard0.example.com:27017,secondary.shard0 .example.com:27017:{input:9474,emit:9474,reduce:909,output:101},Shard-1/primary.shard1.example.com:27017,secondary.shard1.example.com:27017:{ inpu t:1526,emit:1526,reduce:197,output:101}},counts:{emit:11000,input:11000,output:202,reduce:1106}} keyUpdates:0 writeConflicts:0 numYields:0 reslen :4368锁:{全局:{acquireCount:{r:2}},数据库:{acquireCount:{r:1}},集合:{acqu ireCount:{r:1}}}协议:op_command 115ms 2016- 11-24T08:46 :30.830 + 0000 I COMMAND [conn46] CMD:drop test.tmp.mrs。txns_1479977190_0
有关预期行为的任何指针都会非常有用。谢谢。
写了一篇关于这个原因的博客文章,这对于希望在MongoDB上挖掘其MR的人来说是一个重要的限制: https://scalegrid.io/blog/mongodb-performance-running-mongodb-map-reduce-操作上,次级/ – Vaibhaw