您可以使用$substr
和$cond
运算符进行小字符串处理,以获得所需的结果(不需要正则表达式)。这将需要MongoDB的2.6+:
db.User.aggregate([
{ $unwind : "$tags"},
{ $project : {
tagType : {
$cond : {
if : { $eq : [ { $substr : [ "$tags", 0, 6] }, "active" ]},
then: "active",
else: "inactive"}
},
tag: {
$cond : {
if : { $eq : [ { $substr : [ "$tags", 0, 6] }, "active" ]},
then: { $substr : ["$tags", 7, -1]},
else: { $substr : ["$tags", 9, -1]}}
}
}},
{ $group : { _id : {tagType : "$tagType", tag: "$tag"} ,
total: { $sum: 1}}},
{ $group : { _id : "$_id.tagType",
subtags: { $push : {tag : "$_id.tag", total: "$total"}},
total: { $sum : "$total"}}}
]);
此查询的结果将是这样的:
{
"_id" : "inactive",
"subtags" : [
{
"tag" : "adult",
"total" : 1
}
],
"total" : 1
}
{
"_id" : "active",
"subtags" : [
{
"tag" : "junk-tag",
"total" : 1
},
{
"tag" : "child",
"total" : 1
},
{
"tag" : "tradeout",
"total" : 2
},
{
"tag" : "adult",
"total" : 1
}
],
"total" : 5
}
编辑:
我刚刚注意到,在结果总被计数标签总数,而不是具有至少一个活动标签的文档数量。这个查询会给你想要的确切输出,虽然稍微复杂一点:
db.User.aggregate([
/* unwind so we can process each tag from the array */
{ $unwind : "$tags"},
/* Remove the active/inactive strings from the tag values
and create a new value tagType */
{ $project : {
tagType : {
$cond : {
if : { $eq : [ { $substr : [ "$tags", 0, 6] }, "active" ]},
then: "active",
else: "inactive"}
},
tag: {
$cond : {
if : { $eq : [ { $substr : [ "$tags", 0, 6] }, "active" ]},
then: { $substr : ["$tags", 7, -1]},
else: { $substr : ["$tags", 9, -1]}}
}
}},
/* Group the documents by tag type, so we can
find num. of docs by tag type (total) */
{ $group : { _id : "$tagType",
tags :{ $push : "$tag"},
docId :{ $addToSet : "$_id"}}},
/* project the values so we can get the 'total' for tag type */
{ $project : { tagType : "$_id",
tags : 1,
"docTotal": { $size : "$docId" }}},
/* we must unwind to get total count for each tag */
{ $unwind : "$tags"},
/* sum the tags by type and tag value */
{ $group : { _id : {tagType : "$tagType", tag: "$tags"} ,
total: { $sum: 1}, docTotal: {$first : "$docTotal"}}},
/* finally group by tagType so we can get subtags */
{ $group : { _id : "$_id.tagType",
subtags: { $push : {tag : "$_id.tag", total: "$total"}},
total: { $first : "$docTotal"}}}
]);
我不会使用正则表达式嵌入结构,如stylsheets或xml文件。改为使用算法,或者至少一个可以在更多步骤中以文本形式运行的GREP程序 –
是的,我是这么做的,但是好奇的是,是否有通过聚合框架来实现这一点的方法。 –