我想使用Solr在用户搜索(e.g. "skinny jeans" in "blue skinny jeans")
的类别中查找精确匹配。我正在使用以下类型定义:Solr瓦在调试查询中不可见
<fieldType name="subphrase" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
<analyzer type="index">
<charFilter class="solr.PatternReplaceCharFilterFactory"
pattern="\ "
replacement="_"/>
<tokenizer class="solr.KeywordTokenizerFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.ShingleFilterFactory"
outputUnigrams="true"
outputUnigramsIfNoShingles="true"
tokenSeparator="_"
minShingleSize="2"
maxShingleSize="99"/>
</analyzer>
</fieldType>
该类型将索引类别不进行标记化,只将下划线替换为空白。但它会标记查询并将它们拼凑(带下划线)。
我想要做的是匹配索引类别的查询shingles。在Solr的分析页面,我可以看到,空格/下划线更换工程两个索引和查询,我可以看到,查询被鹅卵石正确的(下图):
我的问题是,在Solr查询页面中,我看不到生成的带状疱疹,因此我认为结果类别“紧身牛仔裤”不匹配,但类别“牛仔裤”匹配:(
这是调试输出:
{
"responseHeader": {
"status": 0,
"QTime": 1,
"params": {
"q": "name:(skinny jeans)",
"indent": "true",
"wt": "json",
"debugQuery": "true",
"_": "1464170217438"
}
},
"response": {
"numFound": 1,
"start": 0,
"docs": [
{
"id": 33,
"name": "jeans",
}
]
},
"debug": {
"rawquerystring": "name:(skinny jeans)",
"querystring": "name:(skinny jeans)",
"parsedquery": "name:skinny name:jeans",
"parsedquery_toString": "name:skinny name:jeans",
"explain": {
"33": "\n2.2143755 = product of:\n 4.428751 = sum of:\n 4.428751 = weight(name:jeans in 54) [DefaultSimilarity], result of:\n 4.428751 = score(doc=54,freq=1.0), product of:\n 0.6709952 = queryWeight, product of:\n 6.600272 = idf(docFreq=1, maxDocs=541)\n 0.10166174 = queryNorm\n 6.600272 = fieldWeight in 54, product of:\n 1.0 = tf(freq=1.0), with freq of:\n 1.0 = termFreq=1.0\n 6.600272 = idf(docFreq=1, maxDocs=541)\n 1.0 = fieldNorm(doc=54)\n 0.5 = coord(1/2)\n"
},
"QParser": "LuceneQParser"
}
}
很明显,parsedquery参数不显示叠瓦式查询。我需要做什么来完成匹配查询shingles与索引值的过程?我觉得我非常接近解决这个问题。任何建议表示赞赏!
你有没有试过名字:“紧身牛仔裤”? – MatsLindh
是的,没有任何回报,甚至没有“牛仔裤”。这可能与我提出的另一个问题有关@ [link](https://stackoverflow.com/questions/37425263/solr-keywordtokenizerfactory-exact-match-for-multiple-words-not-working) As @ Abhijit Bashetti提到,令牌不会这样工作,它们没有任何内容。另外,我实际上不希望它以这种方式工作,我不想用引号,因为我正在寻找一个子字符串,而这会破坏目的。 – mils