我Solr模式如下(仅重要部分):使用dismax搜索多字索引项
<fieldType name="bagofwords_expertfinding" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<!-- remove letters repeated more than two times -->
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords_en.txt"
enablePositionIncrements="true"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="^[0-9-/_,\.]+$" replacement="" replace="all"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="^.*(([aA-zZ])\\2)\\2+.*$" replacement=""/>
<filter class="solr.PorterStemFilterFactory"/>
<filter class="solr.LengthFilterFactory" min="3" max="100"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords_en.txt"
enablePositionIncrements="true"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="^[0-9-/_,\.]+$" replacement="" replace="all"/>
<filter class="solr.PorterStemFilterFactory"/>
<filter class="solr.LengthFilterFactory" min="3" max="100"/>
</analyzer>
</fieldType>
<fieldType name="namedentities_expertfinding" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<!-- remove letters repeated more than two times -->
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="\s," replacement=","/>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern=",\s" replacement=","/>
<tokenizer class="solr.PatternTokenizerFactory" pattern="," />
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords_en.txt"
enablePositionIncrements="true"
/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="^[0-9-/_,\.]+$" replacement="" replace="all"/>
<filter class="solr.LengthFilterFactory" min="3" max="100"/>
</analyzer>
</fieldType>
在namedentities我索引多字词,如:“diego alberto milito”,“diego armando maradona”。我试图在两个领域进行搜索,以dismax查询来提升他们。
但与此查询尝试: 本地主机:8080/Solr的/选择/ Q = “马拉多纳” & DEFTYPE = dismax & QF = namedentities^100个bagofwords^1 & FL = *,得分& debugQuery =真& mm = 0
solr找不到任何东西。也许我不明白正确使用“象征
我不明白,也给这个从Solr的维基:
”在Solr的1.4和之前,您应该基本定毫米= 0,如果你想等同于q.op = OR,而mm = 100%,如果您想要q.op = AND的等价性。在3.x和trunk中,默认值mm由q.op参数决定(q.op = AND => mm = 100%; q.op = OR => mm = 0%)。请记住,缺省操作符受到schema.xml条目的影响。在较旧版本的Solr中,默认值为100%(所有子句必须匹配)“
并且假设在我的模式中defaultOperator是OR,为什么没有设置mm = 0,我获得的默认mm值为100.
提前感谢!
解析查询的调试版本的输出也是有用的。我怀疑t由于您将字段标记为字母,因此您的精确搜索将不匹配 - 因为这两个条目都不是您将其用引号引起来搜索的字符串。 – MatsLindh 2012-02-13 21:46:17
谢谢。我终于发现引号并不意味着完全匹配,而是寻找一个短语:连续的字符串,所以我改变了我的模式分析器。但是没有办法处理多词记号......所以我在单词索引中搜索短语 – Tywnil 2012-02-13 21:56:15