2013-05-06 54 views
0

我正尝试在solr中搜索,但我想优先搜索字段(即标题)而不是其他字段(如“董事”)中的字段(即标题)。 这是我的schema.xml中的一部分Solr:字段中的高优先级

<fields> 
    <field name="Id" type="string" indexed="true" stored="true" required="true"/> 
    <field name="Title" type="text_general" indexed="true" stored="true"/> 
    <field name="OriginalTitle" type="text_general" indexed="true" stored="true"/> 
    <field name="Directors" type="text_general" indexed="true" stored="true" multiValued="true" required="false"/> 
    <field name="Language" type="text_general" indexed="false" stored="true" required="false"/> 
    <field name="text" type="text_general" indexed="true" stored="false" multiValued="true"/> 
</fields> 

<uniqueKey>Id</uniqueKey> 

<defaultSearchField>text</defaultSearchField> 

<solrQueryParser defaultOperator="OR"/> 

<copyField source="Title" dest="text"/> 
<copyField source="OriginalTitle" dest="text"/> 
<copyField source="Directors" dest="text"/> 
<copyField source="Keywords" dest="text"/> 

这是我的请求处理:

<lst name="responseHeader"> 
    <int name="status">0</int> 
    <int name="QTime">2</int> 
    <lst name="params"> 
    <str name="lowercaseOperators">true</str> 
    <str name="pf">Title^100 Directors^10</str> 
    <str name="indent">true</str> 
    <str name="q">fo*</str> 
    <str name="qf">Title Directors</str> 
    <str name="stopwords">true</str> 
    <str name="wt">xml</str> 
    <str name="defType">edismax</str> 
    </lst> 
</lst> 

我的结果是:

<result name="response" numFound="4" start="0"> 
    <doc> 
    <str name="Language">Ingles subtítulos español</str> 
    <str name="Title">Footloose</str> 
    <arr name="Directors"> 
     <str>Herbert Ross</str> 
    </arr> 
    <str name="OriginalTitle">Footloose (1984)</str> 
</doc> 
    <doc> 
    <str name="Language">Ingles subtítulos español</str> 
    <str name="Title">Amadeus</str> 
    <arr name="Directors"> 
     <str>Milos Forman</str> 
    </arr> 
    <str name="OriginalTitle">Amadeus</str> 
</doc> 
    <doc> 
    <str name="Language">Ingles subtítulos español</str> 
    <str name="Title">Forrest Gump</str> 
    <arr name="Directors"> 
     <str>Robert Zemeckis</str> 
    </arr> 
    <str name="OriginalTitle">Forrest Gump</str> 
</doc> 
    <doc> 
    <str name="Language">Doblado al español</str> 
    <str name="Title">Chimpancés</str> 
    <arr name="Directors"> 
     <str>Alastair Fothergill</str> 
     <str> Mark Linfield</str> 
    </arr> 
    <str name="OriginalTitle">Chimpanzee Esp</str> 
</doc> 
</result> 

,但我想这样的结果:

<result name="response" numFound="4" start="0"> 
    <doc> 
    <str name="Language">Ingles subtítulos español</str> 
    <str name="Title">Footloose</str> 
    <arr name="Directors"> 
     <str>Herbert Ross</str> 
    </arr> 
    <str name="OriginalTitle">Footloose (1984)</str> 
</doc> 
    <doc> 
    <str name="Language">Ingles subtítulos español</str> 
    <str name="Title">Forrest Gump</str> 
    <arr name="Directors"> 
     <str>Robert Zemeckis</str> 
    </arr> 
    <str name="OriginalTitle">Forrest Gump</str> 
</doc> 
<doc> 
    <str name="Language">Ingles subtítulos español</str> 
    <str name="Title">Amadeus</str> 
    <arr name="Directors"> 
     <str>Milos Forman</str> 
    </arr> 
    <str name="OriginalTitle">Amadeus</str> 
</doc> 
    <doc> 
    <str name="Language">Doblado al español</str> 
    <str name="Title">Chimpancés</str> 
    <arr name="Directors"> 
     <str>Alastair Fothergill</str> 
     <str> Mark Linfield</str> 
    </arr> 
    <str name="OriginalTitle">Chimpanzee Esp</str> 
</doc> 
</result> 

我该怎么办我的问题Ÿ得到我想要的回应?

UPDATE: 关于调试=真,我得到了这样的结果:

<lst name="debug"> 
<str name="rawquerystring">fo*</str> 
<str name="querystring">fo*</str> 
<str name="parsedquery"> 
(+DisjunctionMaxQuery((Directors:fo* | Title:fo*))()())/no_coord 
</str> 
<str name="parsedquery_toString">+(Directors:fo* | Title:fo*)()()</str> 
<lst name="explain"> 
<str name="10"> 
1.0 = (MATCH) sum of: 1.0 = (MATCH) max of: 1.0 = (MATCH) ConstantScore(Title:fo*), product of: 1.0 = boost 1.0 = queryNorm 
</str> 
<str name="2"> 
1.0 = (MATCH) sum of: 1.0 = (MATCH) max of: 1.0 = (MATCH) ConstantScore(Directors:fo*), product of: 1.0 = boost 1.0 = queryNorm 
</str> 
<str name="12"> 
1.0 = (MATCH) sum of: 1.0 = (MATCH) max of: 1.0 = (MATCH) ConstantScore(Title:fo*), product of: 1.0 = boost 1.0 = queryNorm 
</str> 
<str name="711"> 
1.0 = (MATCH) sum of: 1.0 = (MATCH) max of: 1.0 = (MATCH) ConstantScore(Directors:fo*), product of: 1.0 = boost 1.0 = queryNorm 
</str> 
</lst> 
<str name="QParser">ExtendedDismaxQParser</str> 
<null name="altquerystring"/> 
<null name="boost_queries"/> 
<arr name="parsed_boost_queries"/> 
<null name="boostfuncs"/> 
<lst name="timing"> 
<double name="time">4.0</double> 
<lst name="prepare"> 
<double name="time">1.0</double> 
<lst name="query"> 
<double name="time">1.0</double> 
</lst> 
<lst name="facet"> 
<double name="time">0.0</double> 
</lst> 
<lst name="mlt"> 
<double name="time">0.0</double> 
</lst> 
<lst name="highlight"> 
<double name="time">0.0</double> 
</lst> 
<lst name="stats"> 
<double name="time">0.0</double> 
</lst> 
<lst name="debug"> 
<double name="time">0.0</double> 
</lst> 
</lst> 
<lst name="process"> 
<double name="time">3.0</double> 
<lst name="query"> 
<double name="time">0.0</double> 
</lst> 
<lst name="facet"> 
<double name="time">0.0</double> 
</lst> 
<lst name="mlt"> 
<double name="time">0.0</double> 
</lst> 
<lst name="highlight"> 
<double name="time">0.0</double> 
</lst> 
<lst name="stats"> 
<double name="time">0.0</double> 
</lst> 
<lst name="debug"> 
<double name="time">3.0</double> 
</lst> 
</lst> 
</lst> 
</lst> 

回答

3

您正在提高您的乐句字段匹配,但不会增加您的查询字段匹配。你可能想提高所有人,尤其是因为你的搜索实际上并不是一个短语:

<str name="pf">Title^100 Directors^10</str> 
<str name="qf">Title Directors</str> 

尝试把相同的权重上QF场。

+0

的回复谢谢,这对我很有用。现在我需要研究如何降低“to”,“a”等字词的优先级。 – shinjidev 2013-05-07 13:22:54

+0

这就是StopFilterFactory的功能。我认为你的默认配置已经使用它,你可能只需要添加更多你想忽略的单词。但是,是的,这将是一个单独的问题。 – 2013-05-07 14:01:37

0

你应该尝试做一个解释(但调试= true或debugQuery = trueto查询字符串),看看有什么元素的查询分数。 看起来它可能是术语频率或这种差异。由于您没有太多内容,所以它可能还会在文档之间产生联系

+0

我已经更新了我的问题,其中debug = true – shinjidev 2013-05-06 19:13:38