2011-07-09 37 views
1

我有一个网站使用Solr 1.4.1进行相关性/推荐。我在一些地方使用布尔型查询。我正在使用类似+(+type:aoh_company +aoh_dictionary_tids:623)的查询 - 并提供了预期的结果,但结果的顺序似乎是任意的。Solr布尔查询与索引时间提升相结合

我想通过设置索引时间提升来控制文档的排名,但他们似乎被忽略了这些查询。

一个例子

  • 查询URL是http://localhost:4930/solr/prod/select?rows=5&start=0&q.alt=(type%3Aaoh_company)+(aoh_dictionary_tids%3A623)&q=
  • 结果列于这个顺序返回(在括号中的索引时间提升值):
    1. 17132(1.22)
    2. 17179 (1.02)
    3. 17131(1.10)
    4. 17133(1.10)
    5. 17184(1.10)
  • 很明显,结果#2不应该在#3-5的基础上单独提高。
  • 鉴于这是一个布尔查询,排名不应该有太大的差异。

调试输出

我试图通过附加debugQuery=true到查询调试上面的查询,所以它成为http://localhost:4930/solr/prod/select?rows=5&start=0&q.alt=(type%3Aaoh_company)+(aoh_dictionary_tids%3A623)&q=&debugQuery=true

这是非常详细的,但在这里它是:

<lst name="debug"> 
    <null name="rawquerystring"/> 
    <null name="querystring"/> 
    <str name="parsedquery">+(+type:aoh_company +aoh_dictionary_tids:623)</str> 
    <str name="parsedquery_toString">+(+type:aoh_company +aoh_dictionary_tids:623)</str> 
    <lst name="explain"> 
    <str name="50hves/node/17132"> 
    1.7819747 = (MATCH) sum of: 
     0.9014403 = (MATCH) weight(type:aoh_company in 1805), product of: 
     0.37135038 = queryWeight(type:aoh_company), product of: 
      2.4274657 = idf(docFreq=457, maxDocs=1909) 
      0.15297863 = queryNorm 
     2.4274657 = (MATCH) fieldWeight(type:aoh_company in 1805), product of: 
      1.0 = tf(termFreq(type:aoh_company)=1) 
      2.4274657 = idf(docFreq=457, maxDocs=1909) 
      1.0 = fieldNorm(field=type, doc=1805) 
     0.88053435 = (MATCH) weight(aoh_dictionary_tids:623 in 1805), product of: 
     0.9284928 = queryWeight(aoh_dictionary_tids:623), product of: 
      6.069428 = idf(docFreq=11, maxDocs=1909) 
      0.15297863 = queryNorm 
     0.9483481 = (MATCH) fieldWeight(aoh_dictionary_tids:623 in 1805), product of: 
      1.0 = tf(termFreq(aoh_dictionary_tids:623)=1) 
      6.069428 = idf(docFreq=11, maxDocs=1909) 
      0.15625 = fieldNorm(field=aoh_dictionary_tids, doc=1805) 
    </str> 
    <str name="50hves/node/17179"> 
    1.7819747 = (MATCH) sum of: 
     0.9014403 = (MATCH) weight(type:aoh_company in 1896), product of: 
     0.37135038 = queryWeight(type:aoh_company), product of: 
      2.4274657 = idf(docFreq=457, maxDocs=1909) 
      0.15297863 = queryNorm 
     2.4274657 = (MATCH) fieldWeight(type:aoh_company in 1896), product of: 
      1.0 = tf(termFreq(type:aoh_company)=1) 
      2.4274657 = idf(docFreq=457, maxDocs=1909) 
      1.0 = fieldNorm(field=type, doc=1896) 
     0.88053435 = (MATCH) weight(aoh_dictionary_tids:623 in 1896), product of: 
     0.9284928 = queryWeight(aoh_dictionary_tids:623), product of: 
      6.069428 = idf(docFreq=11, maxDocs=1909) 
      0.15297863 = queryNorm 
     0.9483481 = (MATCH) fieldWeight(aoh_dictionary_tids:623 in 1896), product of: 
      1.0 = tf(termFreq(aoh_dictionary_tids:623)=1) 
      6.069428 = idf(docFreq=11, maxDocs=1909) 
      0.15625 = fieldNorm(field=aoh_dictionary_tids, doc=1896) 
    </str> 
    <str name="50hves/node/17131"> 
    1.7819747 = (MATCH) sum of: 
     0.9014403 = (MATCH) weight(type:aoh_company in 1905), product of: 
     0.37135038 = queryWeight(type:aoh_company), product of: 
      2.4274657 = idf(docFreq=457, maxDocs=1909) 
      0.15297863 = queryNorm 
     2.4274657 = (MATCH) fieldWeight(type:aoh_company in 1905), product of: 
      1.0 = tf(termFreq(type:aoh_company)=1) 
      2.4274657 = idf(docFreq=457, maxDocs=1909) 
      1.0 = fieldNorm(field=type, doc=1905) 
     0.88053435 = (MATCH) weight(aoh_dictionary_tids:623 in 1905), product of: 
     0.9284928 = queryWeight(aoh_dictionary_tids:623), product of: 
      6.069428 = idf(docFreq=11, maxDocs=1909) 
      0.15297863 = queryNorm 
     0.9483481 = (MATCH) fieldWeight(aoh_dictionary_tids:623 in 1905), product of: 
      1.0 = tf(termFreq(aoh_dictionary_tids:623)=1) 
      6.069428 = idf(docFreq=11, maxDocs=1909) 
      0.15625 = fieldNorm(field=aoh_dictionary_tids, doc=1905) 
    </str> 
    <str name="50hves/node/17133"> 
    1.7819747 = (MATCH) sum of: 
     0.9014403 = (MATCH) weight(type:aoh_company in 1906), product of: 
     0.37135038 = queryWeight(type:aoh_company), product of: 
      2.4274657 = idf(docFreq=457, maxDocs=1909) 
      0.15297863 = queryNorm 
     2.4274657 = (MATCH) fieldWeight(type:aoh_company in 1906), product of: 
      1.0 = tf(termFreq(type:aoh_company)=1) 
      2.4274657 = idf(docFreq=457, maxDocs=1909) 
      1.0 = fieldNorm(field=type, doc=1906) 
     0.88053435 = (MATCH) weight(aoh_dictionary_tids:623 in 1906), product of: 
     0.9284928 = queryWeight(aoh_dictionary_tids:623), product of: 
      6.069428 = idf(docFreq=11, maxDocs=1909) 
      0.15297863 = queryNorm 
     0.9483481 = (MATCH) fieldWeight(aoh_dictionary_tids:623 in 1906), product of: 
      1.0 = tf(termFreq(aoh_dictionary_tids:623)=1) 
      6.069428 = idf(docFreq=11, maxDocs=1909) 
      0.15625 = fieldNorm(field=aoh_dictionary_tids, doc=1906) 
    </str> 
    <str name="50hves/node/17184"> 
    1.6058679 = (MATCH) sum of: 
     0.9014403 = (MATCH) weight(type:aoh_company in 1892), product of: 
     0.37135038 = queryWeight(type:aoh_company), product of: 
      2.4274657 = idf(docFreq=457, maxDocs=1909) 
      0.15297863 = queryNorm 
     2.4274657 = (MATCH) fieldWeight(type:aoh_company in 1892), product of: 
      1.0 = tf(termFreq(type:aoh_company)=1) 
      2.4274657 = idf(docFreq=457, maxDocs=1909) 
      1.0 = fieldNorm(field=type, doc=1892) 
     0.7044275 = (MATCH) weight(aoh_dictionary_tids:623 in 1892), product of: 
     0.9284928 = queryWeight(aoh_dictionary_tids:623), product of: 
      6.069428 = idf(docFreq=11, maxDocs=1909) 
      0.15297863 = queryNorm 
     0.7586785 = (MATCH) fieldWeight(aoh_dictionary_tids:623 in 1892), product of: 
      1.0 = tf(termFreq(aoh_dictionary_tids:623)=1) 
      6.069428 = idf(docFreq=11, maxDocs=1909) 
      0.125 = fieldNorm(field=aoh_dictionary_tids, doc=1892) 
    </str> 
    </lst> 
    <str name="QParser">DisMaxQParser</str> 
    <str name="altquerystring">org.apache.lucene.search.BooleanQuery:+type:aoh_company +aoh_dictionary_tids:623</str> 
    <null name="boostfuncs"/> 
    <lst name="timing"> 
    <double name="time">7.0</double> 
    <lst name="prepare"> 
     <double name="time">1.0</double> 
     <lst name="org.apache.solr.handler.component.QueryComponent"> 
     <double name="time">0.0</double> 
     </lst> 
     <lst name="org.apache.solr.handler.component.FacetComponent"> 
     <double name="time">0.0</double> 
    </lst> 
    <lst name="org.apache.solr.handler.component.MoreLikeThisComponent"> 
    <double name="time">0.0</double> 
    </lst> 
    <lst name="org.apache.solr.handler.component.HighlightComponent"> 
    <double name="time">0.0</double> 
    </lst> 
    <lst name="org.apache.solr.handler.component.StatsComponent"> 
    <double name="time">0.0</double> 
    </lst> 
    <lst name="org.apache.solr.handler.component.SpellCheckComponent"> 
    <double name="time">0.0</double> 
    </lst> 
    <lst name="org.apache.solr.handler.component.DebugComponent"> 
    <double name="time">0.0</double> 
    </lst> 
    </lst> 
    <lst name="process"> 
    <double name="time">6.0</double> 
    <lst name="org.apache.solr.handler.component.QueryComponent"> 
     <double name="time">0.0</double> 
    </lst> 
    <lst name="org.apache.solr.handler.component.FacetComponent"> 
     <double name="time">0.0</double> 
    </lst> 
    <lst name="org.apache.solr.handler.component.MoreLikeThisComponent"> 
     <double name="time">0.0</double> 
    </lst> 
    <lst name="org.apache.solr.handler.component.HighlightComponent"> 
     <double name="time">0.0</double> 
    </lst> 
    <lst name="org.apache.solr.handler.component.StatsComponent"> 
     <double name="time">0.0</double> 
    </lst> 
    <lst name="org.apache.solr.handler.component.SpellCheckComponent"> 
     <double name="time">0.0</double> 
    </lst> 
    <lst name="org.apache.solr.handler.component.DebugComponent"> 
     <double name="time">6.0</double> 
    </lst> 
    </lst> 
</lst> 

由于我读到它,前四项结果得分为1.7819747,第五项得分为1.6058679,我看不到在那里的任何提升值,所以它是eems他们不是排名方程中的一个因素。

所以我做错了什么。有什么我需要做的,以使Solr考虑到提升?
有没有办法检查存储在Solr中的boost值?它看起来正确,我发送给它的文件,但我找不到一种方法来查看存储的值?

此外,这里是从我的schema.xml相关部分:

<types> 
    <fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/> 
    <fieldType name="integer" class="solr.IntField" omitNorms="true"/> 
</types> 
<fields> 
    <field name="type" type="string" indexed="true" stored="true"/> 
    <field name="aoh_dictionary_tids" type="integer" indexed="true" stored="true" multiValued="true" omitNorms="false"/> 
</fields> 

在下面他的回答,FYR提到的规范需要在球场上启用该提升值适用。所以我想稍微修改一下我的问题:

  • 在其中一个查询字段上启用规范以便应用提升足够了吗?
  • 我的omitNorms="false"在字段上是否会覆盖fieldType上的omitNorms="true"

任何帮助将不胜感激。

回答

0

你不会在解释中看到提升。索引时的提升适用于某个文档中某个字段的规范。像一个乘法器。

如果您启用了规范,则您的bosst值将在索引时使用。如果使用DefaultSimilarity并且启用了规范,则规范始终是相似性函数的一部分。

编辑的后续问题:

  1. 这足以对应用升压启用规范。因为规范提供索引中的字段与索引中的数据权重结构。索引时间提升乘以规范值并保存到规范字段。

  2. 字段声明中的omitNorms将重写类型定义 - 您在解释结构中也会看到这一点。 aoh_dictionary具有不等于1的值。如果禁用了标准1,则默认应用。

+0

该提升适用于整个文档。规范是特定领域的,对吗? – mikl

+0

索引时的确实提升是文档特定的。但它只适用于文档中启用了规范的字段。所以如果你有两个字段A(有规范)和B(没有规范),并且你只查询B,你不会注意到任何区别。 – fyr

+0

另外,如果你有可能在建立索引时设置字段提升,那么这个取决于你的solr库。 – fyr