2014-01-19 51 views
3

我正在多个字段上执行一个字段的query_string查询,_alltags.name,并试图理解评分。查询:{"query":{"query_string":{"query":"animal","fields":["_all","tags.name"]}}}。下面是查询返回的文件:为什么在同一个查询中queryWeight包含某些结果分数,但不包含其他分数?

  • 文件1tags.name完全匹配,但不是在_all
  • 文档8tags.name_all上有完全匹配。

文件8应该赢了,它确实如此,但我对打分的结果感到困惑。看起来像文档1被tags.name分数乘以两次IDF而受到处罚,而文档8的tags.name分数只乘以一次IDF。总之:

  • 他们都有一个组件weight(tags.name:animal in 0) [PerFieldSimilarity]
  • 在文档1中,我们有weight = score = queryWeight x fieldWeight
  • 在文件8中,我们有weight = fieldWeight

由于queryWeight包含idf,这导致文档1被idf两次惩罚。

任何人都可以理解这一点吗?

信息

  • 如果我删除从查询的字段_allqueryWeight完全从解释了。
  • 添加"use_dis_max":true作为选项没有效果。
    • 然而,另外加入"tie_breaker":0.7(或任何值)确实通过给它的更复杂的公式,我们在文献看到1.
    • 思想影响文献8:这是合理的,一个布尔查询(此是)可能会这样做是为了给予与多个子查询匹配的查询更多的权重。然而,这对dis_max查询没有任何意义,它应该只返回最大的子查询。

下面是相关的解释请求。寻找嵌入式评论。

文献1(匹配仅在tags.name):

curl -XGET 'http://localhost:9200/questions/question/1/_explain?pretty' -d '{"query":{"query_string":{"query":"animal","fields":["_all","tags.name"]}}}'

{ 
    "ok" : true, 
    "_index" : "questions_1390104463", 
    "_type" : "question", 
    "_id" : "1", 
    "matched" : true, 
    "explanation" : { 
    "value" : 0.058849156, 
    "description" : "max of:", 
    "details" : [ { 
     "value" : 0.058849156, 
     "description" : "weight(tags.name:animal in 0) [PerFieldSimilarity], result of:", 
     // weight = score = queryWeight x fieldWeight 
     "details" : [ { 
     // score and queryWeight are NOT a part of the other explain! 
     "value" : 0.058849156, 
     "description" : "score(doc=0,freq=1.0 = termFreq=1.0\n), product of:", 
     "details" : [ { 
      "value" : 0.30685282, 
      "description" : "queryWeight, product of:", 
      "details" : [ { 
      // This idf is NOT a part of the other explain! 
      "value" : 0.30685282, 
      "description" : "idf(docFreq=1, maxDocs=1)" 
      }, { 
      "value" : 1.0, 
      "description" : "queryNorm" 
      } ] 
     }, { 
      "value" : 0.19178301, 
      "description" : "fieldWeight in 0, product of:", 
      "details" : [ { 
      "value" : 1.0, 
      "description" : "tf(freq=1.0), with freq of:", 
      "details" : [ { 
       "value" : 1.0, 
       "description" : "termFreq=1.0" 
      } ] 
      }, { 
      "value" : 0.30685282, 
      "description" : "idf(docFreq=1, maxDocs=1)" 
      }, { 
      "value" : 0.625, 
      "description" : "fieldNorm(doc=0)" 
      } ] 
     } ] 
     } ] 
    } ] 
    } 

文献8(在两个_alltags.name匹配):

curl -XGET 'http://localhost:9200/questions/question/8/_explain?pretty' -d '{"query":{"query_string":{"query":"animal","fields":["_all","tags.name"]}}}'

{ 
    "ok" : true, 
    "_index" : "questions_1390104463", 
    "_type" : "question", 
    "_id" : "8", 
    "matched" : true, 
    "explanation" : { 
    "value" : 0.15342641, 
    "description" : "max of:", 
    "details" : [ { 
     "value" : 0.033902764, 
     "description" : "btq, product of:", 
     "details" : [ { 
     "value" : 0.033902764, 
     "description" : "weight(_all:anim in 0) [PerFieldSimilarity], result of:", 
     "details" : [ { 
      "value" : 0.033902764, 
      "description" : "fieldWeight in 0, product of:", 
      "details" : [ { 
      "value" : 0.70710677, 
      "description" : "tf(freq=0.5), with freq of:", 
      "details" : [ { 
       "value" : 0.5, 
       "description" : "phraseFreq=0.5" 
      } ] 
      }, { 
      "value" : 0.30685282, 
      "description" : "idf(docFreq=1, maxDocs=1)" 
      }, { 
      "value" : 0.15625, 
      "description" : "fieldNorm(doc=0)" 
      } ] 
     } ] 
     }, { 
     "value" : 1.0, 
     "description" : "allPayload(...)" 
     } ] 
    }, { 
     "value" : 0.15342641, 
     "description" : "weight(tags.name:animal in 0) [PerFieldSimilarity], result of:", 
     // weight = fieldWeight 
     // No score or queryWeight in sight! 
     "details" : [ { 
     "value" : 0.15342641, 
     "description" : "fieldWeight in 0, product of:", 
     "details" : [ { 
      "value" : 1.0, 
      "description" : "tf(freq=1.0), with freq of:", 
      "details" : [ { 
      "value" : 1.0, 
      "description" : "termFreq=1.0" 
      } ] 
     }, { 
      "value" : 0.30685282, 
      "description" : "idf(docFreq=1, maxDocs=1)" 
     }, { 
      "value" : 0.5, 
      "description" : "fieldNorm(doc=0)" 
     } ] 
     } ] 
    } ] 
    } 
} 
+0

嗨,你自己找到答案了吗?或者你有任何来源去学习?我正在遭受同样的缺乏理解。在我们的案例中,这会对一些点击产生不利影响,并且我需要了解为什么以及如何调整我们的查询。 – Jakub

+0

不,我从来没有找到一个答案,不幸的是,好奇看到你听到回来。 – tmandry

回答

相关问题