为什么在同一个查询中queryWeight包含某些结果分数，但不包含其他分数？

我正在多个字段上执行一个字段的query_string查询，_all和tags.name，并试图理解评分。查询：{"query":{"query_string":{"query":"animal","fields":["_all","tags.name"]}}}。下面是查询返回的文件：为什么在同一个查询中queryWeight包含某些结果分数，但不包含其他分数？

文件1对tags.name完全匹配，但不是在_all。
文档8在tags.name和_all上有完全匹配。

文件8应该赢了，它确实如此，但我对打分的结果感到困惑。看起来像文档1被tags.name分数乘以两次IDF而受到处罚，而文档8的tags.name分数只乘以一次IDF。总之：

他们都有一个组件weight(tags.name:animal in 0) [PerFieldSimilarity]。
在文档1中，我们有weight = score = queryWeight x fieldWeight。
在文件8中，我们有weight = fieldWeight！

由于queryWeight包含idf，这导致文档1被idf两次惩罚。

任何人都可以理解这一点吗？

信息

如果我删除从查询的字段_all，queryWeight完全从解释了。
添加"use_dis_max":true作为选项没有效果。
- 然而，另外加入"tie_breaker":0.7（或任何值）确实通过给它的更复杂的公式，我们在文献看到1.
- 思想影响文献8：这是合理的，一个布尔查询（此是）可能会这样做是为了给予与多个子查询匹配的查询更多的权重。然而，这对dis_max查询没有任何意义，它应该只返回最大的子查询。

下面是相关的解释请求。寻找嵌入式评论。

文献1（匹配仅在tags.name）：

curl -XGET 'http://localhost:9200/questions/question/1/_explain?pretty' -d '{"query":{"query_string":{"query":"animal","fields":["_all","tags.name"]}}}'：

{ 
    "ok" : true, 
    "_index" : "questions_1390104463", 
    "_type" : "question", 
    "_id" : "1", 
    "matched" : true, 
    "explanation" : { 
    "value" : 0.058849156, 
    "description" : "max of:", 
    "details" : [ { 
     "value" : 0.058849156, 
     "description" : "weight(tags.name:animal in 0) [PerFieldSimilarity], result of:", 
     // weight = score = queryWeight x fieldWeight 
     "details" : [ { 
     // score and queryWeight are NOT a part of the other explain! 
     "value" : 0.058849156, 
     "description" : "score(doc=0,freq=1.0 = termFreq=1.0\n), product of:", 
     "details" : [ { 
      "value" : 0.30685282, 
      "description" : "queryWeight, product of:", 
      "details" : [ { 
      // This idf is NOT a part of the other explain! 
      "value" : 0.30685282, 
      "description" : "idf(docFreq=1, maxDocs=1)" 
      }, { 
      "value" : 1.0, 
      "description" : "queryNorm" 
      } ] 
     }, { 
      "value" : 0.19178301, 
      "description" : "fieldWeight in 0, product of:", 
      "details" : [ { 
      "value" : 1.0, 
      "description" : "tf(freq=1.0), with freq of:", 
      "details" : [ { 
       "value" : 1.0, 
       "description" : "termFreq=1.0" 
      } ] 
      }, { 
      "value" : 0.30685282, 
      "description" : "idf(docFreq=1, maxDocs=1)" 
      }, { 
      "value" : 0.625, 
      "description" : "fieldNorm(doc=0)" 
      } ] 
     } ] 
     } ] 
    } ] 
    }

文献8（在两个_all和tags.name匹配）：

curl -XGET 'http://localhost:9200/questions/question/8/_explain?pretty' -d '{"query":{"query_string":{"query":"animal","fields":["_all","tags.name"]}}}'：

{ 
    "ok" : true, 
    "_index" : "questions_1390104463", 
    "_type" : "question", 
    "_id" : "8", 
    "matched" : true, 
    "explanation" : { 
    "value" : 0.15342641, 
    "description" : "max of:", 
    "details" : [ { 
     "value" : 0.033902764, 
     "description" : "btq, product of:", 
     "details" : [ { 
     "value" : 0.033902764, 
     "description" : "weight(_all:anim in 0) [PerFieldSimilarity], result of:", 
     "details" : [ { 
      "value" : 0.033902764, 
      "description" : "fieldWeight in 0, product of:", 
      "details" : [ { 
      "value" : 0.70710677, 
      "description" : "tf(freq=0.5), with freq of:", 
      "details" : [ { 
       "value" : 0.5, 
       "description" : "phraseFreq=0.5" 
      } ] 
      }, { 
      "value" : 0.30685282, 
      "description" : "idf(docFreq=1, maxDocs=1)" 
      }, { 
      "value" : 0.15625, 
      "description" : "fieldNorm(doc=0)" 
      } ] 
     } ] 
     }, { 
     "value" : 1.0, 
     "description" : "allPayload(...)" 
     } ] 
    }, { 
     "value" : 0.15342641, 
     "description" : "weight(tags.name:animal in 0) [PerFieldSimilarity], result of:", 
     // weight = fieldWeight 
     // No score or queryWeight in sight! 
     "details" : [ { 
     "value" : 0.15342641, 
     "description" : "fieldWeight in 0, product of:", 
     "details" : [ { 
      "value" : 1.0, 
      "description" : "tf(freq=1.0), with freq of:", 
      "details" : [ { 
      "value" : 1.0, 
      "description" : "termFreq=1.0" 
      } ] 
     }, { 
      "value" : 0.30685282, 
      "description" : "idf(docFreq=1, maxDocs=1)" 
     }, { 
      "value" : 0.5, 
      "description" : "fieldNorm(doc=0)" 
     } ] 
     } ] 
    } ] 
    } 
}

来源

2014-01-19 tmandry

嗨，你自己找到答案了吗？或者你有任何来源去学习？我正在遭受同样的缺乏理解。在我们的案例中，这会对一些点击产生不利影响，并且我需要了解为什么以及如何调整我们的查询。 – Jakub

不，我从来没有找到一个答案，不幸的是，好奇看到你听到回来。 – tmandry

我没有答案。只是想提及我发布的问题到Elasticsearch论坛：https://groups.google.com/forum/#!topic/elasticsearch/xBKlFkq0SP0 我会在这里通知我什么时候会得到答案。

来源

2015-04-17 13:12:19 Jakub

为什么在同一个查询中queryWeight包含某些结果分数，但不包含其他分数？

回答

相关问题