2017-05-23 59 views
0

我想知道是否有任何方法让词组建议者纠正拼音差异的前缀拼写错误。Elasticsearch词组建议词语拼音差异

Elasticsearch 5.1.2

测试中Kibana 5.1.2

例如:

而不是 “马戏团” 有人写 “sircus” 的,或不是 “编码” 有人写“koding ”。 有趣的是,不是“短语”,你可以写“擦除”,并得到一个建议。

这是我的设置。

设置:

PUT text_index 
{ 
    "settings": { 
    "analysis": { 
     "analyzer": { 
     "suggests_analyzer": { 
      "tokenizer": "standard", 
      "filter": [ 
      "lowercase", 
      "asciifolding", 
      "shingle_filter" 
      ], 
      "type": "custom" 
     }, 
     "reverse": { 
      "type": "custom", 
      "tokenizer": "standard", 
      "filter": ["standard", "reverse"] 
      } 
     }, 
     "filter": { 
     "shingle_filter": { 
      "min_shingle_size": 2, 
      "max_shingle_size": 5, 
      "type": "shingle" 
     } 
     } 
    } 
    }, 
    "mappings": { 
    "testtype": { 
     "properties": { 
     "suggest_field": { 
      "type": "text", 
      "analyzer": "suggests_analyzer", 
      "fields": { 
      "reverse": { 
       "type": "text", 
       "analyzer": "reverse" 
      } 
      } 
     } 
     } 
    } 
    } 
} 

有些文件:

POST test_index/test_type/_bulk 
{"index":{}} 
{ "suggest_field": "phrase"} 
{"index":{}} 
{ "suggest_field": "Circus"} 
{"index":{}} 
{ "suggest_field": "Coding"} 

查询:

POST /so-index/_search 
{ 
    "suggest" : { 
    "text" : "sircus", 
    "simple_phrase" : { 
     "phrase" : { 
     "field" : "suggest_field", 
     "max_errors": 0.9, 
     "highlight": { 
      "pre_tag": "<em>", 
      "post_tag": "</em>" 
     }, 
     "direct_generator" : [ { 
      "field" : "suggest_field", 
      "suggest_mode" : "always" 
     }, { 
      "field" : "suggest_field.reverse", 
      "suggest_mode" : "always", 
      "pre_filter" : "reverse", 
      "post_filter" : "reverse" 
     }] 
     } 
    } 
    } 
} 

另外,我重复以下步骤几次(在5和10),在不改变任何东西:

  • 删除索引
  • 放指标,设置&映射
  • 文档添加
  • 查询(codign)

有时我建议,有时候我不知道。有没有解释呢?

+0

这可以使用术语提示器https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-term.html进行更正 –

回答

0

尝试在direct_generator中设置“prefix_length”:0。