这里是我的领域上elasticSearch:ElasticSearch分析
"keywordName": {
"type": "text",
"analyzer": "custom_stop"
}
这里是我的分析:
"custom_stop": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"my_stop",
"my_snow",
"asciifolding"
]
}
这里是我的过滤器:
"my_stop": {
"type": "stop",
"stopwords": "_french_"
},
"my_snow" : {
"type" : "snowball",
"language" : "French"
}
这里是我的记录我的索引(仅在我的字段中:keywordName):
“canne a peche”,“canne”,“canne a peche telescopique”,“iphone 8”,“iphone 8 case”,“iphone 8 cover”,“iphone 8 charger”,“iphone 8 new”
当我搜索“CANNE”,它给我的“CANNE”的文件,这就是我想要的:
GET ads/_search
{
"query": {
"match": {
"keywordName": {
"query": "canne",
"operator": "and"
}
}
},
"size": 1
}
当我搜索“CANNEàPÊCHE”,它给了我“CANNE一个PECHE”这也可以。 “CannesàPêche” - >“canne a peche” - >确定。
这里有一个棘手的部分:当我搜索“iPhone 8”时,它给了我“iPhone 8的覆盖”,而不是“iPhone 8”。如果我改变大小,我设置5(因为它返回包含“iphone 8”的5个结果)。我看到“iphone 8”是第四项成绩。首先是“iphone 8套”,然后“iphone 8案”,然后“iphone 8个新”,最后是“iphone 8” ......
下面是该查询的结果:
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 5,
"max_score": 1.4009607,
"hits": [
{
"_index": "ads",
"_type": "keyword",
"_id": "iphone 8 cover",
"_score": 1.4009607,
"_source": {
"keywordName": "iphone 8 cover"
}
},
{
"_index": "ads",
"_type": "keyword",
"_id": "iphone 8 case",
"_score": 1.4009607,
"_source": {
"keywordName": "iphone 8 case"
}
},
{
"_index": "ads",
"_type": "keyword",
"_id": "iphone 8 new",
"_score": 0.70293105,
"_source": {
"keywordName": "iphone 8 new"
}
},
{
"_index": "ads",
"_type": "keyword",
"_id": "iphone 8",
"_score": 0.5804671,
"_source": {
"keywordName": "iphone 8"
}
},
{
"_index": "ads",
"_type": "keyword",
"_id": "iphone 8 charge",
"_score": 0.46705723,
"_source": {
"keywordName": "iphone 8 charge"
}
}
]
}
}
哪有我保持关键字“canne a peche”(重音,大写字母,复数项)的灵活性,但也告诉他如果有完全匹配(“iphone 8”=“iphone 8”),请给我确切的关键字名称?
这是我一直在寻找的行为! Thx – Gun
是否有可能提高“最匹配”的结果?我的意思是 - >如果我搜索“samsung”,则有1个标记:“samsung”。但最好的分数是“三星银河”(1.11),然后是“三星充电器”(0.94)和“三星”(0.84)。我怎么能告诉它提升“三星”,因为它与“sâmsung”最接近?而不是“三星Galaxy”或“三星充电器” – Gun