我居然写了一篇博客文章这样一段时间回来Qbox,你可以在这里找到:http://blog.qbox.io/multi-field-partial-word-autocomplete-in-elasticsearch-using-ngrams。 (不幸的是,这篇文章中的一些链接被破坏了,在这一点上不能很容易修复,但希望你能明白这一点。)
我会向你推荐详细信息,但这里有一些代码可以用来快速测试它。请注意,我正在使用edge ngrams而不是全部ngrams。
还特别注意使用_all field和match query operator。
好了,所以这里是映射:
PUT /test_index
{
"settings": {
"analysis": {
"filter": {
"edgeNGram_filter": {
"type": "edgeNGram",
"min_gram": 2,
"max_gram": 20
}
},
"analyzer": {
"edgeNGram_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase",
"asciifolding",
"edgeNGram_filter"
]
}
}
}
},
"mappings": {
"doc": {
"_all": {
"enabled": true,
"index_analyzer": "edgeNGram_analyzer",
"search_analyzer": "standard"
},
"properties": {
"field1": {
"type": "string",
"include_in_all": true
},
"field2": {
"type": "string",
"include_in_all": true
}
}
}
}
}
现在添加几个文件:
POST /test_index/doc/_bulk
{"index":{"_id":1}}
{"field1":"purple duck","field2":"brown fox"}
{"index":{"_id":2}}
{"field1":"slow purple duck","field2":"quick brown fox"}
{"index":{"_id":3}}
{"field1":"red turtle","field2":"quick rabbit"}
而这个查询似乎说明你想要什么:
POST /test_index/_search
{
"query": {
"match": {
"_all": {
"query": "purp fo slo",
"operator": "and"
}
}
}
}
返回:
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.19930676,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "2",
"_score": 0.19930676,
"_source": {
"field1": "slow purple duck",
"field2": "quick brown fox"
}
}
]
}
}
这里是我用来测试它的代码:
http://sense.qbox.io/gist/b87e426062f453d946d643c7fa3d5480cd8e26ec
也看到这个帖子:http://blog.qbox.io/an-introduction-to-ngrams-in-elasticsearch –
这正是我所需要的。太棒了 – Fab
是的,我也是!伟大的东西 – user3125823