0
我有以下配置:WordDelimiterFilterFactory如何通过带有数字的标记进行搜索?
@AnalyzerDef(name = "autocompleteNGramAnalyzer",
// Split input into tokens according to tokenizer
tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
filters = {
// Normalize token text to lowercase, as the user is unlikely to
// care about casing when searching for matches
@TokenFilterDef(factory = WordDelimiterFilterFactory.class),
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(factory = EdgeNGramFilterFactory.class, params = {
@Parameter(name = "minGramSize", value = "2"),
@Parameter(name = "maxGramSize", value = "5") }) })
这个作品几乎如预期,但具有与包含数字的话问题。
例如:
通过ab
令牌Lucene的回报abcdefg
,但如果我需要找到 a1
并有a1b1c1d1
它不返回任何
我怎样才能改变这种配置?
我在哪里可以阅读关于这些工厂的更多细节? – gstackoverflow
@gstackoverflow 1.看看他们的javadoc,或者看看这个非官方的wiki:https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters –