2017-04-17 30 views
0

搜索的人的姓名是“Suleman Kumar With”,并且是姓氏。 它工作正常的所有其他名称,但不是这个英文关键字如何使用Lucene/Hibernate搜索具有关键字“With”的名称?

以下是办法,我创造的Lucene索引:

@Fields({ @Field(index = Index.YES, store = Store.NO), 
@Field(name = "LastName_Sort", index = Index.YES, analyzer = @Analyzer(definition = "sortAnalyzer")) }) 
@Column(name = "LASTNAME", length = 50) 
public String getLastName() { 
    return lastName; 
} 

sortAnalyzer有以下配置:

@AnalyzerDef(name = "sortAnalyzer", 
    tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class), 
filters = { 
    @TokenFilterDef(factory = LowerCaseFilterFactory.class), 
    @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = { 
     @Parameter(name = "pattern", value = "('-&\\.,\\(\\))"), 
     @Parameter(name = "replacement", value = " "), 
     @Parameter(name = "replace", value = "all") 
    }), 
    @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = { 
     @Parameter(name = "pattern", value = "([^0-9\\p{L} ])"), 
     @Parameter(name = "replacement", value = ""), 
     @Parameter(name = "replace", value = "all") 
    }) 
} 
) 

有搜索上姓氏以及主键:ID,我在那里得到令牌不匹配错误。

回答

1

我已经用我自己的“自定义分析器”实现了它。

public class IgnoreStopWordsAnalyzer extends StopwordAnalyzerBase { 

    public IgnoreStopWordsAnalyzer() { 
     super(Version.LUCENE_36, null); 
    } 

    @Override 
    protected ReusableAnalyzerBase.TokenStreamComponents createComponents(final String fieldName, final Reader reader) { 
     final StandardTokenizer src = new StandardTokenizer(Version.LUCENE_36, reader); 
     TokenStream tok = new StandardFilter(Version.LUCENE_36, src); 
     tok = new LowerCaseFilter(Version.LUCENE_36, tok); 
     tok = new StopFilter(Version.LUCENE_36, tok, this.stopwords); 
     return new ReusableAnalyzerBase.TokenStreamComponents(src, tok); 
    } 
} 

在字段中调用此分析器,停用词将被忽略。