2016-09-23 60 views
2

为什么我在InputFile类的ts.reset()行中获得nullpointerexception?如果我使用任何内置分析器,如whitespaceanalyser,我没有任何异常。这里有什么问题?lucene自定义分析器中的Nullpointerexception

public class CourtesyTitleFilter extends TokenFilter 
{ 
    TokenStream input; 
    Map<String,String> courtesyTitleMap = new HashMap<String,String>(); 
    private CharTermAttribute termAttr; 
    public CourtesyTitleFilter(TokenStream input) throws IOException 
    { 
     super(input); 
     termAttr = input.addAttribute(CharTermAttribute.class); 
     courtesyTitleMap.put("Dr", "doctor"); 
     courtesyTitleMap.put("Mr", "mister"); 
     courtesyTitleMap.put("Mrs", "miss"); 
    } 
    @Override 
    public boolean incrementToken() throws IOException 
    { 
     if (!input.incrementToken()) 
      return false; 
     String small = termAttr.toString(); 
     if(courtesyTitleMap.containsKey(small)) { 
      termAttr.setEmpty().append(courtesyTitleMap.get(small)); 
      System.out.print(courtesyTitleMap.get(small)); 
     } 
     return true; 
    } 
} 
public class CourtesyTitleAnalyzer extends Analyzer 
{ 
    @Override 
    protected TokenStreamComponents createComponents(String fieldName, Reader reader) 
    { 
     TokenStream filter = null; 
     Tokenizer whitespaceTokenizer = new WhitespaceTokenizer(reader); 
     try 
     { 
      filter = new CourtesyTitleFilter (whitespaceTokenizer); 
     } 
     catch(IOException e) 
     { 
      e.printStackTrace(); 
     } 
     return new TokenStreamComponents(whitespaceTokenizer,filter); 
    } 
} 
public class InputFile 
{ 
    public static void main(String[] args) throws IOException, ParseException 
    { 
     TokenStream ts=null; 
     CourtesyTitleAnalyzer cta=new CourtesyTitleAnalyzer(); 
     try 
     { 
      StringReader sb=new StringReader("Hello Mr Hari. Meet Dr Kalam and Mrs xyz"); 
      ts = cta.tokenStream("field",sb); 
      OffsetAttribute offsetAtt = ts.addAttribute(OffsetAttribute.class); 
      CharTermAttribute termAtt = ts.addAttribute(CharTermAttribute.class); 
      ts.reset(); 
      while (ts.incrementToken()) 
      { 
       String token = termAtt.toString(); 
       System.out.println("[" + token + "]"); 
       System.out.println("Token starting offset: " + offsetAtt.startOffset()); 
       System.out.println(" Token ending offset: " + offsetAtt.endOffset()); 
       System.out.println(""); 
      } 
      ts.end(); 
     } 
     catch (IOException e) 
     { 
      e.printStackTrace(); 
     } 
     finally 
     { 
      ts.close(); 
      cta.close(); 
     } 
    } 
} 
+0

什么是tokenStream(“field”,sb);做? – raven

+0

它应该分析字符串sb(“Hello Mr hari ...”)并且应该返回令牌。它使用whitespaceAnalyser。但是当我打印'ts'时,我没有得到任何令牌。 – hariii

+0

我想这是试图搜索文本“字段”的其他文字“”你好先生哈里。 “在另一篇文章中,字段不存在,你是否尝试发送另一个文本,如”xyz“? – raven

回答

2

inputTokenFilter抽象类已定义。你通过在你的实现中声明它来隐藏它。

因此,只需在您的CourtesyTitleFilter中删除行TokenStream input;即可。