2014-12-24 56 views
1

我想以“劳动法”的形式搜索pdf文本。但结果是,它返回包含单词“Labor”和“Law”的所有文件。请任何帮助检查我的鳕鱼以下:如何在lucene中搜索全文4.10

EnglishAnalyzer analyzer = new EnglishAnalyzer(); 
analyzer.setVersion(Version.LATEST);   

QueryParser parser = new QueryParser("content", analyzer); 
Query query = parser.parse("Labor Law"); 

Directory indexDirectory = FSDirectory.open(new File(indexLucencePath)); 
DirectoryReader dirReader = DirectoryReader.open(indexDirectory); 
indexSearcher = new IndexSearcher(dirReader); 

ScoreDoc[] queryResults = indexSearcher.search(query, numOfResults).scoreDocs; 

List<IndexItem> results = new ArrayList<IndexItem>(); 
for (ScoreDoc scoreDoc : queryResults) { 
    Document doc = indexSearcher.doc(scoreDoc.doc); 
    results.add(new IndexItem(doc.get(IndexItem.TITLE), doc.get(IndexItem.CONTENT))); 
    } 

回答

2

尝试

短语查询:

Query query = parser.parse("\"Labor Law\""); 

所有条件必须存在

Query query = parser.parse("+Labor +Law"); 

您还可以创建查询自己这样

BooleanQuery query= new BooleanQuery(); 
TermQuery clause1 = new TermQuery(new Term("content", "Labor")); 
TermQuery clause2 = new TermQuery(new Term("content", "Law")); 
query.add(new BooleanClause(clause1, BooleanClause.Occur.MUST)); 
query.add(new BooleanClause(clause1, BooleanClause.Occur.MUST)); 
+0

您的解决方案将搜索2个单词的内容文件。但我想搜索一下“劳动法”而不是“劳动法”和“法律”。 – Mankeomorakort