2012-01-11 66 views
1

我读了很多来自stackoverflow的问题,但没有找到答案,如何使Solr前缀搜索。例如我有文本:“solr文档是不可读的”,我需要找到这样的东西:“solr docu *”,“文档未读*”,“不可读取的是如此*”,但不是“un * so *”,我做这样的事情:前缀搜索的Solr模式,howto?

<fieldType name="prefix_search" class="solr.TextField"> 
    <analyzer> 
    <tokenizer class="solr.LowerCaseTokenizerFactory"/> 
    <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="30" side="front"/> 
    </analyzer> 
</fieldType> 

但有时它会返回意外的结果,并且还可以使用“un * so *”查询。也许问题与PHP SolrClient?谢谢你的回复!

回答

1

ReversedWildcardFilterFactory正是你想要的,那么就可以很容易地测试与卷曲如下:

curl 'http://example.com:8080/solr/select?q=prefix_search:un*+AND+prefix_search:so*'

<!-- Just like text_general except it reverses the characters of 
    each token, to enable more efficient leading wildcard queries. --> 
<fieldType name="text_general_rev" class="solr.TextField" positionIncrementGap="100"> 
    <analyzer type="index"> 
    <tokenizer class="solr.StandardTokenizerFactory"/> 
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> 
    <filter class="solr.LowerCaseFilterFactory"/> 
    <filter class="solr.ReversedWildcardFilterFactory" withOriginal="true" 
     maxPosAsterisk="3" maxPosQuestion="2" maxFractionAsterisk="0.33"/> 
    </analyzer> 
    <analyzer type="query"> 
    <tokenizer class="solr.StandardTokenizerFactory"/> 
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> 
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> 
    <filter class="solr.LowerCaseFilterFactory"/> 
    </analyzer> 
</fieldType>