Searching for 'A*' is not returning me same result as 'a*'

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Searching for 'A*' is not returning me same result as 'a*'

Manu
Hi,

I am using the following analyser for indexing and querying -
------------------------------------------------------------------------------------------------------
 <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
         <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/>
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>
-----------------------------------------------------------------------------------------------------

I search using Solr admin console. When I search for -  institutionName:a*, I get 93 matching records. But when I search for - institutionName:A*, I DO NOT get any matching records.

I did field Analysis for a* and A* for the analyzer configuration.

For a*
------
 


For A*
------
 

As per my understanding, analyzer is working fine in both the case. I am not able to understand, why query is not returning me any result for A*?

I feel that I am missing out something, can anyone help me with that?

Regards,
Manu
Reply | Threaded
Open this post in threaded view
|

Re: Searching for 'A*' is not returning me same result as 'a*'

Manu
I got the answer to my problem. This is happening because I am using wildcard. Wildcard queries are not passed through Analyzer.

http://wiki.apache.org/lucene-java/LuceneFAQ#head-4d62118417eaef0dcb87f4370583f809848ea695
http://markmail.org/message/25wm4mrdhs6yqnck#query:upper%20case%20solr+page:1+mid:7c6bf6e7p755eu67+state:results
http://www.mail-archive.com/solr-user@lucene.apache.org/msg08542.html

Thanks,
Manu


Manupriya wrote
Hi,

I am using the following analyser for indexing and querying -
------------------------------------------------------------------------------------------------------
 <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
         <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/>
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>
-----------------------------------------------------------------------------------------------------

I search using Solr admin console. When I search for -  institutionName:a*, I get 93 matching records. But when I search for - institutionName:A*, I DO NOT get any matching records.

I did field Analysis for a* and A* for the analyzer configuration.

For a*
------
 


For A*
------
 

As per my understanding, analyzer is working fine in both the case. I am not able to understand, why query is not returning me any result for A*?

I feel that I am missing out something, can anyone help me with that?

Regards,
Manu