Solr Text Vs String

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr Text Vs String

RaviWhy
Hi,

  I have incoming field stored both as Text and String field in solr indexed data. When I search the following cases, string field returns documents(from Solr client) and not text fields.

NAME:T - no results
Name_Str:T - returns documents

Similarly for the following cases - CPN*, DPS*, S, IF,AND, ARE, etc.

AAA,AN,AND,ARE,BE,BUT,CCC,CPN*,DPS*,FOR,HRC*,IF,IN,INTO,IT,NOT,S,SID*,T,THE,THIS,TO

Is anything with keywords for writing queries or stopwords or synonyms.

Below is the definition of my field definition.

- <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
- <analyzer type="index">
  <tokenizer class="solr.WhitespaceTokenizerFactory" /> 
-  
  <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> 
  <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" /> 
  <filter class="solr.LowerCaseFilterFactory" /> 
  <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt" /> 
  <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> 
  </analyzer>
- <analyzer type="query">
  <tokenizer class="solr.WhitespaceTokenizerFactory" /> 
  <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true" /> 
  <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> 
  <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" /> 
  <filter class="solr.LowerCaseFilterFactory" /> 
  <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt" /> 
  <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> 
  </analyzer>
  </fieldType>

<fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true" /> 
Reply | Threaded
Open this post in threaded view
|

Re: Solr Text Vs String

hossman

: data. When I search the following cases, string field returns documents(from
: Solr client) and not text fields.
:
: NAME:T - no results
: Name_Str:T - returns documents
:
: Similarly for the following cases - CPN*, DPS*, S, IF,AND, ARE, etc.
:
: AAA,AN,AND,ARE,BE,BUT,CCC,CPN*,DPS*,FOR,HRC*,IF,IN,INTO,IT,NOT,S,SID*,T,THE,THIS,TO

T is a stop word in the example stopsords.txt files, as are many of the
other examples you included.  take a look at your stopwords.txt, you can
change anything you want.

Wildcard and prefix queries (ie: CPN*) do not get "analyzed" by the
query parser so if you use the LowerCaseFilter when indexing, you can only
do lowercase wildcard/prefix queries (ie: cpn*)

: Is anything with keywords for writing queries or stopwords or synonyms.

i'm not sure what that line means ... but it seems like a question about
seeing how your queries and stopwords and synonyms are getting used.  the
"Analysis" screen can help you see how your various analyzers get used at
index and query time, and putting the "debugQuery=true" on your searches
will cause lots of good info about how your query is getting interpreted
to to be included at the bottom of hte response.




-Hoss