solr 1.3 analyzers

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

solr 1.3 analyzers

revas
HI ,

In the solr 1.3 under src/classes/java/analyzers

i see only the following  language specific tokenizer
chinestokenizer
cjktokenizer
russiantokenizer

but i see filterfactories for other languages like dutch ,french,barzialian
etc but no tokenizer
in this scenario are we supposed to use the standard tokenizer and the
corresponding language filters.Lucene has the analyzers for the same.how do
we incorporate the same to solr

Will this be available in future versions?

what is the difference netween normal filter factory and stem filter
factory?

Regards
Reply | Threaded
Open this post in threaded view
|

Re: solr 1.3 analyzers

iorixxx
> i see filterfactories for other languages like dutch
> ,french,barzialian etc but no tokenizer.  in this scenario are we >supposed to use the standard tokenizer and the corresponding language >filters.

Yes. Exactly the same as what Lucene Analyzers do.

>Lucene has the analyzers for the same. how do we incorporate the same to >solr Will this be available in future versions?

One can also specify an existing Lucene Analyzer class that has a         default constructor via the class attribute on the analyzer element
<fieldType name="text_greek" class="solr.TextField">
   <analyzer class="org.apache.lucene.analysis.el.GreekAnalyzer"/>
</fieldType>

> what is the difference netween normal filter factory and stem filter
> factory?

TokenFilters can delete (StopFilter), inject (SynonymFilter), modify(StemFilter) a token according to its purpose. There is no distinction such as normal filter factory and stem filter factory.