How to decide to use Analyzer, AnalyzerFactory, or TokenizerFactory

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

How to decide to use Analyzer, AnalyzerFactory, or TokenizerFactory

T. Kuro Kurosaka
Hello,
I am new to solr, and trying to undestand how things work.
If I want to use my tokenizer, there seems to be three choices:
1. Write a TokenizerFactory that create() my Tokenizer, and specify the factory in schema.xml.
2. Write an Analyzer that uses my Tokenizer, and specify that Analyzer in schema.xml.
3. Write an Analyzer that uses my Tokenizer and an AnalyzerFactory that create() that Analyzer, and specify that factory in schema.xml.

Is there document that describes differences of these approaches, guides when to use which?

-Kuro
Reply | Threaded
Open this post in threaded view
|

Re: How to decide to use Analyzer, AnalyzerFactory, or TokenizerFactory

Chris Hostetter-3

: Is there document that describes differences of these approaches, guides
: when to use which?

there is some guidence in these wiki pages...

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
http://wiki.apache.org/solr/SolrPlugins

In general, do what ever is easiest for you ... if you already have an
Analyzer, just use it.  If you have a Tokenizer (or you are writing one)
then it's just as easy to write a TOkenizerFactory as it is to write an
Analyzer - and the TokenizerFactory allows you to mix and match with all
sorts of TokenFIlter configurations.

: 3. Write an Analyzer that uses my Tokenizer and an AnalyzerFactory that
: create() that Analyzer, and specify that factory in schema.xml.

there's no such thing as an AnalyzerFactory (as far as i remember...) ..
did you see that mentioned in the docs somewhere?



-Hoss