Lemmatizer for indexing

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Lemmatizer for indexing

shamik
Hi,
  I'm trying to use a lemmatized in my analysis chain. Just wondering what
is the recommended way of achieving this. I've come across few different
implementation which are listed below;

Open NLP -->
https://lucene.apache.org/solr/guide/7_5/language-analysis.html#opennlp-lemmatizer-filter

https://opennlp.apache.org/docs/1.8.0/manual/opennlp.html#tools.lemmatizer

KStem Filter -->
https://lucene.apache.org/solr/guide/7_5/filter-descriptions.html#kstem-filter

There are couple of third party libraries , but not sure if they are being
maintained or compatible with the solr version i'm using (7.5).

https://github.com/nicholasding/solr-lemmatizer
https://github.com/bejean/solr-lemmatizer

Currently, I'm looking for English only lemmatization. Also, I need to have
the ability to update the lemma dictionary to add custom terms specific to
our organization (not sure of kstem filter can do that).

Any pointers will be appreciated.

Regards,
Shamik