WordDelimiterFilter and acronyms normalization

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

WordDelimiterFilter and acronyms normalization

Andrew Klochkov
This post was updated on .
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: WordDelimiterFilter and acronyms normalization

iorixxx

> Is there any ready-for-use filter which performs acronyms
> normalization such
> as "I.N.C."->"INC"?
>
> I see that Lucene's StandardFilter can do this but we can't
> use it as we're
> using WhitespaceTokenizer instead of StandardTokenizer.
>

I am bad at regular expressions but if you can write a regex for that replacement solr.PatternReplaceFilterFactory can do that.

<filter class="solr.PatternReplaceFilterFactory" pattern="([^a-z])" replacement="" replace="all" />