help need on words with special characters

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

help need on words with special characters

Doss
Hi,

I am new to solr(and 0 in lucene), my doubt is how can i protect  words with special characters from tokenizing, sat for example A+, A1+ etc. because when i searched for "group A" i am getting results with A+ aswell as A1+ and so on, is there any special way to index these type of words?

Thanks,
Doss.
Reply | Threaded
Open this post in threaded view
|

Re: help need on words with special characters

Mike Klaas
On 4/18/07, Doss <[hidden email]> wrote:

> I am new to solr(and 0 in lucene), my doubt is how can i protect  words with special characters from tokenizing, sat for example A+, A1+ etc. because when i searched for "group A" i am getting results with A+ aswell as A1+ and so on, is there any special way to index these type of words?

You need to change your analyzer to recognize "A+", "A1+" as tokens.
Normally, special characters like + would not be recognized as parts
of words.

I you have a small number of special terms, you could add some code to
your existing analyzer to recognize it (WordDelimiterFilter if you are
using the standard text field in the Solr example).  If it is
complicated, you should look into creating your own analyzer.

-Mike
Reply | Threaded
Open this post in threaded view
|

Re: help need on words with special characters

Chris Hostetter-3
In reply to this post by Doss

: with special characters from tokenizing, sat for example A+, A1+ etc.
: because when i searched for "group A" i am getting results with A+
: aswell as A1+ and so on, is there any special way to index these type of
: words?

all fo the tokenization is controlled via the analyzers you configure in
your schema.xml -- you don't have to use any of the stuff in the example
schema, you cna change it as much as you want.

if you starting with an existing schema, and you want to udnerstand
why/how certain thigns are happening duriring analysis, the "Analysis"
tool (linked to from the admin screen) makes it easy to help you decide
which tokenier/tokenfilter changes to make...

http://localhost:8983/solr/admin/analysis.jsp?highlight=on
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters


-Hoss