Need idea to standardize keywords - ring tone vs ringtone

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Need idea to standardize keywords - ring tone vs ringtone

bbarani
This post was updated on .
I am currently using a separate core for indexing the autosuggest keywords. Everything works fine except for one issue as below.

In index I have 2 entries (these keywords are stored in a database automatically based on popularity)

ring tone
ringtone

When users type in 'r' I display both ring tone and ringtone in auto suggest list since both the keywords are indexed.

Currently I manually add the non standard keywords to the stopwords.txt file so that it doesn't get indexed.

I am trying to figure out a way to standardize common keywords (known standardized keywords) automatically.

Is there a way I can automate this?  
Reply | Threaded
Open this post in threaded view
|

Re: Need idea to standardize keywords - ring tone vs ringtone

Erick Erickson
What would automation look like? How would an automated process "know" what
to do in these cases?

But I'm somewhat confused. On the one hand you say:
bq: type in 'r' I display both ring tone and ringtone in auto suggest list

Then mention stopwords "so that it doesn't get indexed". How do those
relate?

You could always use a copyField to move things into a field that you use
for special purposes.

Best,
Erick


On Fri, Oct 25, 2013 at 1:14 PM, Developer <[hidden email]> wrote:

> I am currently using a separate core for indexing the autosuggest keywords.
> Everything works fine except for one issue as below.
>
> In index I have 2 entries
>
> ring tone
> ringtone
>
> When users type in 'r' I display both ring tone and ringtone in auto
> suggest
> list. I am trying to figure out a way to standardize common keywords (known
> standardized keywords) automatically.
>
> Currently I manually add the non standard keywords to the stopwords.txt
> file
> so that it doesn't get indexed. Is there a way I can automate this?
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Need-idea-to-standardize-keywords-ring-tone-vs-ringtone-tp4097794.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: Need idea to standardize keywords - ring tone vs ringtone

bbarani
This post was updated on .
Thanks for your response Eric. Sorry for the confusion.

I currently display both 'ring tone' as well as 'ringtone' when the user types in 'r' but I am trying to figure out a way to display just 'ringtone'. As of now I just added 'ring tone' to stopwords list so that it doesn't get indexed but I am trying to figure out a way to standardize the keywords (I am thinking of even standardizing the data in database layer before passing it to SOLR  but thought of checking if there is a way to do this in SOLR itself).

I have the list of know keywords (more like synonyms) which I am trying to map against the user entered keywords.

ring tone, ringer tone => ringtone

Reply | Threaded
Open this post in threaded view
|

Re: Need idea to standardize keywords - ring tone vs ringtone

Jonathan Rochkind-2
Do you know about the Solr synonym feature?  That seems more applicable
to what you're describing then stopwords. I'd stay away from stopwords
entirely here, and try to do what you want with synonyms.

Multi-word synonyms can be tricky, I'm not entirely sure the right way
to do it for this use case. But I think the synonym feature is what you
want. Not the stopwords feature.



On 10/28/13 12:24 PM, Developer wrote:

> Thanks for your response Eric. Sorry for the confusion.
>
> I currently display both 'ring tone' as well as 'ringtone' when the user
> types in 'r' but I am trying to figure out a way to display just 'ringtone'
> hence I added 'ring tone' to stopwords list so that it doesn't get indexed.
>
> I have the list of know keywords (more like synonyms) which I am trying to
> map against the user entered keywords.
>
> ring tone, ringer tine => ringtone
>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Need-idea-to-standardize-keywords-ring-tone-vs-ringtone-tp4097794p4098103.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: Need idea to standardize keywords - ring tone vs ringtone

bbarani
I tried using synonyms but it doesn't actually change the stored text rather just the indexed value.

I need a way to change the raw value stored in SOLR. May be I should use a custom update processor to standardize the data.