Synonyms with multiple alternatives

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Synonyms with multiple alternatives

I am using Lucene 4.8 (.net flavour) and cannot find a decent working example to answer my issue.

In our source data we have lots of similar items that can be described in the same way - for example "lawnmower", "lawn mower" & "grass cutter".

Obviously we have no control over how people choose to search for such items as they will just enter their most familiar term.

What we need to do is return all items that contain any of those strings / phrases, if any one of those phrases is used to search - so searching for "lawnmower" could return :

XYZ Electric Lawnmower
ABC Rotary Lawn mower
123 Hover Grass Cutter

Likewise any of the other terms entered to search should return all the same matches as above (if searching for "lawn mower" or "grass cutter")

I am looking to implement the SynonymFilter but I can't grasp how I need to do this to achieve what we want -  I have had some success mapping one term to another but I can't work out how to extend this to 3 or more terms in a "group" of similar terms.

So I will have to add the following combinations always to my SynonymMap :
a > b, b > a, a > c, c > a, b > c and c > b?

Am I looking to do this in both the built index and the incoming query? In my source data I could have different variations of the term, and obviously I cannot predict how people will search for it. Or is it good enough to only process the query to look for all the alternate terms?

Do I retain the original value in the Map when adding the synonym? I can't "see" what is being created to know what is going on under the hood so I can work out the best approach.