How can you perform a fuzzy search on a phrase without it turning into a word distance search?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

How can you perform a fuzzy search on a phrase without it turning into a word distance search?

Daniel Einspanjer
I'd like to be able to search for multi-word titles in a fuzzy manner
where small typos could be compensated for, but when you make a query
term like:
title:"The increadable machine"~
that will perform a word distance search instead of a fuzzy search.

Is it possible to do this without manually splitting up the title
string I'm searching for into terms and then making a compound query
with each of the terms as a fuzzy?

Thanks,
Daniel
Reply | Threaded
Open this post in threaded view
|

Re: How can you perform a fuzzy search on a phrase without it turning into a word distance search?

Chris Hostetter-3

: Is it possible to do this without manually splitting up the title
: string I'm searching for into terms and then making a compound query
: with each of the terms as a fuzzy?

not out of the box ... Lucene has no native concept of a "fuzzy phrase
query" ... you would either need to implement one, or come up with a
custom QueryParser or Analyzer to do the bulk of the work.

writing that QueryParser night not be that hard, Lucene already has a
FuzzyTermEnum class for getting hte list of all Terms similar to a
specific term, and a MultiPhraseQuery for making phrase queries where
each position in the phrase can match any one of several terms you
specificy ... it's would just be a matter of subclassing the appropriate
QueryParser method for dealing with PhraseQueries, and taking each "raw"
term and running it through FuzzyTermEnum to get all the variations to put
in your MultiPhraseQuery.



-Hoss