Rewrite one phrase to another in search query

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Rewrite one phrase to another in search query

Aliaksandr Radzivanovich
What if I need to search for synonyms, but synonyms can be expanded to
phrases of several words?
For example, user enters query "tcp", then my application should also
find documents containing phrase "Transmission Control Protocol". And
conversely, user enters "Transmission Control Protocol", then my
application should also find documents with word "tcp".

It seems like Lucene does not support this scenario out of the box.
Then where to look for the solution? What Lucene
extensions/classes/interfaces should I investigate?

Thanks.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Rewrite one phrase to another in search query

Erick Erickson
The synonym analyzer shown in Lucene In Action is a good place
to start. You need to change *all* occurrences of one form into
another, both an index and search time to get consistent results.

There are some "interesting" implications for this, though, but they
only really need to be considered if you need either phrase or
span queries. For instance, let's say you have the following doc
fragments:
doc1: "this is a tcp interaction that I want to deal with"
doc2: "this is a transmission control protocol interaction that I want to
deal with"

is "this" within 4 of "interaction" in both documents? Do you care?

Also, is the phrase "transmission control protocol" match for the
first document? Would the user be confused by matching a document
with "tcp" in it for that phrase?

For that matter, does searching on "transmission" match doc1?
Mostly, these are issues that may or may not be relevant depending
on the intent of the application...

Highlighting also becomes interesting.

Best
Erick


On 6/27/07, Aliaksandr Radzivanovich <[hidden email]> wrote:

>
> What if I need to search for synonyms, but synonyms can be expanded to
> phrases of several words?
> For example, user enters query "tcp", then my application should also
> find documents containing phrase "Transmission Control Protocol". And
> conversely, user enters "Transmission Control Protocol", then my
> application should also find documents with word "tcp".
>
> It seems like Lucene does not support this scenario out of the box.
> Then where to look for the solution? What Lucene
> extensions/classes/interfaces should I investigate?
>
> Thanks.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Rewrite one phrase to another in search query

steve_rowe
In reply to this post by Aliaksandr Radzivanovich
Hi Aliaksandr,

Aliaksandr Radzivanovich wrote:
> What if I need to search for synonyms, but synonyms can be expanded to
> phrases of several words?
> For example, user enters query "tcp", then my application should also
> find documents containing phrase "Transmission Control Protocol". And
> conversely, user enters "Transmission Control Protocol", then my
> application should also find documents with word "tcp".

Section 4.6 of Gospodnetić & Hatcher's excellent _Lucene_in_Action_[1]
describes a SynonymAnalyzer class, intended for use at indexing time
(AFACT, however, their approach does not address multi-word synonyms).
Although a query-time analyzer is not directly discussed, they do say
(on p. 134):

   The awkwardly named PhrasePrefixQuery (see section 5.2)
   is one option to consider, perhaps created through an
   overridden QueryParser.getFieldQuery method; this is a
   possible option to explore if you wish to implement
   synonym injection at query time.

Steve

[1] http://lucenebook.com/

--
Steve Rowe
Center for Natural Language Processing
http://www.cnlp.org/tech/lucene.asp

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Rewrite one phrase to another in search query

Chris Hostetter-3

: (AFACT, however, their approach does not address multi-word synonyms).
: Although a query-time analyzer is not directly discussed, they do say

Solr's has a SynonymFilter that does handle multi-word synonyms, and it
can handle query-time synonyms, but there are some caveats to both of
those use cases (mainly that you can have one or hte other but not both)
that you need to consider carefully.  they are well documetned in teh SOlr
wiki...

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#SynonymFilter




-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Rewrite one phrase to another in search query

Mark Miller-3
In reply to this post by Aliaksandr Radzivanovich
You might try my Query Parser, Qsol. http://myhardshadow.com/qsol.php
There is a find/replace feature that will do what you want. FindReplace
takes the find string, the replace string, boolean for case sensitive,
boolean to indicate the replacement will act as an operator (allows for
correct default space operator functionality).

- Mark
Example Code:

        QsolParser parser = ParserFactory.getInstance(new
QsolConfiguration())
                                     .getParser(false);

        parser.addFindReplace(new FindReplace("\"the old fast razor\"",
"tofr",
                true, false));

        parser.addFindReplace(new FindReplace("tofr", "\"the old fast
razor\"",
                true, false));

        example = "test(\"the old fast razor\" & mark)";
        expected = "+test:tofr +test:mark";
        assertEquals(expected, parse(example));

        Parse Method:

        Query result = null;
        try {
            result = parser.parse("field", query, analyzer);
        } catch (EmptyQueryException e) {
            return "";
        } catch (QsolSyntaxException e) {
            throw new RuntimeException(e);
        }

        return result.toString();

Aliaksandr Radzivanovich wrote:

> What if I need to search for synonyms, but synonyms can be expanded to
> phrases of several words?
> For example, user enters query "tcp", then my application should also
> find documents containing phrase "Transmission Control Protocol". And
> conversely, user enters "Transmission Control Protocol", then my
> application should also find documents with word "tcp".
>
> It seems like Lucene does not support this scenario out of the box.
> Then where to look for the solution? What Lucene
> extensions/classes/interfaces should I investigate?
>
> Thanks.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Rewrite one phrase to another in search query

Tom Emerson-3
Hi Mark,

Thanks for the pointer, but for my application I already have a custom query
parser, and I think the use of a functional query will do what I want.

    -tree

On Jun 28, 2007 7:46 PM, Mark Miller <[hidden email]> wrote:

> You might try my Query Parser, Qsol. http://myhardshadow.com/qsol.php
> There is a find/replace feature that will do what you want. FindReplace
> takes the find string, the replace string, boolean for case sensitive,
> boolean to indicate the replacement will act as an operator (allows for
> correct default space operator functionality).
>
> - Mark
> Example Code:
>
>        QsolParser parser = ParserFactory.getInstance(new
> QsolConfiguration())
>                                     .getParser(false);
>
>        parser.addFindReplace(new FindReplace("\"the old fast razor\"",
> "tofr",
>                true, false));
>
>        parser.addFindReplace(new FindReplace("tofr", "\"the old fast
> razor\"",
>                true, false));
>
>        example = "test(\"the old fast razor\" & mark)";
>        expected = "+test:tofr +test:mark";
>        assertEquals(expected, parse(example));
>
>        Parse Method:
>
>        Query result = null;
>        try {
>            result = parser.parse("field", query, analyzer);
>        } catch (EmptyQueryException e) {
>            return "";
>        } catch (QsolSyntaxException e) {
>            throw new RuntimeException(e);
>        }
>
>        return result.toString();
>
> Aliaksandr Radzivanovich wrote:
> > What if I need to search for synonyms, but synonyms can be expanded to
> > phrases of several words?
> > For example, user enters query "tcp", then my application should also
> > find documents containing phrase "Transmission Control Protocol". And
> > conversely, user enters "Transmission Control Protocol", then my
> > application should also find documents with word "tcp".
> >
> > It seems like Lucene does not support this scenario out of the box.
> > Then where to look for the solution? What Lucene
> > extensions/classes/interfaces should I investigate?
> >
> > Thanks.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


--
Tom Emerson
[hidden email]
http://www.dreamersrealm.net/~tree
Reply | Threaded
Open this post in threaded view
|

Re: Rewrite one phrase to another in search query

Tom Emerson-3
Bleh, never mind this, I've replied to the wrong thread... mea culpa.

On Dec 7, 2007 4:21 PM, Tom Emerson <[hidden email]> wrote:

> Hi Mark,
>
> Thanks for the pointer, but for my application I already have a custom
> query parser, and I think the use of a functional query will do what I want.
>
>     -tree
>
>
> On Jun 28, 2007 7:46 PM, Mark Miller < [hidden email]> wrote:
>
> > You might try my Query Parser, Qsol. http://myhardshadow.com/qsol.php
> > There is a find/replace feature that will do what you want. FindReplace
> > takes the find string, the replace string, boolean for case sensitive,
> > boolean to indicate the replacement will act as an operator (allows for
> > correct default space operator functionality).
> >
> > - Mark
> > Example Code:
> >
> >        QsolParser parser = ParserFactory.getInstance(new
> > QsolConfiguration())
> >                                     .getParser(false);
> >
> >        parser.addFindReplace(new FindReplace("\"the old fast razor\"",
> > "tofr",
> >                true, false));
> >
> >        parser.addFindReplace(new FindReplace("tofr", "\"the old fast
> > razor\"",
> >                true, false));
> >
> >        example = "test(\"the old fast razor\" & mark)";
> >        expected = "+test:tofr +test:mark";
> >        assertEquals(expected, parse(example));
> >
> >        Parse Method:
> >
> >        Query result = null;
> >        try {
> >            result = parser.parse ("field", query, analyzer);
> >        } catch (EmptyQueryException e) {
> >            return "";
> >        } catch (QsolSyntaxException e) {
> >            throw new RuntimeException(e);
> >        }
> >
> >        return result.toString();
> >
> > Aliaksandr Radzivanovich wrote:
> > > What if I need to search for synonyms, but synonyms can be expanded to
> > > phrases of several words?
> > > For example, user enters query "tcp", then my application should also
> > > find documents containing phrase "Transmission Control Protocol". And
> > > conversely, user enters "Transmission Control Protocol", then my
> > > application should also find documents with word "tcp".
> > >
> > > It seems like Lucene does not support this scenario out of the box.
> > > Then where to look for the solution? What Lucene
> > > extensions/classes/interfaces should I investigate?
> > >
> > > Thanks.
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [hidden email]
> > > For additional commands, e-mail: [hidden email]
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >
>
>
> --
> Tom Emerson
> [hidden email]
> http://www.dreamersrealm.net/~tree <http://www.dreamersrealm.net/%7Etree>
>



--
Tom Emerson
[hidden email]
http://www.dreamersrealm.net/~tree