AND query in SHOULD

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

AND query in SHOULD

Rapthor
Hi,

I want to realize a search that finds the exact phrase I provide. If the word I am searching for is "green tree", I do NOT want to get results for "green" or "tree", but only results for "green tree" within the given field.

This doesn't work so far for me. When providing a word that contains white spaces, Lucene does not give any results at all even though there are documents with the terms within the word in that order. This is what my source looks:

BooleanQuery bq = new BooleanQuery();
for (String word : words) {
        for (String field : fields) {
                bq.add(new TermQuery(new Term(field, word)), Occur.SHOULD);
        }
}
Hits hits = indexSearcher.search(bq);


The Query String would be: "contents:green tree description:green tree filename:green tree"

What's wrong? How to achieve the functionality?

Thanks in advance.
Reply | Threaded
Open this post in threaded view
|

Re: AND query in SHOULD

Daniel Naber-10
On Donnerstag, 22. November 2007, Rapthor wrote:

> I want to realize a search that finds the exact phrase I provide.

You simply need to create a PhraseQuery.

See
http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/org/apache/lucene/search/PhraseQuery.html

Regards
 Daniel

--
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: AND query in SHOULD

Rapthor
In reply to this post by Rapthor
There is no option to provide an Occur.SHOULD to the PhraseQuery. So where does it go? I changed the source to look like this:

PhraseQuery pq = new PhraseQuery();
for (String word : words) {
        for (String field : fields) {
                pq.add(new Term(field, word));
        }
}
Hits hits = indexSearcher.search(pq);


However, I get an exception:
java.lang.IllegalArgumentException: All phrase terms must be in the same field: description:green tree

I don't understand how to a) search for combinations of words like "green tree", b) search in multiple fields (description, text, ...) and c) search by a SHOULD restriction.
Reply | Threaded
Open this post in threaded view
|

Re: AND query in SHOULD

Erick Erickson
The semantics of the phrase query you're constructing probably aren't
what you think. As best I can infer, you are trying to do something
like

"green tree" in field1
or
"green tree" in field 2

but that's not even close to what you're constructing.

It would help a show what the query you want is in some form
like that above before trying to code it, because the actual
query you're making is something like asking for the phrase
"word1 in field1 word1 in field2 word2 in field1 word2 in field2"

Actually, I can't render the semantics of what you're doing int English.
And Lucene can't parse it either.

I suspect you want something like
PhraseQuery pq1
PhraseQuery pq2
for (String word : words) {
   pq1.add(...);
   pq2.add(...)
}
BooleanQuery bq().
bq.add(p1, ....  SHOULD);
bq.add(p2, ... SHOULD);

Index.search(bq);

Best
Erick

On Nov 22, 2007 7:17 AM, Rapthor <[hidden email]> wrote:

>
> There is no option to provide an Occur.SHOULD to the PhraseQuery. So where
> does it go? I changed the source to look like this:
>
> PhraseQuery pq = new PhraseQuery();
> for (String word : words) {
>        for (String field : fields) {
>                pq.add(new Term(field, word));
>        }
> }
> Hits hits = indexSearcher.search(pq);
>
> However, I get an exception:
> java.lang.IllegalArgumentException: All phrase terms must be in the same
> field: description:green tree
>
> I don't understand how to a) search for combinations of words like "green
> tree", b) search in multiple fields (description, text, ...) and c) search
> by a SHOULD restriction.
> --
> View this message in context:
> http://www.nabble.com/AND-query-in-SHOULD-tf4855719.html#a13895700
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: AND query in SHOULD

Shai Erera
How about using MultiFieldQueryParser. Here is a short main I wrote:

        Directory dir = new RAMDirectory();
        Analyzer analyzer = new StandardAnalyzer();
        IndexWriter writer = new IndexWriter(dir, analyzer);
        Document doc = new Document();
        doc.add(new Field("field1", "green tree", Store.NO, Index.TOKENIZED
));
        writer.addDocument(doc);
        doc = new Document();
        doc.add(new Field("field2", "green tree", Store.NO, Index.TOKENIZED
));
        writer.addDocument(doc);
        writer.close();

        IndexSearcher searcher = new IndexSearcher(dir);
        MultiFieldQueryParser qp = new MultiFieldQueryParser(new String[] {
"field1", "field2" }, analyzer);
        Query q = qp.parse("\"green tree\"");
        if (q instanceof BooleanQuery /* Basically this should almost always
be true */) {
            BooleanClause[] clauses = ((BooleanQuery) q).getClauses();
            for (int i = 0; i < clauses.length; i++) {
                clauses[i].setOccur(Occur.SHOULD); /* This is their setting
by default though */
            }
        }
        Hits hits = searcher.search(q);
        System.out.println(hits.length()); /* Should print 2. */
        searcher.close();


On Nov 22, 2007 3:26 PM, Erick Erickson <[hidden email]> wrote:

> The semantics of the phrase query you're constructing probably aren't
> what you think. As best I can infer, you are trying to do something
> like
>
> "green tree" in field1
> or
> "green tree" in field 2
>
> but that's not even close to what you're constructing.
>
> It would help a show what the query you want is in some form
> like that above before trying to code it, because the actual
> query you're making is something like asking for the phrase
> "word1 in field1 word1 in field2 word2 in field1 word2 in field2"
>
> Actually, I can't render the semantics of what you're doing int English.
> And Lucene can't parse it either.
>
> I suspect you want something like
> PhraseQuery pq1
> PhraseQuery pq2
> for (String word : words) {
>   pq1.add(...);
>   pq2.add(...)
> }
> BooleanQuery bq().
> bq.add(p1, ....  SHOULD);
> bq.add(p2, ... SHOULD);
>
> Index.search(bq);
>
> Best
> Erick
>
> On Nov 22, 2007 7:17 AM, Rapthor <[hidden email]> wrote:
>
> >
> > There is no option to provide an Occur.SHOULD to the PhraseQuery. So
> where
> > does it go? I changed the source to look like this:
> >
> > PhraseQuery pq = new PhraseQuery();
> > for (String word : words) {
> >        for (String field : fields) {
> >                pq.add(new Term(field, word));
> >        }
> > }
> > Hits hits = indexSearcher.search(pq);
> >
> > However, I get an exception:
> > java.lang.IllegalArgumentException: All phrase terms must be in the same
> > field: description:green tree
> >
> > I don't understand how to a) search for combinations of words like
> "green
> > tree", b) search in multiple fields (description, text, ...) and c)
> search
> > by a SHOULD restriction.
> > --
> > View this message in context:
> > http://www.nabble.com/AND-query-in-SHOULD-tf4855719.html#a13895700
> > Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >
>



--
Regards,

Shai Erera
Reply | Threaded
Open this post in threaded view
|

Re: AND query in SHOULD

Rapthor
Thanks for this example. I am uncertain about one detail: How do I achieve a search for multiple keywords. Not just "green tree" but also "short road", "sky", "bird"? Is there a chance to add those keywords to the "Query q = qp.parse("\"green tree\"");" command?

EDIT: Tried the example so far for my appliaction. Unfortunetaly the parser converts my "\"green tree\"" to just "green" without any quotes or whatever. So the second word is ignored by the searcher later.

Shai Erera wrote
How about using MultiFieldQueryParser. Here is a short main I wrote:

        Directory dir = new RAMDirectory();
        Analyzer analyzer = new StandardAnalyzer();
        IndexWriter writer = new IndexWriter(dir, analyzer);
        Document doc = new Document();
        doc.add(new Field("field1", "green tree", Store.NO, Index.TOKENIZED
));
        writer.addDocument(doc);
        doc = new Document();
        doc.add(new Field("field2", "green tree", Store.NO, Index.TOKENIZED
));
        writer.addDocument(doc);
        writer.close();

        IndexSearcher searcher = new IndexSearcher(dir);
        MultiFieldQueryParser qp = new MultiFieldQueryParser(new String[] {
"field1", "field2" }, analyzer);
        Query q = qp.parse("\"green tree\"");
        if (q instanceof BooleanQuery /* Basically this should almost always
be true */) {
            BooleanClause[] clauses = ((BooleanQuery) q).getClauses();
            for (int i = 0; i < clauses.length; i++) {
                clauses[i].setOccur(Occur.SHOULD); /* This is their setting
by default though */
            }
        }
        Hits hits = searcher.search(q);
        System.out.println(hits.length()); /* Should print 2. */
        searcher.close();


On Nov 22, 2007 3:26 PM, Erick Erickson <erickerickson@gmail.com> wrote:

> The semantics of the phrase query you're constructing probably aren't
> what you think. As best I can infer, you are trying to do something
> like
>
> "green tree" in field1
> or
> "green tree" in field 2
>
> but that's not even close to what you're constructing.
>
> It would help a show what the query you want is in some form
> like that above before trying to code it, because the actual
> query you're making is something like asking for the phrase
> "word1 in field1 word1 in field2 word2 in field1 word2 in field2"
>
> Actually, I can't render the semantics of what you're doing int English.
> And Lucene can't parse it either.
>
> I suspect you want something like
> PhraseQuery pq1
> PhraseQuery pq2
> for (String word : words) {
>   pq1.add(...);
>   pq2.add(...)
> }
> BooleanQuery bq().
> bq.add(p1, ....  SHOULD);
> bq.add(p2, ... SHOULD);
>
> Index.search(bq);
>
> Best
> Erick
>
> On Nov 22, 2007 7:17 AM, Rapthor <rapthor@lycos.de> wrote:
>
> >
> > There is no option to provide an Occur.SHOULD to the PhraseQuery. So
> where
> > does it go? I changed the source to look like this:
> >
> > PhraseQuery pq = new PhraseQuery();
> > for (String word : words) {
> >        for (String field : fields) {
> >                pq.add(new Term(field, word));
> >        }
> > }
> > Hits hits = indexSearcher.search(pq);
> >
> > However, I get an exception:
> > java.lang.IllegalArgumentException: All phrase terms must be in the same
> > field: description:green tree
> >
> > I don't understand how to a) search for combinations of words like
> "green
> > tree", b) search in multiple fields (description, text, ...) and c)
> search
> > by a SHOULD restriction.
> > --
> > View this message in context:
> > http://www.nabble.com/AND-query-in-SHOULD-tf4855719.html#a13895700
> > Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>



--
Regards,

Shai Erera
Reply | Threaded
Open this post in threaded view
|

Re: AND query in SHOULD

Shai Erera
Hi

Not sure I understand the question. You can add as many keywords as you want
to the query (like \"green tree\" \"short road\" sky bird) and it should
behave the same (i.e., search in each field.

Shai

On Nov 24, 2007 10:26 AM, Rapthor <[hidden email]> wrote:

>
> Thanks for this example. I am uncertain about one detail: How do I achieve
> a
> search for multiple keywords. Not just "green tree" but also "short road",
> "sky", "bird"? Is there a chance to add those keywords to the "Query q =
> qp.parse("\"green tree\"");" command?
>
>
> Shai Erera wrote:
> >
> > How about using MultiFieldQueryParser. Here is a short main I wrote:
> >
> >         Directory dir = new RAMDirectory();
> >         Analyzer analyzer = new StandardAnalyzer();
> >         IndexWriter writer = new IndexWriter(dir, analyzer);
> >         Document doc = new Document();
> >         doc.add(new Field("field1", "green tree", Store.NO,
> > Index.TOKENIZED
> > ));
> >         writer.addDocument(doc);
> >         doc = new Document();
> >         doc.add(new Field("field2", "green tree", Store.NO,
> > Index.TOKENIZED
> > ));
> >         writer.addDocument(doc);
> >         writer.close();
> >
> >         IndexSearcher searcher = new IndexSearcher(dir);
> >         MultiFieldQueryParser qp = new MultiFieldQueryParser(new
> String[]
> > {
> > "field1", "field2" }, analyzer);
> >         Query q = qp.parse("\"green tree\"");
> >         if (q instanceof BooleanQuery /* Basically this should almost
> > always
> > be true */) {
> >             BooleanClause[] clauses = ((BooleanQuery) q).getClauses();
> >             for (int i = 0; i < clauses.length; i++) {
> >                 clauses[i].setOccur(Occur.SHOULD); /* This is their
> > setting
> > by default though */
> >             }
> >         }
> >         Hits hits = searcher.search(q);
> >         System.out.println(hits.length()); /* Should print 2. */
> >         searcher.close();
> >
> >
> > On Nov 22, 2007 3:26 PM, Erick Erickson <[hidden email]> wrote:
> >
> >> The semantics of the phrase query you're constructing probably aren't
> >> what you think. As best I can infer, you are trying to do something
> >> like
> >>
> >> "green tree" in field1
> >> or
> >> "green tree" in field 2
> >>
> >> but that's not even close to what you're constructing.
> >>
> >> It would help a show what the query you want is in some form
> >> like that above before trying to code it, because the actual
> >> query you're making is something like asking for the phrase
> >> "word1 in field1 word1 in field2 word2 in field1 word2 in field2"
> >>
> >> Actually, I can't render the semantics of what you're doing int
> English.
> >> And Lucene can't parse it either.
> >>
> >> I suspect you want something like
> >> PhraseQuery pq1
> >> PhraseQuery pq2
> >> for (String word : words) {
> >>   pq1.add(...);
> >>   pq2.add(...)
> >> }
> >> BooleanQuery bq().
> >> bq.add(p1, ....  SHOULD);
> >> bq.add(p2, ... SHOULD);
> >>
> >> Index.search(bq);
> >>
> >> Best
> >> Erick
> >>
> >> On Nov 22, 2007 7:17 AM, Rapthor <[hidden email]> wrote:
> >>
> >> >
> >> > There is no option to provide an Occur.SHOULD to the PhraseQuery. So
> >> where
> >> > does it go? I changed the source to look like this:
> >> >
> >> > PhraseQuery pq = new PhraseQuery();
> >> > for (String word : words) {
> >> >        for (String field : fields) {
> >> >                pq.add(new Term(field, word));
> >> >        }
> >> > }
> >> > Hits hits = indexSearcher.search(pq);
> >> >
> >> > However, I get an exception:
> >> > java.lang.IllegalArgumentException: All phrase terms must be in the
> >> same
> >> > field: description:green tree
> >> >
> >> > I don't understand how to a) search for combinations of words like
> >> "green
> >> > tree", b) search in multiple fields (description, text, ...) and c)
> >> search
> >> > by a SHOULD restriction.
> >> > --
> >> > View this message in context:
> >> > http://www.nabble.com/AND-query-in-SHOULD-tf4855719.html#a13895700
> >> > Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> >> >
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: [hidden email]
> >> > For additional commands, e-mail: [hidden email]
> >> >
> >> >
> >>
> >
> >
> >
> > --
> > Regards,
> >
> > Shai Erera
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/AND-query-in-SHOULD-tf4855719.html#a13923009
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>


--
Regards,

Shai Erera