MultiPhraseQuery

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

MultiPhraseQuery

baris.kazar
Hi,-

  how does MultiPhraseQuery treat synonyms?

is the following possible?

... (created index with synonyms and indexReader object has the index)

IndexSearcher is = new IndexSearcher(indexReader);

MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
builder.add(new Term("body", "one"), 0);
builder.add(new Term("body", "two"), 1);
MultiPhraseQuery mpq = builder.build();
TopDocs hits = is.search(mpq, 20);// 20 hits

Best regards


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: MultiPhraseQuery

baris.kazar
Trying to implement the example on
https://lucene.apache.org/core/6_6_1/core/org/apache/lucene/search/MultiPhraseQuery.html

// A generalized version of PhraseQuery, with the possibility of adding
more than one term at the same position that are treated as a
disjunction (OR). To use this class to search for the phrase "Microsoft
app*" first create a Builder and use

// MultiPhraseQuery.Builder.add(Term) on the term "microsoft" (assuming
lowercase analysis), then find all terms that have "app" as prefix using
LeafReader.terms(String), seeking to "app" then iterating and collecting
terms until there is no longer that prefix,

// and finally use MultiPhraseQuery.Builder.add(Term[]) to add them.
MultiPhraseQuery.Builder.build() returns the fully constructed (and
immutable) MultiPhraseQuery.


IndexSearcher is = new IndexSearcher(indexReader);

MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
builder.add(new Term("body", "one"), 0);

Terms terms = LeafReader.terms("body"); // will this be slow? and how do
we incorporate token/word "app" here?

// i STILL dont see how to get individual Term objects from terms object
and plus do i need to declare LeafReader object?

Term[] termArr = new Term[k]; // i will get this filled via using
Terms.iterator
builder.add(termArr);
MultiPhraseQuery mpq = builder.build();
TopDocs hits = is.search(mpq, 20);// 20 hits


Best regards


On 9/18/18 4:16 PM, [hidden email] wrote:

> Hi,-
>
>  how does MultiPhraseQuery treat synonyms?
>
> is the following possible?
>
> ... (created index with synonyms and indexReader object has the index)
>
> IndexSearcher is = new IndexSearcher(indexReader);
>
> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
> builder.add(new Term("body", "one"), 0);
> builder.add(new Term("body", "two"), 1);
> MultiPhraseQuery mpq = builder.build();
> TopDocs hits = is.search(mpq, 20);// 20 hits
>
> Best regards
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: MultiPhraseQuery

baris.kazar
Any suggestions please?
Two main questions:
- how do synonyms get utilized by MultiPhraseQuery?
- how do we get second token "app" applied to the example on
MultiPhraseQuery javadocs page? (and how do we get Terms[] array from
Terms object?)

Now three questions :)

i wish the Javadocs has examples like PhraseQuery Javadocs gave.

Best

On 9/18/18 4:45 PM, [hidden email] wrote:

> Trying to implement the example on
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_6-5F6-5F1_core_org_apache_lucene_search_MultiPhraseQuery.html&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=7WmT3NC9wzVk4FPBupACoALoL4kho6V7-c2o4Kac5QM&s=gM6_4hvpLEZY1_7r-CEInZbUb-ublYDcJOQ8rmeAgVA&e=
>
> // A generalized version of PhraseQuery, with the possibility of
> adding more than one term at the same position that are treated as a
> disjunction (OR). To use this class to search for the phrase
> "Microsoft app*" first create a Builder and use
>
> // MultiPhraseQuery.Builder.add(Term) on the term "microsoft"
> (assuming lowercase analysis), then find all terms that have "app" as
> prefix using LeafReader.terms(String), seeking to "app" then iterating
> and collecting terms until there is no longer that prefix,
>
> // and finally use MultiPhraseQuery.Builder.add(Term[]) to add them.
> MultiPhraseQuery.Builder.build() returns the fully constructed (and
> immutable) MultiPhraseQuery.
>
>
> IndexSearcher is = new IndexSearcher(indexReader);
>
> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
> builder.add(new Term("body", "one"), 0);
>
> Terms terms = LeafReader.terms("body"); // will this be slow? and how
> do we incorporate token/word "app" here?
>
> // i STILL dont see how to get individual Term objects from terms
> object and plus do i need to declare LeafReader object?
>
> Term[] termArr = new Term[k]; // i will get this filled via using
> Terms.iterator
> builder.add(termArr);
> MultiPhraseQuery mpq = builder.build();
> TopDocs hits = is.search(mpq, 20);// 20 hits
>
>
> Best regards
>
>
> On 9/18/18 4:16 PM, [hidden email] wrote:
>> Hi,-
>>
>>  how does MultiPhraseQuery treat synonyms?
>>
>> is the following possible?
>>
>> ... (created index with synonyms and indexReader object has the index)
>>
>> IndexSearcher is = new IndexSearcher(indexReader);
>>
>> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
>> builder.add(new Term("body", "one"), 0);
>> builder.add(new Term("body", "two"), 1);
>> MultiPhraseQuery mpq = builder.build();
>> TopDocs hits = is.search(mpq, 20);// 20 hits
>>
>> Best regards
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: MultiPhraseQuery

Erick Erickson
bq. i wish the Javadocs has examples like PhraseQuery Javadocs gave.

This is where someone coming into the examples for the first time is
invaluable, javadoc patches are most welcome! It can be hard to back
off enough to remember what the confusing bits are when you wrote the
code ;)
On Tue, Sep 18, 2018 at 1:56 PM <[hidden email]> wrote:

>
> Any suggestions please?
> Two main questions:
> - how do synonyms get utilized by MultiPhraseQuery?
> - how do we get second token "app" applied to the example on
> MultiPhraseQuery javadocs page? (and how do we get Terms[] array from
> Terms object?)
>
> Now three questions :)
>
> i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>
> Best
>
> On 9/18/18 4:45 PM, [hidden email] wrote:
> > Trying to implement the example on
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_6-5F6-5F1_core_org_apache_lucene_search_MultiPhraseQuery.html&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=7WmT3NC9wzVk4FPBupACoALoL4kho6V7-c2o4Kac5QM&s=gM6_4hvpLEZY1_7r-CEInZbUb-ublYDcJOQ8rmeAgVA&e=
> >
> > // A generalized version of PhraseQuery, with the possibility of
> > adding more than one term at the same position that are treated as a
> > disjunction (OR). To use this class to search for the phrase
> > "Microsoft app*" first create a Builder and use
> >
> > // MultiPhraseQuery.Builder.add(Term) on the term "microsoft"
> > (assuming lowercase analysis), then find all terms that have "app" as
> > prefix using LeafReader.terms(String), seeking to "app" then iterating
> > and collecting terms until there is no longer that prefix,
> >
> > // and finally use MultiPhraseQuery.Builder.add(Term[]) to add them.
> > MultiPhraseQuery.Builder.build() returns the fully constructed (and
> > immutable) MultiPhraseQuery.
> >
> >
> > IndexSearcher is = new IndexSearcher(indexReader);
> >
> > MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
> > builder.add(new Term("body", "one"), 0);
> >
> > Terms terms = LeafReader.terms("body"); // will this be slow? and how
> > do we incorporate token/word "app" here?
> >
> > // i STILL dont see how to get individual Term objects from terms
> > object and plus do i need to declare LeafReader object?
> >
> > Term[] termArr = new Term[k]; // i will get this filled via using
> > Terms.iterator
> > builder.add(termArr);
> > MultiPhraseQuery mpq = builder.build();
> > TopDocs hits = is.search(mpq, 20);// 20 hits
> >
> >
> > Best regards
> >
> >
> > On 9/18/18 4:16 PM, [hidden email] wrote:
> >> Hi,-
> >>
> >>  how does MultiPhraseQuery treat synonyms?
> >>
> >> is the following possible?
> >>
> >> ... (created index with synonyms and indexReader object has the index)
> >>
> >> IndexSearcher is = new IndexSearcher(indexReader);
> >>
> >> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
> >> builder.add(new Term("body", "one"), 0);
> >> builder.add(new Term("body", "two"), 1);
> >> MultiPhraseQuery mpq = builder.build();
> >> TopDocs hits = is.search(mpq, 20);// 20 hits
> >>
> >> Best regards
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: [hidden email]
> >> For additional commands, e-mail: [hidden email]
> >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: MultiPhraseQuery

baris.kazar
Erick,-
  i think the reason why MultiPhraseQuery was created was synonyms as
far as i understood. am i right?

i want to have a BooleanQuery or MultiPhraseQuery (i cant decide between
these two) with an index which considers synonyms already.
One disadvantage of MultiPhraseQuery is that it needs to match all the
terms.
Then should i go for BooleanQuery with multiple PhraseQueries? but
PhraseQuery cannot handle synonyms.
i know TermQuery is for exact match so i cant use that either in this case.

i have multiple tokens and i want to be able to do a cheap fuzzy search.
Best regards


On 9/18/18 4:58 PM, Erick Erickson wrote:

> bq. i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>
> This is where someone coming into the examples for the first time is
> invaluable, javadoc patches are most welcome! It can be hard to back
> off enough to remember what the confusing bits are when you wrote the
> code ;)
> On Tue, Sep 18, 2018 at 1:56 PM <[hidden email]> wrote:
>> Any suggestions please?
>> Two main questions:
>> - how do synonyms get utilized by MultiPhraseQuery?
>> - how do we get second token "app" applied to the example on
>> MultiPhraseQuery javadocs page? (and how do we get Terms[] array from
>> Terms object?)
>>
>> Now three questions :)
>>
>> i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>>
>> Best
>>
>> On 9/18/18 4:45 PM, [hidden email] wrote:
>>> Trying to implement the example on
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_6-5F6-5F1_core_org_apache_lucene_search_MultiPhraseQuery.html&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=7WmT3NC9wzVk4FPBupACoALoL4kho6V7-c2o4Kac5QM&s=gM6_4hvpLEZY1_7r-CEInZbUb-ublYDcJOQ8rmeAgVA&e=
>>>
>>> // A generalized version of PhraseQuery, with the possibility of
>>> adding more than one term at the same position that are treated as a
>>> disjunction (OR). To use this class to search for the phrase
>>> "Microsoft app*" first create a Builder and use
>>>
>>> // MultiPhraseQuery.Builder.add(Term) on the term "microsoft"
>>> (assuming lowercase analysis), then find all terms that have "app" as
>>> prefix using LeafReader.terms(String), seeking to "app" then iterating
>>> and collecting terms until there is no longer that prefix,
>>>
>>> // and finally use MultiPhraseQuery.Builder.add(Term[]) to add them.
>>> MultiPhraseQuery.Builder.build() returns the fully constructed (and
>>> immutable) MultiPhraseQuery.
>>>
>>>
>>> IndexSearcher is = new IndexSearcher(indexReader);
>>>
>>> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
>>> builder.add(new Term("body", "one"), 0);
>>>
>>> Terms terms = LeafReader.terms("body"); // will this be slow? and how
>>> do we incorporate token/word "app" here?
>>>
>>> // i STILL dont see how to get individual Term objects from terms
>>> object and plus do i need to declare LeafReader object?
>>>
>>> Term[] termArr = new Term[k]; // i will get this filled via using
>>> Terms.iterator
>>> builder.add(termArr);
>>> MultiPhraseQuery mpq = builder.build();
>>> TopDocs hits = is.search(mpq, 20);// 20 hits
>>>
>>>
>>> Best regards
>>>
>>>
>>> On 9/18/18 4:16 PM, [hidden email] wrote:
>>>> Hi,-
>>>>
>>>>   how does MultiPhraseQuery treat synonyms?
>>>>
>>>> is the following possible?
>>>>
>>>> ... (created index with synonyms and indexReader object has the index)
>>>>
>>>> IndexSearcher is = new IndexSearcher(indexReader);
>>>>
>>>> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
>>>> builder.add(new Term("body", "one"), 0);
>>>> builder.add(new Term("body", "two"), 1);
>>>> MultiPhraseQuery mpq = builder.build();
>>>> TopDocs hits = is.search(mpq, 20);// 20 hits
>>>>
>>>> Best regards
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: [hidden email]
>>>> For additional commands, e-mail: [hidden email]
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: MultiPhraseQuery

baris.kazar
FuzzyQuery seems also not suitable for me.

PrefixQuery can be one token only, right?

Best


On 9/18/18 5:23 PM, [hidden email] wrote:

> Erick,-
>  i think the reason why MultiPhraseQuery was created was synonyms as
> far as i understood. am i right?
>
> i want to have a BooleanQuery or MultiPhraseQuery (i cant decide
> between these two) with an index which considers synonyms already.
> One disadvantage of MultiPhraseQuery is that it needs to match all the
> terms.
> Then should i go for BooleanQuery with multiple PhraseQueries? but
> PhraseQuery cannot handle synonyms.
> i know TermQuery is for exact match so i cant use that either in this
> case.
>
> i have multiple tokens and i want to be able to do a cheap fuzzy search.
> Best regards
>
>
> On 9/18/18 4:58 PM, Erick Erickson wrote:
>> bq. i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>>
>> This is where someone coming into the examples for the first time is
>> invaluable, javadoc patches are most welcome! It can be hard to back
>> off enough to remember what the confusing bits are when you wrote the
>> code ;)
>> On Tue, Sep 18, 2018 at 1:56 PM <[hidden email]> wrote:
>>> Any suggestions please?
>>> Two main questions:
>>> - how do synonyms get utilized by MultiPhraseQuery?
>>> - how do we get second token "app" applied to the example on
>>> MultiPhraseQuery javadocs page? (and how do we get Terms[] array from
>>> Terms object?)
>>>
>>> Now three questions :)
>>>
>>> i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>>>
>>> Best
>>>
>>> On 9/18/18 4:45 PM, [hidden email] wrote:
>>>> Trying to implement the example on
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_6-5F6-5F1_core_org_apache_lucene_search_MultiPhraseQuery.html&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=7WmT3NC9wzVk4FPBupACoALoL4kho6V7-c2o4Kac5QM&s=gM6_4hvpLEZY1_7r-CEInZbUb-ublYDcJOQ8rmeAgVA&e= 
>>>>
>>>>
>>>> // A generalized version of PhraseQuery, with the possibility of
>>>> adding more than one term at the same position that are treated as a
>>>> disjunction (OR). To use this class to search for the phrase
>>>> "Microsoft app*" first create a Builder and use
>>>>
>>>> // MultiPhraseQuery.Builder.add(Term) on the term "microsoft"
>>>> (assuming lowercase analysis), then find all terms that have "app" as
>>>> prefix using LeafReader.terms(String), seeking to "app" then iterating
>>>> and collecting terms until there is no longer that prefix,
>>>>
>>>> // and finally use MultiPhraseQuery.Builder.add(Term[]) to add them.
>>>> MultiPhraseQuery.Builder.build() returns the fully constructed (and
>>>> immutable) MultiPhraseQuery.
>>>>
>>>>
>>>> IndexSearcher is = new IndexSearcher(indexReader);
>>>>
>>>> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
>>>> builder.add(new Term("body", "one"), 0);
>>>>
>>>> Terms terms = LeafReader.terms("body"); // will this be slow? and how
>>>> do we incorporate token/word "app" here?
>>>>
>>>> // i STILL dont see how to get individual Term objects from terms
>>>> object and plus do i need to declare LeafReader object?
>>>>
>>>> Term[] termArr = new Term[k]; // i will get this filled via using
>>>> Terms.iterator
>>>> builder.add(termArr);
>>>> MultiPhraseQuery mpq = builder.build();
>>>> TopDocs hits = is.search(mpq, 20);// 20 hits
>>>>
>>>>
>>>> Best regards
>>>>
>>>>
>>>> On 9/18/18 4:16 PM, [hidden email] wrote:
>>>>> Hi,-
>>>>>
>>>>>   how does MultiPhraseQuery treat synonyms?
>>>>>
>>>>> is the following possible?
>>>>>
>>>>> ... (created index with synonyms and indexReader object has the
>>>>> index)
>>>>>
>>>>> IndexSearcher is = new IndexSearcher(indexReader);
>>>>>
>>>>> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
>>>>> builder.add(new Term("body", "one"), 0);
>>>>> builder.add(new Term("body", "two"), 1);
>>>>> MultiPhraseQuery mpq = builder.build();
>>>>> TopDocs hits = is.search(mpq, 20);// 20 hits
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [hidden email]
>>>>> For additional commands, e-mail: [hidden email]
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: [hidden email]
>>>> For additional commands, e-mail: [hidden email]
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: MultiPhraseQuery

Michael McCandless-2
In reply to this post by Erick Erickson
Yes, +1 for a patch to improve the docs!

MultiPhraseQuery only works for single term synonyms, and is usually
produced by query parsers when the incoming query text had single term
synonyms matching, I think?  The query parser will use other (span?)
queries for multi token synonyms.

I think the example in the javadoc should be simplified to not use "app*",
e.g. maybe just matching "Microsoft Excel|Word"?

Mike McCandless

http://blog.mikemccandless.com


On Wed, Sep 19, 2018 at 5:59 AM Erick Erickson <[hidden email]>
wrote:

> bq. i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>
> This is where someone coming into the examples for the first time is
> invaluable, javadoc patches are most welcome! It can be hard to back
> off enough to remember what the confusing bits are when you wrote the
> code ;)
> On Tue, Sep 18, 2018 at 1:56 PM <[hidden email]> wrote:
> >
> > Any suggestions please?
> > Two main questions:
> > - how do synonyms get utilized by MultiPhraseQuery?
> > - how do we get second token "app" applied to the example on
> > MultiPhraseQuery javadocs page? (and how do we get Terms[] array from
> > Terms object?)
> >
> > Now three questions :)
> >
> > i wish the Javadocs has examples like PhraseQuery Javadocs gave.
> >
> > Best
> >
> > On 9/18/18 4:45 PM, [hidden email] wrote:
> > > Trying to implement the example on
> > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_6-5F6-5F1_core_org_apache_lucene_search_MultiPhraseQuery.html&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=7WmT3NC9wzVk4FPBupACoALoL4kho6V7-c2o4Kac5QM&s=gM6_4hvpLEZY1_7r-CEInZbUb-ublYDcJOQ8rmeAgVA&e=
> > >
> > > // A generalized version of PhraseQuery, with the possibility of
> > > adding more than one term at the same position that are treated as a
> > > disjunction (OR). To use this class to search for the phrase
> > > "Microsoft app*" first create a Builder and use
> > >
> > > // MultiPhraseQuery.Builder.add(Term) on the term "microsoft"
> > > (assuming lowercase analysis), then find all terms that have "app" as
> > > prefix using LeafReader.terms(String), seeking to "app" then iterating
> > > and collecting terms until there is no longer that prefix,
> > >
> > > // and finally use MultiPhraseQuery.Builder.add(Term[]) to add them.
> > > MultiPhraseQuery.Builder.build() returns the fully constructed (and
> > > immutable) MultiPhraseQuery.
> > >
> > >
> > > IndexSearcher is = new IndexSearcher(indexReader);
> > >
> > > MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
> > > builder.add(new Term("body", "one"), 0);
> > >
> > > Terms terms = LeafReader.terms("body"); // will this be slow? and how
> > > do we incorporate token/word "app" here?
> > >
> > > // i STILL dont see how to get individual Term objects from terms
> > > object and plus do i need to declare LeafReader object?
> > >
> > > Term[] termArr = new Term[k]; // i will get this filled via using
> > > Terms.iterator
> > > builder.add(termArr);
> > > MultiPhraseQuery mpq = builder.build();
> > > TopDocs hits = is.search(mpq, 20);// 20 hits
> > >
> > >
> > > Best regards
> > >
> > >
> > > On 9/18/18 4:16 PM, [hidden email] wrote:
> > >> Hi,-
> > >>
> > >>  how does MultiPhraseQuery treat synonyms?
> > >>
> > >> is the following possible?
> > >>
> > >> ... (created index with synonyms and indexReader object has the index)
> > >>
> > >> IndexSearcher is = new IndexSearcher(indexReader);
> > >>
> > >> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
> > >> builder.add(new Term("body", "one"), 0);
> > >> builder.add(new Term("body", "two"), 1);
> > >> MultiPhraseQuery mpq = builder.build();
> > >> TopDocs hits = is.search(mpq, 20);// 20 hits
> > >>
> > >> Best regards
> > >>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: [hidden email]
> > >> For additional commands, e-mail: [hidden email]
> > >>
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: [hidden email]
> > > For additional commands, e-mail: [hidden email]
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: MultiPhraseQuery

baris.kazar
Ok, Mike, that was very helpful.

Now, i think i should use BooleanQuery with PhraseQueries but will
PhraseQuery be able to handle all synonyms- multi or single term?

What is the best way for this:

i have multiple tokens and i want to be able to do a cheap fuzzy search.

Best regards


On 9/18/18 5:28 PM, Michael McCandless wrote:

> Yes, +1 for a patch to improve the docs!
>
> MultiPhraseQuery only works for single term synonyms, and is usually
> produced by query parsers when the incoming query text had single term
> synonyms matching, I think?  The query parser will use other (span?)
> queries for multi token synonyms.
>
> I think the example in the javadoc should be simplified to not use "app*",
> e.g. maybe just matching "Microsoft Excel|Word"?
>
> Mike McCandless
>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__blog.mikemccandless.com&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=tfPRxWsAx9P1XhVir6rL7WRf3iwK0jtYxnNnnhB9S90&s=yyRp_pK267aMSOlpWodQL-67wMhX3rb88aFr1YJ6lfk&e=
>
>
> On Wed, Sep 19, 2018 at 5:59 AM Erick Erickson <[hidden email]>
> wrote:
>
>> bq. i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>>
>> This is where someone coming into the examples for the first time is
>> invaluable, javadoc patches are most welcome! It can be hard to back
>> off enough to remember what the confusing bits are when you wrote the
>> code ;)
>> On Tue, Sep 18, 2018 at 1:56 PM <[hidden email]> wrote:
>>> Any suggestions please?
>>> Two main questions:
>>> - how do synonyms get utilized by MultiPhraseQuery?
>>> - how do we get second token "app" applied to the example on
>>> MultiPhraseQuery javadocs page? (and how do we get Terms[] array from
>>> Terms object?)
>>>
>>> Now three questions :)
>>>
>>> i wish the Javadocs has examples like PhraseQuery Javadocs gave.
>>>
>>> Best
>>>
>>> On 9/18/18 4:45 PM, [hidden email] wrote:
>>>> Trying to implement the example on
>>>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_6-5F6-5F1_core_org_apache_lucene_search_MultiPhraseQuery.html&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=7WmT3NC9wzVk4FPBupACoALoL4kho6V7-c2o4Kac5QM&s=gM6_4hvpLEZY1_7r-CEInZbUb-ublYDcJOQ8rmeAgVA&e=
>>>> // A generalized version of PhraseQuery, with the possibility of
>>>> adding more than one term at the same position that are treated as a
>>>> disjunction (OR). To use this class to search for the phrase
>>>> "Microsoft app*" first create a Builder and use
>>>>
>>>> // MultiPhraseQuery.Builder.add(Term) on the term "microsoft"
>>>> (assuming lowercase analysis), then find all terms that have "app" as
>>>> prefix using LeafReader.terms(String), seeking to "app" then iterating
>>>> and collecting terms until there is no longer that prefix,
>>>>
>>>> // and finally use MultiPhraseQuery.Builder.add(Term[]) to add them.
>>>> MultiPhraseQuery.Builder.build() returns the fully constructed (and
>>>> immutable) MultiPhraseQuery.
>>>>
>>>>
>>>> IndexSearcher is = new IndexSearcher(indexReader);
>>>>
>>>> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
>>>> builder.add(new Term("body", "one"), 0);
>>>>
>>>> Terms terms = LeafReader.terms("body"); // will this be slow? and how
>>>> do we incorporate token/word "app" here?
>>>>
>>>> // i STILL dont see how to get individual Term objects from terms
>>>> object and plus do i need to declare LeafReader object?
>>>>
>>>> Term[] termArr = new Term[k]; // i will get this filled via using
>>>> Terms.iterator
>>>> builder.add(termArr);
>>>> MultiPhraseQuery mpq = builder.build();
>>>> TopDocs hits = is.search(mpq, 20);// 20 hits
>>>>
>>>>
>>>> Best regards
>>>>
>>>>
>>>> On 9/18/18 4:16 PM, [hidden email] wrote:
>>>>> Hi,-
>>>>>
>>>>>   how does MultiPhraseQuery treat synonyms?
>>>>>
>>>>> is the following possible?
>>>>>
>>>>> ... (created index with synonyms and indexReader object has the index)
>>>>>
>>>>> IndexSearcher is = new IndexSearcher(indexReader);
>>>>>
>>>>> MultiPhraseQuery.Builder builder = new MultiPhraseQuery.Builder();
>>>>> builder.add(new Term("body", "one"), 0);
>>>>> builder.add(new Term("body", "two"), 1);
>>>>> MultiPhraseQuery mpq = builder.build();
>>>>> TopDocs hits = is.search(mpq, 20);// 20 hits
>>>>>
>>>>> Best regards
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [hidden email]
>>>>> For additional commands, e-mail: [hidden email]
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: [hidden email]
>>>> For additional commands, e-mail: [hidden email]
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]