SingleTerm vs MultiTerm in PhraseWildCardQuery class in the sandbox Lucene

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

SingleTerm vs MultiTerm in PhraseWildCardQuery class in the sandbox Lucene

baris.kazar
Hi,-

i hope everyone is doing great.

  if i want to do the following search with PhraseWildCardQuery and
thanks to this forum for letting me know about this class (Especially to
David and Bruno)

term1 term2FirstChar*

i need to do two ways: (i found the source code at
https://fossies.org/linux/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java)

/*

maxMultiTermExpansions - The maximum number of expansions across all
multi-terms and across all segments. It counts expansions for each
segments individually, that allows optimizations per segment and unused
expansions are credited to next segments. This is different from
MultiPhraseQuery and SpanMultiTermQueryWrapper which have an expansion
limit per multi-term.

segmentOptimizationEnabled - Whether to enable the segment optimization
which consists in ignoring a segment for further analysis as soon as a
term is not present inside it. This optimizes the query execution
performance but changes the scoring. The result ranking is preserved.

*/


1st way:

PhraseWildCardQuery.Builder builder = PharseWildCardQuery.Builder(field,
2 _*/<<< i dont know what number to use here for
maxMultiTermExpansions>>>/*_, true/*boolean segmentOptimizationEnabled*/)

pwcqBuilder.addTerm(field, new Term(field, "term1"));

pwcqBuilder.addTerm(field,new Term(field, "term2FirstChar"));

PhraseWildCardQuery pwcq = pwcqBuilder.build();

or

2nd way:

pwcqBuilder.addMultiTerm(MultiTermQuery object here contaning {field,
"term1"} and {field ,"term2FirstChar"});

PhraseWildCardQuery pwcq = pwcqBuilder.build();


Then this pwcq object will be fed into IndexSearcher's as the query
parameter.


Now, it looks like the first way will not consider expansions or in
other words wildcard? Am i right?

i also need to understand this maxMultiTermExpansions parameter better.
For instance if first way is used, will maxMultiTermExpansions be
meaningful?


Thanks

Reply | Threaded
Open this post in threaded view
|

Re: SingleTerm vs MultiTerm in PhraseWildCardQuery class in the sandbox Lucene

Michael Froh
Hi Baris,

The idea with PhraseWildcardQuery is that you can mix literal "exact" terms
with "MultiTerms" (i.e. any subclass of MultiTermQuery). Using addTerm is
for exact terms, while addMultiTerm is for things that may match a number
of possible terms in the given position.

If you want to search for term1 followed by any term that starts with a
given character, I would suggest using:

int maxMultiTermExpansions = ...; // Discussed below
PhraseWildCardQuery.Builder builder = new PhraseWildcardQuery("field",
maxMultiTermExpansions);
builder.addTerm(new BytesRef("term1")); // Add fixed term in position 0
builder.addMultiTerm(new PrefixQuery(new Term("field", "term2FirstChar")));
// Add multiterm in position 1
Query q = builder.build();

The PrefixQuery effectively gets expanded into a bunch of possible terms,
based on the term dictionary on each index segment. To avoid expanding to
cover too many terms (say, if you added a bunch of WildcardQuery),
maxMultiTermExpansions serves as a guard rail, to put a rough bound on
memory consumption and query execution time. If you're interested in
details of how the maxMultiTermExpansions budget is distributed across
MultiTerms, check out PhraseWildcardQuery.createWeight. If you're just
running an experiment in your IDE, you could probably set
maxMultiTermExpansions to Integer.MAX_VALUE. (If you're running in a
production environment, it's likely a good idea to tune it down based on
your memory/latency constraints.)

Incidentally, for tracking down the source code for anything in Lucene,
it's probably better to go to GitHub for the most up-to-date source:
https://github.com/apache/lucene-solr/blob/master/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java
.

Hope that helps,
Michael

On Thu, 13 Feb 2020 at 12:29, <[hidden email]> wrote:

> Hi,-
>
> i hope everyone is doing great.
>
>   if i want to do the following search with PhraseWildCardQuery and
> thanks to this forum for letting me know about this class (Especially to
> David and Bruno)
>
> term1 term2FirstChar*
>
> i need to do two ways: (i found the source code at
>
> https://fossies.org/linux/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java
> )
>
> /*
>
> maxMultiTermExpansions - The maximum number of expansions across all
> multi-terms and across all segments. It counts expansions for each
> segments individually, that allows optimizations per segment and unused
> expansions are credited to next segments. This is different from
> MultiPhraseQuery and SpanMultiTermQueryWrapper which have an expansion
> limit per multi-term.
>
> segmentOptimizationEnabled - Whether to enable the segment optimization
> which consists in ignoring a segment for further analysis as soon as a
> term is not present inside it. This optimizes the query execution
> performance but changes the scoring. The result ranking is preserved.
>
> */
>
>
> 1st way:
>
> PhraseWildCardQuery.Builder builder = PharseWildCardQuery.Builder(field,
> 2 _*/<<< i dont know what number to use here for
> maxMultiTermExpansions>>>/*_, true/*boolean segmentOptimizationEnabled*/)
>
> pwcqBuilder.addTerm(field, new Term(field, "term1"));
>
> pwcqBuilder.addTerm(field,new Term(field, "term2FirstChar"));
>
> PhraseWildCardQuery pwcq = pwcqBuilder.build();
>
> or
>
> 2nd way:
>
> pwcqBuilder.addMultiTerm(MultiTermQuery object here contaning {field,
> "term1"} and {field ,"term2FirstChar"});
>
> PhraseWildCardQuery pwcq = pwcqBuilder.build();
>
>
> Then this pwcq object will be fed into IndexSearcher's as the query
> parameter.
>
>
> Now, it looks like the first way will not consider expansions or in
> other words wildcard? Am i right?
>
> i also need to understand this maxMultiTermExpansions parameter better.
> For instance if first way is used, will maxMultiTermExpansions be
> meaningful?
>
>
> Thanks
>
>
Reply | Threaded
Open this post in threaded view
|

Re: SingleTerm vs MultiTerm in PhraseWildCardQuery class in the sandbox Lucene

baris.kazar
Michael and Forum,-
Thanks for thegreat explanations.

one question please:

why is PrefixQuery used instead of WildCardQuery in the below snippet?

Best regards

> On Feb 17, 2020, at 3:01 PM, Michael Froh <[hidden email]> wrote:
>
> Hi Baris,
>
> The idea with PhraseWildcardQuery is that you can mix literal "exact" terms
> with "MultiTerms" (i.e. any subclass of MultiTermQuery). Using addTerm is
> for exact terms, while addMultiTerm is for things that may match a number
> of possible terms in the given position.
>
> If you want to search for term1 followed by any term that starts with a
> given character, I would suggest using:
>
> int maxMultiTermExpansions = ...; // Discussed below
> PhraseWildCardQuery.Builder builder = new PhraseWildcardQuery("field",
> maxMultiTermExpansions);
> builder.addTerm(new BytesRef("term1")); // Add fixed term in position 0
> builder.addMultiTerm(new PrefixQuery(new Term("field", "term2FirstChar")));
> // Add multiterm in position 1
> Query q = builder.build();
>
> The PrefixQuery effectively gets expanded into a bunch of possible terms,
> based on the term dictionary on each index segment. To avoid expanding to
> cover too many terms (say, if you added a bunch of WildcardQuery),
> maxMultiTermExpansions serves as a guard rail, to put a rough bound on
> memory consumption and query execution time. If you're interested in
> details of how the maxMultiTermExpansions budget is distributed across
> MultiTerms, check out PhraseWildcardQuery.createWeight. If you're just
> running an experiment in your IDE, you could probably set
> maxMultiTermExpansions to Integer.MAX_VALUE. (If you're running in a
> production environment, it's likely a good idea to tune it down based on
> your memory/latency constraints.)
>
> Incidentally, for tracking down the source code for anything in Lucene,
> it's probably better to go to GitHub for the most up-to-date source:
> https://urldefense.com/v3/__https://github.com/apache/lucene-solr/blob/master/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java__;!!GqivPVa7Brio!ONqQgLIltNBUuSo5Cn_Fz7-wuR1LQv68YS_z-6g7X-S86PHQtT9tKl7VbIq9tVLYyw$ 
> .
>
> Hope that helps,
> Michael
>
>> On Thu, 13 Feb 2020 at 12:29, <[hidden email]> wrote:
>>
>> Hi,-
>>
>> i hope everyone is doing great.
>>
>>  if i want to do the following search with PhraseWildCardQuery and
>> thanks to this forum for letting me know about this class (Especially to
>> David and Bruno)
>>
>> term1 term2FirstChar*
>>
>> i need to do two ways: (i found the source code at
>>
>> https://urldefense.com/v3/__https://fossies.org/linux/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java__;!!GqivPVa7Brio!ONqQgLIltNBUuSo5Cn_Fz7-wuR1LQv68YS_z-6g7X-S86PHQtT9tKl7VbIpV8n29nQ$ 
>> )
>>
>> /*
>>
>> maxMultiTermExpansions - The maximum number of expansions across all
>> multi-terms and across all segments. It counts expansions for each
>> segments individually, that allows optimizations per segment and unused
>> expansions are credited to next segments. This is different from
>> MultiPhraseQuery and SpanMultiTermQueryWrapper which have an expansion
>> limit per multi-term.
>>
>> segmentOptimizationEnabled - Whether to enable the segment optimization
>> which consists in ignoring a segment for further analysis as soon as a
>> term is not present inside it. This optimizes the query execution
>> performance but changes the scoring. The result ranking is preserved.
>>
>> */
>>
>>
>> 1st way:
>>
>> PhraseWildCardQuery.Builder builder = PharseWildCardQuery.Builder(field,
>> 2 _*/<<< i dont know what number to use here for
>> maxMultiTermExpansions>>>/*_, true/*boolean segmentOptimizationEnabled*/)
>>
>> pwcqBuilder.addTerm(field, new Term(field, "term1"));
>>
>> pwcqBuilder.addTerm(field,new Term(field, "term2FirstChar"));
>>
>> PhraseWildCardQuery pwcq = pwcqBuilder.build();
>>
>> or
>>
>> 2nd way:
>>
>> pwcqBuilder.addMultiTerm(MultiTermQuery object here contaning {field,
>> "term1"} and {field ,"term2FirstChar"});
>>
>> PhraseWildCardQuery pwcq = pwcqBuilder.build();
>>
>>
>> Then this pwcq object will be fed into IndexSearcher's as the query
>> parameter.
>>
>>
>> Now, it looks like the first way will not consider expansions or in
>> other words wildcard? Am i right?
>>
>> i also need to understand this maxMultiTermExpansions parameter better.
>> For instance if first way is used, will maxMultiTermExpansions be
>> meaningful?
>>
>>
>> Thanks
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: SingleTerm vs MultiTerm in PhraseWildCardQuery class in the sandbox Lucene

Michael Froh
In your example, it looks like you wanted the second term to match based on
the first character, or prefix, of the term.

While you could use a WildcardQuery with a term value of "term2FirstChar*",
PrefixQuery seemed like the simpler approach. WildcardQuery can handle more
general cases, like if you want to match on something like "a*b*c".

Technically, the PrefixQuery compiles down to a slightly simpler automaton,
but I only figured that out by writing a simple unit test:

    public void testAutomata() {
        Automaton prefixAutomaton = PrefixQuery.toAutomaton(new
BytesRef("a"));
        Automaton wildcardAutomaton = WildcardQuery.toAutomaton(new
Term("foo", "a*"));

        System.out.println("PrefixQuery(\"a\")");
        System.out.println(prefixAutomaton.toDot());
        System.out.println("WildcardQuery(\"a*\")");
        System.out.println(wildcardAutomaton.toDot());
    }

That produces the following output:

PrefixQuery("a")
digraph Automaton {
  rankdir = LR
  node [width=0.2, height=0.2, fontsize=8]
  initial [shape=plaintext,label=""]
  initial -> 0
  0 [shape=circle,label="0"]
  0 -> 1 [label="a"]
  1 [shape=doublecircle,label="1"]
  1 -> 1 [label="\\U00000000-\\U000000ff"]
}
WildcardQuery("a*")
digraph Automaton {
  rankdir = LR
  node [width=0.2, height=0.2, fontsize=8]
  initial [shape=plaintext,label=""]
  initial -> 0
  0 [shape=circle,label="0"]
  0 -> 1 [label="a"]
  1 [shape=doublecircle,label="1"]
  1 -> 2 [label="\\U00000000-\\U0010ffff"]
  2 [shape=doublecircle,label="2"]
  2 -> 2 [label="\\U00000000-\\U0010ffff"]
}



On Tue, 18 Feb 2020 at 13:52, <[hidden email]> wrote:

> Michael and Forum,-
> Thanks for thegreat explanations.
>
> one question please:
>
> why is PrefixQuery used instead of WildCardQuery in the below snippet?
>
> Best regards
>
> > On Feb 17, 2020, at 3:01 PM, Michael Froh <[hidden email]> wrote:
> >
> > Hi Baris,
> >
> > The idea with PhraseWildcardQuery is that you can mix literal "exact"
> terms
> > with "MultiTerms" (i.e. any subclass of MultiTermQuery). Using addTerm is
> > for exact terms, while addMultiTerm is for things that may match a number
> > of possible terms in the given position.
> >
> > If you want to search for term1 followed by any term that starts with a
> > given character, I would suggest using:
> >
> > int maxMultiTermExpansions = ...; // Discussed below
> > PhraseWildCardQuery.Builder builder = new PhraseWildcardQuery("field",
> > maxMultiTermExpansions);
> > builder.addTerm(new BytesRef("term1")); // Add fixed term in position 0
> > builder.addMultiTerm(new PrefixQuery(new Term("field",
> "term2FirstChar")));
> > // Add multiterm in position 1
> > Query q = builder.build();
> >
> > The PrefixQuery effectively gets expanded into a bunch of possible terms,
> > based on the term dictionary on each index segment. To avoid expanding to
> > cover too many terms (say, if you added a bunch of WildcardQuery),
> > maxMultiTermExpansions serves as a guard rail, to put a rough bound on
> > memory consumption and query execution time. If you're interested in
> > details of how the maxMultiTermExpansions budget is distributed across
> > MultiTerms, check out PhraseWildcardQuery.createWeight. If you're just
> > running an experiment in your IDE, you could probably set
> > maxMultiTermExpansions to Integer.MAX_VALUE. (If you're running in a
> > production environment, it's likely a good idea to tune it down based on
> > your memory/latency constraints.)
> >
> > Incidentally, for tracking down the source code for anything in Lucene,
> > it's probably better to go to GitHub for the most up-to-date source:
> >
> https://urldefense.com/v3/__https://github.com/apache/lucene-solr/blob/master/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java__;!!GqivPVa7Brio!ONqQgLIltNBUuSo5Cn_Fz7-wuR1LQv68YS_z-6g7X-S86PHQtT9tKl7VbIq9tVLYyw$
> > .
> >
> > Hope that helps,
> > Michael
> >
> >> On Thu, 13 Feb 2020 at 12:29, <[hidden email]> wrote:
> >>
> >> Hi,-
> >>
> >> i hope everyone is doing great.
> >>
> >>  if i want to do the following search with PhraseWildCardQuery and
> >> thanks to this forum for letting me know about this class (Especially to
> >> David and Bruno)
> >>
> >> term1 term2FirstChar*
> >>
> >> i need to do two ways: (i found the source code at
> >>
> >>
> https://urldefense.com/v3/__https://fossies.org/linux/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java__;!!GqivPVa7Brio!ONqQgLIltNBUuSo5Cn_Fz7-wuR1LQv68YS_z-6g7X-S86PHQtT9tKl7VbIpV8n29nQ$
> >> )
> >>
> >> /*
> >>
> >> maxMultiTermExpansions - The maximum number of expansions across all
> >> multi-terms and across all segments. It counts expansions for each
> >> segments individually, that allows optimizations per segment and unused
> >> expansions are credited to next segments. This is different from
> >> MultiPhraseQuery and SpanMultiTermQueryWrapper which have an expansion
> >> limit per multi-term.
> >>
> >> segmentOptimizationEnabled - Whether to enable the segment optimization
> >> which consists in ignoring a segment for further analysis as soon as a
> >> term is not present inside it. This optimizes the query execution
> >> performance but changes the scoring. The result ranking is preserved.
> >>
> >> */
> >>
> >>
> >> 1st way:
> >>
> >> PhraseWildCardQuery.Builder builder = PharseWildCardQuery.Builder(field,
> >> 2 _*/<<< i dont know what number to use here for
> >> maxMultiTermExpansions>>>/*_, true/*boolean
> segmentOptimizationEnabled*/)
> >>
> >> pwcqBuilder.addTerm(field, new Term(field, "term1"));
> >>
> >> pwcqBuilder.addTerm(field,new Term(field, "term2FirstChar"));
> >>
> >> PhraseWildCardQuery pwcq = pwcqBuilder.build();
> >>
> >> or
> >>
> >> 2nd way:
> >>
> >> pwcqBuilder.addMultiTerm(MultiTermQuery object here contaning {field,
> >> "term1"} and {field ,"term2FirstChar"});
> >>
> >> PhraseWildCardQuery pwcq = pwcqBuilder.build();
> >>
> >>
> >> Then this pwcq object will be fed into IndexSearcher's as the query
> >> parameter.
> >>
> >>
> >> Now, it looks like the first way will not consider expansions or in
> >> other words wildcard? Am i right?
> >>
> >> i also need to understand this maxMultiTermExpansions parameter better.
> >> For instance if first way is used, will maxMultiTermExpansions be
> >> meaningful?
> >>
> >>
> >> Thanks
> >>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: SingleTerm vs MultiTerm in PhraseWildCardQuery class in the sandbox Lucene

baris.kazar
Michael and Forum,-
This is amazing, thanks.

i will try both cases.

i can also have "term1 term2Char1term2Char2*"
and so on with term2's next chars.

I hope the latest version on github for this
class works with Lucene Version 7.7.2.

Best regards

> On Feb 18, 2020, at 8:33 PM, Michael Froh <[hidden email]> wrote:
>
> 
> In your example, it looks like you wanted the second term to match based on the first character, or prefix, of the term.
>
> While you could use a WildcardQuery with a term value of "term2FirstChar*", PrefixQuery seemed like the simpler approach. WildcardQuery can handle more general cases, like if you want to match on something like "a*b*c".
>
> Technically, the PrefixQuery compiles down to a slightly simpler automaton, but I only figured that out by writing a simple unit test:
>
>     public void testAutomata() {
>         Automaton prefixAutomaton = PrefixQuery.toAutomaton(new BytesRef("a"));
>         Automaton wildcardAutomaton = WildcardQuery.toAutomaton(new Term("foo", "a*"));
>
>         System.out.println("PrefixQuery(\"a\")");
>         System.out.println(prefixAutomaton.toDot());
>         System.out.println("WildcardQuery(\"a*\")");
>         System.out.println(wildcardAutomaton.toDot());
>     }
>
> That produces the following output:
>
> PrefixQuery("a")
> digraph Automaton {
>   rankdir = LR
>   node [width=0.2, height=0.2, fontsize=8]
>   initial [shape=plaintext,label=""]
>   initial -> 0
>   0 [shape=circle,label="0"]
>   0 -> 1 [label="a"]
>   1 [shape=doublecircle,label="1"]
>   1 -> 1 [label="\\U00000000-\\U000000ff"]
> }
> WildcardQuery("a*")
> digraph Automaton {
>   rankdir = LR
>   node [width=0.2, height=0.2, fontsize=8]
>   initial [shape=plaintext,label=""]
>   initial -> 0
>   0 [shape=circle,label="0"]
>   0 -> 1 [label="a"]
>   1 [shape=doublecircle,label="1"]
>   1 -> 2 [label="\\U00000000-\\U0010ffff"]
>   2 [shape=doublecircle,label="2"]
>   2 -> 2 [label="\\U00000000-\\U0010ffff"]
> }
>
>
>
>> On Tue, 18 Feb 2020 at 13:52, <[hidden email]> wrote:
>> Michael and Forum,-
>> Thanks for thegreat explanations.
>>
>> one question please:
>>
>> why is PrefixQuery used instead of WildCardQuery in the below snippet?
>>
>> Best regards
>>
>> > On Feb 17, 2020, at 3:01 PM, Michael Froh <[hidden email]> wrote:
>> >
>> > Hi Baris,
>> >
>> > The idea with PhraseWildcardQuery is that you can mix literal "exact" terms
>> > with "MultiTerms" (i.e. any subclass of MultiTermQuery). Using addTerm is
>> > for exact terms, while addMultiTerm is for things that may match a number
>> > of possible terms in the given position.
>> >
>> > If you want to search for term1 followed by any term that starts with a
>> > given character, I would suggest using:
>> >
>> > int maxMultiTermExpansions = ...; // Discussed below
>> > PhraseWildCardQuery.Builder builder = new PhraseWildcardQuery("field",
>> > maxMultiTermExpansions);
>> > builder.addTerm(new BytesRef("term1")); // Add fixed term in position 0
>> > builder.addMultiTerm(new PrefixQuery(new Term("field", "term2FirstChar")));
>> > // Add multiterm in position 1
>> > Query q = builder.build();
>> >
>> > The PrefixQuery effectively gets expanded into a bunch of possible terms,
>> > based on the term dictionary on each index segment. To avoid expanding to
>> > cover too many terms (say, if you added a bunch of WildcardQuery),
>> > maxMultiTermExpansions serves as a guard rail, to put a rough bound on
>> > memory consumption and query execution time. If you're interested in
>> > details of how the maxMultiTermExpansions budget is distributed across
>> > MultiTerms, check out PhraseWildcardQuery.createWeight. If you're just
>> > running an experiment in your IDE, you could probably set
>> > maxMultiTermExpansions to Integer.MAX_VALUE. (If you're running in a
>> > production environment, it's likely a good idea to tune it down based on
>> > your memory/latency constraints.)
>> >
>> > Incidentally, for tracking down the source code for anything in Lucene,
>> > it's probably better to go to GitHub for the most up-to-date source:
>> > https://urldefense.com/v3/__https://github.com/apache/lucene-solr/blob/master/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java__;!!GqivPVa7Brio!ONqQgLIltNBUuSo5Cn_Fz7-wuR1LQv68YS_z-6g7X-S86PHQtT9tKl7VbIq9tVLYyw$ 
>> > .
>> >
>> > Hope that helps,
>> > Michael
>> >
>> >> On Thu, 13 Feb 2020 at 12:29, <[hidden email]> wrote:
>> >>
>> >> Hi,-
>> >>
>> >> i hope everyone is doing great.
>> >>
>> >>  if i want to do the following search with PhraseWildCardQuery and
>> >> thanks to this forum for letting me know about this class (Especially to
>> >> David and Bruno)
>> >>
>> >> term1 term2FirstChar*
>> >>
>> >> i need to do two ways: (i found the source code at
>> >>
>> >> https://urldefense.com/v3/__https://fossies.org/linux/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java__;!!GqivPVa7Brio!ONqQgLIltNBUuSo5Cn_Fz7-wuR1LQv68YS_z-6g7X-S86PHQtT9tKl7VbIpV8n29nQ$ 
>> >> )
>> >>
>> >> /*
>> >>
>> >> maxMultiTermExpansions - The maximum number of expansions across all
>> >> multi-terms and across all segments. It counts expansions for each
>> >> segments individually, that allows optimizations per segment and unused
>> >> expansions are credited to next segments. This is different from
>> >> MultiPhraseQuery and SpanMultiTermQueryWrapper which have an expansion
>> >> limit per multi-term.
>> >>
>> >> segmentOptimizationEnabled - Whether to enable the segment optimization
>> >> which consists in ignoring a segment for further analysis as soon as a
>> >> term is not present inside it. This optimizes the query execution
>> >> performance but changes the scoring. The result ranking is preserved.
>> >>
>> >> */
>> >>
>> >>
>> >> 1st way:
>> >>
>> >> PhraseWildCardQuery.Builder builder = PharseWildCardQuery.Builder(field,
>> >> 2 _*/<<< i dont know what number to use here for
>> >> maxMultiTermExpansions>>>/*_, true/*boolean segmentOptimizationEnabled*/)
>> >>
>> >> pwcqBuilder.addTerm(field, new Term(field, "term1"));
>> >>
>> >> pwcqBuilder.addTerm(field,new Term(field, "term2FirstChar"));
>> >>
>> >> PhraseWildCardQuery pwcq = pwcqBuilder.build();
>> >>
>> >> or
>> >>
>> >> 2nd way:
>> >>
>> >> pwcqBuilder.addMultiTerm(MultiTermQuery object here contaning {field,
>> >> "term1"} and {field ,"term2FirstChar"});
>> >>
>> >> PhraseWildCardQuery pwcq = pwcqBuilder.build();
>> >>
>> >>
>> >> Then this pwcq object will be fed into IndexSearcher's as the query
>> >> parameter.
>> >>
>> >>
>> >> Now, it looks like the first way will not consider expansions or in
>> >> other words wildcard? Am i right?
>> >>
>> >> i also need to understand this maxMultiTermExpansions parameter better.
>> >> For instance if first way is used, will maxMultiTermExpansions be
>> >> meaningful?
>> >>
>> >>
>> >> Thanks
>> >>
>> >>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
Reply | Threaded
Open this post in threaded view
|

Re: SingleTerm vs MultiTerm in PhraseWildCardQuery class in the sandbox Lucene

baris.kazar
Hi,-

Thanks again Michael, David and Bruno and the Forum for letting me know
this repository.

The version of PhraseWildCardQuery on
https://github.com/apache/lucene-solr/blob/master/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java 
uses some classes not available Lucene version 7.7.2.

There is a bunch of new and modified classes used by PhraseWildCardquery
class such as QueryVisitor, ScoreMode etc.

I will try to add these classes from
https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene

and i hope it will work with Lucene 7.7.2.

Best regards



On 2/18/20 8:49 PM, [hidden email] wrote:

> Michael and Forum,-
> This is amazing, thanks.
>
> i will try both cases.
>
> i can also have "term1 term2Char1term2Char2*"
> and so on with term2's next chars.
>
> I hope the latest version on github for this
> class works with Lucene Version 7.7.2.
>
> Best regards
>
>> On Feb 18, 2020, at 8:33 PM, Michael Froh <[hidden email]> wrote:
>>
>> 
>> In your example, it looks like you wanted the second term to match
>> based on the first character, or prefix, of the term.
>>
>> While you could use a WildcardQuery with a term value of
>> "term2FirstChar*", PrefixQuery seemed like the simpler approach.
>> WildcardQuery can handle more general cases, like if you want to
>> match on something like "a*b*c".
>>
>> Technically, the PrefixQuery compiles down to a slightly simpler
>> automaton, but I only figured that out by writing a simple unit test:
>>
>>     public void testAutomata() {
>>         Automaton prefixAutomaton = PrefixQuery.toAutomaton(new
>> BytesRef("a"));
>>         Automaton wildcardAutomaton = WildcardQuery.toAutomaton(new
>> Term("foo", "a*"));
>>
>>         System.out.println("PrefixQuery(\"a\")");
>>         System.out.println(prefixAutomaton.toDot());
>>         System.out.println("WildcardQuery(\"a*\")");
>>         System.out.println(wildcardAutomaton.toDot());
>>     }
>>
>> That produces the following output:
>>
>> PrefixQuery("a")
>> digraph Automaton {
>>   rankdir = LR
>>   node [width=0.2, height=0.2, fontsize=8]
>>   initial [shape=plaintext,label=""]
>>   initial -> 0
>>   0 [shape=circle,label="0"]
>>   0 -> 1 [label="a"]
>>   1 [shape=doublecircle,label="1"]
>>   1 -> 1 [label="\\U00000000-\\U000000ff"]
>> }
>> WildcardQuery("a*")
>> digraph Automaton {
>>   rankdir = LR
>>   node [width=0.2, height=0.2, fontsize=8]
>>   initial [shape=plaintext,label=""]
>>   initial -> 0
>>   0 [shape=circle,label="0"]
>>   0 -> 1 [label="a"]
>>   1 [shape=doublecircle,label="1"]
>>   1 -> 2 [label="\\U00000000-\\U0010ffff"]
>>   2 [shape=doublecircle,label="2"]
>>   2 -> 2 [label="\\U00000000-\\U0010ffff"]
>> }
>>
>>
>>
>> On Tue, 18 Feb 2020 at 13:52, <[hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>>     Michael and Forum,-
>>     Thanks for thegreat explanations.
>>
>>     one question please:
>>
>>     why is PrefixQuery used instead of WildCardQuery in the below
>>     snippet?
>>
>>     Best regards
>>
>>     > On Feb 17, 2020, at 3:01 PM, Michael Froh <[hidden email]
>>     <mailto:[hidden email]>> wrote:
>>     >
>>     > Hi Baris,
>>     >
>>     > The idea with PhraseWildcardQuery is that you can mix literal
>>     "exact" terms
>>     > with "MultiTerms" (i.e. any subclass of MultiTermQuery). Using
>>     addTerm is
>>     > for exact terms, while addMultiTerm is for things that may
>>     match a number
>>     > of possible terms in the given position.
>>     >
>>     > If you want to search for term1 followed by any term that
>>     starts with a
>>     > given character, I would suggest using:
>>     >
>>     > int maxMultiTermExpansions = ...; // Discussed below
>>     > PhraseWildCardQuery.Builder builder = new
>>     PhraseWildcardQuery("field",
>>     > maxMultiTermExpansions);
>>     > builder.addTerm(new BytesRef("term1")); // Add fixed term in
>>     position 0
>>     > builder.addMultiTerm(new PrefixQuery(new Term("field",
>>     "term2FirstChar")));
>>     > // Add multiterm in position 1
>>     > Query q = builder.build();
>>     >
>>     > The PrefixQuery effectively gets expanded into a bunch of
>>     possible terms,
>>     > based on the term dictionary on each index segment. To avoid
>>     expanding to
>>     > cover too many terms (say, if you added a bunch of WildcardQuery),
>>     > maxMultiTermExpansions serves as a guard rail, to put a rough
>>     bound on
>>     > memory consumption and query execution time. If you're
>>     interested in
>>     > details of how the maxMultiTermExpansions budget is distributed
>>     across
>>     > MultiTerms, check out PhraseWildcardQuery.createWeight. If
>>     you're just
>>     > running an experiment in your IDE, you could probably set
>>     > maxMultiTermExpansions to Integer.MAX_VALUE. (If you're running
>>     in a
>>     > production environment, it's likely a good idea to tune it down
>>     based on
>>     > your memory/latency constraints.)
>>     >
>>     > Incidentally, for tracking down the source code for anything in
>>     Lucene,
>>     > it's probably better to go to GitHub for the most up-to-date
>>     source:
>>     >
>>     https://urldefense.com/v3/__https://github.com/apache/lucene-solr/blob/master/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java__;!!GqivPVa7Brio!ONqQgLIltNBUuSo5Cn_Fz7-wuR1LQv68YS_z-6g7X-S86PHQtT9tKl7VbIq9tVLYyw$
>>
>>     > .
>>     >
>>     > Hope that helps,
>>     > Michael
>>     >
>>     >> On Thu, 13 Feb 2020 at 12:29, <[hidden email]
>>     <mailto:[hidden email]>> wrote:
>>     >>
>>     >> Hi,-
>>     >>
>>     >> i hope everyone is doing great.
>>     >>
>>     >>  if i want to do the following search with PhraseWildCardQuery and
>>     >> thanks to this forum for letting me know about this class
>>     (Especially to
>>     >> David and Bruno)
>>     >>
>>     >> term1 term2FirstChar*
>>     >>
>>     >> i need to do two ways: (i found the source code at
>>     >>
>>     >>
>>     https://urldefense.com/v3/__https://fossies.org/linux/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java__;!!GqivPVa7Brio!ONqQgLIltNBUuSo5Cn_Fz7-wuR1LQv68YS_z-6g7X-S86PHQtT9tKl7VbIpV8n29nQ$
>>
>>     >> )
>>     >>
>>     >> /*
>>     >>
>>     >> maxMultiTermExpansions - The maximum number of expansions
>>     across all
>>     >> multi-terms and across all segments. It counts expansions for each
>>     >> segments individually, that allows optimizations per segment
>>     and unused
>>     >> expansions are credited to next segments. This is different from
>>     >> MultiPhraseQuery and SpanMultiTermQueryWrapper which have an
>>     expansion
>>     >> limit per multi-term.
>>     >>
>>     >> segmentOptimizationEnabled - Whether to enable the segment
>>     optimization
>>     >> which consists in ignoring a segment for further analysis as
>>     soon as a
>>     >> term is not present inside it. This optimizes the query execution
>>     >> performance but changes the scoring. The result ranking is
>>     preserved.
>>     >>
>>     >> */
>>     >>
>>     >>
>>     >> 1st way:
>>     >>
>>     >> PhraseWildCardQuery.Builder builder =
>>     PharseWildCardQuery.Builder(field,
>>     >> 2 _*/<<< i dont know what number to use here for
>>     >> maxMultiTermExpansions>>>/*_, true/*boolean
>>     segmentOptimizationEnabled*/)
>>     >>
>>     >> pwcqBuilder.addTerm(field, new Term(field, "term1"));
>>     >>
>>     >> pwcqBuilder.addTerm(field,new Term(field, "term2FirstChar"));
>>     >>
>>     >> PhraseWildCardQuery pwcq = pwcqBuilder.build();
>>     >>
>>     >> or
>>     >>
>>     >> 2nd way:
>>     >>
>>     >> pwcqBuilder.addMultiTerm(MultiTermQuery object here contaning
>>     {field,
>>     >> "term1"} and {field ,"term2FirstChar"});
>>     >>
>>     >> PhraseWildCardQuery pwcq = pwcqBuilder.build();
>>     >>
>>     >>
>>     >> Then this pwcq object will be fed into IndexSearcher's as the
>>     query
>>     >> parameter.
>>     >>
>>     >>
>>     >> Now, it looks like the first way will not consider expansions
>>     or in
>>     >> other words wildcard? Am i right?
>>     >>
>>     >> i also need to understand this maxMultiTermExpansions
>>     parameter better.
>>     >> For instance if first way is used, will maxMultiTermExpansions be
>>     >> meaningful?
>>     >>
>>     >>
>>     >> Thanks
>>     >>
>>     >>
>>
>>
>>     ---------------------------------------------------------------------
>>     To unsubscribe, e-mail: [hidden email]
>>     <mailto:[hidden email]>
>>     For additional commands, e-mail: [hidden email]
>>     <mailto:[hidden email]>
>>
Reply | Threaded
Open this post in threaded view
|

Re: SingleTerm vs MultiTerm in PhraseWildCardQuery class in the sandbox Lucene

baris.kazar
Hi,-

  is there a JAR file for the classes in the
https://github.com/apache/lucene-solr/tree/master/lucene/core/src/java/org/apache/lucene/search 
and index and analysis directories?

https://github.com/apache/lucene-solr/tree/master/lucene/core/src/java/org/apache/lucene/search 
does not have PhraseWildcardQuery class, though.

As Michael mentioned, i pulled it from

https://github.com/apache/lucene-solr/blob/master/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java


It seems that many classes in these directories are incompatible with
Lucene Version 7.7.2. Probably these are from Lucene 8.x series.

It will be very nice to have a JAR file to be able to use all these
classes together with Lucene 7.x versions.


Best regards


On 2/19/20 3:42 PM, [hidden email] wrote:

> Hi,-
>
> Thanks again Michael, David and Bruno and the Forum for letting me
> know this repository.
>
> The version of PhraseWildCardQuery on
> https://urldefense.com/v3/__https://github.com/apache/lucene-solr/blob/master/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java__;!!GqivPVa7Brio!M9dm0zfCQgHUNDsJMygJ5_Im1XhQeqAc-0gAWg-a0Cpt4AqkJB0Bb85olDByacVbfA$ 
> uses some classes not available Lucene version 7.7.2.
>
> There is a bunch of new and modified classes used by
> PhraseWildCardquery class such as QueryVisitor, ScoreMode etc.
>
> I will try to add these classes from
> https://urldefense.com/v3/__https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene__;!!GqivPVa7Brio!M9dm0zfCQgHUNDsJMygJ5_Im1XhQeqAc-0gAWg-a0Cpt4AqkJB0Bb85olDDFbpGxRQ$ 
>
> and i hope it will work with Lucene 7.7.2.
>
> Best regards
>
>
>
> On 2/18/20 8:49 PM, [hidden email] wrote:
>> Michael and Forum,-
>> This is amazing, thanks.
>>
>> i will try both cases.
>>
>> i can also have "term1 term2Char1term2Char2*"
>> and so on with term2's next chars.
>>
>> I hope the latest version on github for this
>> class works with Lucene Version 7.7.2.
>>
>> Best regards
>>
>>> On Feb 18, 2020, at 8:33 PM, Michael Froh <[hidden email]> wrote:
>>>
>>> 
>>> In your example, it looks like you wanted the second term to match
>>> based on the first character, or prefix, of the term.
>>>
>>> While you could use a WildcardQuery with a term value of
>>> "term2FirstChar*", PrefixQuery seemed like the simpler approach.
>>> WildcardQuery can handle more general cases, like if you want to
>>> match on something like "a*b*c".
>>>
>>> Technically, the PrefixQuery compiles down to a slightly simpler
>>> automaton, but I only figured that out by writing a simple unit test:
>>>
>>>     public void testAutomata() {
>>>         Automaton prefixAutomaton = PrefixQuery.toAutomaton(new
>>> BytesRef("a"));
>>>         Automaton wildcardAutomaton = WildcardQuery.toAutomaton(new
>>> Term("foo", "a*"));
>>>
>>>         System.out.println("PrefixQuery(\"a\")");
>>>         System.out.println(prefixAutomaton.toDot());
>>>         System.out.println("WildcardQuery(\"a*\")");
>>>         System.out.println(wildcardAutomaton.toDot());
>>>     }
>>>
>>> That produces the following output:
>>>
>>> PrefixQuery("a")
>>> digraph Automaton {
>>>   rankdir = LR
>>>   node [width=0.2, height=0.2, fontsize=8]
>>>   initial [shape=plaintext,label=""]
>>>   initial -> 0
>>>   0 [shape=circle,label="0"]
>>>   0 -> 1 [label="a"]
>>>   1 [shape=doublecircle,label="1"]
>>>   1 -> 1 [label="\\U00000000-\\U000000ff"]
>>> }
>>> WildcardQuery("a*")
>>> digraph Automaton {
>>>   rankdir = LR
>>>   node [width=0.2, height=0.2, fontsize=8]
>>>   initial [shape=plaintext,label=""]
>>>   initial -> 0
>>>   0 [shape=circle,label="0"]
>>>   0 -> 1 [label="a"]
>>>   1 [shape=doublecircle,label="1"]
>>>   1 -> 2 [label="\\U00000000-\\U0010ffff"]
>>>   2 [shape=doublecircle,label="2"]
>>>   2 -> 2 [label="\\U00000000-\\U0010ffff"]
>>> }
>>>
>>>
>>>
>>> On Tue, 18 Feb 2020 at 13:52, <[hidden email]
>>> <mailto:[hidden email]>> wrote:
>>>
>>>     Michael and Forum,-
>>>     Thanks for thegreat explanations.
>>>
>>>     one question please:
>>>
>>>     why is PrefixQuery used instead of WildCardQuery in the below
>>>     snippet?
>>>
>>>     Best regards
>>>
>>>     > On Feb 17, 2020, at 3:01 PM, Michael Froh <[hidden email]
>>>     <mailto:[hidden email]>> wrote:
>>>     >
>>>     > Hi Baris,
>>>     >
>>>     > The idea with PhraseWildcardQuery is that you can mix literal
>>>     "exact" terms
>>>     > with "MultiTerms" (i.e. any subclass of MultiTermQuery). Using
>>>     addTerm is
>>>     > for exact terms, while addMultiTerm is for things that may
>>>     match a number
>>>     > of possible terms in the given position.
>>>     >
>>>     > If you want to search for term1 followed by any term that
>>>     starts with a
>>>     > given character, I would suggest using:
>>>     >
>>>     > int maxMultiTermExpansions = ...; // Discussed below
>>>     > PhraseWildCardQuery.Builder builder = new
>>>     PhraseWildcardQuery("field",
>>>     > maxMultiTermExpansions);
>>>     > builder.addTerm(new BytesRef("term1")); // Add fixed term in
>>>     position 0
>>>     > builder.addMultiTerm(new PrefixQuery(new Term("field",
>>>     "term2FirstChar")));
>>>     > // Add multiterm in position 1
>>>     > Query q = builder.build();
>>>     >
>>>     > The PrefixQuery effectively gets expanded into a bunch of
>>>     possible terms,
>>>     > based on the term dictionary on each index segment. To avoid
>>>     expanding to
>>>     > cover too many terms (say, if you added a bunch of
>>> WildcardQuery),
>>>     > maxMultiTermExpansions serves as a guard rail, to put a rough
>>>     bound on
>>>     > memory consumption and query execution time. If you're
>>>     interested in
>>>     > details of how the maxMultiTermExpansions budget is distributed
>>>     across
>>>     > MultiTerms, check out PhraseWildcardQuery.createWeight. If
>>>     you're just
>>>     > running an experiment in your IDE, you could probably set
>>>     > maxMultiTermExpansions to Integer.MAX_VALUE. (If you're running
>>>     in a
>>>     > production environment, it's likely a good idea to tune it down
>>>     based on
>>>     > your memory/latency constraints.)
>>>     >
>>>     > Incidentally, for tracking down the source code for anything in
>>>     Lucene,
>>>     > it's probably better to go to GitHub for the most up-to-date
>>>     source:
>>>     >
>>> https://urldefense.com/v3/__https://github.com/apache/lucene-solr/blob/master/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java__;!!GqivPVa7Brio!ONqQgLIltNBUuSo5Cn_Fz7-wuR1LQv68YS_z-6g7X-S86PHQtT9tKl7VbIq9tVLYyw$
>>>
>>>     > .
>>>     >
>>>     > Hope that helps,
>>>     > Michael
>>>     >
>>>     >> On Thu, 13 Feb 2020 at 12:29, <[hidden email]
>>>     <mailto:[hidden email]>> wrote:
>>>     >>
>>>     >> Hi,-
>>>     >>
>>>     >> i hope everyone is doing great.
>>>     >>
>>>     >>  if i want to do the following search with
>>> PhraseWildCardQuery and
>>>     >> thanks to this forum for letting me know about this class
>>>     (Especially to
>>>     >> David and Bruno)
>>>     >>
>>>     >> term1 term2FirstChar*
>>>     >>
>>>     >> i need to do two ways: (i found the source code at
>>>     >>
>>>     >>
>>> https://urldefense.com/v3/__https://fossies.org/linux/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java__;!!GqivPVa7Brio!ONqQgLIltNBUuSo5Cn_Fz7-wuR1LQv68YS_z-6g7X-S86PHQtT9tKl7VbIpV8n29nQ$
>>>
>>>     >> )
>>>     >>
>>>     >> /*
>>>     >>
>>>     >> maxMultiTermExpansions - The maximum number of expansions
>>>     across all
>>>     >> multi-terms and across all segments. It counts expansions for
>>> each
>>>     >> segments individually, that allows optimizations per segment
>>>     and unused
>>>     >> expansions are credited to next segments. This is different from
>>>     >> MultiPhraseQuery and SpanMultiTermQueryWrapper which have an
>>>     expansion
>>>     >> limit per multi-term.
>>>     >>
>>>     >> segmentOptimizationEnabled - Whether to enable the segment
>>>     optimization
>>>     >> which consists in ignoring a segment for further analysis as
>>>     soon as a
>>>     >> term is not present inside it. This optimizes the query
>>> execution
>>>     >> performance but changes the scoring. The result ranking is
>>>     preserved.
>>>     >>
>>>     >> */
>>>     >>
>>>     >>
>>>     >> 1st way:
>>>     >>
>>>     >> PhraseWildCardQuery.Builder builder =
>>>     PharseWildCardQuery.Builder(field,
>>>     >> 2 _*/<<< i dont know what number to use here for
>>>     >> maxMultiTermExpansions>>>/*_, true/*boolean
>>>     segmentOptimizationEnabled*/)
>>>     >>
>>>     >> pwcqBuilder.addTerm(field, new Term(field, "term1"));
>>>     >>
>>>     >> pwcqBuilder.addTerm(field,new Term(field, "term2FirstChar"));
>>>     >>
>>>     >> PhraseWildCardQuery pwcq = pwcqBuilder.build();
>>>     >>
>>>     >> or
>>>     >>
>>>     >> 2nd way:
>>>     >>
>>>     >> pwcqBuilder.addMultiTerm(MultiTermQuery object here contaning
>>>     {field,
>>>     >> "term1"} and {field ,"term2FirstChar"});
>>>     >>
>>>     >> PhraseWildCardQuery pwcq = pwcqBuilder.build();
>>>     >>
>>>     >>
>>>     >> Then this pwcq object will be fed into IndexSearcher's as the
>>>     query
>>>     >> parameter.
>>>     >>
>>>     >>
>>>     >> Now, it looks like the first way will not consider expansions
>>>     or in
>>>     >> other words wildcard? Am i right?
>>>     >>
>>>     >> i also need to understand this maxMultiTermExpansions
>>>     parameter better.
>>>     >> For instance if first way is used, will
>>> maxMultiTermExpansions be
>>>     >> meaningful?
>>>     >>
>>>     >>
>>>     >> Thanks
>>>     >>
>>>     >>
>>>
>>>
>>> ---------------------------------------------------------------------
>>>     To unsubscribe, e-mail: [hidden email]
>>>     <mailto:[hidden email]>
>>>     For additional commands, e-mail: [hidden email]
>>>     <mailto:[hidden email]>
>>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: SingleTerm vs MultiTerm in PhraseWildCardQuery class in the sandbox Lucene

baris.kazar
Hi,-

  Looks like the only way to use and test the new PhraseWildCardQuery
class in Lucene 8.4.0 sandbox is to switch to Lucene 8.4.0 from Lucene
7.7.2.

I thought i could adapt it to Lucene 7.7.2 but so far i saw i needed to
change heavily 20+ classes and it will be way more than this.

So, if anybody wants to use this new amazing class You need to on Lucene
8.4.0.

http://lucene.apache.org/core/8_4_0/sandbox/index.html

Best regards


On 2/19/20 5:41 PM, [hidden email] wrote:

> Hi,-
>
>  is there a JAR file for the classes in the
> https://github.com/apache/lucene-solr/tree/master/lucene/core/src/java/org/apache/lucene/search 
> and index and analysis directories?
>
> https://github.com/apache/lucene-solr/tree/master/lucene/core/src/java/org/apache/lucene/search 
> does not have PhraseWildcardQuery class, though.
>
> As Michael mentioned, i pulled it from
>
> https://github.com/apache/lucene-solr/blob/master/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java 
>
>
>
> It seems that many classes in these directories are incompatible with
> Lucene Version 7.7.2. Probably these are from Lucene 8.x series.
>
> It will be very nice to have a JAR file to be able to use all these
> classes together with Lucene 7.x versions.
>
>
> Best regards
>
>
> On 2/19/20 3:42 PM, [hidden email] wrote:
>> Hi,-
>>
>> Thanks again Michael, David and Bruno and the Forum for letting me
>> know this repository.
>>
>> The version of PhraseWildCardQuery on
>> https://urldefense.com/v3/__https://github.com/apache/lucene-solr/blob/master/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java__;!!GqivPVa7Brio!M9dm0zfCQgHUNDsJMygJ5_Im1XhQeqAc-0gAWg-a0Cpt4AqkJB0Bb85olDByacVbfA$ 
>> uses some classes not available Lucene version 7.7.2.
>>
>> There is a bunch of new and modified classes used by
>> PhraseWildCardquery class such as QueryVisitor, ScoreMode etc.
>>
>> I will try to add these classes from
>> https://urldefense.com/v3/__https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene__;!!GqivPVa7Brio!M9dm0zfCQgHUNDsJMygJ5_Im1XhQeqAc-0gAWg-a0Cpt4AqkJB0Bb85olDDFbpGxRQ$ 
>>
>> and i hope it will work with Lucene 7.7.2.
>>
>> Best regards
>>
>>
>>
>> On 2/18/20 8:49 PM, [hidden email] wrote:
>>> Michael and Forum,-
>>> This is amazing, thanks.
>>>
>>> i will try both cases.
>>>
>>> i can also have "term1 term2Char1term2Char2*"
>>> and so on with term2's next chars.
>>>
>>> I hope the latest version on github for this
>>> class works with Lucene Version 7.7.2.
>>>
>>> Best regards
>>>
>>>> On Feb 18, 2020, at 8:33 PM, Michael Froh <[hidden email]> wrote:
>>>>
>>>> 
>>>> In your example, it looks like you wanted the second term to match
>>>> based on the first character, or prefix, of the term.
>>>>
>>>> While you could use a WildcardQuery with a term value of
>>>> "term2FirstChar*", PrefixQuery seemed like the simpler approach.
>>>> WildcardQuery can handle more general cases, like if you want to
>>>> match on something like "a*b*c".
>>>>
>>>> Technically, the PrefixQuery compiles down to a slightly simpler
>>>> automaton, but I only figured that out by writing a simple unit test:
>>>>
>>>>     public void testAutomata() {
>>>>         Automaton prefixAutomaton = PrefixQuery.toAutomaton(new
>>>> BytesRef("a"));
>>>>         Automaton wildcardAutomaton = WildcardQuery.toAutomaton(new
>>>> Term("foo", "a*"));
>>>>
>>>>         System.out.println("PrefixQuery(\"a\")");
>>>>         System.out.println(prefixAutomaton.toDot());
>>>>         System.out.println("WildcardQuery(\"a*\")");
>>>>         System.out.println(wildcardAutomaton.toDot());
>>>>     }
>>>>
>>>> That produces the following output:
>>>>
>>>> PrefixQuery("a")
>>>> digraph Automaton {
>>>>   rankdir = LR
>>>>   node [width=0.2, height=0.2, fontsize=8]
>>>>   initial [shape=plaintext,label=""]
>>>>   initial -> 0
>>>>   0 [shape=circle,label="0"]
>>>>   0 -> 1 [label="a"]
>>>>   1 [shape=doublecircle,label="1"]
>>>>   1 -> 1 [label="\\U00000000-\\U000000ff"]
>>>> }
>>>> WildcardQuery("a*")
>>>> digraph Automaton {
>>>>   rankdir = LR
>>>>   node [width=0.2, height=0.2, fontsize=8]
>>>>   initial [shape=plaintext,label=""]
>>>>   initial -> 0
>>>>   0 [shape=circle,label="0"]
>>>>   0 -> 1 [label="a"]
>>>>   1 [shape=doublecircle,label="1"]
>>>>   1 -> 2 [label="\\U00000000-\\U0010ffff"]
>>>>   2 [shape=doublecircle,label="2"]
>>>>   2 -> 2 [label="\\U00000000-\\U0010ffff"]
>>>> }
>>>>
>>>>
>>>>
>>>> On Tue, 18 Feb 2020 at 13:52, <[hidden email]
>>>> <mailto:[hidden email]>> wrote:
>>>>
>>>>     Michael and Forum,-
>>>>     Thanks for thegreat explanations.
>>>>
>>>>     one question please:
>>>>
>>>>     why is PrefixQuery used instead of WildCardQuery in the below
>>>>     snippet?
>>>>
>>>>     Best regards
>>>>
>>>>     > On Feb 17, 2020, at 3:01 PM, Michael Froh <[hidden email]
>>>>     <mailto:[hidden email]>> wrote:
>>>>     >
>>>>     > Hi Baris,
>>>>     >
>>>>     > The idea with PhraseWildcardQuery is that you can mix literal
>>>>     "exact" terms
>>>>     > with "MultiTerms" (i.e. any subclass of MultiTermQuery). Using
>>>>     addTerm is
>>>>     > for exact terms, while addMultiTerm is for things that may
>>>>     match a number
>>>>     > of possible terms in the given position.
>>>>     >
>>>>     > If you want to search for term1 followed by any term that
>>>>     starts with a
>>>>     > given character, I would suggest using:
>>>>     >
>>>>     > int maxMultiTermExpansions = ...; // Discussed below
>>>>     > PhraseWildCardQuery.Builder builder = new
>>>>     PhraseWildcardQuery("field",
>>>>     > maxMultiTermExpansions);
>>>>     > builder.addTerm(new BytesRef("term1")); // Add fixed term in
>>>>     position 0
>>>>     > builder.addMultiTerm(new PrefixQuery(new Term("field",
>>>>     "term2FirstChar")));
>>>>     > // Add multiterm in position 1
>>>>     > Query q = builder.build();
>>>>     >
>>>>     > The PrefixQuery effectively gets expanded into a bunch of
>>>>     possible terms,
>>>>     > based on the term dictionary on each index segment. To avoid
>>>>     expanding to
>>>>     > cover too many terms (say, if you added a bunch of
>>>> WildcardQuery),
>>>>     > maxMultiTermExpansions serves as a guard rail, to put a rough
>>>>     bound on
>>>>     > memory consumption and query execution time. If you're
>>>>     interested in
>>>>     > details of how the maxMultiTermExpansions budget is distributed
>>>>     across
>>>>     > MultiTerms, check out PhraseWildcardQuery.createWeight. If
>>>>     you're just
>>>>     > running an experiment in your IDE, you could probably set
>>>>     > maxMultiTermExpansions to Integer.MAX_VALUE. (If you're running
>>>>     in a
>>>>     > production environment, it's likely a good idea to tune it down
>>>>     based on
>>>>     > your memory/latency constraints.)
>>>>     >
>>>>     > Incidentally, for tracking down the source code for anything in
>>>>     Lucene,
>>>>     > it's probably better to go to GitHub for the most up-to-date
>>>>     source:
>>>>     >
>>>> https://urldefense.com/v3/__https://github.com/apache/lucene-solr/blob/master/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java__;!!GqivPVa7Brio!ONqQgLIltNBUuSo5Cn_Fz7-wuR1LQv68YS_z-6g7X-S86PHQtT9tKl7VbIq9tVLYyw$ 
>>>>
>>>>
>>>>     > .
>>>>     >
>>>>     > Hope that helps,
>>>>     > Michael
>>>>     >
>>>>     >> On Thu, 13 Feb 2020 at 12:29, <[hidden email]
>>>>     <mailto:[hidden email]>> wrote:
>>>>     >>
>>>>     >> Hi,-
>>>>     >>
>>>>     >> i hope everyone is doing great.
>>>>     >>
>>>>     >>  if i want to do the following search with
>>>> PhraseWildCardQuery and
>>>>     >> thanks to this forum for letting me know about this class
>>>>     (Especially to
>>>>     >> David and Bruno)
>>>>     >>
>>>>     >> term1 term2FirstChar*
>>>>     >>
>>>>     >> i need to do two ways: (i found the source code at
>>>>     >>
>>>>     >>
>>>> https://urldefense.com/v3/__https://fossies.org/linux/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java__;!!GqivPVa7Brio!ONqQgLIltNBUuSo5Cn_Fz7-wuR1LQv68YS_z-6g7X-S86PHQtT9tKl7VbIpV8n29nQ$ 
>>>>
>>>>
>>>>     >> )
>>>>     >>
>>>>     >> /*
>>>>     >>
>>>>     >> maxMultiTermExpansions - The maximum number of expansions
>>>>     across all
>>>>     >> multi-terms and across all segments. It counts expansions
>>>> for each
>>>>     >> segments individually, that allows optimizations per segment
>>>>     and unused
>>>>     >> expansions are credited to next segments. This is different
>>>> from
>>>>     >> MultiPhraseQuery and SpanMultiTermQueryWrapper which have an
>>>>     expansion
>>>>     >> limit per multi-term.
>>>>     >>
>>>>     >> segmentOptimizationEnabled - Whether to enable the segment
>>>>     optimization
>>>>     >> which consists in ignoring a segment for further analysis as
>>>>     soon as a
>>>>     >> term is not present inside it. This optimizes the query
>>>> execution
>>>>     >> performance but changes the scoring. The result ranking is
>>>>     preserved.
>>>>     >>
>>>>     >> */
>>>>     >>
>>>>     >>
>>>>     >> 1st way:
>>>>     >>
>>>>     >> PhraseWildCardQuery.Builder builder =
>>>>     PharseWildCardQuery.Builder(field,
>>>>     >> 2 _*/<<< i dont know what number to use here for
>>>>     >> maxMultiTermExpansions>>>/*_, true/*boolean
>>>>     segmentOptimizationEnabled*/)
>>>>     >>
>>>>     >> pwcqBuilder.addTerm(field, new Term(field, "term1"));
>>>>     >>
>>>>     >> pwcqBuilder.addTerm(field,new Term(field, "term2FirstChar"));
>>>>     >>
>>>>     >> PhraseWildCardQuery pwcq = pwcqBuilder.build();
>>>>     >>
>>>>     >> or
>>>>     >>
>>>>     >> 2nd way:
>>>>     >>
>>>>     >> pwcqBuilder.addMultiTerm(MultiTermQuery object here contaning
>>>>     {field,
>>>>     >> "term1"} and {field ,"term2FirstChar"});
>>>>     >>
>>>>     >> PhraseWildCardQuery pwcq = pwcqBuilder.build();
>>>>     >>
>>>>     >>
>>>>     >> Then this pwcq object will be fed into IndexSearcher's as the
>>>>     query
>>>>     >> parameter.
>>>>     >>
>>>>     >>
>>>>     >> Now, it looks like the first way will not consider expansions
>>>>     or in
>>>>     >> other words wildcard? Am i right?
>>>>     >>
>>>>     >> i also need to understand this maxMultiTermExpansions
>>>>     parameter better.
>>>>     >> For instance if first way is used, will
>>>> maxMultiTermExpansions be
>>>>     >> meaningful?
>>>>     >>
>>>>     >>
>>>>     >> Thanks
>>>>     >>
>>>>     >>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>>     To unsubscribe, e-mail: [hidden email]
>>>>     <mailto:[hidden email]>
>>>>     For additional commands, e-mail: [hidden email]
>>>>     <mailto:[hidden email]>
>>>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: SingleTerm vs MultiTerm in PhraseWildCardQuery class in the sandbox Lucene

baris.kazar
Followup on this thread:

i ended up using WildcardQuery with "*" at the end of last token for
PhraseWildcardQuery class from the sandbox,


i tested this class rigorously and i think it is ready to move it from
sandbox jar to the appropriate release jar.

Is there a plan for that?


PhraseWildcardQuery is on ave 3-4 times faster than
ComplexPhraseQueryParser class and gives same result.


I did some more naive enhancements on top of ComplexPhraseQueryParser
results and i plan to do those with this new class which will bring down
the execution another 2 to 3 times more.


Best regards


On 2/21/20 12:34 PM, [hidden email] wrote:

> Hi,-
>
>  Looks like the only way to use and test the new PhraseWildCardQuery
> class in Lucene 8.4.0 sandbox is to switch to Lucene 8.4.0 from Lucene
> 7.7.2.
>
> I thought i could adapt it to Lucene 7.7.2 but so far i saw i needed
> to change heavily 20+ classes and it will be way more than this.
>
> So, if anybody wants to use this new amazing class You need to on
> Lucene 8.4.0.
>
> http://lucene.apache.org/core/8_4_0/sandbox/index.html
>
> Best regards
>
>
> On 2/19/20 5:41 PM, [hidden email] wrote:
>> Hi,-
>>
>>  is there a JAR file for the classes in the
>> https://github.com/apache/lucene-solr/tree/master/lucene/core/src/java/org/apache/lucene/search 
>> and index and analysis directories?
>>
>> https://github.com/apache/lucene-solr/tree/master/lucene/core/src/java/org/apache/lucene/search 
>> does not have PhraseWildcardQuery class, though.
>>
>> As Michael mentioned, i pulled it from
>>
>> https://github.com/apache/lucene-solr/blob/master/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java 
>>
>>
>>
>> It seems that many classes in these directories are incompatible with
>> Lucene Version 7.7.2. Probably these are from Lucene 8.x series.
>>
>> It will be very nice to have a JAR file to be able to use all these
>> classes together with Lucene 7.x versions.
>>
>>
>> Best regards
>>
>>
>> On 2/19/20 3:42 PM, [hidden email] wrote:
>>> Hi,-
>>>
>>> Thanks again Michael, David and Bruno and the Forum for letting me
>>> know this repository.
>>>
>>> The version of PhraseWildCardQuery on
>>> https://urldefense.com/v3/__https://github.com/apache/lucene-solr/blob/master/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java__;!!GqivPVa7Brio!M9dm0zfCQgHUNDsJMygJ5_Im1XhQeqAc-0gAWg-a0Cpt4AqkJB0Bb85olDByacVbfA$ 
>>> uses some classes not available Lucene version 7.7.2.
>>>
>>> There is a bunch of new and modified classes used by
>>> PhraseWildCardquery class such as QueryVisitor, ScoreMode etc.
>>>
>>> I will try to add these classes from
>>> https://urldefense.com/v3/__https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene__;!!GqivPVa7Brio!M9dm0zfCQgHUNDsJMygJ5_Im1XhQeqAc-0gAWg-a0Cpt4AqkJB0Bb85olDDFbpGxRQ$ 
>>>
>>> and i hope it will work with Lucene 7.7.2.
>>>
>>> Best regards
>>>
>>>
>>>
>>> On 2/18/20 8:49 PM, [hidden email] wrote:
>>>> Michael and Forum,-
>>>> This is amazing, thanks.
>>>>
>>>> i will try both cases.
>>>>
>>>> i can also have "term1 term2Char1term2Char2*"
>>>> and so on with term2's next chars.
>>>>
>>>> I hope the latest version on github for this
>>>> class works with Lucene Version 7.7.2.
>>>>
>>>> Best regards
>>>>
>>>>> On Feb 18, 2020, at 8:33 PM, Michael Froh <[hidden email]> wrote:
>>>>>
>>>>> 
>>>>> In your example, it looks like you wanted the second term to match
>>>>> based on the first character, or prefix, of the term.
>>>>>
>>>>> While you could use a WildcardQuery with a term value of
>>>>> "term2FirstChar*", PrefixQuery seemed like the simpler approach.
>>>>> WildcardQuery can handle more general cases, like if you want to
>>>>> match on something like "a*b*c".
>>>>>
>>>>> Technically, the PrefixQuery compiles down to a slightly simpler
>>>>> automaton, but I only figured that out by writing a simple unit test:
>>>>>
>>>>>     public void testAutomata() {
>>>>>         Automaton prefixAutomaton = PrefixQuery.toAutomaton(new
>>>>> BytesRef("a"));
>>>>>         Automaton wildcardAutomaton =
>>>>> WildcardQuery.toAutomaton(new Term("foo", "a*"));
>>>>>
>>>>>         System.out.println("PrefixQuery(\"a\")");
>>>>>         System.out.println(prefixAutomaton.toDot());
>>>>>         System.out.println("WildcardQuery(\"a*\")");
>>>>>         System.out.println(wildcardAutomaton.toDot());
>>>>>     }
>>>>>
>>>>> That produces the following output:
>>>>>
>>>>> PrefixQuery("a")
>>>>> digraph Automaton {
>>>>>   rankdir = LR
>>>>>   node [width=0.2, height=0.2, fontsize=8]
>>>>>   initial [shape=plaintext,label=""]
>>>>>   initial -> 0
>>>>>   0 [shape=circle,label="0"]
>>>>>   0 -> 1 [label="a"]
>>>>>   1 [shape=doublecircle,label="1"]
>>>>>   1 -> 1 [label="\\U00000000-\\U000000ff"]
>>>>> }
>>>>> WildcardQuery("a*")
>>>>> digraph Automaton {
>>>>>   rankdir = LR
>>>>>   node [width=0.2, height=0.2, fontsize=8]
>>>>>   initial [shape=plaintext,label=""]
>>>>>   initial -> 0
>>>>>   0 [shape=circle,label="0"]
>>>>>   0 -> 1 [label="a"]
>>>>>   1 [shape=doublecircle,label="1"]
>>>>>   1 -> 2 [label="\\U00000000-\\U0010ffff"]
>>>>>   2 [shape=doublecircle,label="2"]
>>>>>   2 -> 2 [label="\\U00000000-\\U0010ffff"]
>>>>> }
>>>>>
>>>>>
>>>>>
>>>>> On Tue, 18 Feb 2020 at 13:52, <[hidden email]
>>>>> <mailto:[hidden email]>> wrote:
>>>>>
>>>>>     Michael and Forum,-
>>>>>     Thanks for thegreat explanations.
>>>>>
>>>>>     one question please:
>>>>>
>>>>>     why is PrefixQuery used instead of WildCardQuery in the below
>>>>>     snippet?
>>>>>
>>>>>     Best regards
>>>>>
>>>>>     > On Feb 17, 2020, at 3:01 PM, Michael Froh <[hidden email]
>>>>>     <mailto:[hidden email]>> wrote:
>>>>>     >
>>>>>     > Hi Baris,
>>>>>     >
>>>>>     > The idea with PhraseWildcardQuery is that you can mix literal
>>>>>     "exact" terms
>>>>>     > with "MultiTerms" (i.e. any subclass of MultiTermQuery). Using
>>>>>     addTerm is
>>>>>     > for exact terms, while addMultiTerm is for things that may
>>>>>     match a number
>>>>>     > of possible terms in the given position.
>>>>>     >
>>>>>     > If you want to search for term1 followed by any term that
>>>>>     starts with a
>>>>>     > given character, I would suggest using:
>>>>>     >
>>>>>     > int maxMultiTermExpansions = ...; // Discussed below
>>>>>     > PhraseWildCardQuery.Builder builder = new
>>>>>     PhraseWildcardQuery("field",
>>>>>     > maxMultiTermExpansions);
>>>>>     > builder.addTerm(new BytesRef("term1")); // Add fixed term in
>>>>>     position 0
>>>>>     > builder.addMultiTerm(new PrefixQuery(new Term("field",
>>>>>     "term2FirstChar")));
>>>>>     > // Add multiterm in position 1
>>>>>     > Query q = builder.build();
>>>>>     >
>>>>>     > The PrefixQuery effectively gets expanded into a bunch of
>>>>>     possible terms,
>>>>>     > based on the term dictionary on each index segment. To avoid
>>>>>     expanding to
>>>>>     > cover too many terms (say, if you added a bunch of
>>>>> WildcardQuery),
>>>>>     > maxMultiTermExpansions serves as a guard rail, to put a rough
>>>>>     bound on
>>>>>     > memory consumption and query execution time. If you're
>>>>>     interested in
>>>>>     > details of how the maxMultiTermExpansions budget is distributed
>>>>>     across
>>>>>     > MultiTerms, check out PhraseWildcardQuery.createWeight. If
>>>>>     you're just
>>>>>     > running an experiment in your IDE, you could probably set
>>>>>     > maxMultiTermExpansions to Integer.MAX_VALUE. (If you're running
>>>>>     in a
>>>>>     > production environment, it's likely a good idea to tune it down
>>>>>     based on
>>>>>     > your memory/latency constraints.)
>>>>>     >
>>>>>     > Incidentally, for tracking down the source code for anything in
>>>>>     Lucene,
>>>>>     > it's probably better to go to GitHub for the most up-to-date
>>>>>     source:
>>>>>     >
>>>>> https://urldefense.com/v3/__https://github.com/apache/lucene-solr/blob/master/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java__;!!GqivPVa7Brio!ONqQgLIltNBUuSo5Cn_Fz7-wuR1LQv68YS_z-6g7X-S86PHQtT9tKl7VbIq9tVLYyw$ 
>>>>>
>>>>>
>>>>>     > .
>>>>>     >
>>>>>     > Hope that helps,
>>>>>     > Michael
>>>>>     >
>>>>>     >> On Thu, 13 Feb 2020 at 12:29, <[hidden email]
>>>>>     <mailto:[hidden email]>> wrote:
>>>>>     >>
>>>>>     >> Hi,-
>>>>>     >>
>>>>>     >> i hope everyone is doing great.
>>>>>     >>
>>>>>     >>  if i want to do the following search with
>>>>> PhraseWildCardQuery and
>>>>>     >> thanks to this forum for letting me know about this class
>>>>>     (Especially to
>>>>>     >> David and Bruno)
>>>>>     >>
>>>>>     >> term1 term2FirstChar*
>>>>>     >>
>>>>>     >> i need to do two ways: (i found the source code at
>>>>>     >>
>>>>>     >>
>>>>> https://urldefense.com/v3/__https://fossies.org/linux/lucene/sandbox/src/java/org/apache/lucene/search/PhraseWildcardQuery.java__;!!GqivPVa7Brio!ONqQgLIltNBUuSo5Cn_Fz7-wuR1LQv68YS_z-6g7X-S86PHQtT9tKl7VbIpV8n29nQ$ 
>>>>>
>>>>>
>>>>>     >> )
>>>>>     >>
>>>>>     >> /*
>>>>>     >>
>>>>>     >> maxMultiTermExpansions - The maximum number of expansions
>>>>>     across all
>>>>>     >> multi-terms and across all segments. It counts expansions
>>>>> for each
>>>>>     >> segments individually, that allows optimizations per segment
>>>>>     and unused
>>>>>     >> expansions are credited to next segments. This is different
>>>>> from
>>>>>     >> MultiPhraseQuery and SpanMultiTermQueryWrapper which have an
>>>>>     expansion
>>>>>     >> limit per multi-term.
>>>>>     >>
>>>>>     >> segmentOptimizationEnabled - Whether to enable the segment
>>>>>     optimization
>>>>>     >> which consists in ignoring a segment for further analysis as
>>>>>     soon as a
>>>>>     >> term is not present inside it. This optimizes the query
>>>>> execution
>>>>>     >> performance but changes the scoring. The result ranking is
>>>>>     preserved.
>>>>>     >>
>>>>>     >> */
>>>>>     >>
>>>>>     >>
>>>>>     >> 1st way:
>>>>>     >>
>>>>>     >> PhraseWildCardQuery.Builder builder =
>>>>>     PharseWildCardQuery.Builder(field,
>>>>>     >> 2 _*/<<< i dont know what number to use here for
>>>>>     >> maxMultiTermExpansions>>>/*_, true/*boolean
>>>>>     segmentOptimizationEnabled*/)
>>>>>     >>
>>>>>     >> pwcqBuilder.addTerm(field, new Term(field, "term1"));
>>>>>     >>
>>>>>     >> pwcqBuilder.addTerm(field,new Term(field, "term2FirstChar"));
>>>>>     >>
>>>>>     >> PhraseWildCardQuery pwcq = pwcqBuilder.build();
>>>>>     >>
>>>>>     >> or
>>>>>     >>
>>>>>     >> 2nd way:
>>>>>     >>
>>>>>     >> pwcqBuilder.addMultiTerm(MultiTermQuery object here contaning
>>>>>     {field,
>>>>>     >> "term1"} and {field ,"term2FirstChar"});
>>>>>     >>
>>>>>     >> PhraseWildCardQuery pwcq = pwcqBuilder.build();
>>>>>     >>
>>>>>     >>
>>>>>     >> Then this pwcq object will be fed into IndexSearcher's as the
>>>>>     query
>>>>>     >> parameter.
>>>>>     >>
>>>>>     >>
>>>>>     >> Now, it looks like the first way will not consider expansions
>>>>>     or in
>>>>>     >> other words wildcard? Am i right?
>>>>>     >>
>>>>>     >> i also need to understand this maxMultiTermExpansions
>>>>>     parameter better.
>>>>>     >> For instance if first way is used, will
>>>>> maxMultiTermExpansions be
>>>>>     >> meaningful?
>>>>>     >>
>>>>>     >>
>>>>>     >> Thanks
>>>>>     >>
>>>>>     >>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>>     To unsubscribe, e-mail: [hidden email]
>>>>>     <mailto:[hidden email]>
>>>>>     For additional commands, e-mail: [hidden email]
>>>>>     <mailto:[hidden email]>
>>>>>
>>>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]