Tracking that all query terms are matched in one document

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Tracking that all query terms are matched in one document

Vadim Gindin
Hi all.

I need to track that all query terms are matched in one document. When all
terms are matched I need to multiply the score of such document to some
constant coefficient.
Reply | Threaded
Open this post in threaded view
|

Re: Tracking that all query terms are matched in one document

Vadim Gindin
Sorry

I've accidentally sent an unfinished letter ).

Could somebody advise me the way how to implement the following thing?

Regards
Vadim Gindin

On Mon, Dec 4, 2017 at 3:12 PM, Vadim Gindin <[hidden email]> wrote:

> Hi all.
>
> I need to track that all query terms are matched in one document. When all
> terms are matched I need to multiply the score of such document to some
> constant coefficient.
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Tracking that all query terms are matched in one document

Michael Sokolov-4
In reply to this post by Vadim Gindin
You could combine a Boolean and query with the same terms, as an optional
clause. Are you sure about the requirement to multiply the score in that
case?

On Dec 4, 2017 5:13 AM, "Vadim Gindin" <[hidden email]> wrote:

> Hi all.
>
> I need to track that all query terms are matched in one document. When all
> terms are matched I need to multiply the score of such document to some
> constant coefficient.
>
Reply | Threaded
Open this post in threaded view
|

Re: Tracking that all query terms are matched in one document

Vadim Gindin
Thanks, Michael!

Yes, I'm sure. Could you explain your proposal in more detail?

Regards,
Vadim Gindin

On Mon, Dec 4, 2017 at 3:18 PM, Michael Sokolov <[hidden email]> wrote:

> You could combine a Boolean and query with the same terms, as an optional
> clause. Are you sure about the requirement to multiply the score in that
> case?
>
> On Dec 4, 2017 5:13 AM, "Vadim Gindin" <[hidden email]> wrote:
>
> > Hi all.
> >
> > I need to track that all query terms are matched in one document. When
> all
> > terms are matched I need to multiply the score of such document to some
> > constant coefficient.
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Tracking that all query terms are matched in one document

Michael Sokolov-4
I'm just saying, that when you form your query, you could also create
another extra query that requires all the terms in the original query, and
then combine it with the original query in a boolean where the original
query is required and the extra query is optional. That will give a boost
when all the terms are found, although I think the scores will be added,
not multiplied.

On Dec 4, 2017 5:22 AM, "Vadim Gindin" <[hidden email]> wrote:

> Thanks, Michael!
>
> Yes, I'm sure. Could you explain your proposal in more detail?
>
> Regards,
> Vadim Gindin
>
> On Mon, Dec 4, 2017 at 3:18 PM, Michael Sokolov <[hidden email]>
> wrote:
>
> > You could combine a Boolean and query with the same terms, as an optional
> > clause. Are you sure about the requirement to multiply the score in that
> > case?
> >
> > On Dec 4, 2017 5:13 AM, "Vadim Gindin" <[hidden email]> wrote:
> >
> > > Hi all.
> > >
> > > I need to track that all query terms are matched in one document. When
> > all
> > > terms are matched I need to multiply the score of such document to some
> > > constant coefficient.
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Tracking that all query terms are matched in one document

Vadim Gindin
Yes, thanks. My question is exactly about how to create "another extra
query that requires all the terms in the original query"

On Mon, Dec 4, 2017 at 6:50 PM, Michael Sokolov <[hidden email]> wrote:

> I'm just saying, that when you form your query, you could also create
> another extra query that requires all the terms in the original query, and
> then combine it with the original query in a boolean where the original
> query is required and the extra query is optional. That will give a boost
> when all the terms are found, although I think the scores will be added,
> not multiplied.
>
> On Dec 4, 2017 5:22 AM, "Vadim Gindin" <[hidden email]> wrote:
>
> > Thanks, Michael!
> >
> > Yes, I'm sure. Could you explain your proposal in more detail?
> >
> > Regards,
> > Vadim Gindin
> >
> > On Mon, Dec 4, 2017 at 3:18 PM, Michael Sokolov <[hidden email]>
> > wrote:
> >
> > > You could combine a Boolean and query with the same terms, as an
> optional
> > > clause. Are you sure about the requirement to multiply the score in
> that
> > > case?
> > >
> > > On Dec 4, 2017 5:13 AM, "Vadim Gindin" <[hidden email]> wrote:
> > >
> > > > Hi all.
> > > >
> > > > I need to track that all query terms are matched in one document.
> When
> > > all
> > > > terms are matched I need to multiply the score of such document to
> some
> > > > constant coefficient.
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Tracking that all query terms are matched in one document

Michael Sokolov-4
Well how did you make the original query?

On Dec 4, 2017 12:05 PM, "Vadim Gindin" <[hidden email]> wrote:

> Yes, thanks. My question is exactly about how to create "another extra
> query that requires all the terms in the original query"
>
> On Mon, Dec 4, 2017 at 6:50 PM, Michael Sokolov <[hidden email]>
> wrote:
>
> > I'm just saying, that when you form your query, you could also create
> > another extra query that requires all the terms in the original query,
> and
> > then combine it with the original query in a boolean where the original
> > query is required and the extra query is optional. That will give a boost
> > when all the terms are found, although I think the scores will be added,
> > not multiplied.
> >
> > On Dec 4, 2017 5:22 AM, "Vadim Gindin" <[hidden email]> wrote:
> >
> > > Thanks, Michael!
> > >
> > > Yes, I'm sure. Could you explain your proposal in more detail?
> > >
> > > Regards,
> > > Vadim Gindin
> > >
> > > On Mon, Dec 4, 2017 at 3:18 PM, Michael Sokolov <[hidden email]>
> > > wrote:
> > >
> > > > You could combine a Boolean and query with the same terms, as an
> > optional
> > > > clause. Are you sure about the requirement to multiply the score in
> > that
> > > > case?
> > > >
> > > > On Dec 4, 2017 5:13 AM, "Vadim Gindin" <[hidden email]> wrote:
> > > >
> > > > > Hi all.
> > > > >
> > > > > I need to track that all query terms are matched in one document.
> > When
> > > > all
> > > > > terms are matched I need to multiply the score of such document to
> > some
> > > > > constant coefficient.
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Tracking that all query terms are matched in one document

Vadim Gindin
For example like this:

BooleanQuery.Builder expected = new BooleanQuery.Builder();

Query param_vendor = new BoostQuery(new ConstantScoreQuery(new
TermQuery(new Term("param_vendor", queryStr))), 5f);
Query param_model = new BoostQuery(new ConstantScoreQuery(new
TermQuery(new Term("param_model", queryStr))), 5f);
Query param_value = new BoostQuery(new ConstantScoreQuery(new
TermQuery(new Term("param_value", queryStr))), 3f);
Query param_name = new BoostQuery(new ConstantScoreQuery(new
TermQuery(new Term("param_name", queryStr))), 4f);

BooleanQuery bq = expected
        .add(param_vendor, BooleanClause.Occur.SHOULD)
        .add(param_model, BooleanClause.Occur.SHOULD)
        .add(param_value, BooleanClause.Occur.SHOULD)
        .add(param_name, BooleanClause.Occur.SHOULD)
        .setMinimumNumberShouldMatch(1)
        .build();

return new BoostQuery(bq, queryBoost);


Vadim

On Tue, Dec 5, 2017 at 9:24 AM, Michael Sokolov <[hidden email]> wrote:

> Well how did you make the original query?
>
> On Dec 4, 2017 12:05 PM, "Vadim Gindin" <[hidden email]> wrote:
>
> > Yes, thanks. My question is exactly about how to create "another extra
> > query that requires all the terms in the original query"
> >
> > On Mon, Dec 4, 2017 at 6:50 PM, Michael Sokolov <[hidden email]>
> > wrote:
> >
> > > I'm just saying, that when you form your query, you could also create
> > > another extra query that requires all the terms in the original query,
> > and
> > > then combine it with the original query in a boolean where the original
> > > query is required and the extra query is optional. That will give a
> boost
> > > when all the terms are found, although I think the scores will be
> added,
> > > not multiplied.
> > >
> > > On Dec 4, 2017 5:22 AM, "Vadim Gindin" <[hidden email]> wrote:
> > >
> > > > Thanks, Michael!
> > > >
> > > > Yes, I'm sure. Could you explain your proposal in more detail?
> > > >
> > > > Regards,
> > > > Vadim Gindin
> > > >
> > > > On Mon, Dec 4, 2017 at 3:18 PM, Michael Sokolov <[hidden email]>
> > > > wrote:
> > > >
> > > > > You could combine a Boolean and query with the same terms, as an
> > > optional
> > > > > clause. Are you sure about the requirement to multiply the score in
> > > that
> > > > > case?
> > > > >
> > > > > On Dec 4, 2017 5:13 AM, "Vadim Gindin" <[hidden email]>
> wrote:
> > > > >
> > > > > > Hi all.
> > > > > >
> > > > > > I need to track that all query terms are matched in one document.
> > > When
> > > > > all
> > > > > > terms are matched I need to multiply the score of such document
> to
> > > some
> > > > > > constant coefficient.
> > > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Tracking that all query terms are matched in one document

Vadim Gindin
I'm not sure here that I will be able to track somehow that different terms
were matched to the same document...

I'm thinking more about little another way: when query scores some document
- save the query term for that document somewhere. Probably it would be
some map in some class SearchContext. I could write something like this:

SearchContext sc = getSearchContext();                    // -  does such
search context exist in Lucene? Maybe QueryContext
sc.getDocTerms().get(docID).add(query.getTerm()));  // docTerms here is a
Map<Int, List<String>> - where the key - is a document ID and the value -
is a list of terms by whom this document was matched.

I need to save somewhere the document ID and the term matched that
document. Could somebody advise me an appropriate place?

Regards,
Vadim Gindin


On Tue, Dec 5, 2017 at 12:04 PM, Vadim Gindin <[hidden email]> wrote:

> For example like this:
>
> BooleanQuery.Builder expected = new BooleanQuery.Builder();
>
> Query param_vendor = new BoostQuery(new ConstantScoreQuery(new TermQuery(new Term("param_vendor", queryStr))), 5f);
> Query param_model = new BoostQuery(new ConstantScoreQuery(new TermQuery(new Term("param_model", queryStr))), 5f);
> Query param_value = new BoostQuery(new ConstantScoreQuery(new TermQuery(new Term("param_value", queryStr))), 3f);
> Query param_name = new BoostQuery(new ConstantScoreQuery(new TermQuery(new Term("param_name", queryStr))), 4f);
>
> BooleanQuery bq = expected
>         .add(param_vendor, BooleanClause.Occur.SHOULD)
>         .add(param_model, BooleanClause.Occur.SHOULD)
>         .add(param_value, BooleanClause.Occur.SHOULD)
>         .add(param_name, BooleanClause.Occur.SHOULD)
>         .setMinimumNumberShouldMatch(1)
>         .build();
>
> return new BoostQuery(bq, queryBoost);
>
>
> Vadim
>
> On Tue, Dec 5, 2017 at 9:24 AM, Michael Sokolov <[hidden email]>
> wrote:
>
>> Well how did you make the original query?
>>
>> On Dec 4, 2017 12:05 PM, "Vadim Gindin" <[hidden email]> wrote:
>>
>> > Yes, thanks. My question is exactly about how to create "another extra
>> > query that requires all the terms in the original query"
>> >
>> > On Mon, Dec 4, 2017 at 6:50 PM, Michael Sokolov <[hidden email]>
>> > wrote:
>> >
>> > > I'm just saying, that when you form your query, you could also create
>> > > another extra query that requires all the terms in the original query,
>> > and
>> > > then combine it with the original query in a boolean where the
>> original
>> > > query is required and the extra query is optional. That will give a
>> boost
>> > > when all the terms are found, although I think the scores will be
>> added,
>> > > not multiplied.
>> > >
>> > > On Dec 4, 2017 5:22 AM, "Vadim Gindin" <[hidden email]> wrote:
>> > >
>> > > > Thanks, Michael!
>> > > >
>> > > > Yes, I'm sure. Could you explain your proposal in more detail?
>> > > >
>> > > > Regards,
>> > > > Vadim Gindin
>> > > >
>> > > > On Mon, Dec 4, 2017 at 3:18 PM, Michael Sokolov <[hidden email]
>> >
>> > > > wrote:
>> > > >
>> > > > > You could combine a Boolean and query with the same terms, as an
>> > > optional
>> > > > > clause. Are you sure about the requirement to multiply the score
>> in
>> > > that
>> > > > > case?
>> > > > >
>> > > > > On Dec 4, 2017 5:13 AM, "Vadim Gindin" <[hidden email]>
>> wrote:
>> > > > >
>> > > > > > Hi all.
>> > > > > >
>> > > > > > I need to track that all query terms are matched in one
>> document.
>> > > When
>> > > > > all
>> > > > > > terms are matched I need to multiply the score of such document
>> to
>> > > some
>> > > > > > constant coefficient.
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Tracking that all query terms are matched in one document

Mikhail Khludnev-2
Vadim,
You can create a collector which checks Scorer.getChildren()
https://issues.apache.org/jira/browse/LUCENE-7628 but it's way cumbersome.
I'd suggest to avoid this if it's possible. However, Elastic does something
like this with named queries or so.
I've told about this few years ago
https://www.youtube.com/watch?v=sGVyUdNGBgw

On Tue, Dec 5, 2017 at 12:36 PM, Vadim Gindin <[hidden email]> wrote:

> I'm not sure here that I will be able to track somehow that different terms
> were matched to the same document...
>
> I'm thinking more about little another way: when query scores some document
> - save the query term for that document somewhere. Probably it would be
> some map in some class SearchContext. I could write something like this:
>
> SearchContext sc = getSearchContext();                    // -  does such
> search context exist in Lucene? Maybe QueryContext
> sc.getDocTerms().get(docID).add(query.getTerm()));  // docTerms here is a
> Map<Int, List<String>> - where the key - is a document ID and the value -
> is a list of terms by whom this document was matched.
>
> I need to save somewhere the document ID and the term matched that
> document. Could somebody advise me an appropriate place?
>
> Regards,
> Vadim Gindin
>
>
> On Tue, Dec 5, 2017 at 12:04 PM, Vadim Gindin <[hidden email]>
> wrote:
>
> > For example like this:
> >
> > BooleanQuery.Builder expected = new BooleanQuery.Builder();
> >
> > Query param_vendor = new BoostQuery(new ConstantScoreQuery(new
> TermQuery(new Term("param_vendor", queryStr))), 5f);
> > Query param_model = new BoostQuery(new ConstantScoreQuery(new
> TermQuery(new Term("param_model", queryStr))), 5f);
> > Query param_value = new BoostQuery(new ConstantScoreQuery(new
> TermQuery(new Term("param_value", queryStr))), 3f);
> > Query param_name = new BoostQuery(new ConstantScoreQuery(new
> TermQuery(new Term("param_name", queryStr))), 4f);
> >
> > BooleanQuery bq = expected
> >         .add(param_vendor, BooleanClause.Occur.SHOULD)
> >         .add(param_model, BooleanClause.Occur.SHOULD)
> >         .add(param_value, BooleanClause.Occur.SHOULD)
> >         .add(param_name, BooleanClause.Occur.SHOULD)
> >         .setMinimumNumberShouldMatch(1)
> >         .build();
> >
> > return new BoostQuery(bq, queryBoost);
> >
> >
> > Vadim
> >
> > On Tue, Dec 5, 2017 at 9:24 AM, Michael Sokolov <[hidden email]>
> > wrote:
> >
> >> Well how did you make the original query?
> >>
> >> On Dec 4, 2017 12:05 PM, "Vadim Gindin" <[hidden email]> wrote:
> >>
> >> > Yes, thanks. My question is exactly about how to create "another extra
> >> > query that requires all the terms in the original query"
> >> >
> >> > On Mon, Dec 4, 2017 at 6:50 PM, Michael Sokolov <[hidden email]>
> >> > wrote:
> >> >
> >> > > I'm just saying, that when you form your query, you could also
> create
> >> > > another extra query that requires all the terms in the original
> query,
> >> > and
> >> > > then combine it with the original query in a boolean where the
> >> original
> >> > > query is required and the extra query is optional. That will give a
> >> boost
> >> > > when all the terms are found, although I think the scores will be
> >> added,
> >> > > not multiplied.
> >> > >
> >> > > On Dec 4, 2017 5:22 AM, "Vadim Gindin" <[hidden email]>
> wrote:
> >> > >
> >> > > > Thanks, Michael!
> >> > > >
> >> > > > Yes, I'm sure. Could you explain your proposal in more detail?
> >> > > >
> >> > > > Regards,
> >> > > > Vadim Gindin
> >> > > >
> >> > > > On Mon, Dec 4, 2017 at 3:18 PM, Michael Sokolov <
> [hidden email]
> >> >
> >> > > > wrote:
> >> > > >
> >> > > > > You could combine a Boolean and query with the same terms, as an
> >> > > optional
> >> > > > > clause. Are you sure about the requirement to multiply the score
> >> in
> >> > > that
> >> > > > > case?
> >> > > > >
> >> > > > > On Dec 4, 2017 5:13 AM, "Vadim Gindin" <[hidden email]>
> >> wrote:
> >> > > > >
> >> > > > > > Hi all.
> >> > > > > >
> >> > > > > > I need to track that all query terms are matched in one
> >> document.
> >> > > When
> >> > > > > all
> >> > > > > > terms are matched I need to multiply the score of such
> document
> >> to
> >> > > some
> >> > > > > > constant coefficient.
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>



--
Sincerely yours
Mikhail Khludnev
Reply | Threaded
Open this post in threaded view
|

Re: Tracking that all query terms are matched in one document

Vadim Gindin
Thank's for your help. I'll try that.

On Tue, Dec 5, 2017 at 4:18 PM, Mikhail Khludnev <[hidden email]> wrote:

> Vadim,
> You can create a collector which checks Scorer.getChildren()
> https://issues.apache.org/jira/browse/LUCENE-7628 but it's way cumbersome.
> I'd suggest to avoid this if it's possible. However, Elastic does something
> like this with named queries or so.
> I've told about this few years ago
> https://www.youtube.com/watch?v=sGVyUdNGBgw
>
> On Tue, Dec 5, 2017 at 12:36 PM, Vadim Gindin <[hidden email]>
> wrote:
>
> > I'm not sure here that I will be able to track somehow that different
> terms
> > were matched to the same document...
> >
> > I'm thinking more about little another way: when query scores some
> document
> > - save the query term for that document somewhere. Probably it would be
> > some map in some class SearchContext. I could write something like this:
> >
> > SearchContext sc = getSearchContext();                    // -  does such
> > search context exist in Lucene? Maybe QueryContext
> > sc.getDocTerms().get(docID).add(query.getTerm()));  // docTerms here is
> a
> > Map<Int, List<String>> - where the key - is a document ID and the value -
> > is a list of terms by whom this document was matched.
> >
> > I need to save somewhere the document ID and the term matched that
> > document. Could somebody advise me an appropriate place?
> >
> > Regards,
> > Vadim Gindin
> >
> >
> > On Tue, Dec 5, 2017 at 12:04 PM, Vadim Gindin <[hidden email]>
> > wrote:
> >
> > > For example like this:
> > >
> > > BooleanQuery.Builder expected = new BooleanQuery.Builder();
> > >
> > > Query param_vendor = new BoostQuery(new ConstantScoreQuery(new
> > TermQuery(new Term("param_vendor", queryStr))), 5f);
> > > Query param_model = new BoostQuery(new ConstantScoreQuery(new
> > TermQuery(new Term("param_model", queryStr))), 5f);
> > > Query param_value = new BoostQuery(new ConstantScoreQuery(new
> > TermQuery(new Term("param_value", queryStr))), 3f);
> > > Query param_name = new BoostQuery(new ConstantScoreQuery(new
> > TermQuery(new Term("param_name", queryStr))), 4f);
> > >
> > > BooleanQuery bq = expected
> > >         .add(param_vendor, BooleanClause.Occur.SHOULD)
> > >         .add(param_model, BooleanClause.Occur.SHOULD)
> > >         .add(param_value, BooleanClause.Occur.SHOULD)
> > >         .add(param_name, BooleanClause.Occur.SHOULD)
> > >         .setMinimumNumberShouldMatch(1)
> > >         .build();
> > >
> > > return new BoostQuery(bq, queryBoost);
> > >
> > >
> > > Vadim
> > >
> > > On Tue, Dec 5, 2017 at 9:24 AM, Michael Sokolov <[hidden email]>
> > > wrote:
> > >
> > >> Well how did you make the original query?
> > >>
> > >> On Dec 4, 2017 12:05 PM, "Vadim Gindin" <[hidden email]> wrote:
> > >>
> > >> > Yes, thanks. My question is exactly about how to create "another
> extra
> > >> > query that requires all the terms in the original query"
> > >> >
> > >> > On Mon, Dec 4, 2017 at 6:50 PM, Michael Sokolov <[hidden email]
> >
> > >> > wrote:
> > >> >
> > >> > > I'm just saying, that when you form your query, you could also
> > create
> > >> > > another extra query that requires all the terms in the original
> > query,
> > >> > and
> > >> > > then combine it with the original query in a boolean where the
> > >> original
> > >> > > query is required and the extra query is optional. That will give
> a
> > >> boost
> > >> > > when all the terms are found, although I think the scores will be
> > >> added,
> > >> > > not multiplied.
> > >> > >
> > >> > > On Dec 4, 2017 5:22 AM, "Vadim Gindin" <[hidden email]>
> > wrote:
> > >> > >
> > >> > > > Thanks, Michael!
> > >> > > >
> > >> > > > Yes, I'm sure. Could you explain your proposal in more detail?
> > >> > > >
> > >> > > > Regards,
> > >> > > > Vadim Gindin
> > >> > > >
> > >> > > > On Mon, Dec 4, 2017 at 3:18 PM, Michael Sokolov <
> > [hidden email]
> > >> >
> > >> > > > wrote:
> > >> > > >
> > >> > > > > You could combine a Boolean and query with the same terms, as
> an
> > >> > > optional
> > >> > > > > clause. Are you sure about the requirement to multiply the
> score
> > >> in
> > >> > > that
> > >> > > > > case?
> > >> > > > >
> > >> > > > > On Dec 4, 2017 5:13 AM, "Vadim Gindin" <[hidden email]>
> > >> wrote:
> > >> > > > >
> > >> > > > > > Hi all.
> > >> > > > > >
> > >> > > > > > I need to track that all query terms are matched in one
> > >> document.
> > >> > > When
> > >> > > > > all
> > >> > > > > > terms are matched I need to multiply the score of such
> > document
> > >> to
> > >> > > some
> > >> > > > > > constant coefficient.
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>
Reply | Threaded
Open this post in threaded view
|

Re: Tracking that all query terms are matched in one document

Vadim Gindin
Hi Michael,

I've tried to implement such case but faced with the following problem. I
recall, that my Query is combined with several ConstantScoreQuery with
BooleanQuery. I wrote custom Collector as follows:

@Override
public void setScorer(Scorer scorer) throws IOException {
    this.scorer = scorer;

}

@Override
public void collect(int doc) throws IOException {
    System.out.println("doc=" + doc);
    diveIntoScorers(this.scorer);
}

and, when I'm diving recursively to child scorers I'm facing new
UnsupportedOperationException error. It happens because of the following
code in BooleanScorer:

@Override
public int score(LeafCollector collector, Bits acceptDocs, int min,
int max) throws IOException {
  fakeScorer.doc = -1;
  collector.setScorer(fakeScorer);

Later fakeScorer throws an Exception.

How did you implement your similar functionality?
How to avoid this?

Thanks,
Vadim Gindin

On Fri, Dec 8, 2017 at 2:01 PM, Vadim Gindin <[hidden email]> wrote:

> Thank's for your help. I'll try that.
>
> On Tue, Dec 5, 2017 at 4:18 PM, Mikhail Khludnev <[hidden email]> wrote:
>
>> Vadim,
>> You can create a collector which checks Scorer.getChildren()
>> https://issues.apache.org/jira/browse/LUCENE-7628 but it's way
>> cumbersome.
>> I'd suggest to avoid this if it's possible. However, Elastic does
>> something
>> like this with named queries or so.
>> I've told about this few years ago
>> https://www.youtube.com/watch?v=sGVyUdNGBgw
>>
>> On Tue, Dec 5, 2017 at 12:36 PM, Vadim Gindin <[hidden email]>
>> wrote:
>>
>> > I'm not sure here that I will be able to track somehow that different
>> terms
>> > were matched to the same document...
>> >
>> > I'm thinking more about little another way: when query scores some
>> document
>> > - save the query term for that document somewhere. Probably it would be
>> > some map in some class SearchContext. I could write something like this:
>> >
>> > SearchContext sc = getSearchContext();                    // -  does
>> such
>> > search context exist in Lucene? Maybe QueryContext
>> > sc.getDocTerms().get(docID).add(query.getTerm()));  // docTerms here
>> is a
>> > Map<Int, List<String>> - where the key - is a document ID and the value
>> -
>> > is a list of terms by whom this document was matched.
>> >
>> > I need to save somewhere the document ID and the term matched that
>> > document. Could somebody advise me an appropriate place?
>> >
>> > Regards,
>> > Vadim Gindin
>> >
>> >
>> > On Tue, Dec 5, 2017 at 12:04 PM, Vadim Gindin <[hidden email]>
>> > wrote:
>> >
>> > > For example like this:
>> > >
>> > > BooleanQuery.Builder expected = new BooleanQuery.Builder();
>> > >
>> > > Query param_vendor = new BoostQuery(new ConstantScoreQuery(new
>> > TermQuery(new Term("param_vendor", queryStr))), 5f);
>> > > Query param_model = new BoostQuery(new ConstantScoreQuery(new
>> > TermQuery(new Term("param_model", queryStr))), 5f);
>> > > Query param_value = new BoostQuery(new ConstantScoreQuery(new
>> > TermQuery(new Term("param_value", queryStr))), 3f);
>> > > Query param_name = new BoostQuery(new ConstantScoreQuery(new
>> > TermQuery(new Term("param_name", queryStr))), 4f);
>> > >
>> > > BooleanQuery bq = expected
>> > >         .add(param_vendor, BooleanClause.Occur.SHOULD)
>> > >         .add(param_model, BooleanClause.Occur.SHOULD)
>> > >         .add(param_value, BooleanClause.Occur.SHOULD)
>> > >         .add(param_name, BooleanClause.Occur.SHOULD)
>> > >         .setMinimumNumberShouldMatch(1)
>> > >         .build();
>> > >
>> > > return new BoostQuery(bq, queryBoost);
>> > >
>> > >
>> > > Vadim
>> > >
>> > > On Tue, Dec 5, 2017 at 9:24 AM, Michael Sokolov <[hidden email]>
>> > > wrote:
>> > >
>> > >> Well how did you make the original query?
>> > >>
>> > >> On Dec 4, 2017 12:05 PM, "Vadim Gindin" <[hidden email]>
>> wrote:
>> > >>
>> > >> > Yes, thanks. My question is exactly about how to create "another
>> extra
>> > >> > query that requires all the terms in the original query"
>> > >> >
>> > >> > On Mon, Dec 4, 2017 at 6:50 PM, Michael Sokolov <
>> [hidden email]>
>> > >> > wrote:
>> > >> >
>> > >> > > I'm just saying, that when you form your query, you could also
>> > create
>> > >> > > another extra query that requires all the terms in the original
>> > query,
>> > >> > and
>> > >> > > then combine it with the original query in a boolean where the
>> > >> original
>> > >> > > query is required and the extra query is optional. That will
>> give a
>> > >> boost
>> > >> > > when all the terms are found, although I think the scores will be
>> > >> added,
>> > >> > > not multiplied.
>> > >> > >
>> > >> > > On Dec 4, 2017 5:22 AM, "Vadim Gindin" <[hidden email]>
>> > wrote:
>> > >> > >
>> > >> > > > Thanks, Michael!
>> > >> > > >
>> > >> > > > Yes, I'm sure. Could you explain your proposal in more detail?
>> > >> > > >
>> > >> > > > Regards,
>> > >> > > > Vadim Gindin
>> > >> > > >
>> > >> > > > On Mon, Dec 4, 2017 at 3:18 PM, Michael Sokolov <
>> > [hidden email]
>> > >> >
>> > >> > > > wrote:
>> > >> > > >
>> > >> > > > > You could combine a Boolean and query with the same terms,
>> as an
>> > >> > > optional
>> > >> > > > > clause. Are you sure about the requirement to multiply the
>> score
>> > >> in
>> > >> > > that
>> > >> > > > > case?
>> > >> > > > >
>> > >> > > > > On Dec 4, 2017 5:13 AM, "Vadim Gindin" <[hidden email]
>> >
>> > >> wrote:
>> > >> > > > >
>> > >> > > > > > Hi all.
>> > >> > > > > >
>> > >> > > > > > I need to track that all query terms are matched in one
>> > >> document.
>> > >> > > When
>> > >> > > > > all
>> > >> > > > > > terms are matched I need to multiply the score of such
>> > document
>> > >> to
>> > >> > > some
>> > >> > > > > > constant coefficient.
>> > >> > > > > >
>> > >> > > > >
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> > >
>> > >
>> >
>>
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Tracking that all query terms are matched in one document

Mikhail Khludnev-2
There are two algorithm for scoring disjunction: term-a-time, doc-at-time.
The former was called BooleanScorer and the later was called
BooleanScorer2.
I remember that they was drastically renamed and/or replaced with
BulkScorer or so. Anyway, you need to find a way to prevent term-at-time
scoring, when FakeScorer is injected.
You need to make it score doc-at-time. As I told you, it's far way.

On Wed, Dec 13, 2017 at 11:55 AM, Vadim Gindin <[hidden email]> wrote:

> Hi Michael,
>
> I've tried to implement such case but faced with the following problem. I
> recall, that my Query is combined with several ConstantScoreQuery with
> BooleanQuery. I wrote custom Collector as follows:
>
> @Override
> public void setScorer(Scorer scorer) throws IOException {
>     this.scorer = scorer;
>
> }
>
> @Override
> public void collect(int doc) throws IOException {
>     System.out.println("doc=" + doc);
>     diveIntoScorers(this.scorer);
> }
>
> and, when I'm diving recursively to child scorers I'm facing new
> UnsupportedOperationException error. It happens because of the following
> code in BooleanScorer:
>
> @Override
> public int score(LeafCollector collector, Bits acceptDocs, int min,
> int max) throws IOException {
>   fakeScorer.doc = -1;
>   collector.setScorer(fakeScorer);
>
> Later fakeScorer throws an Exception.
>
> How did you implement your similar functionality?
> How to avoid this?
>
> Thanks,
> Vadim Gindin
>
> On Fri, Dec 8, 2017 at 2:01 PM, Vadim Gindin <[hidden email]> wrote:
>
> > Thank's for your help. I'll try that.
> >
> > On Tue, Dec 5, 2017 at 4:18 PM, Mikhail Khludnev <[hidden email]>
> wrote:
> >
> >> Vadim,
> >> You can create a collector which checks Scorer.getChildren()
> >> https://issues.apache.org/jira/browse/LUCENE-7628 but it's way
> >> cumbersome.
> >> I'd suggest to avoid this if it's possible. However, Elastic does
> >> something
> >> like this with named queries or so.
> >> I've told about this few years ago
> >> https://www.youtube.com/watch?v=sGVyUdNGBgw
> >>
> >> On Tue, Dec 5, 2017 at 12:36 PM, Vadim Gindin <[hidden email]>
> >> wrote:
> >>
> >> > I'm not sure here that I will be able to track somehow that different
> >> terms
> >> > were matched to the same document...
> >> >
> >> > I'm thinking more about little another way: when query scores some
> >> document
> >> > - save the query term for that document somewhere. Probably it would
> be
> >> > some map in some class SearchContext. I could write something like
> this:
> >> >
> >> > SearchContext sc = getSearchContext();                    // -  does
> >> such
> >> > search context exist in Lucene? Maybe QueryContext
> >> > sc.getDocTerms().get(docID).add(query.getTerm()));  // docTerms here
> >> is a
> >> > Map<Int, List<String>> - where the key - is a document ID and the
> value
> >> -
> >> > is a list of terms by whom this document was matched.
> >> >
> >> > I need to save somewhere the document ID and the term matched that
> >> > document. Could somebody advise me an appropriate place?
> >> >
> >> > Regards,
> >> > Vadim Gindin
> >> >
> >> >
> >> > On Tue, Dec 5, 2017 at 12:04 PM, Vadim Gindin <[hidden email]>
> >> > wrote:
> >> >
> >> > > For example like this:
> >> > >
> >> > > BooleanQuery.Builder expected = new BooleanQuery.Builder();
> >> > >
> >> > > Query param_vendor = new BoostQuery(new ConstantScoreQuery(new
> >> > TermQuery(new Term("param_vendor", queryStr))), 5f);
> >> > > Query param_model = new BoostQuery(new ConstantScoreQuery(new
> >> > TermQuery(new Term("param_model", queryStr))), 5f);
> >> > > Query param_value = new BoostQuery(new ConstantScoreQuery(new
> >> > TermQuery(new Term("param_value", queryStr))), 3f);
> >> > > Query param_name = new BoostQuery(new ConstantScoreQuery(new
> >> > TermQuery(new Term("param_name", queryStr))), 4f);
> >> > >
> >> > > BooleanQuery bq = expected
> >> > >         .add(param_vendor, BooleanClause.Occur.SHOULD)
> >> > >         .add(param_model, BooleanClause.Occur.SHOULD)
> >> > >         .add(param_value, BooleanClause.Occur.SHOULD)
> >> > >         .add(param_name, BooleanClause.Occur.SHOULD)
> >> > >         .setMinimumNumberShouldMatch(1)
> >> > >         .build();
> >> > >
> >> > > return new BoostQuery(bq, queryBoost);
> >> > >
> >> > >
> >> > > Vadim
> >> > >
> >> > > On Tue, Dec 5, 2017 at 9:24 AM, Michael Sokolov <[hidden email]
> >
> >> > > wrote:
> >> > >
> >> > >> Well how did you make the original query?
> >> > >>
> >> > >> On Dec 4, 2017 12:05 PM, "Vadim Gindin" <[hidden email]>
> >> wrote:
> >> > >>
> >> > >> > Yes, thanks. My question is exactly about how to create "another
> >> extra
> >> > >> > query that requires all the terms in the original query"
> >> > >> >
> >> > >> > On Mon, Dec 4, 2017 at 6:50 PM, Michael Sokolov <
> >> [hidden email]>
> >> > >> > wrote:
> >> > >> >
> >> > >> > > I'm just saying, that when you form your query, you could also
> >> > create
> >> > >> > > another extra query that requires all the terms in the original
> >> > query,
> >> > >> > and
> >> > >> > > then combine it with the original query in a boolean where the
> >> > >> original
> >> > >> > > query is required and the extra query is optional. That will
> >> give a
> >> > >> boost
> >> > >> > > when all the terms are found, although I think the scores will
> be
> >> > >> added,
> >> > >> > > not multiplied.
> >> > >> > >
> >> > >> > > On Dec 4, 2017 5:22 AM, "Vadim Gindin" <[hidden email]>
> >> > wrote:
> >> > >> > >
> >> > >> > > > Thanks, Michael!
> >> > >> > > >
> >> > >> > > > Yes, I'm sure. Could you explain your proposal in more
> detail?
> >> > >> > > >
> >> > >> > > > Regards,
> >> > >> > > > Vadim Gindin
> >> > >> > > >
> >> > >> > > > On Mon, Dec 4, 2017 at 3:18 PM, Michael Sokolov <
> >> > [hidden email]
> >> > >> >
> >> > >> > > > wrote:
> >> > >> > > >
> >> > >> > > > > You could combine a Boolean and query with the same terms,
> >> as an
> >> > >> > > optional
> >> > >> > > > > clause. Are you sure about the requirement to multiply the
> >> score
> >> > >> in
> >> > >> > > that
> >> > >> > > > > case?
> >> > >> > > > >
> >> > >> > > > > On Dec 4, 2017 5:13 AM, "Vadim Gindin" <
> [hidden email]
> >> >
> >> > >> wrote:
> >> > >> > > > >
> >> > >> > > > > > Hi all.
> >> > >> > > > > >
> >> > >> > > > > > I need to track that all query terms are matched in one
> >> > >> document.
> >> > >> > > When
> >> > >> > > > > all
> >> > >> > > > > > terms are matched I need to multiply the score of such
> >> > document
> >> > >> to
> >> > >> > > some
> >> > >> > > > > > constant coefficient.
> >> > >> > > > > >
> >> > >> > > > >
> >> > >> > > >
> >> > >> > >
> >> > >> >
> >> > >>
> >> > >
> >> > >
> >> >
> >>
> >>
> >>
> >> --
> >> Sincerely yours
> >> Mikhail Khludnev
> >>
> >
> >
>



--
Sincerely yours
Mikhail Khludnev
Reply | Threaded
Open this post in threaded view
|

Re: Tracking that all query terms are matched in one document

Vadim Gindin
Thank you

On Wed, Dec 13, 2017 at 3:32 PM, Mikhail Khludnev <[hidden email]> wrote:

> There are two algorithm for scoring disjunction: term-a-time, doc-at-time.
> The former was called BooleanScorer and the later was called
> BooleanScorer2.
> I remember that they was drastically renamed and/or replaced with
> BulkScorer or so. Anyway, you need to find a way to prevent term-at-time
> scoring, when FakeScorer is injected.
> You need to make it score doc-at-time. As I told you, it's far way.
>
> On Wed, Dec 13, 2017 at 11:55 AM, Vadim Gindin <[hidden email]>
> wrote:
>
> > Hi Michael,
> >
> > I've tried to implement such case but faced with the following problem. I
> > recall, that my Query is combined with several ConstantScoreQuery with
> > BooleanQuery. I wrote custom Collector as follows:
> >
> > @Override
> > public void setScorer(Scorer scorer) throws IOException {
> >     this.scorer = scorer;
> >
> > }
> >
> > @Override
> > public void collect(int doc) throws IOException {
> >     System.out.println("doc=" + doc);
> >     diveIntoScorers(this.scorer);
> > }
> >
> > and, when I'm diving recursively to child scorers I'm facing new
> > UnsupportedOperationException error. It happens because of the following
> > code in BooleanScorer:
> >
> > @Override
> > public int score(LeafCollector collector, Bits acceptDocs, int min,
> > int max) throws IOException {
> >   fakeScorer.doc = -1;
> >   collector.setScorer(fakeScorer);
> >
> > Later fakeScorer throws an Exception.
> >
> > How did you implement your similar functionality?
> > How to avoid this?
> >
> > Thanks,
> > Vadim Gindin
> >
> > On Fri, Dec 8, 2017 at 2:01 PM, Vadim Gindin <[hidden email]>
> wrote:
> >
> > > Thank's for your help. I'll try that.
> > >
> > > On Tue, Dec 5, 2017 at 4:18 PM, Mikhail Khludnev <[hidden email]>
> > wrote:
> > >
> > >> Vadim,
> > >> You can create a collector which checks Scorer.getChildren()
> > >> https://issues.apache.org/jira/browse/LUCENE-7628 but it's way
> > >> cumbersome.
> > >> I'd suggest to avoid this if it's possible. However, Elastic does
> > >> something
> > >> like this with named queries or so.
> > >> I've told about this few years ago
> > >> https://www.youtube.com/watch?v=sGVyUdNGBgw
> > >>
> > >> On Tue, Dec 5, 2017 at 12:36 PM, Vadim Gindin <[hidden email]>
> > >> wrote:
> > >>
> > >> > I'm not sure here that I will be able to track somehow that
> different
> > >> terms
> > >> > were matched to the same document...
> > >> >
> > >> > I'm thinking more about little another way: when query scores some
> > >> document
> > >> > - save the query term for that document somewhere. Probably it would
> > be
> > >> > some map in some class SearchContext. I could write something like
> > this:
> > >> >
> > >> > SearchContext sc = getSearchContext();                    // -  does
> > >> such
> > >> > search context exist in Lucene? Maybe QueryContext
> > >> > sc.getDocTerms().get(docID).add(query.getTerm()));  // docTerms
> here
> > >> is a
> > >> > Map<Int, List<String>> - where the key - is a document ID and the
> > value
> > >> -
> > >> > is a list of terms by whom this document was matched.
> > >> >
> > >> > I need to save somewhere the document ID and the term matched that
> > >> > document. Could somebody advise me an appropriate place?
> > >> >
> > >> > Regards,
> > >> > Vadim Gindin
> > >> >
> > >> >
> > >> > On Tue, Dec 5, 2017 at 12:04 PM, Vadim Gindin <[hidden email]
> >
> > >> > wrote:
> > >> >
> > >> > > For example like this:
> > >> > >
> > >> > > BooleanQuery.Builder expected = new BooleanQuery.Builder();
> > >> > >
> > >> > > Query param_vendor = new BoostQuery(new ConstantScoreQuery(new
> > >> > TermQuery(new Term("param_vendor", queryStr))), 5f);
> > >> > > Query param_model = new BoostQuery(new ConstantScoreQuery(new
> > >> > TermQuery(new Term("param_model", queryStr))), 5f);
> > >> > > Query param_value = new BoostQuery(new ConstantScoreQuery(new
> > >> > TermQuery(new Term("param_value", queryStr))), 3f);
> > >> > > Query param_name = new BoostQuery(new ConstantScoreQuery(new
> > >> > TermQuery(new Term("param_name", queryStr))), 4f);
> > >> > >
> > >> > > BooleanQuery bq = expected
> > >> > >         .add(param_vendor, BooleanClause.Occur.SHOULD)
> > >> > >         .add(param_model, BooleanClause.Occur.SHOULD)
> > >> > >         .add(param_value, BooleanClause.Occur.SHOULD)
> > >> > >         .add(param_name, BooleanClause.Occur.SHOULD)
> > >> > >         .setMinimumNumberShouldMatch(1)
> > >> > >         .build();
> > >> > >
> > >> > > return new BoostQuery(bq, queryBoost);
> > >> > >
> > >> > >
> > >> > > Vadim
> > >> > >
> > >> > > On Tue, Dec 5, 2017 at 9:24 AM, Michael Sokolov <
> [hidden email]
> > >
> > >> > > wrote:
> > >> > >
> > >> > >> Well how did you make the original query?
> > >> > >>
> > >> > >> On Dec 4, 2017 12:05 PM, "Vadim Gindin" <[hidden email]>
> > >> wrote:
> > >> > >>
> > >> > >> > Yes, thanks. My question is exactly about how to create
> "another
> > >> extra
> > >> > >> > query that requires all the terms in the original query"
> > >> > >> >
> > >> > >> > On Mon, Dec 4, 2017 at 6:50 PM, Michael Sokolov <
> > >> [hidden email]>
> > >> > >> > wrote:
> > >> > >> >
> > >> > >> > > I'm just saying, that when you form your query, you could
> also
> > >> > create
> > >> > >> > > another extra query that requires all the terms in the
> original
> > >> > query,
> > >> > >> > and
> > >> > >> > > then combine it with the original query in a boolean where
> the
> > >> > >> original
> > >> > >> > > query is required and the extra query is optional. That will
> > >> give a
> > >> > >> boost
> > >> > >> > > when all the terms are found, although I think the scores
> will
> > be
> > >> > >> added,
> > >> > >> > > not multiplied.
> > >> > >> > >
> > >> > >> > > On Dec 4, 2017 5:22 AM, "Vadim Gindin" <[hidden email]
> >
> > >> > wrote:
> > >> > >> > >
> > >> > >> > > > Thanks, Michael!
> > >> > >> > > >
> > >> > >> > > > Yes, I'm sure. Could you explain your proposal in more
> > detail?
> > >> > >> > > >
> > >> > >> > > > Regards,
> > >> > >> > > > Vadim Gindin
> > >> > >> > > >
> > >> > >> > > > On Mon, Dec 4, 2017 at 3:18 PM, Michael Sokolov <
> > >> > [hidden email]
> > >> > >> >
> > >> > >> > > > wrote:
> > >> > >> > > >
> > >> > >> > > > > You could combine a Boolean and query with the same
> terms,
> > >> as an
> > >> > >> > > optional
> > >> > >> > > > > clause. Are you sure about the requirement to multiply
> the
> > >> score
> > >> > >> in
> > >> > >> > > that
> > >> > >> > > > > case?
> > >> > >> > > > >
> > >> > >> > > > > On Dec 4, 2017 5:13 AM, "Vadim Gindin" <
> > [hidden email]
> > >> >
> > >> > >> wrote:
> > >> > >> > > > >
> > >> > >> > > > > > Hi all.
> > >> > >> > > > > >
> > >> > >> > > > > > I need to track that all query terms are matched in one
> > >> > >> document.
> > >> > >> > > When
> > >> > >> > > > > all
> > >> > >> > > > > > terms are matched I need to multiply the score of such
> > >> > document
> > >> > >> to
> > >> > >> > > some
> > >> > >> > > > > > constant coefficient.
> > >> > >> > > > > >
> > >> > >> > > > >
> > >> > >> > > >
> > >> > >> > >
> > >> > >> >
> > >> > >>
> > >> > >
> > >> > >
> > >> >
> > >>
> > >>
> > >>
> > >> --
> > >> Sincerely yours
> > >> Mikhail Khludnev
> > >>
> > >
> > >
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>