Scorer.iterator() - how to implement correctly

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Scorer.iterator() - how to implement correctly

Vadim Gindin
Hi

I'm implementing the custom QUERY with appropriate custom WEIGHT and SCORER.

I'm trying to implement Scorer.iterator() method. It should return an
iterator of documents that matches the query. Right? There are a lot of
descendant classes of the DocIdSetIterato.

1. How to choose correct one?
2. How to correctly implement Scorer.iterator() method?

I've tried DocIdSetIterator.all(context.reader().maxDoc());

But as I can see it returns all documents.

My task looks simple. I need to return a constant score depending on the
matched fields. I.e. field "model" score - 3f, field "vendor" - score - 5f.

I'm creating a subquery for each field and specify score for it using
custom QUERY that is almost the same as TermQuery except Weight.Scorer

Any help is appreciated.

Regards,
Vadim Gindin
Reply | Threaded
Open this post in threaded view
|

Re: Scorer.iterator() - how to implement correctly

Adrien Grand
There are many implementations because each query typically needs a custom
DocIdSetIterator implementation. It looks like your use-case doesn't need a
custom query though, you could use a TermQuery wrapped in a constant-score
query (see my reply to the other question you asked).

Le ven. 1 déc. 2017 à 08:24, Vadim Gindin <[hidden email]> a écrit :

> Hi
>
> I'm implementing the custom QUERY with appropriate custom WEIGHT and
> SCORER.
>
> I'm trying to implement Scorer.iterator() method. It should return an
> iterator of documents that matches the query. Right? There are a lot of
> descendant classes of the DocIdSetIterato.
>
> 1. How to choose correct one?
> 2. How to correctly implement Scorer.iterator() method?
>
> I've tried DocIdSetIterator.all(context.reader().maxDoc());
>
> But as I can see it returns all documents.
>
> My task looks simple. I need to return a constant score depending on the
> matched fields. I.e. field "model" score - 3f, field "vendor" - score - 5f.
>
> I'm creating a subquery for each field and specify score for it using
> custom QUERY that is almost the same as TermQuery except Weight.Scorer
>
> Any help is appreciated.
>
> Regards,
> Vadim Gindin
>
Reply | Threaded
Open this post in threaded view
|

Re: Scorer.iterator() - how to implement correctly

Vadim Gindin
Hi Adrien.

ConstantScoreQuery - I'd tried that earlier. There is the problem. It
returns score = 0.0 for my configuration with Boolean.. I've debugged and
found, that it happens because of the following:

@Override
public Weight createWeight(IndexSearcher searcher, boolean
needsScores, float boost) throws IOException {
  final Weight innerWeight = searcher.createWeight(query, false, 1f);
  if (needsScores) {
    return new ConstantScoreWeight(this, boost) {


As you can see innerWeight is created with needsScores=false  and further
innerWeight.scorerSuplier will return null. That will lead to 0.0 final
score.

Moreover I'm trying to start my current logic from simple step. That's why
I wanted to implement something simple and decide to write it from scratch.

Ok, as you say - iterator is the reason of returning all documents. *So how
to properly implement scorer.iterator()?*

Many thanks for your help!
Regards
Vadim Gindin

On Fri, Dec 1, 2017 at 1:11 PM, Adrien Grand <[hidden email]> wrote:

> There are many implementations because each query typically needs a custom
> DocIdSetIterator implementation. It looks like your use-case doesn't need a
> custom query though, you could use a TermQuery wrapped in a constant-score
> query (see my reply to the other question you asked).
>
> Le ven. 1 déc. 2017 à 08:24, Vadim Gindin <[hidden email]> a écrit :
>
> > Hi
> >
> > I'm implementing the custom QUERY with appropriate custom WEIGHT and
> > SCORER.
> >
> > I'm trying to implement Scorer.iterator() method. It should return an
> > iterator of documents that matches the query. Right? There are a lot of
> > descendant classes of the DocIdSetIterato.
> >
> > 1. How to choose correct one?
> > 2. How to correctly implement Scorer.iterator() method?
> >
> > I've tried DocIdSetIterator.all(context.reader().maxDoc());
> >
> > But as I can see it returns all documents.
> >
> > My task looks simple. I need to return a constant score depending on the
> > matched fields. I.e. field "model" score - 3f, field "vendor" - score -
> 5f.
> >
> > I'm creating a subquery for each field and specify score for it using
> > custom QUERY that is almost the same as TermQuery except Weight.Scorer
> >
> > Any help is appreciated.
> >
> > Regards,
> > Vadim Gindin
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Scorer.iterator() - how to implement correctly

Vadim Gindin
Adrien.

I've found some working solution. Here is how it calculates iterator:

this.iterator = context.reader().postings(query.getTerm(), PostingsEnum.ALL);
if (this.iterator == null) this.iterator = DocIdSetIterator.empty();


Is that implementation correct?

On Sun, Dec 3, 2017 at 4:43 PM, Vadim Gindin <[hidden email]> wrote:

> Hi Adrien.
>
> ConstantScoreQuery - I'd tried that earlier. There is the problem. It
> returns score = 0.0 for my configuration with Boolean.. I've debugged and
> found, that it happens because of the following:
>
> @Override
> public Weight createWeight(IndexSearcher searcher, boolean needsScores, float boost) throws IOException {
>   final Weight innerWeight = searcher.createWeight(query, false, 1f);
>   if (needsScores) {
>     return new ConstantScoreWeight(this, boost) {
>
>
> As you can see innerWeight is created with needsScores=false  and further
> innerWeight.scorerSuplier will return null. That will lead to 0.0 final
> score.
>
> Moreover I'm trying to start my current logic from simple step. That's why
> I wanted to implement something simple and decide to write it from scratch.
>
> Ok, as you say - iterator is the reason of returning all documents. *So
> how to properly implement scorer.iterator()?*
>
> Many thanks for your help!
> Regards
> Vadim Gindin
>
> On Fri, Dec 1, 2017 at 1:11 PM, Adrien Grand <[hidden email]> wrote:
>
>> There are many implementations because each query typically needs a custom
>> DocIdSetIterator implementation. It looks like your use-case doesn't need
>> a
>> custom query though, you could use a TermQuery wrapped in a constant-score
>> query (see my reply to the other question you asked).
>>
>> Le ven. 1 déc. 2017 à 08:24, Vadim Gindin <[hidden email]> a écrit
>> :
>>
>> > Hi
>> >
>> > I'm implementing the custom QUERY with appropriate custom WEIGHT and
>> > SCORER.
>> >
>> > I'm trying to implement Scorer.iterator() method. It should return an
>> > iterator of documents that matches the query. Right? There are a lot of
>> > descendant classes of the DocIdSetIterato.
>> >
>> > 1. How to choose correct one?
>> > 2. How to correctly implement Scorer.iterator() method?
>> >
>> > I've tried DocIdSetIterator.all(context.reader().maxDoc());
>> >
>> > But as I can see it returns all documents.
>> >
>> > My task looks simple. I need to return a constant score depending on the
>> > matched fields. I.e. field "model" score - 3f, field "vendor" - score -
>> 5f.
>> >
>> > I'm creating a subquery for each field and specify score for it using
>> > custom QUERY that is almost the same as TermQuery except Weight.Scorer
>> >
>> > Any help is appreciated.
>> >
>> > Regards,
>> > Vadim Gindin
>> >
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Scorer.iterator() - how to implement correctly

Adrien Grand
It is correct... but ConstantScoreQuery is the way to go with your
use-case. It should not return scores of 0 unless you are misusing the API
in some way. Please share the code that you use in order to build your
query.

Le lun. 4 déc. 2017 à 11:10, Vadim Gindin <[hidden email]> a écrit :

> Adrien.
>
> I've found some working solution. Here is how it calculates iterator:
>
> this.iterator = context.reader().postings(query.getTerm(),
> PostingsEnum.ALL);
> if (this.iterator == null) this.iterator = DocIdSetIterator.empty();
>
>
> Is that implementation correct?
>
> On Sun, Dec 3, 2017 at 4:43 PM, Vadim Gindin <[hidden email]> wrote:
>
> > Hi Adrien.
> >
> > ConstantScoreQuery - I'd tried that earlier. There is the problem. It
> > returns score = 0.0 for my configuration with Boolean.. I've debugged and
> > found, that it happens because of the following:
> >
> > @Override
> > public Weight createWeight(IndexSearcher searcher, boolean needsScores,
> float boost) throws IOException {
> >   final Weight innerWeight = searcher.createWeight(query, false, 1f);
> >   if (needsScores) {
> >     return new ConstantScoreWeight(this, boost) {
> >
> >
> > As you can see innerWeight is created with needsScores=false  and further
> > innerWeight.scorerSuplier will return null. That will lead to 0.0 final
> > score.
> >
> > Moreover I'm trying to start my current logic from simple step. That's
> why
> > I wanted to implement something simple and decide to write it from
> scratch.
> >
> > Ok, as you say - iterator is the reason of returning all documents. *So
> > how to properly implement scorer.iterator()?*
> >
> > Many thanks for your help!
> > Regards
> > Vadim Gindin
> >
> > On Fri, Dec 1, 2017 at 1:11 PM, Adrien Grand <[hidden email]> wrote:
> >
> >> There are many implementations because each query typically needs a
> custom
> >> DocIdSetIterator implementation. It looks like your use-case doesn't
> need
> >> a
> >> custom query though, you could use a TermQuery wrapped in a
> constant-score
> >> query (see my reply to the other question you asked).
> >>
> >> Le ven. 1 déc. 2017 à 08:24, Vadim Gindin <[hidden email]> a
> écrit
> >> :
> >>
> >> > Hi
> >> >
> >> > I'm implementing the custom QUERY with appropriate custom WEIGHT and
> >> > SCORER.
> >> >
> >> > I'm trying to implement Scorer.iterator() method. It should return an
> >> > iterator of documents that matches the query. Right? There are a lot
> of
> >> > descendant classes of the DocIdSetIterato.
> >> >
> >> > 1. How to choose correct one?
> >> > 2. How to correctly implement Scorer.iterator() method?
> >> >
> >> > I've tried DocIdSetIterator.all(context.reader().maxDoc());
> >> >
> >> > But as I can see it returns all documents.
> >> >
> >> > My task looks simple. I need to return a constant score depending on
> the
> >> > matched fields. I.e. field "model" score - 3f, field "vendor" - score
> -
> >> 5f.
> >> >
> >> > I'm creating a subquery for each field and specify score for it using
> >> > custom QUERY that is almost the same as TermQuery except Weight.Scorer
> >> >
> >> > Any help is appreciated.
> >> >
> >> > Regards,
> >> > Vadim Gindin
> >> >
> >>
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Scorer.iterator() - how to implement correctly

Vadim Gindin
Adrien, you're right. I've checked it again - it starts working. Probably,
I had an error in my index causing wrong behavior or yes misusing API. Here
is my code

BooleanQuery.Builder expected = new BooleanQuery.Builder();

Query param_vendor = new BoostQuery(new ConstantScoreQuery(new
TermQuery(new Term("param_vendor", queryStr))), 5f);
Query param_model = new BoostQuery(new ConstantScoreQuery(new
TermQuery(new Term("param_model", queryStr))), 5f);
Query param_value = new BoostQuery(new ConstantScoreQuery(new
TermQuery(new Term("param_value", queryStr))), 3f);
Query param_name = new BoostQuery(new ConstantScoreQuery(new
TermQuery(new Term("param_name", queryStr))), 4f);

BooleanQuery bq = expected
        .add(param_vendor, BooleanClause.Occur.SHOULD)
        .add(param_model, BooleanClause.Occur.SHOULD)
        .add(param_value, BooleanClause.Occur.SHOULD)
        .add(param_name, BooleanClause.Occur.SHOULD)
        .setMinimumNumberShouldMatch(1)
        .build();

return new BoostQuery(bq, queryBoost);

ConstantScoreQuery returns always 1 which is multiplied each time by a
corresponding "field' boost and finally multiplied to query boost. Now it
works.

Thank's a lot!

Regards,
Vadim Gindin

On Mon, Dec 4, 2017 at 3:17 PM, Adrien Grand <[hidden email]> wrote:

> It is correct... but ConstantScoreQuery is the way to go with your
> use-case. It should not return scores of 0 unless you are misusing the API
> in some way. Please share the code that you use in order to build your
> query.
>
> Le lun. 4 déc. 2017 à 11:10, Vadim Gindin <[hidden email]> a écrit :
>
> > Adrien.
> >
> > I've found some working solution. Here is how it calculates iterator:
> >
> > this.iterator = context.reader().postings(query.getTerm(),
> > PostingsEnum.ALL);
> > if (this.iterator == null) this.iterator = DocIdSetIterator.empty();
> >
> >
> > Is that implementation correct?
> >
> > On Sun, Dec 3, 2017 at 4:43 PM, Vadim Gindin <[hidden email]>
> wrote:
> >
> > > Hi Adrien.
> > >
> > > ConstantScoreQuery - I'd tried that earlier. There is the problem. It
> > > returns score = 0.0 for my configuration with Boolean.. I've debugged
> and
> > > found, that it happens because of the following:
> > >
> > > @Override
> > > public Weight createWeight(IndexSearcher searcher, boolean needsScores,
> > float boost) throws IOException {
> > >   final Weight innerWeight = searcher.createWeight(query, false, 1f);
> > >   if (needsScores) {
> > >     return new ConstantScoreWeight(this, boost) {
> > >
> > >
> > > As you can see innerWeight is created with needsScores=false  and
> further
> > > innerWeight.scorerSuplier will return null. That will lead to 0.0 final
> > > score.
> > >
> > > Moreover I'm trying to start my current logic from simple step. That's
> > why
> > > I wanted to implement something simple and decide to write it from
> > scratch.
> > >
> > > Ok, as you say - iterator is the reason of returning all documents. *So
> > > how to properly implement scorer.iterator()?*
> > >
> > > Many thanks for your help!
> > > Regards
> > > Vadim Gindin
> > >
> > > On Fri, Dec 1, 2017 at 1:11 PM, Adrien Grand <[hidden email]>
> wrote:
> > >
> > >> There are many implementations because each query typically needs a
> > custom
> > >> DocIdSetIterator implementation. It looks like your use-case doesn't
> > need
> > >> a
> > >> custom query though, you could use a TermQuery wrapped in a
> > constant-score
> > >> query (see my reply to the other question you asked).
> > >>
> > >> Le ven. 1 déc. 2017 à 08:24, Vadim Gindin <[hidden email]> a
> > écrit
> > >> :
> > >>
> > >> > Hi
> > >> >
> > >> > I'm implementing the custom QUERY with appropriate custom WEIGHT and
> > >> > SCORER.
> > >> >
> > >> > I'm trying to implement Scorer.iterator() method. It should return
> an
> > >> > iterator of documents that matches the query. Right? There are a lot
> > of
> > >> > descendant classes of the DocIdSetIterato.
> > >> >
> > >> > 1. How to choose correct one?
> > >> > 2. How to correctly implement Scorer.iterator() method?
> > >> >
> > >> > I've tried DocIdSetIterator.all(context.reader().maxDoc());
> > >> >
> > >> > But as I can see it returns all documents.
> > >> >
> > >> > My task looks simple. I need to return a constant score depending on
> > the
> > >> > matched fields. I.e. field "model" score - 3f, field "vendor" -
> score
> > -
> > >> 5f.
> > >> >
> > >> > I'm creating a subquery for each field and specify score for it
> using
> > >> > custom QUERY that is almost the same as TermQuery except
> Weight.Scorer
> > >> >
> > >> > Any help is appreciated.
> > >> >
> > >> > Regards,
> > >> > Vadim Gindin
> > >> >
> > >>
> > >
> > >
> >
>