different score from different replica of same shard

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

different score from different replica of same shard

Bernd Fehling
Hello list,

a question for better understanding scoring of a shard in a cloud.

I see different scores from different replicas of the same shard.
Is this normal and if yes, why?

My understanding until now was that replicas are always the same within a shard
and the same query to each replica within a shard gives always the same score.

Can someone help me to understand this?

Regards
Bernd
Reply | Threaded
Open this post in threaded view
|

Re: different score from different replica of same shard

Markus Jelsma-2
Hello Bernd,

This is normal for NRT replicas, because the way segments are merged and
deletes are removed is not synchronized between replicas. In that case
counts for TF and IDF and norms become slightly different.

You can either use ExactStatsCache that fetches counts for terms before
scoring, so that all replica's use the same counts. Or change the replica
types to TLOG. With TLOG segments are fetched from the leader and thus
identical.

Regards,
Markus

Op wo 13 jan. 2021 om 14:45 schreef Bernd Fehling <
[hidden email]>:

> Hello list,
>
> a question for better understanding scoring of a shard in a cloud.
>
> I see different scores from different replicas of the same shard.
> Is this normal and if yes, why?
>
> My understanding until now was that replicas are always the same within a
> shard
> and the same query to each replica within a shard gives always the same
> score.
>
> Can someone help me to understand this?
>
> Regards
> Bernd
>
Reply | Threaded
Open this post in threaded view
|

Re: different score from different replica of same shard

Bernd Fehling
In reply to this post by Bernd Fehling
Hello Markus,

thanks a lot.
Is TLOG also for SOLR 6.6.6 or only 8.x and up?

I will first try ExactStatsCache.
Should be added as invariant to request handler, right?

Comparing the replica index directories they have different size and
the index version and generation is different. Also Max Doc.
But Num Docs is the same.

Regards,
Bernd


Am 13.01.21 um 14:54 schrieb Markus Jelsma:

> Hello Bernd,
>
> This is normal for NRT replicas, because the way segments are merged and
> deletes are removed is not synchronized between replicas. In that case
> counts for TF and IDF and norms become slightly different.
>
> You can either use ExactStatsCache that fetches counts for terms before
> scoring, so that all replica's use the same counts. Or change the replica
> types to TLOG. With TLOG segments are fetched from the leader and thus
> identical.
>
> Regards,
> Markus
>
> Op wo 13 jan. 2021 om 14:45 schreef Bernd Fehling <
> [hidden email]>:
>
>> Hello list,
>>
>> a question for better understanding scoring of a shard in a cloud.
>>
>> I see different scores from different replicas of the same shard.
>> Is this normal and if yes, why?
>>
>> My understanding until now was that replicas are always the same within a
>> shard
>> and the same query to each replica within a shard gives always the same
>> score.
>>
>> Can someone help me to understand this?
>>
>> Regards
>> Bernd
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: different score from different replica of same shard

Markus Jelsma-2
Hallo Bernd,

I see the different replica types in the 7.1 [1] manual but not in the 6.6.
ExactStatsCache should work in 6.6, just add it to solrconfig.xml, not the
request handler [1]. It will slow down searches due to added overhead.

Regards,
Markus

[1]
https://lucene.apache.org/solr/guide/7_1/shards-and-indexing-data-in-solrcloud.html#types-of-replicas
[2] https://lucene.apache.org/solr/guide/6_6/distributed-requests.html

Op wo 13 jan. 2021 om 15:11 schreef Bernd Fehling <
[hidden email]>:

> Hello Markus,
>
> thanks a lot.
> Is TLOG also for SOLR 6.6.6 or only 8.x and up?
>
> I will first try ExactStatsCache.
> Should be added as invariant to request handler, right?
>
> Comparing the replica index directories they have different size and
> the index version and generation is different. Also Max Doc.
> But Num Docs is the same.
>
> Regards,
> Bernd
>
>
> Am 13.01.21 um 14:54 schrieb Markus Jelsma:
> > Hello Bernd,
> >
> > This is normal for NRT replicas, because the way segments are merged and
> > deletes are removed is not synchronized between replicas. In that case
> > counts for TF and IDF and norms become slightly different.
> >
> > You can either use ExactStatsCache that fetches counts for terms before
> > scoring, so that all replica's use the same counts. Or change the replica
> > types to TLOG. With TLOG segments are fetched from the leader and thus
> > identical.
> >
> > Regards,
> > Markus
> >
> > Op wo 13 jan. 2021 om 14:45 schreef Bernd Fehling <
> > [hidden email]>:
> >
> >> Hello list,
> >>
> >> a question for better understanding scoring of a shard in a cloud.
> >>
> >> I see different scores from different replicas of the same shard.
> >> Is this normal and if yes, why?
> >>
> >> My understanding until now was that replicas are always the same within
> a
> >> shard
> >> and the same query to each replica within a shard gives always the same
> >> score.
> >>
> >> Can someone help me to understand this?
> >>
> >> Regards
> >> Bernd
> >>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: different score from different replica of same shard

Vincent Brehin
Hallo Bernd und Markus,
A very instructive article, by the creator of TLOG mode (introduced in 7.0,
btw):
https://medium.com/@caomanhdat317/indexing-flow-of-solrcloud-sharding-distributed-systems-1-bba411bf8994
It helped me when architecting our replication policy.
Not an easy matter, it's a tradeoff between consistency and performance.
Solr's choice is to be eventually consistent, this "lag" is the price to
pay for scalability and fault tolerance.
TLOG are not as realtime as NRT (obviously) so it could be a problem too,
the replicas lag behind the leader (for score stats and results)... But
TLOG make your replicas eventually identical.
From my experience, TLOG comes with a price when you have many replicas and
*a lot* of index updates: the leader is doing indexing and is constantly
replying to update requests from replicas -> high memory, thread
contention, bad situation...
Pull replica are safer (solr process has no extra load, only the
filesystem) but they can't be leader, so no fault tolerance here. And still
a lag between leader and replica.
You could maybe stick a search session to a specific replica, in order to
have consistent results for a given client, if it's important for your use
case.
Hope this helps. Regards,
Vincent

Le mer. 13 janv. 2021 à 15:31, Markus Jelsma <[hidden email]> a
écrit :

> Hallo Bernd,
>
> I see the different replica types in the 7.1 [1] manual but not in the 6.6.
> ExactStatsCache should work in 6.6, just add it to solrconfig.xml, not the
> request handler [1]. It will slow down searches due to added overhead.
>
> Regards,
> Markus
>
> [1]
>
> https://lucene.apache.org/solr/guide/7_1/shards-and-indexing-data-in-solrcloud.html#types-of-replicas
> [2] https://lucene.apache.org/solr/guide/6_6/distributed-requests.html
>
> Op wo 13 jan. 2021 om 15:11 schreef Bernd Fehling <
> [hidden email]>:
>
> > Hello Markus,
> >
> > thanks a lot.
> > Is TLOG also for SOLR 6.6.6 or only 8.x and up?
> >
> > I will first try ExactStatsCache.
> > Should be added as invariant to request handler, right?
> >
> > Comparing the replica index directories they have different size and
> > the index version and generation is different. Also Max Doc.
> > But Num Docs is the same.
> >
> > Regards,
> > Bernd
> >
> >
> > Am 13.01.21 um 14:54 schrieb Markus Jelsma:
> > > Hello Bernd,
> > >
> > > This is normal for NRT replicas, because the way segments are merged
> and
> > > deletes are removed is not synchronized between replicas. In that case
> > > counts for TF and IDF and norms become slightly different.
> > >
> > > You can either use ExactStatsCache that fetches counts for terms before
> > > scoring, so that all replica's use the same counts. Or change the
> replica
> > > types to TLOG. With TLOG segments are fetched from the leader and thus
> > > identical.
> > >
> > > Regards,
> > > Markus
> > >
> > > Op wo 13 jan. 2021 om 14:45 schreef Bernd Fehling <
> > > [hidden email]>:
> > >
> > >> Hello list,
> > >>
> > >> a question for better understanding scoring of a shard in a cloud.
> > >>
> > >> I see different scores from different replicas of the same shard.
> > >> Is this normal and if yes, why?
> > >>
> > >> My understanding until now was that replicas are always the same
> within
> > a
> > >> shard
> > >> and the same query to each replica within a shard gives always the
> same
> > >> score.
> > >>
> > >> Can someone help me to understand this?
> > >>
> > >> Regards
> > >> Bernd
> > >>
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: different score from different replica of same shard

Walter Underwood
In reply to this post by Markus Jelsma-2
Yes, check performance before turning on the stats cache in prod.

When we tested the LRUStatsCache in 6.6.2, searches were 11X slower.

It should be possible to do distributed IDF with little extra overhead.
Infoseek was doing that in 1995 and the patent on the technique has
expired.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On Jan 13, 2021, at 6:31 AM, Markus Jelsma <[hidden email]> wrote:
>
> Hallo Bernd,
>
> I see the different replica types in the 7.1 [1] manual but not in the 6.6.
> ExactStatsCache should work in 6.6, just add it to solrconfig.xml, not the
> request handler [1]. It will slow down searches due to added overhead.
>
> Regards,
> Markus
>
> [1]
> https://lucene.apache.org/solr/guide/7_1/shards-and-indexing-data-in-solrcloud.html#types-of-replicas
> [2] https://lucene.apache.org/solr/guide/6_6/distributed-requests.html
>
> Op wo 13 jan. 2021 om 15:11 schreef Bernd Fehling <
> [hidden email]>:
>
>> Hello Markus,
>>
>> thanks a lot.
>> Is TLOG also for SOLR 6.6.6 or only 8.x and up?
>>
>> I will first try ExactStatsCache.
>> Should be added as invariant to request handler, right?
>>
>> Comparing the replica index directories they have different size and
>> the index version and generation is different. Also Max Doc.
>> But Num Docs is the same.
>>
>> Regards,
>> Bernd
>>
>>
>> Am 13.01.21 um 14:54 schrieb Markus Jelsma:
>>> Hello Bernd,
>>>
>>> This is normal for NRT replicas, because the way segments are merged and
>>> deletes are removed is not synchronized between replicas. In that case
>>> counts for TF and IDF and norms become slightly different.
>>>
>>> You can either use ExactStatsCache that fetches counts for terms before
>>> scoring, so that all replica's use the same counts. Or change the replica
>>> types to TLOG. With TLOG segments are fetched from the leader and thus
>>> identical.
>>>
>>> Regards,
>>> Markus
>>>
>>> Op wo 13 jan. 2021 om 14:45 schreef Bernd Fehling <
>>> [hidden email]>:
>>>
>>>> Hello list,
>>>>
>>>> a question for better understanding scoring of a shard in a cloud.
>>>>
>>>> I see different scores from different replicas of the same shard.
>>>> Is this normal and if yes, why?
>>>>
>>>> My understanding until now was that replicas are always the same within
>> a
>>>> shard
>>>> and the same query to each replica within a shard gives always the same
>>>> score.
>>>>
>>>> Can someone help me to understand this?
>>>>
>>>> Regards
>>>> Bernd
>>>>
>>>
>>