Solr relevancy score different on replicated nodes

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr relevancy score different on replicated nodes

Ashish Bisht
This post was updated on .
Version Solr 7.4.0 zookeeper 3.4.11 Achitecture Two boxes Machine-1,Machine-2
holding single instances of solr

We are having a collection which was single shard and single replica i.e s=1
and rf=1

Few days back we tried to add replica to it.But the score for same query is
coming different from different replicas.

http://Machine-1:8983/solr/MyTestCollection/select?q=%22data%22+OR+(data)&rows=10&fl=score&defType=edismax&qf=search_field+content&wt=json

"response":{"numFound":5836,"start":0,"maxScore":4.418847,"docs":[

whereas on another machine(replica)

http://Machine-2:8983/solr/MyTestCollection/select?q=%22data%22+OR+(data)&rows=10&fl=score&defType=edismax&qf=search_field+content&wt=json

"response":{"numFound":5836,"start":0,"maxScore":4.4952264,"docs":[

The maxScore is different.

Relevancy gets affected due to sharding but replication was not expected as
same documents get copied to other node. score explaination shows uneven
docCount and docFreq.

idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:
1.050635000 docCount :10020.000000000 docFreq :3504.0000000

idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:
1.068795100

docCount :10291.000000000 docFreq :3534.0000000

Update:Tried the same on different collection.Both instances give same score.Seems like a issue with particular collection.

How can we correct the original collection.


--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Solr relevancy score different on replicated nodes

Erick Erickson
See particularly point 3 here and to a lesser extent point 2.
https://support.lucidworks.com/s/question/0D58000003LRpijCAD/the-number-of-results-returned-is-not-constant-every-time-i-query-solr

For point two (the internal Lucene doc IDs are different) you can
easily correct it by adding sort=score desc, solrId asc to the query.

That article was written before TLOG and PULL replicas came into the
picture. Since those replica types all have the
exact same index structure you shouldn't have this problem in that case.

Best,
Erick

On Fri, Jan 4, 2019 at 3:40 AM AshB <[hidden email]> wrote:

>
> Version Solr 7.4.0 zookeeper 3.4.11 Achitecture Two boxes Machine-1,Machine-2
> holding single instances of solr
>
> We are having a collection which was single shard and single replica i.e s=1
> and rf=1
>
> Few days back we tried to add replica to it.But the score for same query is
> coming different from different replicas.
>
> http://Machine-1:8983/solr/MyTestCollection/select?q=%22data%22+OR+(data)&rows=10&fl=score&defType=edismax&qf=search_field+content&wt=json
>
> "response":{"numFound":5836,"start":0,"maxScore":*4.418847*,"docs":[
>
> whereas on another machine(replica)
>
> http://Machine-2:8983/solr/MyTestCollection/select?q=%22data%22+OR+(data)&rows=10&fl=score&defType=edismax&qf=search_field+content&wt=json
>
> "response":{"numFound":5836,"start":0,"maxScore":*4.4952264*,"docs":[
>
> The maxScore is different.
>
> Relevancy gets affected due to sharding but replication was not expected as
> same documents get copied to other node. score explaination gives issue with
> docCount and docFreq uneven.
>
> idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:
> 1.050635000 docCount :*10020.000000000* docFreq :*3504.0000000*
>
> idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:
> 1.068795100
>
> docCount :*10291.000000000* docFreq :*3534.0000000*
>
> Is this expected?What could be wrong here?Please suggest
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Solr relevancy score different on replicated nodes

Mikhail Khludnev-2
In reply to this post by Ashish Bisht
Replicated segments might have different deleted documents by design.
Precise numbers can be achieved via exact stats. see
https://lucene.apache.org/solr/guide/6_6/distributed-requests.html#DistributedRequests-ConfiguringstatsCache_DistributedIDF_


On Fri, Jan 4, 2019 at 2:40 PM AshB <[hidden email]> wrote:

> Version Solr 7.4.0 zookeeper 3.4.11 Achitecture Two boxes
> Machine-1,Machine-2
> holding single instances of solr
>
> We are having a collection which was single shard and single replica i.e
> s=1
> and rf=1
>
> Few days back we tried to add replica to it.But the score for same query is
> coming different from different replicas.
>
>
> http://Machine-1:8983/solr/MyTestCollection/select?q=%22data%22+OR+(data)&rows=10&fl=score&defType=edismax&qf=search_field+content&wt=json
>
> "response":{"numFound":5836,"start":0,"maxScore":*4.418847*,"docs":[
>
> whereas on another machine(replica)
>
>
> http://Machine-2:8983/solr/MyTestCollection/select?q=%22data%22+OR+(data)&rows=10&fl=score&defType=edismax&qf=search_field+content&wt=json
>
> "response":{"numFound":5836,"start":0,"maxScore":*4.4952264*,"docs":[
>
> The maxScore is different.
>
> Relevancy gets affected due to sharding but replication was not expected as
> same documents get copied to other node. score explaination gives issue
> with
> docCount and docFreq uneven.
>
> idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5))
> from:
> 1.050635000 docCount :*10020.000000000* docFreq :*3504.0000000*
>
> idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5))
> from:
> 1.068795100
>
> docCount :*10291.000000000* docFreq :*3534.0000000*
>
> Is this expected?What could be wrong here?Please suggest
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


--
Sincerely yours
Mikhail Khludnev
Reply | Threaded
Open this post in threaded view
|

Re: Solr relevancy score different on replicated nodes

Ashish Bisht
In reply to this post by Erick Erickson
Hi Erick,

I have updated that I am not facing this problem in a new collection.

As per 3) I can try deleting a replica and adding it again, but the
confusion is which one out of two should I delete.(wondering which replica
is giving correct score for query)

Both replicas give same number of docs while doing all query.Its strange
that in query explain docCount and docFreq is differing.

Regards
Ashish



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Solr relevancy score different on replicated nodes

Erick Erickson
Ashish:

Deleting and re-adding a replica is not a solution. Even if you did,
that would then be identical only until you started indexing again,
then the stats could skew a bit.

When you index to NRT replicas, the wall clock times that cause the
commits to trigger will be different due to network delays. What
happens essentially is that the doc gets indexed to the leader at time
X but hits the replica Y milliseconds later. So on leader, the
autocommit interval expires at time X+Z (Z being your autocommit
interval) but X+Y+Z on the follower. However, some additional docs may
have already been indexed on the leader but not yet on the follower
when the autocommit trigger happens so the newly-closed segment on the
leader can have docs that the newly-closed segment on the  follower
does not have.

the point is that the termfreq does _not_ change when a document is
deleted in some segment (and remember that an update is really a
delete followed by an add). The data associated with deleted docs is
not purged until segments are merged. Further, the decision about
which segments to merge is influenced by how many documents are
deleted in each.

All of which means that the tf/idf statistics are different (slightly)
and you either have to use destributed IDF or just live with it.

You're saying that the document count of live documents is different,
and that's more concerning. Is this true for brief intervals or is it
true when there is _no_ indexing going on _and_ your autocommit
interval is allowed to expire? In that case it's a different problem.
However, if the condition is transitory and goes away if you stop
indexing, then it's the same issue I outlined above; autocommit is
happening at different wall-clock times.

Best,
Erick

On Fri, Jan 4, 2019 at 11:12 AM Ashish Bisht <[hidden email]> wrote:

>
> Hi Erick,
>
> I have updated that I am not facing this problem in a new collection.
>
> As per 3) I can try deleting a replica and adding it again, but the
> confusion is which one out of two should I delete.(wondering which replica
> is giving correct score for query)
>
> Both replicas give same number of docs while doing all query.Its strange
> that in query explain docCount and docFreq is differing.
>
> Regards
> Ashish
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Solr relevancy score different on replicated nodes

Ashish Bisht
Hi Erick,

Thank you for the details,but doesn't look like a time difference in
autocommit caused this issue.As I said if I do retrieve all query/keyword
query on both server,they returned correct number of docs,its just relevancy
score is taking diff values.  

I waited for brief period,still discrepancy was coming(no indexing also).So
I went ahead deleting the follower node(thinking leader replica should be in
correct state).After adding the new replica again,the issue is not
appearing.

We will monitor same if it appears in future.

Regards
Ashish



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Solr relevancy score different on replicated nodes

Erick Erickson
You misunderstand my point. The wall clock times _will_ be
different on leader and follower. It follows that the
documents contained in the individual segments on
the leader and follower will _not_ be identical.

This leads to _deleted_ documents being in different
segments on the leader and follower. Which also means
that the merge decisions will eventually merge different
segments.

Now remember that over time when you update a doc,
the doc is "marked as deleted", but some of the stats
e.g. termfrequency _still_ include the data for the
deleted docs and will until the segment is merged.

So the term frequency for some term on the leader
will be slightly different than on the follower and thus
the scoring will differ depending on which replica
gets the query. Etc.

The fact that you deleted and re-added the follower
supports the above. And your scores will skew as
you continue to update documents over time.

Generally this isn't something that people concern
themselves with, but if it's important to you you can
try enabling exactstatscache helps, see:
https://lucene.apache.org/solr/guide/6_6/distributed-requests.html

Best,
Erick

On Sun, Jan 6, 2019 at 10:25 PM Ashish Bisht <[hidden email]> wrote:

>
> Hi Erick,
>
> Thank you for the details,but doesn't look like a time difference in
> autocommit caused this issue.As I said if I do retrieve all query/keyword
> query on both server,they returned correct number of docs,its just relevancy
> score is taking diff values.
>
> I waited for brief period,still discrepancy was coming(no indexing also).So
> I went ahead deleting the follower node(thinking leader replica should be in
> correct state).After adding the new replica again,the issue is not
> appearing.
>
> We will monitor same if it appears in future.
>
> Regards
> Ashish
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Solr relevancy score different on replicated nodes

Ashish Bisht
Thank you Erick for explaining.

In my senario, I stopped indexing and updates too and waited for 1 day.
Restarted solr too.Shouldn't both replica and leader come to same state
after this much long period. As you said this gets corrected by segment
merging, hope it is internal process itself and no manual activity required.

For us score matters as we are using it to display some scenarios on search
and it gave changing values.As of now we are dependent of single
shard-replica but in future we might need more replicas
Will planning indexing and updates outside peak query hour help?

I have tried the exact cache while debugging score difference during
sharding.Didn't help much.Anyhow that's a different topic.

Thanks again,

Regards
Ashish Bisht





--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Solr relevancy score different on replicated nodes

Erick Erickson
bq. Shouldn't both replica and leader come to same state
after this much long period.

No. After that long, the docs will be the same, all the docs
present on one replica will be present and searchable on
the other. However, they will be in different segments so the
"stats skew" will remain.

But displaying the scores isn't a good reason to worry about
this. Frankly, that's almost always a mistake. Scores are
meaningless outside of ranking the docs _in a single
query_. Because a doc in one query got a score of 10 but
some other doc in some other query scored 5 doesn't say
anything at all about whether one was "twice as good" as
another. Even within the same query, the same two
scores don't mean one doc is "twice as good".

I think this is a waste of effort frankly. At best, I've seen
UIs where they display, say, 1 to 5 stars that are just
showing the percentile that the particular doc had
_relative to the max score of that query_, unrelated
to any other query.

If you insist (and again I think it's a mistake) you can
optimize periodically, but if you're using anything
earlier than Solr 7.5 that has its own traps and I do
NOT recommend it unless you can do it every time
you change your index. See:
https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/
and
https://lucidworks.com/2018/06/20/solr-and-optimizing-your-index-take-ii/

On Tue, Jan 8, 2019 at 7:28 AM Ashish Bisht <[hidden email]> wrote:

>
> Thank you Erick for explaining.
>
> In my senario, I stopped indexing and updates too and waited for 1 day.
> Restarted solr too.Shouldn't both replica and leader come to same state
> after this much long period. As you said this gets corrected by segment
> merging, hope it is internal process itself and no manual activity required.
>
> For us score matters as we are using it to display some scenarios on search
> and it gave changing values.As of now we are dependent of single
> shard-replica but in future we might need more replicas
> Will planning indexing and updates outside peak query hour help?
>
> I have tried the exact cache while debugging score difference during
> sharding.Didn't help much.Anyhow that's a different topic.
>
> Thanks again,
>
> Regards
> Ashish Bisht
>
>
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Solr relevancy score different on replicated nodes

Ashish Bisht
This post was updated on .
Hi Erick,

Your statement At best, I've seen UIs where they display, say, 1 to 5
stars that are just showing the percentile that the particular doc had
_relative to the max score
  is something we are trying to achieve,but we
are dealing in percentages rather stars(ratings)

Change in MaxScore per node is messing it.

I was thinking if it possible to make one complete request(for a term) go
though one replica,i.e if to the client we could tell which replica hit the
first request and subsequently further paginated requests should go though
that replica until keyword is changed.Do you think it is possible or a good
idea?If yes is there a way in solr to know which replica served request?

Regards
Ashish




--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Solr relevancy score different on replicated nodes

eaph
Hello,

To a certain extent, I agree with Eric, that this isn't a problem, but
looks like one.  The nature of TF*IDF is such that you will see different
scores for the same query over time on the same replica, or different
replicas for the same query with most replication schemes. This is mildly
annoying when the score is displayed to the user, although I have found
most end users do not pay that much attention to the floating point score.
Testers do.  On a small index with high write/delete traffic and homogenous
docs, I've seen it cause document re-orderings when the same query is
repeated and sent to different replicas such as for paging, and that is
noticeable to end users.

How big is your index, and how different are the percentages you are
seeing?  This is a much more pronounced problem on smaller indices; it is
possible this is a problem with your test setup, but not production.

Your solution at directing users to a consistent replica will solve the
change in values over a session-sized window of time.   With a single
shard, you could use a Master/Slave setup, direct queries at a given
slave.  This has a number of operational consequences though, as it means
you will lose the benefits of SolrCloud.

Mikhail's suggestion to use ExactStats would be cleaner:
https://lucene.apache.org/solr/guide/6_6/distributed-requests.html#DistributedRequests-ConfiguringstatsCache_DistributedIDF_


Elizabeth

On Fri, Jan 11, 2019 at 3:56 AM Ashish Bisht <[hidden email]>
wrote:

> Hi Erick,
>
> Your statement "*At best, I've seen UIs where they display, say, 1 to 5
> stars that are just showing the percentile that the particular doc had
> _relative to the max score*"  is something we are trying to achieve,but we
> are dealing in percentages rather stars(ratings)
>
> Change in MaxScore per node is messing it.
>
> I was thinking if it possible to make one complete request(for a term) go
> though one replica,i.e if to the client we could tell which replica hit the
> first request and subsequently further paginated requests should go though
> that replica until keyword is changed.Do you think it is possible or a good
> idea?If yes is there a way in solr to know which replica served request?
>
> Regards
> Ashish
>
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr relevancy score different on replicated nodes

Erick Erickson
What Elizabeth said.

Really, this is an intractable problem. Even in the TLOG
and PULL replica case, an index getting updates will
still fire their replication requests at different wall-clock
time. Even if that were coordinated, the vagaries of
networks etc. would _still_ mean the various replicas
would see slightly different "snapshots" of the index.
True, the window would be smaller....

The only situations I've seen where the scores on different
replicas are always identical is when the index is optimized,
which isn't recommended except if you can do it
all the time. Or TLOG and PULL replicas are used and
the index is not undergoing continuous updates.

As for locking subsequent requests to a set of nodes, the
idea has been bandied about but usually falls down when
it's realized that this has the potential to unevenly distribute
the load.

Best,
Erick

On Fri, Jan 11, 2019 at 3:13 AM Elizabeth Haubert
<[hidden email]> wrote:

>
> Hello,
>
> To a certain extent, I agree with Eric, that this isn't a problem, but
> looks like one.  The nature of TF*IDF is such that you will see different
> scores for the same query over time on the same replica, or different
> replicas for the same query with most replication schemes. This is mildly
> annoying when the score is displayed to the user, although I have found
> most end users do not pay that much attention to the floating point score.
> Testers do.  On a small index with high write/delete traffic and homogenous
> docs, I've seen it cause document re-orderings when the same query is
> repeated and sent to different replicas such as for paging, and that is
> noticeable to end users.
>
> How big is your index, and how different are the percentages you are
> seeing?  This is a much more pronounced problem on smaller indices; it is
> possible this is a problem with your test setup, but not production.
>
> Your solution at directing users to a consistent replica will solve the
> change in values over a session-sized window of time.   With a single
> shard, you could use a Master/Slave setup, direct queries at a given
> slave.  This has a number of operational consequences though, as it means
> you will lose the benefits of SolrCloud.
>
> Mikhail's suggestion to use ExactStats would be cleaner:
> https://lucene.apache.org/solr/guide/6_6/distributed-requests.html#DistributedRequests-ConfiguringstatsCache_DistributedIDF_
>
>
> Elizabeth
>
> On Fri, Jan 11, 2019 at 3:56 AM Ashish Bisht <[hidden email]>
> wrote:
>
> > Hi Erick,
> >
> > Your statement "*At best, I've seen UIs where they display, say, 1 to 5
> > stars that are just showing the percentile that the particular doc had
> > _relative to the max score*"  is something we are trying to achieve,but we
> > are dealing in percentages rather stars(ratings)
> >
> > Change in MaxScore per node is messing it.
> >
> > I was thinking if it possible to make one complete request(for a term) go
> > though one replica,i.e if to the client we could tell which replica hit the
> > first request and subsequently further paginated requests should go though
> > that replica until keyword is changed.Do you think it is possible or a good
> > idea?If yes is there a way in solr to know which replica served request?
> >
> > Regards
> > Ashish
> >
> >
> >
> >
> > --
> > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> >