correlation between score and term frequency

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

correlation between score and term frequency

Alexander Kubias
Hi!

I have a question about the correlation between the score value and the
term frequency. Let's assume that we have one index about one set of
documents. In addition to that, let's assume that there is only one term
in a query.

If we now search for the term "car" and get a certain score value X, and
if we then search for the term "football" and get the same score value X.
Is it now sure that both values X are the same?

Could you explain, what correlation between the score value and the term
frequency exists in my scenario?

Thanks for your help!

Best regards,
alex


Reply | Threaded
Open this post in threaded view
|

Re: correlation between score and term frequency

Grant Ingersoll-2
Not sure I follow, you get back the same score for two different  
queries and you wonder why?

The best way to see how a score is calculated is to use the explain  
(debug) functionality in Solr.

-Grant

On Oct 1, 2007, at 10:06 AM, [hidden email] wrote:

> Hi!
>
> I have a question about the correlation between the score value and  
> the
> term frequency. Let's assume that we have one index about one set of
> documents. In addition to that, let's assume that there is only one  
> term
> in a query.
>
> If we now search for the term "car" and get a certain score value  
> X, and
> if we then search for the term "football" and get the same score  
> value X.
> Is it now sure that both values X are the same?
>
> Could you explain, what correlation between the score value and the  
> term
> frequency exists in my scenario?
>
> Thanks for your help!
>
> Best regards,
> alex
>
>

--------------------------
Grant Ingersoll
http://lucene.grantingersoll.com



Reply | Threaded
Open this post in threaded view
|

Re: correlation between score and term frequency

Joseph Doehr
In reply to this post by Alexander Kubias

        Hi Alex,

do you mean, you like to know if both results have the same relevance
through the whole content which is indexed and if both results are
direct comparable?


[hidden email] schrieb:

> I have a question about the correlation between the score value and the
> term frequency. Let's assume that we have one index about one set of
> documents. In addition to that, let's assume that there is only one term
> in a query.
>
> If we now search for the term "car" and get a certain score value X, and
> if we then search for the term "football" and get the same score value X.
> Is it now sure that both values X are the same?
>
> Could you explain, what correlation between the score value and the term
> frequency exists in my scenario?

Reply | Threaded
Open this post in threaded view
|

Re: correlation between score and term frequency

Mike Klaas
In reply to this post by Alexander Kubias
On 1-Oct-07, at 7:06 AM, [hidden email] wrote:

> Hi!
>
> I have a question about the correlation between the score value and  
> the
> term frequency. Let's assume that we have one index about one set of
> documents. In addition to that, let's assume that there is only one  
> term
> in a query.
>
> If we now search for the term "car" and get a certain score value  
> X, and
> if we then search for the term "football" and get the same score  
> value X.
> Is it now sure that both values X are the same?
>
> Could you explain, what correlation between the score value and the  
> term
> frequency exists in my scenario?

If the field has norms, there is a corrolation but the tf is  
unrecoverable from the score, because of field length normalization.  
query normalization also makes it difficult to compare scores from  
query to query.

see http://lucene.apache.org/java/docs/scoring.html to start out, in  
particular the link to the Similarity class javadocs.

-Mike
Reply | Threaded
Open this post in threaded view
|

AW: correlation between score and term frequency

Alexander Kubias
In reply to this post by Joseph Doehr
Yes, that was the meaning of my question! Can you answer it?

-----Urspr√ľngliche Nachricht-----
Von: Joseph Doehr [mailto:[hidden email]]
Gesendet: Montag, 1. Oktober 2007 20:00
An: [hidden email]
Betreff: Re: correlation between score and term frequency



        Hi Alex,

do you mean, you like to know if both results have the same relevance
through the whole content which is indexed and if both results are
direct comparable?


[hidden email] schrieb:
> I have a question about the correlation between the score value and
> the term frequency. Let's assume that we have one index about one set
> of documents. In addition to that, let's assume that there is only one

> term in a query.
>
> If we now search for the term "car" and get a certain score value X,
> and if we then search for the term "football" and get the same score
> value X. Is it now sure that both values X are the same?
>
> Could you explain, what correlation between the score value and the
> term frequency exists in my scenario?