Function boosts...

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Function boosts...

escher2k
I had a question about the way boosting works  - is it a final boost on the score that is returned ?
For instance, in the LinearFloatFunction (LinearFloatFunction(ValueSource source, float slope, float intercept)),
is the ValueSource is the "core" score returned by Lucene that gets boosted.

From, http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.html
score(q,d)   =   coord(q,d)  ·  queryNorm(q)  ·   ∑  ( tf(t in d)  ·  idf(t)2  ·  t.getBoost() ·  norm(t,d) )  
So, is ValueSource really score(q,d) and hence LinearFloatFunction does,
Final Score = score(q,d) * slope + intercept ?

Thanks.
Reply | Threaded
Open this post in threaded view
|

RE: Function boosts...

Graham Stead-2
I believe two concepts are getting slightly mixed here: the
LinearFloatFunction, which is a Solr FunctionQuery, and the original Lucene
scoring methodology. FunctionQueries are not part of vanilla Lucene, so you
will not explicitly see them mentioned in the Lucene similarity documents.

The best way to understand how FunctionQueries are applied is to use the
Solr explanations (&debugQuery=1, I believe).

From my experience, each Function Query you add is treated as another term
in the summation. E.g., if the search query has 2 terms and 1 function query
is added, you will see 3 terms summed to yield the score. The function query
result is multiplied by queryNorm(q), making the effect a bit hard to
predict sometimes.

Hope this helps,
-Graham

> -----Original Message-----
> From: escher2k [mailto:[hidden email]]
> Sent: Thursday, December 28, 2006 3:20 PM
> To: [hidden email]
> Subject: Function boosts...
>
>
> I had a question about the way boosting works  - is it a
> final boost on the score that is returned ?
> For instance, in the LinearFloatFunction
> (LinearFloatFunction(ValueSource source, float slope, float
> intercept)), is the ValueSource is the "core" score returned
> by Lucene that gets boosted.
>
> From,
> http://lucene.apache.org/java/docs/api/org/apache/lucene/searc
> h/Similarity.html
> score(q,d)   =   coord(q,d)  ・  queryNorm(q)  ・   �  ( tf(t in d)  ・
> idf(t)2  ・  t.getBoost() ・  norm(t,d) ) So, is ValueSource
> really score(q,d) and hence LinearFloatFunction does, Final
> Score = score(q,d) * slope + intercept ?
>
> Thanks.
> --
> View this message in context:
> http://www.nabble.com/Function-boosts...-tf2892636.html#a8081654
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


Reply | Threaded
Open this post in threaded view
|

RE: Function boosts...

Chris Hostetter-3
: I believe two concepts are getting slightly mixed here: the
: LinearFloatFunction, which is a Solr FunctionQuery, and the original Lucene
: scoring methodology. FunctionQueries are not part of vanilla Lucene, so you
: will not explicitly see them mentioned in the Lucene similarity documents.

furthermore, the "Lucene Scoring Formula" is based very heavily on simple
BooleanQueries containing TermQueries ... when you start looking at more
exotic queries (like PhraseQueries, SpanQueries, etc...) it's not longer
as simple.  FunctionQueries are about as exotic as you cna get.

: The best way to understand how FunctionQueries are applied is to use the
: Solr explanations (&debugQuery=1, I believe).

I just want to re-iterate that point ... when trying to understand
anything baout scoring, explain is your friend ... this is doubly true
with function queries.

: >From my experience, each Function Query you add is treated as another term
: in the summation. E.g., if the search query has 2 terms and 1 function query
: is added, you will see 3 terms summed to yield the score. The function query
: result is multiplied by queryNorm(q), making the effect a bit hard to
: predict sometimes.

correct. the Sigma in the Lucene scoring equation is across all of the
hypothetical term queries contained in an outermost hypothetical boolena
query.  when dealing with a function query, all of the "t" based terms
(tf, idf, t.getBoost, and norm(t,d)) don't exist .. instead you have
only the function value, and ny boost you've applied to the function query
(which strictly speaking is the "t.getBoost()" from the orriginal
equation, even though it's not a term)



-Hoss