Document "freshness" and Boost Functions

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Document "freshness" and Boost Functions

Luis Neves-3

Hello,
Reading the javadocs from the DisMaxRequestHandler I see that is possible to use
"Boost Functions" to influence the score. How would that work in order to
improve the score of recent documents? (I have a timestamp field in the
schema)... I'm assuming it's possible (right?), but I can't figure out the syntax.

--
Luis Neves




Reply | Threaded
Open this post in threaded view
|

Re: Document "freshness" and Boost Functions

Bertrand Delacretaz
On 1/17/07, Luis Neves <[hidden email]> wrote:

> ...I see that is possible to use
> "Boost Functions" to influence the score. How would that work in order to
> improve the score of recent documents? (I have a timestamp field in the
> schema)...

I've been using expressions like these in boolean queries, based on  a
"broadcast_date" field:

_val_:"linear(recip(rord(broadcast_date),1,1000,1000),11,0)"

Where recip computes an age-based score, and linear is used to boost it.

See http://incubator.apache.org/solr/docs/api/org/apache/solr/search/QueryParsing.html,
and also the list archives, these functions have been discussed
before.

I'm not sure off the top of my head how to use this with dismax queries though.

-Bertrand
Reply | Threaded
Open this post in threaded view
|

Re: Document "freshness" and Boost Functions

kkrugler
>On 1/17/07, Luis Neves <[hidden email]> wrote:
>
>>...I see that is possible to use
>>"Boost Functions" to influence the score. How would that work in order to
>>improve the score of recent documents? (I have a timestamp field in the
>>schema)...
>
>I've been using expressions like these in boolean queries, based on  a
>"broadcast_date" field:
>
>_val_:"linear(recip(rord(broadcast_date),1,1000,1000),11,0)"
>
>Where recip computes an age-based score, and linear is used to boost it.
>
>See
>http://incubator.apache.org/solr/docs/api/org/apache/solr/search/QueryParsing.html,
>and also the list archives, these functions have been discussed
>before.
>
>I'm not sure off the top of my head how to use this with dismax
>queries though.

There's another trick, described by Andrzej here:

http://www.gossamer-threads.com/lists/lucene/java-user/43457

You have another field such as days: where you write the same token
(e.g. "1") the number of times that matches the age of your document
(starting from some reasonable base date). Then add +days:1 to your
query and Lucene winds up boosting the results by how recent they are.

-- Ken
--
Ken Krugler
Krugle, Inc.
+1 530-210-6378
"Find Code, Find Answers"
Reply | Threaded
Open this post in threaded view
|

Re: Document "freshness" and Boost Functions

Chris Hostetter-3
In reply to this post by Bertrand Delacretaz

: > "Boost Functions" to influence the score. How would that work in order to
: > improve the score of recent documents? (I have a timestamp field in the

: I've been using expressions like these in boolean queries, based on  a
: "broadcast_date" field:
:
: _val_:"linear(recip(rord(broadcast_date),1,1000,1000),11,0)"

: I'm not sure off the top of my head how to use this with dismax queries though.

with teh dismax request handler, you can specify a "bq" param which takes
in a raw lucene query for boostig -- the query above with the _val_ sytnax
would work htere -- but the DisMax handler also has explicit support for
boost function parsing with the "bf" param, so you could say...

http://localhost:8983/solr/search?qt=dismax&q=hoss&bf=linear(recip(rord(broadcast_date),1,1000,1000),11,0)

http://incubator.apache.org/solr/docs/api/org/apache/solr/request/DisMaxRequestHandler.html

-Hoss