Date range boost

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Date range boost

stefano nicolai-2
Hi all,

I'm using Solr to search inside a catalogue of many items (around 3
million).
All of these items have a field containing the date they were created
(it's a string field at the moment, as i have this type inside my DB).

I want to give a higher score to the ones with the most recent date, but i
ended
up with something like this (at query time):

[...] AND (publ_date:[2007-00-00 TO 2007-12-31]^3.0 |
            publ_date:[2006-00-00 TO 2006-12-31]^2.6 |
            publ_date:[2005-00-00 TO 2005-12-31]^2.3 |
            publ_date:[2004-00-00 TO 2004-12-31]^2.0 |
            publ_date:[2003-00-00 TO 2003-12-31]^1.6 |
            publ_date:[2002-00-00 TO 2002-12-31]^1.3 |
            publ_date:[2001-00-00 TO 2001-12-31]^1.0 |
            publ_date:[2000-00-00 TO 2000-12-31]^0.6 |
            publ_date:[1990-00-00 TO 1999-12-31]^0.3 |
            publ_date:[0000-00-00 TO 2010-12-31])  [...]

It doesn't look like the best solution available to me though.

Any suggestions in how i can implement this differently ?
Reply | Threaded
Open this post in threaded view
|

Re: Date range boost

Bertrand Delacretaz
On 3/12/07, stefano nicolai <[hidden email]> wrote:

> ...All of these items have a field containing the date they were created
> (it's a string field at the moment, as i have this type inside my DB).
>
> I want to give a higher score to the ones with the most recent date...

You should be able to use boost functions for this, see for example
http://www.mail-archive.com/solr-user@.../msg01877.html

and

http://lucene.apache.org/solr/api/org/apache/solr/search/QueryParsing.html#parseFunction(java.lang.String,%20org.apache.solr.schema.IndexSchema)

-Bertrand
Reply | Threaded
Open this post in threaded view
|

Re: Date range boost

Stefano Nicolai
Bertrand Delacretaz wrote:

> On 3/12/07, stefano nicolai <[hidden email]> wrote:
>
>> ...All of these items have a field containing the date they were created
>> (it's a string field at the moment, as i have this type inside my DB).
>>
>> I want to give a higher score to the ones with the most recent date...
>
> You should be able to use boost functions for this, see for example
> http://www.mail-archive.com/solr-user@.../msg01877.html
>
> and
>
> http://lucene.apache.org/solr/api/org/apache/solr/search/QueryParsing.html#parseFunction(java.lang.String,%20org.apache.solr.schema.IndexSchema)
>
>
> -Bertrand
>
Thanks for the answer, I'm trying to play with the linear/recip
functions to get the results i want.

Reading the old threads i ended up in this link with a different solution:
http://www.gossamer-threads.com/lists/lucene/java-user/43457

What you think about Andrej's idea (copy/pasted here)?

"Add a separate field, say "days", in which you will put as many "1" as
many days elapsed since the epoch (not neccessarily since 1 Jan 1970 -
pick a date that makes sense for you). Then, if you want to prioritize
newer documents, just add "+days:1" to your query. Voila - the final
results are a sum of other score factors plus a score factor that is
higher for more recent document, containing more 1-s"



I'm just wondering if this way may boost the date field too much
compared to the other fields?
Reply | Threaded
Open this post in threaded view
|

Re: Date range boost

Chris Hostetter-3

: I'm just wondering if this way may boost the date field too much
: compared to the other fields?

either way you can tweak the affects of the affects on teh score with
boosts ... but i suspect you'll find the FunctionQuery approach a lot
easier to deal with.



-Hoss