variaton on boosting recent documents gives exception

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

variaton on boosting recent documents gives exception

Michael Lackhoff-2
Since my field to measure recency is not a date field but a string field
(with only year-numbers in it), I tried a variation on the suggested
boost function for recent documents:
  recip(sub(2015,min(sortyear,2015)),1,10,10)
But this gives an exception when used in a boost or bf parameter.
I guess the reason is that all the mathematics doesn't work with a
string field even if it only contains numbers. Am I right with this
guess? And if so, is there a function I can use to change the type to
something numeric? Or are there other problems with my function?

Another related question: as you can see the current year (2015) is hard
coded. Is there an easy way to get the current year within the function?
Messing around with NOW looks very complicated.

-Michael
Reply | Threaded
Open this post in threaded view
|

RE: variaton on boosting recent documents gives exception

Gonzalo Rodriguez
Hello Michael,

You can always change the type of your sortyear field to an int, or create an int version of it and use copyField to populate it.

And using NOW/YEAR will round the current date to the start of the year, you can read more about this in the Javadoc: http://lucene.apache.org/solr/4_10_3/solr-core/org/apache/solr/util/DateMathParser.html

You can test it using the example collection: http://localhost:8983/solr/collection1/select?q=*:*&boost=recip(ms(NOW/YEAR,manufacturedate_dt),3.16e-11,1,1)&fl=id,manufacturedate_dt,score,[explain]&defType=edismax and checking the explain field for the numeric value given to NOW/YEAR vs NOW/HOUR, etc.


Gonzalo

-----Original Message-----
From: Michael Lackhoff [mailto:[hidden email]]
Sent: Thursday, February 12, 2015 8:57 AM
To: [hidden email]
Subject: variaton on boosting recent documents gives exception

Since my field to measure recency is not a date field but a string field (with only year-numbers in it), I tried a variation on the suggested boost function for recent documents:
  recip(sub(2015,min(sortyear,2015)),1,10,10)
But this gives an exception when used in a boost or bf parameter.
I guess the reason is that all the mathematics doesn't work with a string field even if it only contains numbers. Am I right with this guess? And if so, is there a function I can use to change the type to something numeric? Or are there other problems with my function?

Another related question: as you can see the current year (2015) is hard coded. Is there an easy way to get the current year within the function?
Messing around with NOW looks very complicated.

-Michael
Reply | Threaded
Open this post in threaded view
|

Re: variaton on boosting recent documents gives exception

Michael Lackhoff-2
Am 13.02.2015 um 11:18 schrieb Gonzalo Rodriguez:

> You can always change the type of your sortyear field to an int, or create an int version of it and use copyField to populate it.

But that would require me to reindex. Would be nice to have some type
conversion available within a function query.

> And using NOW/YEAR will round the current date to the start of the year, you can read more about this in the Javadoc: http://lucene.apache.org/solr/4_10_3/solr-core/org/apache/solr/util/DateMathParser.html
>
> You can test it using the example collection: http://localhost:8983/solr/collection1/select?q=*:*&boost=recip(ms(NOW/YEAR,manufacturedate_dt),3.16e-11,1,1)&fl=id,manufacturedate_dt,score,[explain]&defType=edismax and checking the explain field for the numeric value given to NOW/YEAR vs NOW/HOUR, etc.

The definition of *_dt fields int the example-schema is 'date' but my
field is text or (t)int if I have to reindex.

To compare against this int field I need another (comparable) int.
ms(NOW/YEAR,manufacturedate_dt) is an int, but a huge one, which is very
difficult to bring into a sensible relationship to e.g. '2015'.

Your suggestion would only work if I change my year to a date like
2015-01-01T00:00:00Z which is not a sensible format for a publication
year and not even easily creatable by copyfield.

What I need is a real year number, not a date truncated to the year,
which is only accessible as the number of milliseconds since the epoch
of Jan, 1st 00:00:00h, which is not very handy.

-Michael