# document boost Classic List Threaded 6 messages Open this post in threaded view
|

## document boost

 Hello folks, We're trying to use Lucene's scoring to do a fairly basic thing: give a document (in this case, we index "articles") a boost based on an integer value that we know at index-time.  We want the  document boost to affect the final document score linearly. We thought that assigning a document boost based on this value would do the trick, but the behavior we're seeing doesn't match what we expect given the online documentation.  In fact, we see that a linear increase in document boost yields an exponential increase in the 'fieldNorm' component of the score for each term of the query that matched the document.    Here's a small table of values that relate the document boost we pass in to the fieldNorm contribution returned by Lucene: boost  fieldNorm 1.0    0.3125    =  (5/16, 2^-1.678) 2.0    20.0      =  (2^4.3219, 2^1 * 10) 3.0    256.0     =  (2^8) 4.0    1280.0    =  (2^10.3219, 2^7 * 10) 5.0    5120.0    =  (2^12.3219, 2^9 * 10) 6.0    16384.0   =  (2^14.0) 7.0    40960.0   =  (2^15.3219, 2^12 * 10) 8.0    81920.0   =  (2^16.3219, 2^13 * 10) 10.0   327680.0  =  (2^18.3219, 2^15 * 10) This example is using a query with two terms against a document that contains those terms and a few others, in one searchable field. Is this the way document boost is supposed to work?  Or have we misconfigured something? If we cannot use document boost to affect scoring linearly, is there some other technique we can use? By the way, we're using SOLR to access Lucene.  We can give more information if necessary, such as our SOLR schema.xml, if folks think that would help explain things.  Let us know what other information we can provide. Thanks, Mike
Open this post in threaded view
|

## Re: document boost

 I would say you def misconfigured something. Doubling your doc boost will double your fieldNorm approximately (I think the precision isn't perfect). I don't know what your doing wrong in such a small test, but your fieldNorm should *not* be exploding like that. Can you post some code? - Mark Mike Grafton wrote: > Hello folks, > > We're trying to use Lucene's scoring to do a fairly basic thing: give a > document (in this case, we index "articles") a boost based on an integer > value that we know at index-time.  We want the  document boost to affect the > final document score linearly. > > We thought that assigning a document boost based on this value would do the > trick, but the behavior we're seeing doesn't match what we expect given the > online documentation.  In fact, we see that a linear increase in document > boost yields an exponential increase in the 'fieldNorm' component of the > score for each term of the query that matched the document.    Here's a > small table of values that relate the document boost we pass in to the > fieldNorm contribution returned by Lucene: > > boost  fieldNorm > 1.0    0.3125    =  (5/16, 2^-1.678) > 2.0    20.0      =  (2^4.3219, 2^1 * 10) > 3.0    256.0     =  (2^8) > 4.0    1280.0    =  (2^10.3219, 2^7 * 10) > 5.0    5120.0    =  (2^12.3219, 2^9 * 10) > 6.0    16384.0   =  (2^14.0) > 7.0    40960.0   =  (2^15.3219, 2^12 * 10) > 8.0    81920.0   =  (2^16.3219, 2^13 * 10) > 10.0   327680.0  =  (2^18.3219, 2^15 * 10) > > This example is using a query with two terms against a document that > contains those terms and a few others, in one searchable field. > > Is this the way document boost is supposed to work?  Or have we > misconfigured something? If we cannot use document boost to affect scoring > linearly, is there some other technique we can use? > > By the way, we're using SOLR to access Lucene.  We can give more information > if necessary, such as our SOLR schema.xml, if folks think that would help > explain things.  Let us know what other information we can provide. > > Thanks, > Mike > >   --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email]
Open this post in threaded view
|

## Re: document boost

 Thanks for your help, Mark.  We can start by posting our SOLR config files, although I'm not sure if that will be helpful (we don't see much in there regarding boosts).  See attached.  How SOLR actually configures and interfaces with Lucene is a bit of an unknown to us, so I'm not sure we can get down to the raw Lucene configuration and interaction. That being said, in addition to the SOLR config files, see our attached XML which we post to SOLR to add the document to the index.How do you know that boost should affect fieldNorm linearly? Is there some code you can point us to?  We looked through the Lucene source for a while, but it was kind of hard to track this down. One note: we're on an old version of Lucene - a nightly build between 2.0.0 and 2.1.0.MikeOn 1/30/08, Mark Miller <[hidden email]> wrote: I would say you def misconfigured something. Doubling your doc boostwill double your fieldNorm approximately (I think the precision isn'tperfect).I don't know what your doing wrong in such a small test, but your fieldNorm should *not* be exploding like that.Can you post some code?- MarkMike Grafton wrote:> Hello folks,>> We're trying to use Lucene's scoring to do a fairly basic thing: give a > document (in this case, we index "articles") a boost based on an integer> value that we know at index-time.  We want the  document boost to affect the> final document score linearly.> > We thought that assigning a document boost based on this value would do the> trick, but the behavior we're seeing doesn't match what we expect given the> online documentation.  In fact, we see that a linear increase in document > boost yields an exponential increase in the 'fieldNorm' component of the> score for each term of the query that matched the document.    Here's a> small table of values that relate the document boost we pass in to the > fieldNorm contribution returned by Lucene:>> boost  fieldNorm> 1.0    0.3125    =  (5/16, 2^-1.678)> 2.0    20.0      =  (2^4.3219, 2^1 * 10)> 3.0    256.0     =  (2^8)> 4.0    1280.0    =  (2^10.3219, 2^7 * 10) > 5.0    5120.0    =  (2^12.3219, 2^9 * 10)> 6.0    16384.0   =  (2^14.0)> 7.0    40960.0   =  (2^15.3219, 2^12 * 10)> 8.0    81920.0   =  (2^16.3219, 2^13 * 10)> 10.0   327680.0  =  (2^18.3219, 2^15 * 10) >> This example is using a query with two terms against a document that> contains those terms and a few others, in one searchable field.>> Is this the way document boost is supposed to work?  Or have we > misconfigured something? If we cannot use document boost to affect scoring> linearly, is there some other technique we can use?>> By the way, we're using SOLR to access Lucene.  We can give more information > if necessary, such as our SOLR schema.xml, if folks think that would help> explain things.  Let us know what other information we can provide.>> Thanks,> Mike>>--------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email]For additional commands, e-mail: [hidden email] --------------------------------------------------------------------- To unsubscribe, e-mail: [hidden email] For additional commands, e-mail: [hidden email] schema.xml (15K) Download Attachment solrconfig.xml (17K) Download Attachment solr_post.xml (386 bytes) Download Attachment
Open this post in threaded view
|