Boost One Term Query

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Boost One Term Query

java_user_
Boosting a one term query does not have an affect on the score.

For example:
apple

Has the same score as:
apple^3

But repeating the term will up the score
apple apple apple

I expected the score to go up when boosting a one term query.  Is that a wrong expectation?

Thanks!
Reply | Threaded
Open this post in threaded view
|

Re: Boost One Term Query

hossman

first off: if you are looking at the score from the "Hits" class, bear in
mind they are "psuedo-normalized" and don't mean much.

second: a "query" doesn't have a score, a document has a score relative to
a query ... scores can't be compared between different queries.

third: there is a "queryNorm" that comes into play, it's designed to keep
scores "managable" you can read more about it (and how to change it if you
want) in the scoring documentation.  you should also look at the
"Explanation" info for each query/doc to make sure you understand what's
going on.



: For example:
: apple
:
: Has the same score as:
: apple^3
:
: But repeating the term will up the score
: apple apple apple
:
: I expected the score to go up when boosting a one term query.  Is that a
: wrong expectation?



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Boost One Term Query

java_user_
Thanks for the response Hoss.

The score I receive is from the Explaination object.  The score stays the same regardless of how I boost the single term.

The score of the query:
apple

Is the same as the score of the query:
apple^3

I am surprised by the result of the test.  Would you expect "apple" and "apple^3" to receive the same score?

Thanks

hossman wrote
first off: if you are looking at the score from the "Hits" class, bear in
mind they are "psuedo-normalized" and don't mean much.

second: a "query" doesn't have a score, a document has a score relative to
a query ... scores can't be compared between different queries.

third: there is a "queryNorm" that comes into play, it's designed to keep
scores "managable" you can read more about it (and how to change it if you
want) in the scoring documentation.  you should also look at the
"Explanation" info for each query/doc to make sure you understand what's
going on.



: For example:
: apple
:
: Has the same score as:
: apple^3
:
: But repeating the term will up the score
: apple apple apple
:
: I expected the score to go up when boosting a one term query.  Is that a
: wrong expectation?



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Reply | Threaded
Open this post in threaded view
|

Re: Boost One Term Query

Yonik Seeley-2
On Dec 6, 2007 2:31 PM, java_user_ <[hidden email]> wrote:

> Thanks for the response Hoss.
>
> The score I receive is from the Explaination object.  The score stays the
> same regardless of how I boost the single term.
>
> The score of the query:
> apple
>
> Is the same as the score of the query:
> apple^3

This boosts apple 3 times in relation to the other query clauses.  If
there are no other query clauses, it's a bit meaningless.

> I am surprised by the result of the test.  Would you expect "apple" and
> "apple^3" to receive the same score?

Lucene does some "weighting" of the query that causes this to happen.

class Query { [...]
  /** Expert: Constructs and initializes a Weight for a top-level query. */
  public Weight weight(Searcher searcher)
    throws IOException {
    Query query = searcher.rewrite(this);
    Weight weight = query.createWeight(searcher);
    float sum = weight.sumOfSquaredWeights();
    float norm = getSimilarity(searcher).queryNorm(sum);
    weight.normalize(norm);
    return weight;
  }

Are you simply curious about this, or is it causing you a problem somehow?

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Boost One Term Query

java_user_
I was hoping to boost the entire query to give the query more weight compared to other queries.

Instead of boosting my entire query, I may just multiply the resulting score by the weight (or something like that).


Yonik Seeley wrote
On Dec 6, 2007 2:31 PM, java_user_ <jkraemer@cs.tufts.edu> wrote:
> Thanks for the response Hoss.
>
> The score I receive is from the Explaination object.  The score stays the
> same regardless of how I boost the single term.
>
> The score of the query:
> apple
>
> Is the same as the score of the query:
> apple^3

This boosts apple 3 times in relation to the other query clauses.  If
there are no other query clauses, it's a bit meaningless.

> I am surprised by the result of the test.  Would you expect "apple" and
> "apple^3" to receive the same score?

Lucene does some "weighting" of the query that causes this to happen.

class Query { [...]
  /** Expert: Constructs and initializes a Weight for a top-level query. */
  public Weight weight(Searcher searcher)
    throws IOException {
    Query query = searcher.rewrite(this);
    Weight weight = query.createWeight(searcher);
    float sum = weight.sumOfSquaredWeights();
    float norm = getSimilarity(searcher).queryNorm(sum);
    weight.normalize(norm);
    return weight;
  }

Are you simply curious about this, or is it causing you a problem somehow?

-Yonik

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Reply | Threaded
Open this post in threaded view
|

Re: Boost One Term Query

Erick Erickson
I don't believe you can compare scores across queries in any meaningful
way.

This sounds a lot like you're trying to solve some problem and have decided
that boosting and comparing scores across queries is the answer. in other
words, the XY problem.

Perhaps if you explained what you're trying to accomplish someone could
suggest an alternative...

Best
Erick

On Dec 6, 2007 3:12 PM, java_user_ <[hidden email]> wrote:

>
> I was hoping to boost the entire query to give the query more weight
> compared
> to other queries.
>
> Instead of boosting my entire query, I may just multiply the resulting
> score
> by the weight (or something like that).
>
>
>
> Yonik Seeley wrote:
> >
> > On Dec 6, 2007 2:31 PM, java_user_ <[hidden email]> wrote:
> >> Thanks for the response Hoss.
> >>
> >> The score I receive is from the Explaination object.  The score stays
> the
> >> same regardless of how I boost the single term.
> >>
> >> The score of the query:
> >> apple
> >>
> >> Is the same as the score of the query:
> >> apple^3
> >
> > This boosts apple 3 times in relation to the other query clauses.  If
> > there are no other query clauses, it's a bit meaningless.
> >
> >> I am surprised by the result of the test.  Would you expect "apple" and
> >> "apple^3" to receive the same score?
> >
> > Lucene does some "weighting" of the query that causes this to happen.
> >
> > class Query { [...]
> >   /** Expert: Constructs and initializes a Weight for a top-level query.
> > */
> >   public Weight weight(Searcher searcher)
> >     throws IOException {
> >     Query query = searcher.rewrite(this);
> >     Weight weight = query.createWeight(searcher);
> >     float sum = weight.sumOfSquaredWeights();
> >     float norm = getSimilarity(searcher).queryNorm(sum);
> >     weight.normalize(norm);
> >     return weight;
> >   }
> >
> > Are you simply curious about this, or is it causing you a problem
> somehow?
> >
> > -Yonik
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Boost-One-Term-Query-tf4900128.html#a14200211
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Boost One Term Query

Jens Grivolla-3
Erick Erickson wrote:
> I don't believe you can compare scores across queries in any meaningful
> way.

I actually investigated this to some degree in my thesis, comparing
different participating systems from the TREC campaigns.  It turns out
that some systems' scores (e.g. the top scores for a given query)
correlate quite well with the quality of the result (relevance of the
returned documents), whereas for others this is not at all the case.  I
don't think there was a Lucene-based system in there, though.

So in some cases comparing scores across queries can be used (to a
limited degree) as a confidence measure, an indicator of expected
usefulness of the results for the user.

Jens

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]