LTR original score feature

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

LTR original score feature

Brian Yee-2
I wanted to get some opinions on using the original score feature. The original score produced by Solr is intuitively a very important feature. In my data set I'm seeing that the original score varies wildly between different queries. This makes sense since the score generated by Solr is not normalized across all queries. However, won't this mess with our training data? If this feature is 3269.4 for the top result for one query, and then 32.7 for the top result for another query, it does not mean that the first document was 10x more relevant to its query than the second document. I am using a normalize param within Ranklib, but that only normalizes features between each other, not within one feature, right? How are people handling this? Am I missing something?
Reply | Threaded
Open this post in threaded view
|

Re: LTR original score feature

Michael Alcorn
What you're suggesting is that there's a "nonlinear relationship
<http://blog.minitab.com/blog/adventures-in-statistics-2/what-is-the-difference-between-linear-and-nonlinear-equations-in-regression-analysis>"
between the original score (the input variable) and some measure of
"relevance" (the output variable). Nonlinear models like decision trees
(which include LambdaMART) and neural networks (which include RankNet) can
handle these types of situations, assuming there's enough data. The
nonlinear phenomena you brought up are also probably part of the
reason why pairwise
models tend to perform better than pointwise models
<https://www.quora.com/What-are-the-differences-between-pointwise-pairwise-and-listwise-approaches-to-Learning-to-Rank>
in
learning to rank tasks.

On Fri, Jan 12, 2018 at 1:52 PM, Brian Yee <[hidden email]> wrote:

> I wanted to get some opinions on using the original score feature. The
> original score produced by Solr is intuitively a very important feature. In
> my data set I'm seeing that the original score varies wildly between
> different queries. This makes sense since the score generated by Solr is
> not normalized across all queries. However, won't this mess with our
> training data? If this feature is 3269.4 for the top result for one query,
> and then 32.7 for the top result for another query, it does not mean that
> the first document was 10x more relevant to its query than the second
> document. I am using a normalize param within Ranklib, but that only
> normalizes features between each other, not within one feature, right? How
> are people handling this? Am I missing something?
>