MoreLikeThis mlt.qf boosting

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

MoreLikeThis mlt.qf boosting

Clas Rydergren
Hi! I have been testing the MoreLikeThis feature in Solr. I have
indexed a subset of Wikipedia with the fields title (the title of the
Wikipedia page) and content (the Wikipedia page content). When
performing a MoreLikeThis request on this index as:

http://server:8983/solr/mlt?stream.body=google+yahoo&mlt.fl=title,content&mlt.interestingTerms=details&mlt.boost=true&mlt.mintf=0&fl=title&mlt.qf=title^1000.0+content^0.1

I get the following (manually compressed) output:

str name="title">Sequoia Capital /str
str name="title">Google /str
str name="title">Google Translate /str
lst name="interestingTerms"
float name="content:yahoo">0.1 /float
float name="content:google">0.08868032 float
/lst

Note that the document with Google in the title is ranked lower than
the Sequoia Capital document. I have two questions:

Firstly: Why is the "interestingTerms" prefixed with the tag
"content"? Does that mean that the MLT-query is made in the
"content"-field only? If so, how to adjust the search to include both
title and content (also copied to a field called "text")?

Secondly, and possibly related to the first question: Independent of
the qf-boosts (now 1000.0 and 0.1) the second search result (with
Google title) is not ranked higher in the MLT search. Why is that?
This indicate to me that the MLT field boosting not works as I expect.
I would have like to see the document with the Google title ranked
first. How should I "boost" to do that?

Cheers
Clas