BM25F in Solr

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

BM25F in Solr

Jan Høydahl / Cominvent

There have been several discussions in the past on how to do BM25F scoring in Solr.
People have mentioned BlendedTermQuery and in Lucene 8.0 we got a new BM25FQuery.

What I mainly want is to normalize the doc freq (IDF) across fields, so that
e.g. title field uses same doc-freq as body field. And ideally it should work
in any query parser, including edismax.

Have any of you succeeded in this, alternatively some other workaround achieving
a normalized IDF across fields?

An approximation could be to always use doc-freq from the largest field in the index,
e.g. body, but not sure if you can do that in Similarity?

Jan Høydahl, search solution architect
Cominvent AS -