[lucy-user] tf-idf/cosine similarity

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[lucy-user] tf-idf/cosine similarity

Joel Reymont
Does Lucy support TF-IDF/cosine similarity like Lucene does?

Is there 'more like this' Lucene functionality or can it be easily implemented?

        Thanks, Joel

--------------------------------------------------------------------------
- mac osx device driver ninja, kernel extensions and user-land usb drivers
---------------------+------------+---------------------------------------
http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont
---------------------+------------+---------------------------------------

Reply | Threaded
Open this post in threaded view
|

Re: [lucy-user] tf-idf/cosine similarity

Marvin Humphrey
On Wed, Feb 23, 2011 at 11:13:59PM +0000, Joel Reymont wrote:
> Does Lucy support TF-IDF/cosine similarity like Lucene does?
 
Yes.  The default scoring model is the same as Lucene's.

> Is there 'more like this' Lucene functionality or can it be easily implemented?

Lucy does not provide a MoreLikeThisQuery.  In theory, it's not difficult to
implement, but some of the APIs you would need are not public yet.

MoreLikeThisQuery has been discussed before on the lucy-dev list.  I have
misgivings about the algorithm that the Lucene implementation uses because the
results are noisy.

    http://lucy.markmail.org/thread/rb5ruelwomgaj7lp

Best,

Marvin Humphrey