I just get my hand dirty in nutch recently, especially in extending its
I learned that nutch/lucene have their document retrieval model implmented
in TF vector-based approach. I wonder if there exist of other document model
like fuzzy set or probabilistic model implemented in nutch/lucene.
The objective of proposing and having a number of document models
implemented is to enable us further improve the document ranking in nutch.
Please understand that I not questioning the current nutch document ranking
efficiency. I just like to see more options in nutch especially how document
being modelled and how well they are.
It is worth while to move in this direction? Please comment.