This post has NOT been accepted by the mailing list yet.
Hi Every one,
I am trying to use mahout's lucene.vectors to pull data from a lucene index. The index contains web page content crawled by Nutch. Some of the fields that are indexed are : title, url, id, text and category.
I know I can use lucene.vectors to fetch the data from the index and convert it to vectors. However, but what I could not understand is how to tell this tool which field in Lucene contains the label. For my scenario, the category field is the label field.