This post has NOT been accepted by the mailing list yet.
I'm working on a e-commerce project and i have to classify products into it's corresponding category (we have about 10000 categories). After doing some research, i found that Mahout is the solution of my problem.
I want to do an offline classification so I installed Mahout and hadoop in order to succed in that.
Q1 : My first question : Is mahout suitable to classify products into
My available data (training and test data) : Product title, product description.
Q2 : Is it better to store this data in text files or index it in Solr?
How do you recommend me to implement the classification algorithm process?
I am new to Hadoop and Mahout but i'm very interested by this tools.