Require answer for configuration and other issues

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Require answer for configuration and other issues

Arun Kumar Sharma
Hi Nutch Geeks,
 
    I need urgent answer to the following queries:
 
1. Is there any documentation/tutorial available which helps us understands the config parameters in different config files under conf directory ?
 
2.Suppose I want to parse different type of document like PDF, Ms-word, Ms-Excel , ppt etc., What I need to do ? What are the necessary config parameters for parsing of different type of documents(different mime type )
 
3.If I want to crawl local filesystem, what I need to add in urls.txt and crawl-urlfilter.txt?
 
   Thanx in advance for early response....


Regards,
 
Arun Kumar Sharma (Tech Lead -Java/J2EE)
Mob: +91.981.529.5761




               
---------------------------------
 Enjoy this Diwali with Y! India Click here