problems: crawling specific domain

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

problems: crawling specific domain



How can i crawl specific domain only(like What i have to change to work things correctly?I tried with the change in crawl-urlfilter.txt and nutch started crawling outside my domain after sometimes.

I am using nutch 0.9 in standalone mode(without hadoop).Can anyone gives me some idea how to merge indexes from different crawl to a single indexes?

--mohammad monirul hoque