adding [-numFetchers numFetchers] to crawl

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

adding [-numFetchers numFetchers] to crawl

Brian Tingle
How do I set the number of Map tasks when I do a command like


hadoop jar nutch-1.0.job org.apache.nutch.crawler.Crawl




I think I'm going to try out the change below, is there any reason not
to do it, or is Crawl supposed to be more of a demo and I should write
some script or my own crawler class?


> diff


<         ("Usage: Crawl <urlDir> [-dir d] [-threads n] [-depth i]
[-topN N]");


>         ("Usage: Crawl <urlDir> [-dir d] [-threads n] [-depth i]
[-topN N] [-numFetchers]");


>     int numFetchers = -1;


>       } else if ("-numFetchers".equals(args[i])) {

>           numFetchers = Integer.parseInt(args[i+1]);

>           i++;


<       Path segment = generator.generate(crawlDb, segments, -1, topN,


>       Path segment = generator.generate(crawlDb, segments,
numFetchers, topN, System