Exception in dedup

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Exception in dedup

scott green
Hi,

I am running nutch spider in my eclipse and come across some exceptions:

06/11/20 03:15:47 INFO mapred.MapTask: opened part-0.out
06/11/20 03:15:47 WARN mapred.LocalJobRunner: job_5ay3nq
java.lang.ArrayIndexOutOfBoundsException: -1
at org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java:109)
at org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReader.next(DeleteDuplicates.java:174)
at org.apache.hadoop.mapred.MapTask$3.next(MapTask.java:201)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:44)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:213)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:105)
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:393)
at org.apache.nutch.indexer.DeleteDuplicates.dedup(DeleteDuplicates.java:433)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:116)

Could you please tell me how to fix this problem?

-- Scott