Issue with trunk (rev 496535)

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Issue with trunk (rev 496535)

Sean Dean-3
I have had a common error come up now on two seperate fetches, both using the new Hadoop 0.10.1. The first error came up on my regular fetch using my large Nutch DB, but to rule out any problems with that (possibly related to the new fetch statuses) i created a brand new DB using the standard DMOZ inject. Just now that failed also, with the same error.

Here is the output:

2007-01-17 01:50:21,480 WARN  mapred.LocalJobRunner - job_m5rew8
java.lang.NullPointerException
        at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:2158)
        at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:1892)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:498)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:191)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:109)
2007-01-17 01:50:21,629 FATAL fetcher.Fetcher - Fetcher: java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:402)
        at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:469)
        at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:504)
        at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:189)
        at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:476)

Its not failing right away upon fetch start but around 100k urls, my Hadoop map file makes it to about 2.4GB.