ArrayIndexOutOfBoundsException during invert link phase

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

ArrayIndexOutOfBoundsException during invert link phase

kkrugler
Hi all,

Has anybody else seen the java.lang.ArrayIndexOutOfBoundsException
error displayed in Diagnostic Text column of the jobdetail.jsp page
when running 0.8?

This occasionally seems to happen during the invert links phase. The
stack crawl looks like:

java.lang.ArrayIndexOutOfBoundsException  at
java.util.zip.CRC32.update(CRC32.java:43)  at
org.apache.nutch.fs.NFSDataInputStream$Checker.read(NFSDataInputStream.java:92)
at
org.apache.nutch.fs.NFSDataInputStream$PositionCache.read(NFSDataInputStream.java:156)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)  at
java.io.BufferedInputStream.read1(BufferedInputStream.java:256)  at
java.io.BufferedInputStream.read(BufferedInputStream.java:313)  at
java.io.DataInputStream.readFully(DataInputStream.java:176)  at
org.apache.nutch.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:55)
at
org.apache.nutch.io.DataOutputBuffer.write(DataOutputBuffer.java:89)
at
org.apache.nutch.io.SequenceFile$Reader.next(SequenceFile.java:378)
at
org.apache.nutch.io.SequenceFile$Reader.next(SequenceFile.java:301)
at
org.apache.nutch.io.SequenceFile$Reader.next(SequenceFile.java:323)
at
org.apache.nutch.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:60)
at org.apache.nutch.mapred.MapTask$2.next(MapTask.java:106)  at
org.apache.nutch.mapred.MapRunner.run(MapRunner.java:48)  at
org.apache.nutch.mapred.MapTask.run(MapTask.java:116)  at
org.apache.nutch.mapred.TaskTracker$Child.main(TaskTracker.java:603)

For our most recent trial, I see this 15 times out of 4840 map
attempts (along with 25 socket timeout errors, thus 4800 actual maps
completed).

I see that Rod Taylor reported an error from the same general
location (http://issues.apache.org/jira/browse/NUTCH-170), but his
reported stack had one additional entry:

org.apache.nutch.segment.SegmentReader$InputFormat$1.next(SegmentReader.java:80)

Between the MapTask$2.next and the SequenceFileRecordReader.next calls.

Seems like there might be a bug hiding in this area of the code. I'm
going to wrap some extra debugging around it to get more info when an
error does occur.

Thanks,

-- Ken
--
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ArrayIndexOutOfBoundsException during invert link phase

Bryan A. P. Pendleton
Just to chime in, I've also seen this bug, but only once so far.

On 2/4/06, Ken Krugler <[hidden email]> wrote:

>
> Hi all,
>
> Has anybody else seen the java.lang.ArrayIndexOutOfBoundsException
> error displayed in Diagnostic Text column of the jobdetail.jsp page
> when running 0.8?
>
> This occasionally seems to happen during the invert links phase. The
> stack crawl looks like:
>
> java.lang.ArrayIndexOutOfBoundsException  at
> java.util.zip.CRC32.update(CRC32.java:43)  at
> org.apache.nutch.fs.NFSDataInputStream$Checker.read(
> NFSDataInputStream.java:92)
> at
> org.apache.nutch.fs.NFSDataInputStream$PositionCache.read(
> NFSDataInputStream.java:156)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)  at
> java.io.BufferedInputStream.read1(BufferedInputStream.java:256)  at
> java.io.BufferedInputStream.read(BufferedInputStream.java:313)  at
> java.io.DataInputStream.readFully(DataInputStream.java:176)  at
> org.apache.nutch.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java
> :55)
> at
> org.apache.nutch.io.DataOutputBuffer.write(DataOutputBuffer.java:89)
> at
> org.apache.nutch.io.SequenceFile$Reader.next(SequenceFile.java:378)
> at
> org.apache.nutch.io.SequenceFile$Reader.next(SequenceFile.java:301)
> at
> org.apache.nutch.io.SequenceFile$Reader.next(SequenceFile.java:323)
> at
> org.apache.nutch.mapred.SequenceFileRecordReader.next(
> SequenceFileRecordReader.java:60)
> at org.apache.nutch.mapred.MapTask$2.next(MapTask.java:106)  at
> org.apache.nutch.mapred.MapRunner.run(MapRunner.java:48)  at
> org.apache.nutch.mapred.MapTask.run(MapTask.java:116)  at
> org.apache.nutch.mapred.TaskTracker$Child.main(TaskTracker.java:603)
>
> For our most recent trial, I see this 15 times out of 4840 map
> attempts (along with 25 socket timeout errors, thus 4800 actual maps
> completed).
>
> I see that Rod Taylor reported an error from the same general
> location (http://issues.apache.org/jira/browse/NUTCH-170), but his
> reported stack had one additional entry:
>
> org.apache.nutch.segment.SegmentReader$InputFormat$1.next(
> SegmentReader.java:80)
>
> Between the MapTask$2.next and the SequenceFileRecordReader.next calls.
>
> Seems like there might be a bug hiding in this area of the code. I'm
> going to wrap some extra debugging around it to get more info when an
> error does occur.
>
> Thanks,
>
> -- Ken
> --
>



--
Bryan A. Pendleton
Ph: (877) geek-1-bp
Loading...