JVM error while parsing

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

JVM error while parsing

uygaryuzsuren
Hi everyone,

I am using Hadoop-0.2.0 and Nutch-0.8, and at the moment trying to complete
a 1-depth-crawl
by using DFS and mapreduce structures. However, after a fetch step, I
encounter the below JVM exception
at one or more task trackers at the parsing step. It does not differ whether
I use only the default parsers,
or I also use the additional ones (pdf excel etc.). My task trackers work on
AMD X2 64-bit machines
and my JVM version is 1.5_06.

Have you ever faced with such a problem at the parse stage?Or how do you
think I can spot the cause of
this JVM exception?The error report is :

060530 144113 task_0007_m_000010_0  Using Signature impl:
org.apache.nutch.crawl.MD5Signature
060530 144113 task_0007_m_000010_0
5.0391704E-6%/crawl/segments/20060521171305/content/part-00004/data:0+12303612
060530 144114 task_0007_m_000010_0  Using URL normalizer:
org.apache.nutch.net.BasicUrlNormalizer
060530 144114 task_0007_m_000007_0
0.084114%/crawl/segments/20060521171305/content/part-00011/data:0+12493176
060530 144115 task_0007_m_000007_0
0.09551566%/crawl/segments/20060521171305/content/part-00011/data:0+12493176
060530 144115 task_0007_m_000007_0 #
060530 144115 task_0007_m_000007_0 # An unexpected error has been detected
by HotSpot Virtual Machine:
060530 144115 task_0007_m_000007_0 #
060530 144115 task_0007_m_000007_0 #  SIGSEGV (0xb) at
pc=0x0000003d1d247c10, pid=25093, tid=182894086496
060530 144115 task_0007_m_000007_0 #
060530 144115 task_0007_m_000007_0 # Java VM: Java HotSpot(TM) 64-Bit Server
VM (1.5.0_06-b05 mixed mode)
060530 144115 task_0007_m_000007_0 # Problematic frame:
060530 144115 task_0007_m_000007_0 # C  [libc.so.6+0x47c10]
printf_size+0x740
060530 144115 task_0007_m_000007_0 #
060530 144115 task_0007_m_000007_0 # An error report file with more
information is saved as hs_err_pid25093.log
060530 144115 task_0007_m_000007_0 #
060530 144115 task_0007_m_000007_0 # If you would like to submit a bug
report, please visit:
060530 144115 task_0007_m_000007_0 #
http://java.sun.com/webapps/bugreport/crash.jsp
060530 144115 task_0007_m_000007_0 #
060530 144115 Server connection on port 51950 from 192.168.15.61: exiting
060530 144115 task_0007_m_000007_0 Child Error
java.io.IOException: Task process exit with nonzero status of 134.
        at org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:242)
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:145)


Thank you very much.
Reply | Threaded
Open this post in threaded view
|

Re: JVM error while parsing

Stefan Groschupf-2
Hi,
I heard there is a bug in JVM  1.5_06 beta, can you try a older or  
may be a 1.4 jvm and report if this happens with a other jvm as well.
Thanks,
Stefan

Am 30.05.2006 um 14:14 schrieb Uygar Yüzsüren:

> Hi everyone,
>
> I am using Hadoop-0.2.0 and Nutch-0.8, and at the moment trying to  
> complete
> a 1-depth-crawl
> by using DFS and mapreduce structures. However, after a fetch step, I
> encounter the below JVM exception
> at one or more task trackers at the parsing step. It does not  
> differ whether
> I use only the default parsers,
> or I also use the additional ones (pdf excel etc.). My task  
> trackers work on
> AMD X2 64-bit machines
> and my JVM version is 1.5_06.
>
> Have you ever faced with such a problem at the parse stage?Or how  
> do you
> think I can spot the cause of
> this JVM exception?The error report is :
>
> 060530 144113 task_0007_m_000010_0  Using Signature impl:
> org.apache.nutch.crawl.MD5Signature
> 060530 144113 task_0007_m_000010_0
> 5.0391704E-6%/crawl/segments/20060521171305/content/part-00004/data:
> 0+12303612
> 060530 144114 task_0007_m_000010_0  Using URL normalizer:
> org.apache.nutch.net.BasicUrlNormalizer
> 060530 144114 task_0007_m_000007_0
> 0.084114%/crawl/segments/20060521171305/content/part-00011/data:0
> +12493176
> 060530 144115 task_0007_m_000007_0
> 0.09551566%/crawl/segments/20060521171305/content/part-00011/data:0
> +12493176
> 060530 144115 task_0007_m_000007_0 #
> 060530 144115 task_0007_m_000007_0 # An unexpected error has been  
> detected
> by HotSpot Virtual Machine:
> 060530 144115 task_0007_m_000007_0 #
> 060530 144115 task_0007_m_000007_0 #  SIGSEGV (0xb) at
> pc=0x0000003d1d247c10, pid=25093, tid=182894086496
> 060530 144115 task_0007_m_000007_0 #
> 060530 144115 task_0007_m_000007_0 # Java VM: Java HotSpot(TM) 64-
> Bit Server
> VM (1.5.0_06-b05 mixed mode)
> 060530 144115 task_0007_m_000007_0 # Problematic frame:
> 060530 144115 task_0007_m_000007_0 # C  [libc.so.6+0x47c10]
> printf_size+0x740
> 060530 144115 task_0007_m_000007_0 #
> 060530 144115 task_0007_m_000007_0 # An error report file with more
> information is saved as hs_err_pid25093.log
> 060530 144115 task_0007_m_000007_0 #
> 060530 144115 task_0007_m_000007_0 # If you would like to submit a bug
> report, please visit:
> 060530 144115 task_0007_m_000007_0 #
> http://java.sun.com/webapps/bugreport/crash.jsp
> 060530 144115 task_0007_m_000007_0 #
> 060530 144115 Server connection on port 51950 from 192.168.15.61:  
> exiting
> 060530 144115 task_0007_m_000007_0 Child Error
> java.io.IOException: Task process exit with nonzero status of 134.
>        at org.apache.hadoop.mapred.TaskRunner.runChild
> (TaskRunner.java:242)
>        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:145)
>
>
> Thank you very much.