Error Nutchwax Search

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Error Nutchwax Search

aonewa
i use nutchwax-0.10.0 search but it shows only
"Search took 0.032 seconds. Hits 0-0 (out of about 0 total matching pages): "
it not shows title, url, content when i look at log file in catalina.out it shows error:

2007-11-27 16:44:02,330 INFO  NutchBean - query: com
2007-11-27 16:44:02,331 INFO  NutchBean - searching for 20 raw hits
2007-11-27 16:44:02,385 ERROR [jsp] - Servlet.service() for servlet jsp threw exception
java.lang.IllegalStateException
   at java.nio.charset.CharsetEncoder.encode(libgcj.so.7rh)
   at org.apache.hadoop.io.Text.encode(Text.java:375)
   at org.apache.hadoop.io.Text.set(Text.java:165)
   at org.apache.hadoop.io.Text.<init>(Text.java:71)
   at org.archive.access.nutch.Nutchwax.generateWaxKey(Nutchwax.java:449)
   at org.archive.access.nutch.NutchwaxBean.getCollectionQualifiedHitDetails(NutchwaxBean.java:70)
   at org.archive.access.nutch.NutchwaxBean.getSummary(NutchwaxBean.java:50)
   at org.apache.jsp.search_jsp._jspService(search_jsp.java:349)
   at org.apache.jasper.runtime.HttpJspBase.service(jasper5-runtime-5.5.23.jar.so)
   at javax.servlet.http.HttpServlet.service(tomcat5-servlet-2.4-api-5.5.23.jar.so)
   at org.apache.jasper.servlet.JspServletWrapper.service(jasper5-compiler-5.5.23.jar.so)
   at org.apache.jasper.servlet.JspServlet.serviceJspFile(jasper5-compiler-5.5.23.jar.so)
   at org.apache.jasper.servlet.JspServlet.service(jasper5-compiler-5.5.23.jar.so)
   at javax.servlet.http.HttpServlet.service(tomcat5-servlet-2.4-api-5.5.23.jar.so)
   at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(catalina-5.5.23.jar.so)
   at org.apache.catalina.core.ApplicationFilterChain.doFilter(catalina-5.5.23.jar.so)
   at org.apache.catalina.core.StandardWrapperValve.invoke(catalina-5.5.23.jar.so)
   at org.apache.catalina.core.StandardContextValve.invoke(catalina-5.5.23.jar.so)
   at org.apache.catalina.core.StandardHostValve.invoke(catalina-5.5.23.jar.so)
   at org.apache.catalina.valves.ErrorReportValve.invoke(catalina-5.5.23.jar.so)
   at org.apache.catalina.core.StandardEngineValve.invoke(catalina-5.5.23.jar.so)
   at org.apache.catalina.connector.CoyoteAdapter.service(catalina-5.5.23.jar.so)
   at org.apache.coyote.http11.Http11Processor.process(tomcat-http-5.5.23.jar.so)
   at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(tomcat-http-5.5.23.jar.so)
   at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(tomcat-util-5.5.23.jar.so)
   at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(tomcat-util-5.5.23.jar.so)
   at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(tomcat-util-5.5.23.jar.so)
   at java.lang.Thread.run(libgcj.so.7rh)

why?and how i solve this problem?
Reply | Threaded
Open this post in threaded view
|

Re: Error Nutchwax Search

jibjoice
i have the same problem, pls help me.
aonewa wrote
i use nutchwax-0.10.0 search but it shows only
"Search took 0.032 seconds. Hits 0-0 (out of about 0 total matching pages): "
it not shows title, url, content when i look at log file in catalina.out it shows error:

2007-11-27 16:44:02,330 INFO  NutchBean - query: com
2007-11-27 16:44:02,331 INFO  NutchBean - searching for 20 raw hits
2007-11-27 16:44:02,385 ERROR [jsp] - Servlet.service() for servlet jsp threw exception
java.lang.IllegalStateException
   at java.nio.charset.CharsetEncoder.encode(libgcj.so.7rh)
   at org.apache.hadoop.io.Text.encode(Text.java:375)
   at org.apache.hadoop.io.Text.set(Text.java:165)
   at org.apache.hadoop.io.Text.<init>(Text.java:71)
   at org.archive.access.nutch.Nutchwax.generateWaxKey(Nutchwax.java:449)
   at org.archive.access.nutch.NutchwaxBean.getCollectionQualifiedHitDetails(NutchwaxBean.java:70)
   at org.archive.access.nutch.NutchwaxBean.getSummary(NutchwaxBean.java:50)
   at org.apache.jsp.search_jsp._jspService(search_jsp.java:349)
   at org.apache.jasper.runtime.HttpJspBase.service(jasper5-runtime-5.5.23.jar.so)
   at javax.servlet.http.HttpServlet.service(tomcat5-servlet-2.4-api-5.5.23.jar.so)
   at org.apache.jasper.servlet.JspServletWrapper.service(jasper5-compiler-5.5.23.jar.so)
   at org.apache.jasper.servlet.JspServlet.serviceJspFile(jasper5-compiler-5.5.23.jar.so)
   at org.apache.jasper.servlet.JspServlet.service(jasper5-compiler-5.5.23.jar.so)
   at javax.servlet.http.HttpServlet.service(tomcat5-servlet-2.4-api-5.5.23.jar.so)
   at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(catalina-5.5.23.jar.so)
   at org.apache.catalina.core.ApplicationFilterChain.doFilter(catalina-5.5.23.jar.so)
   at org.apache.catalina.core.StandardWrapperValve.invoke(catalina-5.5.23.jar.so)
   at org.apache.catalina.core.StandardContextValve.invoke(catalina-5.5.23.jar.so)
   at org.apache.catalina.core.StandardHostValve.invoke(catalina-5.5.23.jar.so)
   at org.apache.catalina.valves.ErrorReportValve.invoke(catalina-5.5.23.jar.so)
   at org.apache.catalina.core.StandardEngineValve.invoke(catalina-5.5.23.jar.so)
   at org.apache.catalina.connector.CoyoteAdapter.service(catalina-5.5.23.jar.so)
   at org.apache.coyote.http11.Http11Processor.process(tomcat-http-5.5.23.jar.so)
   at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(tomcat-http-5.5.23.jar.so)
   at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(tomcat-util-5.5.23.jar.so)
   at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(tomcat-util-5.5.23.jar.so)
   at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(tomcat-util-5.5.23.jar.so)
   at java.lang.Thread.run(libgcj.so.7rh)

why?and how i solve this problem?
Reply | Threaded
Open this post in threaded view
|

Re: Error Nutchwax Search

stack-3
Try SUN's JDK.  You are using the default gcj java on your, I presume,
red hat 7 linux install.  It looks like it might have encoding issues.

St.Ack

P.S. IIRC, this question has been answered already on this list.  Also,
nutchwax has its own list that would be more appropriate to questions of
this sort.  See
http://archive-access.sourceforge.net/projects/nutch/mail-lists.html

jibjoice wrote:

> i have the same problem, pls help me.
>
> aonewa wrote:
>  
>> i use nutchwax-0.10.0 search but it shows only
>> "Search took 0.032 seconds. Hits 0-0 (out of about 0 total matching
>> pages): "
>> it not shows title, url, content when i look at log file in catalina.out
>> it shows error:
>>
>> 2007-11-27 16:44:02,330 INFO  NutchBean - query: com
>> 2007-11-27 16:44:02,331 INFO  NutchBean - searching for 20 raw hits
>> 2007-11-27 16:44:02,385 ERROR [jsp] - Servlet.service() for servlet jsp
>> threw exception
>> java.lang.IllegalStateException
>>    at java.nio.charset.CharsetEncoder.encode(libgcj.so.7rh)
>>    at org.apache.hadoop.io.Text.encode(Text.java:375)
>>    at org.apache.hadoop.io.Text.set(Text.java:165)
>>    at org.apache.hadoop.io.Text.<init>(Text.java:71)
>>    at org.archive.access.nutch.Nutchwax.generateWaxKey(Nutchwax.java:449)
>>    at
>> org.archive.access.nutch.NutchwaxBean.getCollectionQualifiedHitDetails(NutchwaxBean.java:70)
>>    at
>> org.archive.access.nutch.NutchwaxBean.getSummary(NutchwaxBean.java:50)
>>    at org.apache.jsp.search_jsp._jspService(search_jsp.java:349)
>>    at
>> org.apache.jasper.runtime.HttpJspBase.service(jasper5-runtime-5.5.23.jar.so)
>>    at
>> javax.servlet.http.HttpServlet.service(tomcat5-servlet-2.4-api-5.5.23.jar.so)
>>    at
>> org.apache.jasper.servlet.JspServletWrapper.service(jasper5-compiler-5.5.23.jar.so)
>>    at
>> org.apache.jasper.servlet.JspServlet.serviceJspFile(jasper5-compiler-5.5.23.jar.so)
>>    at
>> org.apache.jasper.servlet.JspServlet.service(jasper5-compiler-5.5.23.jar.so)
>>    at
>> javax.servlet.http.HttpServlet.service(tomcat5-servlet-2.4-api-5.5.23.jar.so)
>>    at
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(catalina-5.5.23.jar.so)
>>    at
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(catalina-5.5.23.jar.so)
>>    at
>> org.apache.catalina.core.StandardWrapperValve.invoke(catalina-5.5.23.jar.so)
>>    at
>> org.apache.catalina.core.StandardContextValve.invoke(catalina-5.5.23.jar.so)
>>    at
>> org.apache.catalina.core.StandardHostValve.invoke(catalina-5.5.23.jar.so)
>>    at
>> org.apache.catalina.valves.ErrorReportValve.invoke(catalina-5.5.23.jar.so)
>>    at
>> org.apache.catalina.core.StandardEngineValve.invoke(catalina-5.5.23.jar.so)
>>    at
>> org.apache.catalina.connector.CoyoteAdapter.service(catalina-5.5.23.jar.so)
>>    at
>> org.apache.coyote.http11.Http11Processor.process(tomcat-http-5.5.23.jar.so)
>>    at
>> org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(tomcat-http-5.5.23.jar.so)
>>    at
>> org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(tomcat-util-5.5.23.jar.so)
>>    at
>> org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(tomcat-util-5.5.23.jar.so)
>>    at
>> org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(tomcat-util-5.5.23.jar.so)
>>    at java.lang.Thread.run(libgcj.so.7rh)
>>
>> why?and how i solve this problem?
>>
>>    
>
>  

Reply | Threaded
Open this post in threaded view
|

Re: Error Nutchwax Search

aonewa
hadoop use gcj java but St.Ack said to try SUN's JDK that means modify code in hadoop, yes or no?

stack-3 wrote
Try SUN's JDK.  You are using the default gcj java on your, I presume,
red hat 7 linux install.  It looks like it might have encoding issues.

St.Ack

P.S. IIRC, this question has been answered already on this list.  Also,
nutchwax has its own list that would be more appropriate to questions of
this sort.  See
http://archive-access.sourceforge.net/projects/nutch/mail-lists.html
Reply | Threaded
Open this post in threaded view
|

Re: Error Nutchwax Search

Stefan Groschupf
Just install sun jdk on your machine and update the $JAVA_HOME  
environment variable.
That should be all you need to do.
No hadoop modification necessary.

On Dec 11, 2007, at 11:54 PM, aonewa wrote:

>
> hadoop use gcj java but St.Ack said to try SUN's JDK that means  
> modify code
> in hadoop, yes or no?
>
>
> stack-3 wrote:
>>
>> Try SUN's JDK.  You are using the default gcj java on your, I  
>> presume,
>> red hat 7 linux install.  It looks like it might have encoding  
>> issues.
>>
>> St.Ack
>>
>> P.S. IIRC, this question has been answered already on this list.  
>> Also,
>> nutchwax has its own list that would be more appropriate to  
>> questions of
>> this sort.  See
>> http://archive-access.sourceforge.net/projects/nutch/mail-lists.html
>>
>>
>
> --
> View this message in context: http://www.nabble.com/Error-Nutchwax-Search-tp13967873p14290462.html
> Sent from the Hadoop Users mailing list archive at Nabble.com.
>
>

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
101tec Inc.
Menlo Park, California, USA
http://www.101tec.com


Reply | Threaded
Open this post in threaded view
|

Re: Error Nutchwax Search

aonewa
now my machine have sun java jdk1.6.0_01 and set $JAVA_HOME already, i search by tomcat which i  install with command "yum install tomcat" that use jvm java it is not match with index time, i want to know how i config it  

Stefan Groschupf-2 wrote
Just install sun jdk on your machine and update the $JAVA_HOME  
environment variable.
That should be all you need to do.
No hadoop modification necessary.

On Dec 11, 2007, at 11:54 PM, aonewa wrote:

>
> hadoop use gcj java but St.Ack said to try SUN's JDK that means  
> modify code
> in hadoop, yes or no?
>
>
> stack-3 wrote:
>>
>> Try SUN's JDK.  You are using the default gcj java on your, I  
>> presume,
>> red hat 7 linux install.  It looks like it might have encoding  
>> issues.
>>
>> St.Ack
>>
>> P.S. IIRC, this question has been answered already on this list.  
>> Also,
>> nutchwax has its own list that would be more appropriate to  
>> questions of
>> this sort.  See
>> http://archive-access.sourceforge.net/projects/nutch/mail-lists.html
>>
>>
>
> --
> View this message in context: http://www.nabble.com/Error-Nutchwax-Search-tp13967873p14290462.html
> Sent from the Hadoop Users mailing list archive at Nabble.com.
>
>

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
101tec Inc.
Menlo Park, California, USA
http://www.101tec.com

Reply | Threaded
Open this post in threaded view
|

Re: Error Nutchwax Search

Ted Dunning-3
In reply to this post by aonewa

Hadoop *normally* uses the Sun JDK.  Using gcj successfully would be a bit
of a surprise.


On 12/11/07 11:54 PM, "aonewa" <[hidden email]> wrote:

>
> hadoop use gcj java but St.Ack said to try SUN's JDK that means modify code
> in hadoop, yes or no?
>
>
> stack-3 wrote:
>>
>> Try SUN's JDK.  You are using the default gcj java on your, I presume,
>> red hat 7 linux install.  It looks like it might have encoding issues.
>>
>> St.Ack
>>
>> P.S. IIRC, this question has been answered already on this list.  Also,
>> nutchwax has its own list that would be more appropriate to questions of
>> this sort.  See
>> http://archive-access.sourceforge.net/projects/nutch/mail-lists.html
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: Error Nutchwax Search

Andrzej Białecki-2
Ted Dunning wrote:
> Hadoop *normally* uses the Sun JDK.  Using gcj successfully would be a bit
> of a surprise.

GCJ 4.2 does NOT work. With minor tweaks it's possible to compile all
Hadoop classes, including contrib, but it doesn't run properly. The
offending class is org.apache.hadoop.io.Text (CharacterEncoder works
differently from the Sun implementation, perhaps it's broken). This
class (Text) is widely used throughout Hadoop, so it won't work with GCJ
for now ...

--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply | Threaded
Open this post in threaded view
|

Re: Error Nutchwax Search

Ted Dunning-3

I guess it would be even more of a surprise, then.

:-)


On 12/12/07 1:36 PM, "Andrzej Bialecki" <[hidden email]> wrote:

>> Using gcj successfully would be a bit of a surprise.
>
> GCJ 4.2 does NOT work.

Reply | Threaded
Open this post in threaded view
|

Re: Error Nutchwax Search

Owen O'Malley-4
In reply to this post by Andrzej Białecki-2

On Dec 12, 2007, at 1:36 PM, Andrzej Bialecki wrote:

> Ted Dunning wrote:
>> Hadoop *normally* uses the Sun JDK.  Using gcj successfully would  
>> be a bit
>> of a surprise.
>
> GCJ 4.2 does NOT work. With minor tweaks it's possible to compile  
> all Hadoop classes, including contrib, but it doesn't run properly.  
> The offending class is org.apache.hadoop.io.Text (CharacterEncoder  
> works differently from the Sun implementation, perhaps it's  
> broken). This class (Text) is widely used throughout Hadoop, so it  
> won't work with GCJ for now ...

If anyone knows of specific problems or workarounds, it would be  
great to share them. I thought that gcj was still missing a lot of  
the the java 1.5 libraries...

Actually, the piece I'd love to see working under gcj is the hdfs  
client. I bet gcj would perform better that using jni in libhdfs.

-- Owen
Reply | Threaded
Open this post in threaded view
|

Re: Error Nutchwax Search

Andrzej Białecki-2
Owen O'Malley wrote:

>
> On Dec 12, 2007, at 1:36 PM, Andrzej Bialecki wrote:
>
>> Ted Dunning wrote:
>>> Hadoop *normally* uses the Sun JDK.  Using gcj successfully would be
>>> a bit
>>> of a surprise.
>>
>> GCJ 4.2 does NOT work. With minor tweaks it's possible to compile all
>> Hadoop classes, including contrib, but it doesn't run properly. The
>> offending class is org.apache.hadoop.io.Text (CharacterEncoder works
>> differently from the Sun implementation, perhaps it's broken). This
>> class (Text) is widely used throughout Hadoop, so it won't work with
>> GCJ for now ...
>
> If anyone knows of specific problems or workarounds, it would be great
> to share them. I thought that gcj was still missing a lot of the the
> java 1.5 libraries...

AFAIK few GUI applications run successfully - AWT / Swing support is
still shaky, but most other APIs are in good shape.

>
> Actually, the piece I'd love to see working under gcj is the hdfs
> client. I bet gcj would perform better that using jni in libhdfs.

I tried to find some info about this bug in GCJ 4.3, perhaps it's fixed
(I don't know what input caused this error, but there was some work done
on CharacterEncoder since 4.2 release).


--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com