Nutch 2.2.1 pseudo dist, errors

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Nutch 2.2.1 pseudo dist, errors

BlackIce
HI,

My first try to run Nutch in pseudo dist, when trying to run any nutch
comman from the /runtime/deploy folder I get following error:

hduser@bl4ck1c3:/usr/local/nutch2/runtime/deploy$ bin/nutch inject urls
Warning: $HADOOP_HOME is deprecated.

14/03/18 16:19:33 INFO crawl.InjectorJob: InjectorJob: starting at
2014-03-18 16:19:33
14/03/18 16:19:33 INFO crawl.InjectorJob: InjectorJob: Injecting urlDir:
urls
Exception in thread "main" java.lang.NoSuchMethodError:
org.slf4j.spi.LocationAwareLogger.log(Lorg/slf4j/Marker;Ljava/lang/String;ILjava/lang/String;[Ljava/lang/Object;Ljava/lang/Throwable;)V
    at
org.apache.commons.logging.impl.SLF4JLocationAwareLog.debug(SLF4JLocationAwareLog.java:133)
    at
org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:139)
    at
org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:206)
    at
org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:185)
    at
org.apache.hadoop.security.UserGroupInformation.isSecurityEnabled(UserGroupInformation.java:237)
    at
org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:522)
    at
org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:505)
    at org.apache.hadoop.mapreduce.JobContext.<init>(JobContext.java:80)
    at org.apache.hadoop.mapreduce.Job.<init>(Job.java:100)
    at org.apache.hadoop.mapreduce.Job.<init>(Job.java:104)
    at org.apache.nutch.util.NutchJob.<init>(NutchJob.java:37)
    at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:214)
    at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:251)
    at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:273)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:282)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:483)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Reply | Threaded
Open this post in threaded view
|

Re: Nutch 2.2.1 pseudo dist, errors

lewis john mcgibbney
Hi BlackIce,

On Wed, Mar 19, 2014 at 3:07 PM, <[hidden email]> wrote:

>
> HI,
>
> My first try to run Nutch in pseudo dist, when trying to run any nutch
> comman from the /runtime/deploy folder I get following error:
>

Which version of Hadoop?
Check the classpath for the offending libraries.


>
>
Reply | Threaded
Open this post in threaded view
|

Re: Nutch 2.2.1 pseudo dist, errors

BlackIce
Thnx Lewis, Hadoop 1.2.1

Did that, it runs now up until Solr dedup job, which fails with:

java.lang.RuntimeException: java.lang.ClassNotFoundException:
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
        at org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException:
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
        at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:340)
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
        ... 8 more

Did I miss a Lib?


Thnx




On Thu, Mar 20, 2014 at 3:13 AM, Lewis John Mcgibbney <
[hidden email]> wrote:

> Hi BlackIce,
>
> On Wed, Mar 19, 2014 at 3:07 PM, <[hidden email]>
> wrote:
>
> >
> > HI,
> >
> > My first try to run Nutch in pseudo dist, when trying to run any nutch
> > comman from the /runtime/deploy folder I get following error:
> >
>
> Which version of Hadoop?
> Check the classpath for the offending libraries.
>
>
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Nutch 2.2.1 pseudo dist, errors

Talat Uyarer
Hi,

Could you add your plugins directory path as a absloute path in
nutch-site.xml ?

<property>
  <name>plugin.folders</name>
  <value>plugins</value>
  <description>Directories where nutch plugins are located.  Each
  element may be a relative or absolute path.  If absolute, it is used
  as is.  If relative, it is searched for on the classpath.</description>
</property>


2014-03-20 13:53 GMT+02:00 BlackIce <[hidden email]>:

> Thnx Lewis, Hadoop 1.2.1
>
> Did that, it runs now up until Solr dedup job, which fails with:
>
> java.lang.RuntimeException: java.lang.ClassNotFoundException:
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
>         at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
>         at
> org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>         at java.lang.Class.forName0(Native Method)
>         at java.lang.Class.forName(Class.java:340)
>         at
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
>         at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
>         ... 8 more
>
> Did I miss a Lib?
>
>
> Thnx
>
>
>
>
> On Thu, Mar 20, 2014 at 3:13 AM, Lewis John Mcgibbney <
> [hidden email]> wrote:
>
> > Hi BlackIce,
> >
> > On Wed, Mar 19, 2014 at 3:07 PM, <[hidden email]>
> > wrote:
> >
> > >
> > > HI,
> > >
> > > My first try to run Nutch in pseudo dist, when trying to run any nutch
> > > comman from the /runtime/deploy folder I get following error:
> > >
> >
> > Which version of Hadoop?
> > Check the classpath for the offending libraries.
> >
> >
> > >
> > >
> >
>



--
Talat UYARER
Websitesi: http://talat.uyarer.com
Twitter: http://twitter.com/talatuyarer
Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304
Reply | Threaded
Open this post in threaded view
|

Re: Nutch 2.2.1 pseudo dist, errors

BlackIce
you mean the one located
 in /nutch/runtime/local ?



On Thu, Mar 20, 2014 at 4:51 PM, Talat Uyarer <[hidden email]> wrote:

> Hi,
>
> Could you add your plugins directory path as a absloute path in
> nutch-site.xml ?
>
> <property>
>   <name>plugin.folders</name>
>   <value>plugins</value>
>   <description>Directories where nutch plugins are located.  Each
>   element may be a relative or absolute path.  If absolute, it is used
>   as is.  If relative, it is searched for on the classpath.</description>
> </property>
>
>
> 2014-03-20 13:53 GMT+02:00 BlackIce <[hidden email]>:
>
> > Thnx Lewis, Hadoop 1.2.1
> >
> > Did that, it runs now up until Solr dedup job, which fails with:
> >
> > java.lang.RuntimeException: java.lang.ClassNotFoundException:
> > org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
> >         at
> > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
> >         at
> >
> org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
> >         at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at javax.security.auth.Subject.doAs(Subject.java:422)
> >         at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> > Caused by: java.lang.ClassNotFoundException:
> > org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
> >         at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
> >         at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
> >         at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> >         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> >         at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> >         at java.lang.Class.forName0(Native Method)
> >         at java.lang.Class.forName(Class.java:340)
> >         at
> >
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
> >         at
> > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
> >         ... 8 more
> >
> > Did I miss a Lib?
> >
> >
> > Thnx
> >
> >
> >
> >
> > On Thu, Mar 20, 2014 at 3:13 AM, Lewis John Mcgibbney <
> > [hidden email]> wrote:
> >
> > > Hi BlackIce,
> > >
> > > On Wed, Mar 19, 2014 at 3:07 PM, <[hidden email]>
> > > wrote:
> > >
> > > >
> > > > HI,
> > > >
> > > > My first try to run Nutch in pseudo dist, when trying to run any
> nutch
> > > > comman from the /runtime/deploy folder I get following error:
> > > >
> > >
> > > Which version of Hadoop?
> > > Check the classpath for the offending libraries.
> > >
> > >
> > > >
> > > >
> > >
> >
>
>
>
> --
> Talat UYARER
> Websitesi: http://talat.uyarer.com
> Twitter: http://twitter.com/talatuyarer
> Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304
>
Reply | Threaded
Open this post in threaded view
|

Re: Nutch 2.2.1 pseudo dist, errors

BlackIce
triple check the jars in the lib dirs and did set the plug-in dir to an
absolute path

This is the oputput I get from the terminal window:

14/03/21 17:53:07 INFO solr.SolrDeleteDuplicates: SolrDeleteDuplicates:
starting...
14/03/21 17:53:07 INFO solr.SolrDeleteDuplicates: SolrDeleteDuplicates:
Solr url: http://localhost:8983/solr/
14/03/21 17:53:07 WARN mapred.JobClient: No job jar file set.  User classes
may not be found. See JobConf(Class) or JobConf#setJar(String).
14/03/21 17:53:08 INFO mapred.JobClient: Running job: job_201403211736_0018
14/03/21 17:53:09 INFO mapred.JobClient:  map 0% reduce 0%
14/03/21 17:53:14 INFO mapred.JobClient: Task Id :
attempt_201403211736_0018_m_000000_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException:
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
    at
org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException:
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
    at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:340)
    at
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
    ... 8 more

14/03/21 17:53:14 INFO mapred.JobClient: Task Id :
attempt_201403211736_0018_m_000001_0, Status : FAILED
java.lang.RuntimeException: java.lang.ClassNotFoundException:
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
    at
org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassNotFoundException:
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
    at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:340)
    at
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
    ... 8 more


Next step: Downgrade to Java 6? (i'm on 8)


On Fri, Mar 21, 2014 at 2:55 PM, BlackIce <[hidden email]> wrote:

> you mean the one located
>  in /nutch/runtime/local ?
>
>
>
> On Thu, Mar 20, 2014 at 4:51 PM, Talat Uyarer <[hidden email]> wrote:
>
>> Hi,
>>
>> Could you add your plugins directory path as a absloute path in
>> nutch-site.xml ?
>>
>> <property>
>>   <name>plugin.folders</name>
>>   <value>plugins</value>
>>   <description>Directories where nutch plugins are located.  Each
>>   element may be a relative or absolute path.  If absolute, it is used
>>   as is.  If relative, it is searched for on the classpath.</description>
>> </property>
>>
>>
>> 2014-03-20 13:53 GMT+02:00 BlackIce <[hidden email]>:
>>
>> > Thnx Lewis, Hadoop 1.2.1
>> >
>> > Did that, it runs now up until Solr dedup job, which fails with:
>> >
>> > java.lang.RuntimeException: java.lang.ClassNotFoundException:
>> > org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
>> >         at
>> > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
>> >         at
>> >
>> org.apache.hadoop.mapreduce.JobContext.getInputFormatClass(JobContext.java:187)
>> >         at
>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:722)
>> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
>> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>> >         at java.security.AccessController.doPrivileged(Native Method)
>> >         at javax.security.auth.Subject.doAs(Subject.java:422)
>> >         at
>> >
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>> >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> > Caused by: java.lang.ClassNotFoundException:
>> > org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat
>> >         at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
>> >         at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>> >         at java.security.AccessController.doPrivileged(Native Method)
>> >         at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
>> >         at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> >         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>> >         at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> >         at java.lang.Class.forName0(Native Method)
>> >         at java.lang.Class.forName(Class.java:340)
>> >         at
>> >
>> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
>> >         at
>> > org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
>> >         ... 8 more
>> >
>> > Did I miss a Lib?
>> >
>> >
>> > Thnx
>> >
>> >
>> >
>> >
>> > On Thu, Mar 20, 2014 at 3:13 AM, Lewis John Mcgibbney <
>> > [hidden email]> wrote:
>> >
>> > > Hi BlackIce,
>> > >
>> > > On Wed, Mar 19, 2014 at 3:07 PM, <[hidden email]>
>> > > wrote:
>> > >
>> > > >
>> > > > HI,
>> > > >
>> > > > My first try to run Nutch in pseudo dist, when trying to run any
>> nutch
>> > > > comman from the /runtime/deploy folder I get following error:
>> > > >
>> > >
>> > > Which version of Hadoop?
>> > > Check the classpath for the offending libraries.
>> > >
>> > >
>> > > >
>> > > >
>> > >
>> >
>>
>>
>>
>> --
>> Talat UYARER
>> Websitesi: http://talat.uyarer.com
>> Twitter: http://twitter.com/talatuyarer
>> Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304
>>
>
>