running Nutch

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

running Nutch

ilango gurusamy
Hi
 I am trying to run Nutch by following the instructions given in the tutorial.
 The environment is Suse Linux10, JDK 1.4.2 and Nutch 0.71. And of course Tomcat 5
 
 I get the following errors:
 linux:/usr/local/nutch/nutch071 # bin/nutch crawl urls -dir crawl depth 3 -topN 50
 060307 200146 parsing file:/usr/local/nutch/nutch071/conf/nutch-default.xml
 060307 200147 parsing file:/usr/local/nutch/nutch071/conf/crawl-tool.xml
 060307 200147 parsing file:/usr/local/nutch/nutch071/conf/nutch-site.xml
 060307 200147 No FS indicated, using default:local
 Exception in thread "main" java.lang.RuntimeException: crawl already exists.
         at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:121)
 linux:/usr/local/nutch/nutch071 # export JAVA_HOME
 linux:/usr/local/nutch/nutch071 # bin/nutch crawl urls -dir crawl depth 3 -topN 50
 060307 200325 parsing file:/usr/local/nutch/nutch071/conf/nutch-default.xml
 060307 200325 parsing file:/usr/local/nutch/nutch071/conf/crawl-tool.xml
 060307 200325 parsing file:/usr/local/nutch/nutch071/conf/nutch-site.xml
 060307 200325 No FS indicated, using default:local
 Exception in thread "main" java.lang.RuntimeException: crawl already exists.
         at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:121)
 linux:/usr/local/nutch/nutch071 # NUTCH_JAVA_HOME=//usr/lib/jvm/java-1.4.2
 linux:/usr/local/nutch/nutch071 # NUTCH_JAVA_HOME=/usr/lib/jvm/java-1.4.2
 linux:/usr/local/nutch/nutch071 # export NUTCH_JAVA_HOME
 linux:/usr/local/nutch/nutch071 # echo $NUTCH_JAVA_HOME
 /usr/lib/jvm/java-1.4.2
 linux:/usr/local/nutch/nutch071 # bin/nutch crawl urls -dir crawl depth 3
 run java in /usr/lib/jvm/java-1.4.2
 060307 201624 parsing file:/usr/local/nutch/nutch071/conf/nutch-default.xml
 060307 201625 parsing file:/usr/local/nutch/nutch071/conf/crawl-tool.xml
 060307 201625 parsing file:/usr/local/nutch/nutch071/conf/nutch-site.xml
 060307 201625 No FS indicated, using default:local
 Exception in thread "main" java.lang.RuntimeException: crawl already exists.
         at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:121)
 
 
 I set JAVA_HOME and NUTCH_JAVA_HOME to the base of my JVM installation, but I am not sure what is going on.
 
 I really appreciate any help that I can get. Thanks a lot
 
 ilango
 
 
 
               
---------------------------------
 Yahoo! Mail
 Use Photomail to share photos without annoying attachments.
Reply | Threaded
Open this post in threaded view
|

Re: running Nutch

D.Saravanaraj
Delete the crawl folder which would have been created in the previous crawl.

On 3/7/06, ilango gurusamy <[hidden email]> wrote:

>
> Hi
> I am trying to run Nutch by following the instructions given in the
> tutorial.
> The environment is Suse Linux10, JDK 1.4.2 and Nutch 0.71. And of course
> Tomcat 5
>
> I get the following errors:
> linux:/usr/local/nutch/nutch071 # bin/nutch crawl urls -dir crawl depth 3
> -topN 50
> 060307 200146 parsing file:/usr/local/nutch/nutch071/conf/nutch-
> default.xml
> 060307 200147 parsing file:/usr/local/nutch/nutch071/conf/crawl-tool.xml
> 060307 200147 parsing file:/usr/local/nutch/nutch071/conf/nutch-site.xml
> 060307 200147 No FS indicated, using default:local
> Exception in thread "main" java.lang.RuntimeException: crawl already
> exists.
>          at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:121)
> linux:/usr/local/nutch/nutch071 # export JAVA_HOME
> linux:/usr/local/nutch/nutch071 # bin/nutch crawl urls -dir crawl depth 3
> -topN 50
> 060307 200325 parsing file:/usr/local/nutch/nutch071/conf/nutch-
> default.xml
> 060307 200325 parsing file:/usr/local/nutch/nutch071/conf/crawl-tool.xml
> 060307 200325 parsing file:/usr/local/nutch/nutch071/conf/nutch-site.xml
> 060307 200325 No FS indicated, using default:local
> Exception in thread "main" java.lang.RuntimeException: crawl already
> exists.
>          at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:121)
> linux:/usr/local/nutch/nutch071 # NUTCH_JAVA_HOME=//usr/lib/jvm/java-1.4.2
> linux:/usr/local/nutch/nutch071 # NUTCH_JAVA_HOME=/usr/lib/jvm/java-1.4.2
> linux:/usr/local/nutch/nutch071 # export NUTCH_JAVA_HOME
> linux:/usr/local/nutch/nutch071 # echo $NUTCH_JAVA_HOME
> /usr/lib/jvm/java-1.4.2
> linux:/usr/local/nutch/nutch071 # bin/nutch crawl urls -dir crawl depth 3
> run java in /usr/lib/jvm/java-1.4.2
> 060307 201624 parsing file:/usr/local/nutch/nutch071/conf/nutch-
> default.xml
> 060307 201625 parsing file:/usr/local/nutch/nutch071/conf/crawl-tool.xml
> 060307 201625 parsing file:/usr/local/nutch/nutch071/conf/nutch-site.xml
> 060307 201625 No FS indicated, using default:local
> Exception in thread "main" java.lang.RuntimeException: crawl already
> exists.
>          at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:121)
>
>
> I set JAVA_HOME and NUTCH_JAVA_HOME to the base of my JVM installation,
> but I am not sure what is going on.
>
> I really appreciate any help that I can get. Thanks a lot
>
> ilango
>
>
>
>
> ---------------------------------
> Yahoo! Mail
> Use Photomail to share photos without annoying attachments.
>
Reply | Threaded
Open this post in threaded view
|

Re: running Nutch

ilango gurusamy
Hi
 I successfully ran Nutch. Thanks for the tip. Strangely I remember deleting the crawl directory before..but anyway, you worked the magic for me
 
 by the way, Saravanaraj, are you from TN. What are your research interests with Nutch
 
 ilango

"D.Saravanaraj" <[hidden email]> wrote: Delete the crawl folder which would have been created in the previous crawl.

On 3/7/06, ilango gurusamy  wrote:

>
> Hi
> I am trying to run Nutch by following the instructions given in the
> tutorial.
> The environment is Suse Linux10, JDK 1.4.2 and Nutch 0.71. And of course
> Tomcat 5
>
> I get the following errors:
> linux:/usr/local/nutch/nutch071 # bin/nutch crawl urls -dir crawl depth 3
> -topN 50
> 060307 200146 parsing file:/usr/local/nutch/nutch071/conf/nutch-
> default.xml
> 060307 200147 parsing file:/usr/local/nutch/nutch071/conf/crawl-tool.xml
> 060307 200147 parsing file:/usr/local/nutch/nutch071/conf/nutch-site.xml
> 060307 200147 No FS indicated, using default:local
> Exception in thread "main" java.lang.RuntimeException: crawl already
> exists.
>          at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:121)
> linux:/usr/local/nutch/nutch071 # export JAVA_HOME
> linux:/usr/local/nutch/nutch071 # bin/nutch crawl urls -dir crawl depth 3
> -topN 50
> 060307 200325 parsing file:/usr/local/nutch/nutch071/conf/nutch-
> default.xml
> 060307 200325 parsing file:/usr/local/nutch/nutch071/conf/crawl-tool.xml
> 060307 200325 parsing file:/usr/local/nutch/nutch071/conf/nutch-site.xml
> 060307 200325 No FS indicated, using default:local
> Exception in thread "main" java.lang.RuntimeException: crawl already
> exists.
>          at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:121)
> linux:/usr/local/nutch/nutch071 # NUTCH_JAVA_HOME=//usr/lib/jvm/java-1.4.2
> linux:/usr/local/nutch/nutch071 # NUTCH_JAVA_HOME=/usr/lib/jvm/java-1.4.2
> linux:/usr/local/nutch/nutch071 # export NUTCH_JAVA_HOME
> linux:/usr/local/nutch/nutch071 # echo $NUTCH_JAVA_HOME
> /usr/lib/jvm/java-1.4.2
> linux:/usr/local/nutch/nutch071 # bin/nutch crawl urls -dir crawl depth 3
> run java in /usr/lib/jvm/java-1.4.2
> 060307 201624 parsing file:/usr/local/nutch/nutch071/conf/nutch-
> default.xml
> 060307 201625 parsing file:/usr/local/nutch/nutch071/conf/crawl-tool.xml
> 060307 201625 parsing file:/usr/local/nutch/nutch071/conf/nutch-site.xml
> 060307 201625 No FS indicated, using default:local
> Exception in thread "main" java.lang.RuntimeException: crawl already
> exists.
>          at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:121)
>
>
> I set JAVA_HOME and NUTCH_JAVA_HOME to the base of my JVM installation,
> but I am not sure what is going on.
>
> I really appreciate any help that I can get. Thanks a lot
>
> ilango
>
>
>
>
> ---------------------------------
> Yahoo! Mail
> Use Photomail to share photos without annoying attachments.
>


               
---------------------------------
Yahoo! Mail
Bring photos to life! New PhotoMail  makes sharing a breeze.