Empty "incoming anchor text"

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Empty "incoming anchor text"

Zhen Zhen

Hi all, sorry for having another question so soon :P.

After deploying Nutch, when I clicked on "anchors" link under
each URL, the page came with an empty "incoming achor text", Is this
normal?

thanks

Zhen
Reply | Threaded
Open this post in threaded view
|

Which tutorial to use for getting Nutch 9.12 up and running on a single machine?

Jp Mutch
Hello,
   
  I'm new to Nutch.
   
  I selected Nutch 9.12 dev because I need to use a
Java 1.5 local development environment.
I am able to build Nutch 9.12 succesfully on Java 1.5, with very
little effort. Great packaging of distribution.
   
  [Aside: Only one problem: Kept giving typical
compiler warnings due to template class mismatches,
in core as well as many plugins.].
   
  My questions are regarding crawling and testing/searching:
Due to my local requirements, initially I just need to run all of nutch
on a single machine in its local filesystem, without really needing
Hadoop or DFS [I don't mind if they are running "under the hood"].
  Later on if the initial study is successful, I will of course
  switch to the full blown Nutch with Hadoop+DFS+Distributed Search.
   
  (Q1) What tutorial do I need to follow to get Nutch 9.12
to crawl and index on a single machine?
(a) The Nutch 0.8 tutorial
http://lucene.apache.org/nutch/tutorial8.html ?
OR
(c) The new Hadoop tutorial
http://wiki.apache.org/nutch/NutchHadoopTutorial ?
   
  (Q2) Can I run [crawl+search] Nutch 9.12 or later on a single Windows XP
  machine with Cygwin+Tomcat 5.5?
   
  Appreciate any help.
  Thanks a lot!
   
  -jp


 
---------------------------------
Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ countries) for 2¢/min or less.
Reply | Threaded
Open this post in threaded view
|

Re: Which tutorial to use for getting Nutch 9.12 up and running on a single machine?

peter decrem
The nutch version 0.8 tutorial has a section and it is pretty straight forward.  Make sure to remember to change the nutch-site.xml file and fill in your username.

I have had mIxed results with cygwin and nutch (so make backups etc.).

Cheers


Sent from my Verizon Wireless BlackBerry  

-----Original Message-----
From: Jp Mutch <[hidden email]>
Date: Mon, 18 Sep 2006 10:48:47
To:[hidden email]
Subject: Which tutorial to use for getting Nutch 9.12 up and running on a single machine?

Hello,
   
  I'm new to Nutch.
   
  I selected Nutch 9.12 dev because I need to use a
Java 1.5 local development environment.
I am able to build Nutch 9.12 succesfully on Java 1.5, with very
little effort. Great packaging of distribution.
   
  [Aside: Only one problem: Kept giving typical
compiler warnings due to template class mismatches,
in core as well as many plugins.].
   
  My questions are regarding crawling and testing/searching:
Due to my local requirements, initially I just need to run all of nutch
on a single machine in its local filesystem, without really needing
Hadoop or DFS [I don't mind if they are running "under the hood"].
  Later on if the initial study is successful, I will of course
  switch to the full blown Nutch with Hadoop+DFS+Distributed Search.
   
  (Q1) What tutorial do I need to follow to get Nutch 9.12
to crawl and index on a single machine?
(a) The Nutch 0.8 tutorial
http://lucene.apache.org/nutch/tutorial8.html ?
OR
(c) The new Hadoop tutorial
http://wiki.apache.org/nutch/NutchHadoopTutorial ?
   
  (Q2) Can I run [crawl+search] Nutch 9.12 or later on a single Windows XP
  machine with Cygwin+Tomcat 5.5?
   
  Appreciate any help.
  Thanks a lot!
   
  -jp


 
---------------------------------
Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ countries) for 2¢/min or less.
Reply | Threaded
Open this post in threaded view
|

Re: Which tutorial to use for getting Nutch 9.12 up and running on a single machine?

Richard Braman-2
In reply to this post by Jp Mutch
Jp Mutch wrote:
>    
>   My questions are regarding crawling and testing/searching:
> Due to my local requirements, initially I just need to run all of nutch
> on a single machine in its local filesystem, without really needing
> Hadoop or DFS [I don't mind if they are running "under the hood"].
>   Later on if the initial study is successful, I will of course
>   switch to the full blown Nutch with Hadoop+DFS+Distributed Search.
>    
>  
Hadoop is run no matter what.  Its no big deal, unless there is a Hadoop
bug, several have come along but have been fixed.
hadoop needs a tmp directory to execute jobs in the distributed
fashion.  I usually point mine to C:\tmp  Hdoop will also create some
directories related to its filesystem.  the main directories you will
work with will be your crawl directory and its subfolders crawldb lindb,
indexes, and segements.
>   (Q1) What tutorial do I need to follow to get Nutch 9.12
> to crawl and index on a single machine?
> (a) The Nutch 0.8 tutorial
> http://lucene.apache.org/nutch/tutorial8.html ?
> OR
> (c) The new Hadoop tutorial
> http://wiki.apache.org/nutch/NutchHadoopTutorial ?
>    
>  
The .8 would work, there are some additional notes on windows on the wiki
>   (Q2) Can I run [crawl+search] Nutch 9.12 or later on a single Windows XP
>   machine with Cygwin+Tomcat 5.5?
>  
Yes

>    
>   Appreciate any help.
>   Thanks a lot!
>    
>   -jp
>
>
>  
> ---------------------------------
> Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ countries) for 2¢/min or less.
>