indexing or search problem?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

indexing or search problem?

Rocio Chongtay
Hi,

How can I check if my indexing is has gone well if so far I cannot search?

I have followed the step by step guide all the way to indexing and setting the GUI in tomcat.

in my indexes/part-00000 folder I can see files like:

_28k.f0  _28k.f2  _28k.f4  _28k.fdt  _28k.fnm  _28k.prx  _28k.tis   index.done
_28k.f1  _28k.f3  _28k.f5  _28k.fdx  _28k.frq  _28k.tii  deletable  segments

is that the folder that should go as a value in the searcher.dir
in the conf/nutch-site.xml file?

I am able to see the indexed document using Luke but if I
search with the GUI or by using the command line (see below) I get zero hits:

 bin/nutch org.apache.nutch.searcher.NutchBean home
Total hits: 0

I would appreciate any help

Rocio Chongtay

 
---------------------------------
Do you Yahoo!?
 Next-gen email? Have it all with the  all-new Yahoo! Mail Beta.
Reply | Threaded
Open this post in threaded view
|

Re: indexing or search problem?

Marko Bauhardt-2

Am 04.08.2006 um 12:33 schrieb Rocio Chongtay:

> Hi,
>

Hi

> How can I check if my indexing is has gone well if so far I cannot  
> search?
>
> I have followed the step by step guide all the way to indexing and  
> setting the GUI in tomcat.
>
> in my indexes/part-00000 folder I can see files like:
>
> _28k.f0  _28k.f2  _28k.f4  _28k.fdt  _28k.fnm  _28k.prx  _28k.tis    
> index.done
> _28k.f1  _28k.f3  _28k.f5  _28k.fdx  _28k.frq  _28k.tii  deletable  
> segments
>
> is that the folder that should go as a value in the searcher.dir
> in the conf/nutch-site.xml file?

Yes you have to set the absolute path to the searcher.dir, e.g. /home/
rocio/crawl
In crawl directory exists
segments
indexes
crawldb
linkdb

This should work.

Marko

Reply | Threaded
Open this post in threaded view
|

Re: indexing or search problem?

Rocio Chongtay
Hi Marko,

thanks so much for your help,

IT WORKED!! (:

I just moved the indexes directory into the crawl directory as with the instruction in the tutorial indexes gets created ouside


> bin/nutch index indexes crawl/linkdb crawl/segments/*
so it might help if in the tutorial indexing part it is changed to:


bin/nutch index crawl/indexes crawl/linkdb crawl/segments/*

And in the Search section just adding: to put the absolute path in the conf/nutch-site.xml for example:

<property>
   <name>searcher.dir</name>
   <value>/home/rocio/crawl</value>
</property>

this would save a lot of time to an absolute nutch beginner like me, thanks again, now I can continue exploring the great posibilities of nutch.

Rocio

Marko Bauhardt <[hidden email]> wrote:
Am 04.08.2006 um 12:33 schrieb Rocio Chongtay:

> Hi,
>

Hi

> How can I check if my indexing is has gone well if so far I cannot  
> search?
>
> I have followed the step by step guide all the way to indexing and  
> setting the GUI in tomcat.
>
> in my indexes/part-00000 folder I can see files like:
>
> _28k.f0  _28k.f2  _28k.f4  _28k.fdt  _28k.fnm  _28k.prx  _28k.tis    
> index.done
> _28k.f1  _28k.f3  _28k.f5  _28k.fdx  _28k.frq  _28k.tii  deletable  
> segments
>
> is that the folder that should go as a value in the searcher.dir
> in the conf/nutch-site.xml file?

Yes you have to set the absolute path to the searcher.dir, e.g. /home/
rocio/crawl
In crawl directory exists
segments
indexes
crawldb
linkdb

This should work.

Marko



 
---------------------------------
Yahoo! Music Unlimited - Access over 1 million songs.Try it free.