No results when searching via the web

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

No results when searching via the web

Ricardo Ramirez-4
Hello,

I've set up a nutch crawler on a Debian Etch server (nutch-0.9, tomcat-5.5, sun-java-1.5) and I can't get results to show up on the web interface. I deployed the nutch war file to tomcat and modified nutch-site.xml to point to my crawl directory. Running the search from the command line yeilds results (eg. running bin/nutch org.apache.nutch.searcher.NutchBean <query>). I've tried symlinking the crawl directory into /var/lib/tomcat5.5/webapps (where the nutch webapp is located) but that didn't work. What do I need to do to get results?

Thanks,

Ricardo Ramirez
Reply | Threaded
Open this post in threaded view
|

Re: No results when searching via the web

Jason Boss
Hi,

Are you starting tomcat in your main nutch directory?

Jason


On Fri, Jun 20, 2008 at 3:02 PM, Ricardo Ramirez <[hidden email]> wrote:
> Hello,
>
> I've set up a nutch crawler on a Debian Etch server (nutch-0.9, tomcat-5.5, sun-java-1.5) and I can't get results to show up on the web interface. I deployed the nutch war file to tomcat and modified nutch-site.xml to point to my crawl directory. Running the search from the command line yeilds results (eg. running bin/nutch org.apache.nutch.searcher.NutchBean <query>). I've tried symlinking the crawl directory into /var/lib/tomcat5.5/webapps (where the nutch webapp is located) but that didn't work. What do I need to do to get results?
>
> Thanks,
>
> Ricardo Ramirez
>
Reply | Threaded
Open this post in threaded view
|

RE: No results when searching via the web

Howie Wang

Have you checked whether the nutch-site.xml under the Tomcat webapps
directory has the search.dir setting pointing to your crawl directory?

Howie


> Date: Fri, 20 Jun 2008 20:00:51 -0700
> From: [hidden email]
> To: [hidden email]
> Subject: Re: No results when searching via the web
>
> Hi,
>
> Are you starting tomcat in your main nutch directory?
>
> Jason
>
>
> On Fri, Jun 20, 2008 at 3:02 PM, Ricardo Ramirez <[hidden email]> wrote:
> > Hello,
> >
> > I've set up a nutch crawler on a Debian Etch server (nutch-0.9, tomcat-5.5, sun-java-1.5) and I can't get results to show up on the web interface. I deployed the nutch war file to tomcat and modified nutch-site.xml to point to my crawl directory. Running the search from the command line yeilds results (eg. running bin/nutch org.apache.nutch.searcher.NutchBean <query>). I've tried symlinking the crawl directory into /var/lib/tomcat5.5/webapps (where the nutch webapp is located) but that didn't work. What do I need to do to get results?
> >
> > Thanks,
> >
> > Ricardo Ramirez
> >

_________________________________________________________________
Need to know now? Get instant answers with Windows Live Messenger.
http://www.windowslive.com/messenger/connect_your_way.html?ocid=TXT_TAGLM_WL_Refresh_messenger_062008
Reply | Threaded
Open this post in threaded view
|

Re: No results when searching via the web

jthompson-2
Ricardo,

I'm running on Ubuntu and I have the same problem.  Are you using the
/etc/init.d/tomcat5.5 script to start and stop your server?  When I do that,
my search can't find anything.  However, when I use
/usr/share/tomcat5.5/bin/catalina.sh to start the server, it works
perfectly.  I've read through both scripts briefly and have yet to figure
out what's causing the different behavior (or if there is anything wrong
with just starting via the catalina script).  Hopefully this helps!

John

On Fri, Jun 20, 2008 at 8:18 PM, Howie Wang <[hidden email]> wrote:

>
> Have you checked whether the nutch-site.xml under the Tomcat webapps
> directory has the search.dir setting pointing to your crawl directory?
>
> Howie
>
>
> > Date: Fri, 20 Jun 2008 20:00:51 -0700
> > From: [hidden email]
> > To: [hidden email]
> > Subject: Re: No results when searching via the web
> >
> > Hi,
> >
> > Are you starting tomcat in your main nutch directory?
> >
> > Jason
> >
> >
> > On Fri, Jun 20, 2008 at 3:02 PM, Ricardo Ramirez <[hidden email]>
> wrote:
> > > Hello,
> > >
> > > I've set up a nutch crawler on a Debian Etch server (nutch-0.9,
> tomcat-5.5, sun-java-1.5) and I can't get results to show up on the web
> interface. I deployed the nutch war file to tomcat and modified
> nutch-site.xml to point to my crawl directory. Running the search from the
> command line yeilds results (eg. running bin/nutch
> org.apache.nutch.searcher.NutchBean <query>). I've tried symlinking the
> crawl directory into /var/lib/tomcat5.5/webapps (where the nutch webapp is
> located) but that didn't work. What do I need to do to get results?
> > >
> > > Thanks,
> > >
> > > Ricardo Ramirez
> > >
>
> _________________________________________________________________
> Need to know now? Get instant answers with Windows Live Messenger.
>
> http://www.windowslive.com/messenger/connect_your_way.html?ocid=TXT_TAGLM_WL_Refresh_messenger_062008
Reply | Threaded
Open this post in threaded view
|

Re: No results when searching via the web

Ricardo Ramirez-4
In reply to this post by Jason Boss
Hello,

Responses to several messages are inline.

Jason Boss wrote:
> Are you starting tomcat in your main nutch directory?

No, I'm starting tomcat via debian's invoke-rc.d script.

Howie Wang wrote:
> Have you checked whether the nutch-site.xml under the Tomcat webapps
> directory has the search.dir setting pointing to your crawl
> directory?

nutch-site.xml is pointed to the correct directory. If I understand
correctly, this is so I don't have to have the directory be ./crawl
relative to the current working directory of tomcat.

John Thompson wrote:
> I'm running on Ubuntu and I have the same problem.  Are you using the
>  /etc/init.d/tomcat5.5 script to start and stop your server?  When I
> do that, my search can't find anything.  However, when I use
> /usr/share/tomcat5.5/bin/catalina.sh to start the server, it works
> perfectly.  I've read through both scripts briefly and have yet to
> figure out what's causing the different behavior (or if there is
> anything wrong with just starting via the catalina script). Hopefully
> this helps!

When I use catalina.sh directly, I lose the nutch-0.9 webapp. I deployed
the WAR file via the Tomcat Manager so nutch lives at
/var/lib/tomcat5.5/webapps/nutch-0.9/

Ricardo
Reply | Threaded
Open this post in threaded view
|

RE: No results when searching via the web

Howie Wang

Sorry if this isn't the right answer, are you sure that all
nutch-site.xml's have the correct value? There's one
under the the nutch dir and another one will be unpacked
under the tomcat webapps dir from the war file.

If that's definitely correct, I might look for permissions issues
on the crawl directory. If tomcat runs as nobody, can it read
the crawl directory?  That sort of thing.

If that still doesn't help, turn on logging to see if it can tell
you what files are being accessed. You could also try running
tomcat through strace or attaching strace to tomcat to see
what files are being accessed, and failing.

hth,
Howie


> Date: Sat, 21 Jun 2008 21:54:31 -0400
> From: [hidden email]
> To: [hidden email]
> Subject: Re: No results when searching via the web
>
> Hello,
>
> Responses to several messages are inline.
>
> Jason Boss wrote:
> > Are you starting tomcat in your main nutch directory?
>
> No, I'm starting tomcat via debian's invoke-rc.d script.
>
> Howie Wang wrote:
> > Have you checked whether the nutch-site.xml under the Tomcat webapps
> > directory has the search.dir setting pointing to your crawl
> > directory?
>
> nutch-site.xml is pointed to the correct directory. If I understand
> correctly, this is so I don't have to have the directory be ./crawl
> relative to the current working directory of tomcat.
>
> John Thompson wrote:
> > I'm running on Ubuntu and I have the same problem.  Are you using the
> >  /etc/init.d/tomcat5.5 script to start and stop your server?  When I
> > do that, my search can't find anything.  However, when I use
> > /usr/share/tomcat5.5/bin/catalina.sh to start the server, it works
> > perfectly.  I've read through both scripts briefly and have yet to
> > figure out what's causing the different behavior (or if there is
> > anything wrong with just starting via the catalina script). Hopefully
> > this helps!
>
> When I use catalina.sh directly, I lose the nutch-0.9 webapp. I deployed
> the WAR file via the Tomcat Manager so nutch lives at
> /var/lib/tomcat5.5/webapps/nutch-0.9/
>
> Ricardo

_________________________________________________________________
The other season of giving begins 6/24/08. Check out the i’m Talkathon.
http://www.imtalkathon.com?source=TXT_EML_WLH_SeasonOfGiving
Reply | Threaded
Open this post in threaded view
|

Re: No results when searching via the web

Jason Boss
Post your tomcat logs so we can get some better ideas of what is wrong.

It should show that you are searching an index and give you some type
of clue as to what is going on...

Jason

On Sat, Jun 21, 2008 at 10:40 PM, Howie Wang <[hidden email]> wrote:

>
> Sorry if this isn't the right answer, are you sure that all
> nutch-site.xml's have the correct value? There's one
> under the the nutch dir and another one will be unpacked
> under the tomcat webapps dir from the war file.
>
> If that's definitely correct, I might look for permissions issues
> on the crawl directory. If tomcat runs as nobody, can it read
> the crawl directory?  That sort of thing.
>
> If that still doesn't help, turn on logging to see if it can tell
> you what files are being accessed. You could also try running
> tomcat through strace or attaching strace to tomcat to see
> what files are being accessed, and failing.
>
> hth,
> Howie
>
>
>> Date: Sat, 21 Jun 2008 21:54:31 -0400
>> From: [hidden email]
>> To: [hidden email]
>> Subject: Re: No results when searching via the web
>>
>> Hello,
>>
>> Responses to several messages are inline.
>>
>> Jason Boss wrote:
>> > Are you starting tomcat in your main nutch directory?
>>
>> No, I'm starting tomcat via debian's invoke-rc.d script.
>>
>> Howie Wang wrote:
>> > Have you checked whether the nutch-site.xml under the Tomcat webapps
>> > directory has the search.dir setting pointing to your crawl
>> > directory?
>>
>> nutch-site.xml is pointed to the correct directory. If I understand
>> correctly, this is so I don't have to have the directory be ./crawl
>> relative to the current working directory of tomcat.
>>
>> John Thompson wrote:
>> > I'm running on Ubuntu and I have the same problem.  Are you using the
>> >  /etc/init.d/tomcat5.5 script to start and stop your server?  When I
>> > do that, my search can't find anything.  However, when I use
>> > /usr/share/tomcat5.5/bin/catalina.sh to start the server, it works
>> > perfectly.  I've read through both scripts briefly and have yet to
>> > figure out what's causing the different behavior (or if there is
>> > anything wrong with just starting via the catalina script). Hopefully
>> > this helps!
>>
>> When I use catalina.sh directly, I lose the nutch-0.9 webapp. I deployed
>> the WAR file via the Tomcat Manager so nutch lives at
>> /var/lib/tomcat5.5/webapps/nutch-0.9/
>>
>> Ricardo
>
> _________________________________________________________________
> The other season of giving begins 6/24/08. Check out the i'm Talkathon.
> http://www.imtalkathon.com?source=TXT_EML_WLH_SeasonOfGiving
Reply | Threaded
Open this post in threaded view
|

Re: No results when searching via the web

Ricardo Ramirez-4
In reply to this post by Howie Wang
Howie Wang wrote:
> Sorry if this isn't the right answer, are you sure that all
> nutch-site.xml's have the correct value? There's one
> under the the nutch dir and another one will be unpacked
> under the tomcat webapps dir from the war file.
>
> If that's definitely correct, I might look for permissions issues
> on the crawl directory. If tomcat runs as nobody, can it read
> the crawl directory?  That sort of thing.

[facepalm]

When I recreated the crawl directory I messed up the permissions.
Setting the files as world read and then restarting tomcat worked.

Thanks,

Ricardo