Normal search speeds

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Normal search speeds

waterwheel
Asking again for the patience of the list, we're still working on speed.
I guess what I need to know is if we still have a 'problem' or if the
following search speeds are normal for nutch.

query: 'term life insurance'; first search 25 seconds, second search 6
seconds.
query: 'stratford bed and breakfast'; first search 8 seconds, second
search 2 seconds
query: 'mortgage broker'; first search 6 seconds, second search 5 seconds

Is this the type of speed you'd expect from a nutch install?  I keep
feeling that it should be far faster than what we're seeing.

Specs: nutch 0.71, merged index.  Dedicated server, 4 million pages
indexed, dual xeon, 8gigs RAM, 3 Scsi HD's in Raid 0.
Reply | Threaded
Open this post in threaded view
|

Re: Normal search speeds

Stefan Groschupf-2
This is very slow!
You can expect results in less than a second from my experience.
+ check memory settings of tomcat.
+ you do not use ndfs, right?


Am 06.03.2006 um 00:23 schrieb Insurance Squared Inc.:

> Asking again for the patience of the list, we're still working on  
> speed. I guess what I need to know is if we still have a 'problem'  
> or if the following search speeds are normal for nutch.
>
> query: 'term life insurance'; first search 25 seconds, second  
> search 6 seconds.
> query: 'stratford bed and breakfast'; first search 8 seconds,  
> second search 2 seconds
> query: 'mortgage broker'; first search 6 seconds, second search 5  
> seconds
>
> Is this the type of speed you'd expect from a nutch install?  I  
> keep feeling that it should be far faster than what we're seeing.
>
> Specs: nutch 0.71, merged index.  Dedicated server, 4 million pages  
> indexed, dual xeon, 8gigs RAM, 3 Scsi HD's in Raid 0.

---------------------------------------------
blog: http://www.find23.org
company: http://www.media-style.com


Reply | Threaded
Open this post in threaded view
|

Re: Normal search speeds

waterwheel
That's correct, we're not using ndfs.  As far as I know it's an out of
the box installation of Mandrake 2006, tomcat, and nutch.

Byron's suggestion of merging to one index cut speeds by about 1/3 or
1/2.  I think we've already looked at the tomcat memory settings but
I'll ask our developer to look deeper.  I'm suspicious that something's
cycling somewhere, it's hard for me to imagine a regular process taking
25 seconds when cpu and memory show nothing really happening.  (I also
suspect that the problem is not with nutch, but instead with something
at the OS or tomcat level, or with another system process that nutch is
using).




Stefan Groschupf wrote:

> This is very slow!
> You can expect results in less than a second from my experience.
> + check memory settings of tomcat.
> + you do not use ndfs, right?
>
>
> Am 06.03.2006 um 00:23 schrieb Insurance Squared Inc.:
>
>> Asking again for the patience of the list, we're still working on  
>> speed. I guess what I need to know is if we still have a 'problem'  
>> or if the following search speeds are normal for nutch.
>>
>> query: 'term life insurance'; first search 25 seconds, second  search
>> 6 seconds.
>> query: 'stratford bed and breakfast'; first search 8 seconds,  second
>> search 2 seconds
>> query: 'mortgage broker'; first search 6 seconds, second search 5  
>> seconds
>>
>> Is this the type of speed you'd expect from a nutch install?  I  keep
>> feeling that it should be far faster than what we're seeing.
>>
>> Specs: nutch 0.71, merged index.  Dedicated server, 4 million pages  
>> indexed, dual xeon, 8gigs RAM, 3 Scsi HD's in Raid 0.
>
>
> ---------------------------------------------
> blog: http://www.find23.org
> company: http://www.media-style.com
>
>
>
Reply | Threaded
Open this post in threaded view
|

nutch 0.7.0 search performance measurement

Stefan Groschupf-2
Hi,
for people that found that interesting I had published some  
measurement values I had done a long time ago.
http://www.find23.net/Web-Site/blog/A712F01B-4BB1-4FC6-AE95- 
E64988FBCC79.html
All time related values are in milliseconds.
Don't take the values to serious however at least they give an idea.

Stefan


Am 06.03.2006 um 02:03 schrieb Insurance Squared Inc.:

> That's correct, we're not using ndfs.  As far as I know it's an out  
> of the box installation of Mandrake 2006, tomcat, and nutch.
>
> Byron's suggestion of merging to one index cut speeds by about 1/3  
> or 1/2.  I think we've already looked at the tomcat memory settings  
> but I'll ask our developer to look deeper.  I'm suspicious that  
> something's cycling somewhere, it's hard for me to imagine a  
> regular process taking 25 seconds when cpu and memory show nothing  
> really happening.  (I also suspect that the problem is not with  
> nutch, but instead with something at the OS or tomcat level, or  
> with another system process that nutch is using).
>
>
>
>
> Stefan Groschupf wrote:
>
>> This is very slow!
>> You can expect results in less than a second from my experience.
>> + check memory settings of tomcat.
>> + you do not use ndfs, right?
>>
>>
>> Am 06.03.2006 um 00:23 schrieb Insurance Squared Inc.:
>>
>>> Asking again for the patience of the list, we're still working  
>>> on  speed. I guess what I need to know is if we still have a  
>>> 'problem'  or if the following search speeds are normal for nutch.
>>>
>>> query: 'term life insurance'; first search 25 seconds, second  
>>> search 6 seconds.
>>> query: 'stratford bed and breakfast'; first search 8 seconds,  
>>> second search 2 seconds
>>> query: 'mortgage broker'; first search 6 seconds, second search  
>>> 5  seconds
>>>
>>> Is this the type of speed you'd expect from a nutch install?  I  
>>> keep feeling that it should be far faster than what we're seeing.
>>>
>>> Specs: nutch 0.71, merged index.  Dedicated server, 4 million  
>>> pages  indexed, dual xeon, 8gigs RAM, 3 Scsi HD's in Raid 0.
>>
>>
>> ---------------------------------------------
>> blog: http://www.find23.org
>> company: http://www.media-style.com
>>
>>
>>
>

---------------------------------------------
blog: http://www.find23.org
company: http://www.media-style.com


Reply | Threaded
Open this post in threaded view
|

Re: Normal search speeds

Howie Wang
In reply to this post by waterwheel
If you want to narrow down whether it's a Tomcat issue, maybe you
could try running Nutch on another app server like Resin to see if
there's a difference. It's been a while since I used Tomcat, but I
did find the performance to be kind of slow. I think things are
supposed to be better now, but many claim that Resin is still faster.

Howie

>That's correct, we're not using ndfs.  As far as I know it's an out of the
>box installation of Mandrake 2006, tomcat, and nutch.
>
>Byron's suggestion of merging to one index cut speeds by about 1/3 or 1/2.  
>I think we've already looked at the tomcat memory settings but I'll ask our
>developer to look deeper.  I'm suspicious that something's cycling
>somewhere, it's hard for me to imagine a regular process taking 25 seconds
>when cpu and memory show nothing really happening.  (I also suspect that
>the problem is not with nutch, but instead with something at the OS or
>tomcat level, or with another system process that nutch is using).
>
>
>
>
>Stefan Groschupf wrote:
>
>>This is very slow!
>>You can expect results in less than a second from my experience.
>>+ check memory settings of tomcat.
>>+ you do not use ndfs, right?
>>
>>
>>Am 06.03.2006 um 00:23 schrieb Insurance Squared Inc.:
>>
>>>Asking again for the patience of the list, we're still working on  speed.
>>>I guess what I need to know is if we still have a 'problem'  or if the
>>>following search speeds are normal for nutch.
>>>
>>>query: 'term life insurance'; first search 25 seconds, second  search 6
>>>seconds.
>>>query: 'stratford bed and breakfast'; first search 8 seconds,  second
>>>search 2 seconds
>>>query: 'mortgage broker'; first search 6 seconds, second search 5  
>>>seconds
>>>
>>>Is this the type of speed you'd expect from a nutch install?  I  keep
>>>feeling that it should be far faster than what we're seeing.
>>>
>>>Specs: nutch 0.71, merged index.  Dedicated server, 4 million pages  
>>>indexed, dual xeon, 8gigs RAM, 3 Scsi HD's in Raid 0.
>>
>>
>>---------------------------------------------
>>blog: http://www.find23.org
>>company: http://www.media-style.com
>>
>>
>>