Search Speed

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Search Speed

waterwheel
I appreciate your patience as we try to get over our search speed
issues.  We're getting closer - it seems we are having huge delays when
retrieving the summaries for the various search results.  Below are our
logs from a search, you can see that retrieving some of the search
summaries  took into the double digit seconds.  ('ve left in the
comments from the developer).

As we continue to dig deeper, I was wondering if the folks here that are
more intimately familiar with the code had any immediate reaction as to
what the problem might be,given this additional info.

We've pretty much ruled out Tomcat as the source, we installed Resin and
search speed was the same.

Nutch 0.71 running on linux, dual Xeon, 8 gigs of ram, 3Xscsi drives in
Raid 0.  Nothing else running on the server.  Index has about 4.5
million pages.

Thanks!

060308 104251 11 query: term life insurance
060308 104251 11 searching for 20 raw hits
060308 104253 11 total hits: 20859
060308 104253 11 Keren: get hits.
060308 104253 11 Keren: get details.
060308 104253 11 Keren: get summary.
060308 104253 12 Keren: getSegment().
060308 104253 13 Keren: getSegment().
060308 104253 12 Keren: getDocNo().
060308 104253 14 Keren: getSegment().
060308 104253 14 Keren: getDocNo().
060308 104253 13 Keren: getDocNo().
060308 104253 12 Keren: getParseText().
060308 104253 15 Keren: getSegment().
060308 104253 17 Keren: getSegment().
060308 104253 13 Keren: getParseText().
060308 104253 15 Keren: getDocNo().
060308 104253 15 Keren: getParseText().
060308 104253 18 Keren: getSegment().
060308 104253 14 Keren: getParseText().
060308 104253 16 Keren: getSegment().
060308 104253 16 Keren: getDocNo().
060308 104253 18 Keren: getDocNo().
060308 104253 16 Keren: getParseText().
060308 104253 17 Keren: getDocNo().
060308 104253 17 Keren: getParseText().
060308 104253 19 Keren: getSegment().
060308 104253 18 Keren: getParseText().
060308 104253 20 Keren: getSegment().
060308 104253 21 Keren: getSegment().
060308 104253 20 Keren: getDocNo().
060308 104253 21 Keren: getDocNo().
060308 104253 20 Keren: getParseText().
060308 104253 21 Keren: getParseText().
060308 104253 19 Keren: getDocNo().
060308 104253 19 Keren: getParseText().
060308 104253 19 Keren: getText().
060308 104253 19 Keren: Summarizer().getSummary. text length=3288
060308 104254 19 found resource common-terms.utf8 at
file:/var/jakarta-tomcat-4.1.31/webapps/ROOT/WEB-INF/classes/common-terms.utf8
060308 104254 18 Keren: getText().
060308 104254 18 Keren: Summarizer().getSummary. text length=4770
060308 104254 12 Keren: getText().
060308 104254 12 Keren: Summarizer().getSummary. text length=9442
060308 104257 20 Keren: getText().
060308 104257 20 Keren: Summarizer().getSummary. text length=4162
060308 104302 14 Keren: getText().
060308 104302 14 Keren: Summarizer().getSummary. text length=9364
060308 104302 13 Keren: getText().
060308 104302 13 Keren: Summarizer().getSummary. text length=9140
060308 104303 21 Keren: getText().
060308 104303 21 Keren: Summarizer().getSummary. text length=1107
060308 104304 17 Keren: getText().
060308 104304 17 Keren: Summarizer().getSummary. text length=3315
060308 104305 15 Keren: getText().
060308 104305 15 Keren: Summarizer().getSummary. text length=3261
060308 104305 16 Keren: getText().
060308 104305 16 Keren: Summarizer().getSummary. text length=492
060308 104305 11 Keren: get requestURL.
060308 104305 11 Keren: start try.
060308 104305 11 Keren: start detail.
060308 104305 11 Keren: detail: 0
060308 104305 11 Keren: detail: 1
060308 104305 11 Keren: detail: 2
060308 104305 11 Keren: detail: 3
060308 104305 11 Keren: detail: 4
060308 104305 11 Keren: detail: 5
060308 104305 11 Keren: detail: 6
060308 104305 11 Keren: detail: 7
060308 104305 11 Keren: detail: 8
060308 104305 11 Keren: detail: 9
060308 104306 11 Keren: doGet done.

There are 10 threads to get summary. After these threads are done, it
return the search results as RSS. Let's see the threads separately,

060308 104253 12 Keren: getSegment().
060308 104253 12 Keren: getDocNo().
060308 104253 12 Keren: getParseText().
060308 104254 12 Keren: getText().
060308 104254 12 Keren: Summarizer().getSummary. text length=9442

The thread 12 took 1 second to get parse text.

060308 104253 13 Keren: getSegment().
060308 104253 13 Keren: getDocNo().
060308 104253 13 Keren: getParseText().
060308 104302 13 Keren: getText().
060308 104302 13 Keren: Summarizer().getSummary. text length=9140

The thread 13 took 9 seconds to get parse text.

060308 104253 14 Keren: getSegment().
060308 104253 14 Keren: getDocNo().
060308 104253 14 Keren: getParseText().
060308 104302 14 Keren: getText().
060308 104302 14 Keren: Summarizer().getSummary. text length=9364

The thread 14 took 9 seconds to get parse text.

060308 104253 15 Keren: getSegment().
060308 104253 15 Keren: getDocNo().
060308 104253 15 Keren: getParseText().
060308 104305 15 Keren: getText().
060308 104305 15 Keren: Summarizer().getSummary. text length=3261

The thread 15 took 12 seconds to get parse text.

060308 104253 16 Keren: getSegment().
060308 104253 16 Keren: getDocNo().
060308 104253 16 Keren: getParseText().
060308 104305 16 Keren: getText().
060308 104305 16 Keren: Summarizer().getSummary. text length=492

The thread 16 took 12 seconds to get parse text.

060308 104253 17 Keren: getSegment().
060308 104253 17 Keren: getDocNo().
060308 104253 17 Keren: getParseText().
060308 104304 17 Keren: getText().
060308 104304 17 Keren: Summarizer().getSummary. text length=3315

The thread 17 took 11 seconds to get parse text.

060308 104253 18 Keren: getSegment().
060308 104253 18 Keren: getDocNo().
060308 104253 18 Keren: getParseText().
060308 104254 18 Keren: getText().
060308 104254 18 Keren: Summarizer().getSummary. text length=4770

The thread 18 took 1 second to get parse text.

060308 104253 19 Keren: getSegment().
060308 104253 19 Keren: getParseText().
060308 104253 19 Keren: getText().
060308 104253 19 Keren: Summarizer().getSummary. text length=3288

The thread 19 took 1 second to get parse text.

I think the problem is that how these 10 concurrent threads run. I'm not
sure they are really concurrenctly run. In the thread 16, it's text
length is the smallest, 492.


Reply | Threaded
Open this post in threaded view
|

Re: Search Speed

Stefan Groschupf-2
How many segments you have and how big are they?
Try a disc IO Measurement tool or script what does it says?


Am 08.03.2006 um 17:38 schrieb Insurance Squared Inc.:

> I appreciate your patience as we try to get over our search speed  
> issues.  We're getting closer - it seems we are having huge delays  
> when retrieving the summaries for the various search results.  
> Below are our logs from a search, you can see that retrieving some  
> of the search summaries  took into the double digit seconds.  ('ve  
> left in the comments from the developer).
>
> As we continue to dig deeper, I was wondering if the folks here  
> that are more intimately familiar with the code had any immediate  
> reaction as to what the problem might be,given this additional info.
>
> We've pretty much ruled out Tomcat as the source, we installed  
> Resin and search speed was the same.
>
> Nutch 0.71 running on linux, dual Xeon, 8 gigs of ram, 3Xscsi  
> drives in Raid 0.  Nothing else running on the server.  Index has  
> about 4.5 million pages.
>
> Thanks!
>
> 060308 104251 11 query: term life insurance
> 060308 104251 11 searching for 20 raw hits
> 060308 104253 11 total hits: 20859
> 060308 104253 11 Keren: get hits.
> 060308 104253 11 Keren: get details.
> 060308 104253 11 Keren: get summary.
> 060308 104253 12 Keren: getSegment().
> 060308 104253 13 Keren: getSegment().
> 060308 104253 12 Keren: getDocNo().
> 060308 104253 14 Keren: getSegment().
> 060308 104253 14 Keren: getDocNo().
> 060308 104253 13 Keren: getDocNo().
> 060308 104253 12 Keren: getParseText().
> 060308 104253 15 Keren: getSegment().
> 060308 104253 17 Keren: getSegment().
> 060308 104253 13 Keren: getParseText().
> 060308 104253 15 Keren: getDocNo().
> 060308 104253 15 Keren: getParseText().
> 060308 104253 18 Keren: getSegment().
> 060308 104253 14 Keren: getParseText().
> 060308 104253 16 Keren: getSegment().
> 060308 104253 16 Keren: getDocNo().
> 060308 104253 18 Keren: getDocNo().
> 060308 104253 16 Keren: getParseText().
> 060308 104253 17 Keren: getDocNo().
> 060308 104253 17 Keren: getParseText().
> 060308 104253 19 Keren: getSegment().
> 060308 104253 18 Keren: getParseText().
> 060308 104253 20 Keren: getSegment().
> 060308 104253 21 Keren: getSegment().
> 060308 104253 20 Keren: getDocNo().
> 060308 104253 21 Keren: getDocNo().
> 060308 104253 20 Keren: getParseText().
> 060308 104253 21 Keren: getParseText().
> 060308 104253 19 Keren: getDocNo().
> 060308 104253 19 Keren: getParseText().
> 060308 104253 19 Keren: getText().
> 060308 104253 19 Keren: Summarizer().getSummary. text length=3288
> 060308 104254 19 found resource common-terms.utf8 at file:/var/
> jakarta-tomcat-4.1.31/webapps/ROOT/WEB-INF/classes/common-terms.utf8
> 060308 104254 18 Keren: getText().
> 060308 104254 18 Keren: Summarizer().getSummary. text length=4770
> 060308 104254 12 Keren: getText().
> 060308 104254 12 Keren: Summarizer().getSummary. text length=9442
> 060308 104257 20 Keren: getText().
> 060308 104257 20 Keren: Summarizer().getSummary. text length=4162
> 060308 104302 14 Keren: getText().
> 060308 104302 14 Keren: Summarizer().getSummary. text length=9364
> 060308 104302 13 Keren: getText().
> 060308 104302 13 Keren: Summarizer().getSummary. text length=9140
> 060308 104303 21 Keren: getText().
> 060308 104303 21 Keren: Summarizer().getSummary. text length=1107
> 060308 104304 17 Keren: getText().
> 060308 104304 17 Keren: Summarizer().getSummary. text length=3315
> 060308 104305 15 Keren: getText().
> 060308 104305 15 Keren: Summarizer().getSummary. text length=3261
> 060308 104305 16 Keren: getText().
> 060308 104305 16 Keren: Summarizer().getSummary. text length=492
> 060308 104305 11 Keren: get requestURL.
> 060308 104305 11 Keren: start try.
> 060308 104305 11 Keren: start detail.
> 060308 104305 11 Keren: detail: 0
> 060308 104305 11 Keren: detail: 1
> 060308 104305 11 Keren: detail: 2
> 060308 104305 11 Keren: detail: 3
> 060308 104305 11 Keren: detail: 4
> 060308 104305 11 Keren: detail: 5
> 060308 104305 11 Keren: detail: 6
> 060308 104305 11 Keren: detail: 7
> 060308 104305 11 Keren: detail: 8
> 060308 104305 11 Keren: detail: 9
> 060308 104306 11 Keren: doGet done.
>
> There are 10 threads to get summary. After these threads are done,  
> it return the search results as RSS. Let's see the threads separately,
>
> 060308 104253 12 Keren: getSegment().
> 060308 104253 12 Keren: getDocNo().
> 060308 104253 12 Keren: getParseText().
> 060308 104254 12 Keren: getText().
> 060308 104254 12 Keren: Summarizer().getSummary. text length=9442
>
> The thread 12 took 1 second to get parse text.
>
> 060308 104253 13 Keren: getSegment().
> 060308 104253 13 Keren: getDocNo().
> 060308 104253 13 Keren: getParseText().
> 060308 104302 13 Keren: getText().
> 060308 104302 13 Keren: Summarizer().getSummary. text length=9140
>
> The thread 13 took 9 seconds to get parse text.
>
> 060308 104253 14 Keren: getSegment().
> 060308 104253 14 Keren: getDocNo().
> 060308 104253 14 Keren: getParseText().
> 060308 104302 14 Keren: getText().
> 060308 104302 14 Keren: Summarizer().getSummary. text length=9364
>
> The thread 14 took 9 seconds to get parse text.
>
> 060308 104253 15 Keren: getSegment().
> 060308 104253 15 Keren: getDocNo().
> 060308 104253 15 Keren: getParseText().
> 060308 104305 15 Keren: getText().
> 060308 104305 15 Keren: Summarizer().getSummary. text length=3261
>
> The thread 15 took 12 seconds to get parse text.
>
> 060308 104253 16 Keren: getSegment().
> 060308 104253 16 Keren: getDocNo().
> 060308 104253 16 Keren: getParseText().
> 060308 104305 16 Keren: getText().
> 060308 104305 16 Keren: Summarizer().getSummary. text length=492
>
> The thread 16 took 12 seconds to get parse text.
>
> 060308 104253 17 Keren: getSegment().
> 060308 104253 17 Keren: getDocNo().
> 060308 104253 17 Keren: getParseText().
> 060308 104304 17 Keren: getText().
> 060308 104304 17 Keren: Summarizer().getSummary. text length=3315
>
> The thread 17 took 11 seconds to get parse text.
>
> 060308 104253 18 Keren: getSegment().
> 060308 104253 18 Keren: getDocNo().
> 060308 104253 18 Keren: getParseText().
> 060308 104254 18 Keren: getText().
> 060308 104254 18 Keren: Summarizer().getSummary. text length=4770
>
> The thread 18 took 1 second to get parse text.
>
> 060308 104253 19 Keren: getSegment().
> 060308 104253 19 Keren: getParseText().
> 060308 104253 19 Keren: getText().
> 060308 104253 19 Keren: Summarizer().getSummary. text length=3288
>
> The thread 19 took 1 second to get parse text.
>
> I think the problem is that how these 10 concurrent threads run.  
> I'm not sure they are really concurrenctly run. In the thread 16,  
> it's text length is the smallest, 492.
>
>

---------------------------------------------------------------
company:        http://www.media-style.com
forum:        http://www.text-mining.org
blog:            http://www.find23.net


Reply | Threaded
Open this post in threaded view
|

Re: Search Speed

waterwheel
We've got about 6 or 8 segments, but we just merged our indexes in an
attempt to speed things up.  Total hard drive space is something like
60-80 gigs, in that neighbourhood.  Nothing there strikes me as suspicious.

I could look at disc i/o speeds, but I'm doubtful that's the issue.  
We're running 3 Scsi drives at 10K RPM in a Raid 0 (striped, using
hardware) configuration across two channels.  And we had the same speed
problems on a different server using software raid 0 on SATA hard
drives.  Moving between the two servers makes no difference in terms of
speed, which leads me away from thinking this is hardware.

It seems there's something specific to looking up the summary for each
site returned in the search, the rest of the search is fast.



Stefan Groschupf wrote:

> How many segments you have and how big are they?
> Try a disc IO Measurement tool or script what does it says?
>
>
> Am 08.03.2006 um 17:38 schrieb Insurance Squared Inc.:
>
>> I appreciate your patience as we try to get over our search speed  
>> issues.  We're getting closer - it seems we are having huge delays  
>> when retrieving the summaries for the various search results.   Below
>> are our logs from a search, you can see that retrieving some  of the
>> search summaries  took into the double digit seconds.  ('ve  left in
>> the comments from the developer).
>>
>> As we continue to dig deeper, I was wondering if the folks here  that
>> are more intimately familiar with the code had any immediate  
>> reaction as to what the problem might be,given this additional info.
>>
>> We've pretty much ruled out Tomcat as the source, we installed  Resin
>> and search speed was the same.
>>
>> Nutch 0.71 running on linux, dual Xeon, 8 gigs of ram, 3Xscsi  drives
>> in Raid 0.  Nothing else running on the server.  Index has  about 4.5
>> million pages.
>>
>> Thanks!
>>
>> 060308 104251 11 query: term life insurance
>> 060308 104251 11 searching for 20 raw hits
>> 060308 104253 11 total hits: 20859
>> 060308 104253 11 Keren: get hits.
>> 060308 104253 11 Keren: get details.
>> 060308 104253 11 Keren: get summary.
>> 060308 104253 12 Keren: getSegment().
>> 060308 104253 13 Keren: getSegment().
>> 060308 104253 12 Keren: getDocNo().
>> 060308 104253 14 Keren: getSegment().
>> 060308 104253 14 Keren: getDocNo().
>> 060308 104253 13 Keren: getDocNo().
>> 060308 104253 12 Keren: getParseText().
>> 060308 104253 15 Keren: getSegment().
>> 060308 104253 17 Keren: getSegment().
>> 060308 104253 13 Keren: getParseText().
>> 060308 104253 15 Keren: getDocNo().
>> 060308 104253 15 Keren: getParseText().
>> 060308 104253 18 Keren: getSegment().
>> 060308 104253 14 Keren: getParseText().
>> 060308 104253 16 Keren: getSegment().
>> 060308 104253 16 Keren: getDocNo().
>> 060308 104253 18 Keren: getDocNo().
>> 060308 104253 16 Keren: getParseText().
>> 060308 104253 17 Keren: getDocNo().
>> 060308 104253 17 Keren: getParseText().
>> 060308 104253 19 Keren: getSegment().
>> 060308 104253 18 Keren: getParseText().
>> 060308 104253 20 Keren: getSegment().
>> 060308 104253 21 Keren: getSegment().
>> 060308 104253 20 Keren: getDocNo().
>> 060308 104253 21 Keren: getDocNo().
>> 060308 104253 20 Keren: getParseText().
>> 060308 104253 21 Keren: getParseText().
>> 060308 104253 19 Keren: getDocNo().
>> 060308 104253 19 Keren: getParseText().
>> 060308 104253 19 Keren: getText().
>> 060308 104253 19 Keren: Summarizer().getSummary. text length=3288
>> 060308 104254 19 found resource common-terms.utf8 at file:/var/
>> jakarta-tomcat-4.1.31/webapps/ROOT/WEB-INF/classes/common-terms.utf8
>> 060308 104254 18 Keren: getText().
>> 060308 104254 18 Keren: Summarizer().getSummary. text length=4770
>> 060308 104254 12 Keren: getText().
>> 060308 104254 12 Keren: Summarizer().getSummary. text length=9442
>> 060308 104257 20 Keren: getText().
>> 060308 104257 20 Keren: Summarizer().getSummary. text length=4162
>> 060308 104302 14 Keren: getText().
>> 060308 104302 14 Keren: Summarizer().getSummary. text length=9364
>> 060308 104302 13 Keren: getText().
>> 060308 104302 13 Keren: Summarizer().getSummary. text length=9140
>> 060308 104303 21 Keren: getText().
>> 060308 104303 21 Keren: Summarizer().getSummary. text length=1107
>> 060308 104304 17 Keren: getText().
>> 060308 104304 17 Keren: Summarizer().getSummary. text length=3315
>> 060308 104305 15 Keren: getText().
>> 060308 104305 15 Keren: Summarizer().getSummary. text length=3261
>> 060308 104305 16 Keren: getText().
>> 060308 104305 16 Keren: Summarizer().getSummary. text length=492
>> 060308 104305 11 Keren: get requestURL.
>> 060308 104305 11 Keren: start try.
>> 060308 104305 11 Keren: start detail.
>> 060308 104305 11 Keren: detail: 0
>> 060308 104305 11 Keren: detail: 1
>> 060308 104305 11 Keren: detail: 2
>> 060308 104305 11 Keren: detail: 3
>> 060308 104305 11 Keren: detail: 4
>> 060308 104305 11 Keren: detail: 5
>> 060308 104305 11 Keren: detail: 6
>> 060308 104305 11 Keren: detail: 7
>> 060308 104305 11 Keren: detail: 8
>> 060308 104305 11 Keren: detail: 9
>> 060308 104306 11 Keren: doGet done.
>>
>> There are 10 threads to get summary. After these threads are done,  
>> it return the search results as RSS. Let's see the threads separately,
>>
>> 060308 104253 12 Keren: getSegment().
>> 060308 104253 12 Keren: getDocNo().
>> 060308 104253 12 Keren: getParseText().
>> 060308 104254 12 Keren: getText().
>> 060308 104254 12 Keren: Summarizer().getSummary. text length=9442
>>
>> The thread 12 took 1 second to get parse text.
>>
>> 060308 104253 13 Keren: getSegment().
>> 060308 104253 13 Keren: getDocNo().
>> 060308 104253 13 Keren: getParseText().
>> 060308 104302 13 Keren: getText().
>> 060308 104302 13 Keren: Summarizer().getSummary. text length=9140
>>
>> The thread 13 took 9 seconds to get parse text.
>>
>> 060308 104253 14 Keren: getSegment().
>> 060308 104253 14 Keren: getDocNo().
>> 060308 104253 14 Keren: getParseText().
>> 060308 104302 14 Keren: getText().
>> 060308 104302 14 Keren: Summarizer().getSummary. text length=9364
>>
>> The thread 14 took 9 seconds to get parse text.
>>
>> 060308 104253 15 Keren: getSegment().
>> 060308 104253 15 Keren: getDocNo().
>> 060308 104253 15 Keren: getParseText().
>> 060308 104305 15 Keren: getText().
>> 060308 104305 15 Keren: Summarizer().getSummary. text length=3261
>>
>> The thread 15 took 12 seconds to get parse text.
>>
>> 060308 104253 16 Keren: getSegment().
>> 060308 104253 16 Keren: getDocNo().
>> 060308 104253 16 Keren: getParseText().
>> 060308 104305 16 Keren: getText().
>> 060308 104305 16 Keren: Summarizer().getSummary. text length=492
>>
>> The thread 16 took 12 seconds to get parse text.
>>
>> 060308 104253 17 Keren: getSegment().
>> 060308 104253 17 Keren: getDocNo().
>> 060308 104253 17 Keren: getParseText().
>> 060308 104304 17 Keren: getText().
>> 060308 104304 17 Keren: Summarizer().getSummary. text length=3315
>>
>> The thread 17 took 11 seconds to get parse text.
>>
>> 060308 104253 18 Keren: getSegment().
>> 060308 104253 18 Keren: getDocNo().
>> 060308 104253 18 Keren: getParseText().
>> 060308 104254 18 Keren: getText().
>> 060308 104254 18 Keren: Summarizer().getSummary. text length=4770
>>
>> The thread 18 took 1 second to get parse text.
>>
>> 060308 104253 19 Keren: getSegment().
>> 060308 104253 19 Keren: getParseText().
>> 060308 104253 19 Keren: getText().
>> 060308 104253 19 Keren: Summarizer().getSummary. text length=3288
>>
>> The thread 19 took 1 second to get parse text.
>>
>> I think the problem is that how these 10 concurrent threads run.  I'm
>> not sure they are really concurrenctly run. In the thread 16,  it's
>> text length is the smallest, 492.
>>
>>
>
> ---------------------------------------------------------------
> company:        http://www.media-style.com
> forum:        http://www.text-mining.org
> blog:            http://www.find23.net
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Search Speed

keren nutch
In reply to this post by Stefan Groschupf-2
Hi Stefan,
 
 Thank you for reply.
 We have 31 segments. They are totally 106G.
 
 Keren

Stefan Groschupf <[hidden email]> wrote: How many segments you have and how big are they?
Try a disc IO Measurement tool or script what does it says?


Am 08.03.2006 um 17:38 schrieb Insurance Squared Inc.:

> I appreciate your patience as we try to get over our search speed  
> issues.  We're getting closer - it seems we are having huge delays  
> when retrieving the summaries for the various search results.  
> Below are our logs from a search, you can see that retrieving some  
> of the search summaries  took into the double digit seconds.  ('ve  
> left in the comments from the developer).
>
> As we continue to dig deeper, I was wondering if the folks here  
> that are more intimately familiar with the code had any immediate  
> reaction as to what the problem might be,given this additional info.
>
> We've pretty much ruled out Tomcat as the source, we installed  
> Resin and search speed was the same.
>
> Nutch 0.71 running on linux, dual Xeon, 8 gigs of ram, 3Xscsi  
> drives in Raid 0.  Nothing else running on the server.  Index has  
> about 4.5 million pages.
>
> Thanks!
>
> 060308 104251 11 query: term life insurance
> 060308 104251 11 searching for 20 raw hits
> 060308 104253 11 total hits: 20859
> 060308 104253 11 Keren: get hits.
> 060308 104253 11 Keren: get details.
> 060308 104253 11 Keren: get summary.
> 060308 104253 12 Keren: getSegment().
> 060308 104253 13 Keren: getSegment().
> 060308 104253 12 Keren: getDocNo().
> 060308 104253 14 Keren: getSegment().
> 060308 104253 14 Keren: getDocNo().
> 060308 104253 13 Keren: getDocNo().
> 060308 104253 12 Keren: getParseText().
> 060308 104253 15 Keren: getSegment().
> 060308 104253 17 Keren: getSegment().
> 060308 104253 13 Keren: getParseText().
> 060308 104253 15 Keren: getDocNo().
> 060308 104253 15 Keren: getParseText().
> 060308 104253 18 Keren: getSegment().
> 060308 104253 14 Keren: getParseText().
> 060308 104253 16 Keren: getSegment().
> 060308 104253 16 Keren: getDocNo().
> 060308 104253 18 Keren: getDocNo().
> 060308 104253 16 Keren: getParseText().
> 060308 104253 17 Keren: getDocNo().
> 060308 104253 17 Keren: getParseText().
> 060308 104253 19 Keren: getSegment().
> 060308 104253 18 Keren: getParseText().
> 060308 104253 20 Keren: getSegment().
> 060308 104253 21 Keren: getSegment().
> 060308 104253 20 Keren: getDocNo().
> 060308 104253 21 Keren: getDocNo().
> 060308 104253 20 Keren: getParseText().
> 060308 104253 21 Keren: getParseText().
> 060308 104253 19 Keren: getDocNo().
> 060308 104253 19 Keren: getParseText().
> 060308 104253 19 Keren: getText().
> 060308 104253 19 Keren: Summarizer().getSummary. text length=3288
> 060308 104254 19 found resource common-terms.utf8 at file:/var/
> jakarta-tomcat-4.1.31/webapps/ROOT/WEB-INF/classes/common-terms.utf8
> 060308 104254 18 Keren: getText().
> 060308 104254 18 Keren: Summarizer().getSummary. text length=4770
> 060308 104254 12 Keren: getText().
> 060308 104254 12 Keren: Summarizer().getSummary. text length=9442
> 060308 104257 20 Keren: getText().
> 060308 104257 20 Keren: Summarizer().getSummary. text length=4162
> 060308 104302 14 Keren: getText().
> 060308 104302 14 Keren: Summarizer().getSummary. text length=9364
> 060308 104302 13 Keren: getText().
> 060308 104302 13 Keren: Summarizer().getSummary. text length=9140
> 060308 104303 21 Keren: getText().
> 060308 104303 21 Keren: Summarizer().getSummary. text length=1107
> 060308 104304 17 Keren: getText().
> 060308 104304 17 Keren: Summarizer().getSummary. text length=3315
> 060308 104305 15 Keren: getText().
> 060308 104305 15 Keren: Summarizer().getSummary. text length=3261
> 060308 104305 16 Keren: getText().
> 060308 104305 16 Keren: Summarizer().getSummary. text length=492
> 060308 104305 11 Keren: get requestURL.
> 060308 104305 11 Keren: start try.
> 060308 104305 11 Keren: start detail.
> 060308 104305 11 Keren: detail: 0
> 060308 104305 11 Keren: detail: 1
> 060308 104305 11 Keren: detail: 2
> 060308 104305 11 Keren: detail: 3
> 060308 104305 11 Keren: detail: 4
> 060308 104305 11 Keren: detail: 5
> 060308 104305 11 Keren: detail: 6
> 060308 104305 11 Keren: detail: 7
> 060308 104305 11 Keren: detail: 8
> 060308 104305 11 Keren: detail: 9
> 060308 104306 11 Keren: doGet done.
>
> There are 10 threads to get summary. After these threads are done,  
> it return the search results as RSS. Let's see the threads separately,
>
> 060308 104253 12 Keren: getSegment().
> 060308 104253 12 Keren: getDocNo().
> 060308 104253 12 Keren: getParseText().
> 060308 104254 12 Keren: getText().
> 060308 104254 12 Keren: Summarizer().getSummary. text length=9442
>
> The thread 12 took 1 second to get parse text.
>
> 060308 104253 13 Keren: getSegment().
> 060308 104253 13 Keren: getDocNo().
> 060308 104253 13 Keren: getParseText().
> 060308 104302 13 Keren: getText().
> 060308 104302 13 Keren: Summarizer().getSummary. text length=9140
>
> The thread 13 took 9 seconds to get parse text.
>
> 060308 104253 14 Keren: getSegment().
> 060308 104253 14 Keren: getDocNo().
> 060308 104253 14 Keren: getParseText().
> 060308 104302 14 Keren: getText().
> 060308 104302 14 Keren: Summarizer().getSummary. text length=9364
>
> The thread 14 took 9 seconds to get parse text.
>
> 060308 104253 15 Keren: getSegment().
> 060308 104253 15 Keren: getDocNo().
> 060308 104253 15 Keren: getParseText().
> 060308 104305 15 Keren: getText().
> 060308 104305 15 Keren: Summarizer().getSummary. text length=3261
>
> The thread 15 took 12 seconds to get parse text.
>
> 060308 104253 16 Keren: getSegment().
> 060308 104253 16 Keren: getDocNo().
> 060308 104253 16 Keren: getParseText().
> 060308 104305 16 Keren: getText().
> 060308 104305 16 Keren: Summarizer().getSummary. text length=492
>
> The thread 16 took 12 seconds to get parse text.
>
> 060308 104253 17 Keren: getSegment().
> 060308 104253 17 Keren: getDocNo().
> 060308 104253 17 Keren: getParseText().
> 060308 104304 17 Keren: getText().
> 060308 104304 17 Keren: Summarizer().getSummary. text length=3315
>
> The thread 17 took 11 seconds to get parse text.
>
> 060308 104253 18 Keren: getSegment().
> 060308 104253 18 Keren: getDocNo().
> 060308 104253 18 Keren: getParseText().
> 060308 104254 18 Keren: getText().
> 060308 104254 18 Keren: Summarizer().getSummary. text length=4770
>
> The thread 18 took 1 second to get parse text.
>
> 060308 104253 19 Keren: getSegment().
> 060308 104253 19 Keren: getParseText().
> 060308 104253 19 Keren: getText().
> 060308 104253 19 Keren: Summarizer().getSummary. text length=3288
>
> The thread 19 took 1 second to get parse text.
>
> I think the problem is that how these 10 concurrent threads run.  
> I'm not sure they are really concurrenctly run. In the thread 16,  
> it's text length is the smallest, 492.
>
>

---------------------------------------------------------------
company:        http://www.media-style.com
forum:        http://www.text-mining.org
blog:            http://www.find23.net




               
---------------------------------
Make Yahoo! Canada your Homepage Yahoo! Canada Homepage