How long does optimize take on your Solr installation?

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

How long does optimize take on your Solr installation?

Walter Underwood, Netflix
Please answer with the size of your index (post-optimize) and how long
an optimize takes. I'll collect the data and see if I can draw a line
through it.

190 MB, 55 seconds

$ du -sk /apps/wss/solr_home/data/index
191592  /apps/wss/solr_home/data/index
$  grep commit /apps/wss/tomcat/logs/stdout.log
Feb 28, 2008 11:55:11 AM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start commit(optimize=true,waitFlush=false,waitSearcher=true)
Feb 28, 2008 11:56:06 AM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: end_commit_flush
$ uname -a
Linux spiderman4 2.6.9-22.EL #1 SMP Mon Sep 19 17:52:20 EDT 2005 ppc64 ppc64
ppc64 GNU/Linux

wunder

Reply | Threaded
Open this post in threaded view
|

Re: How long does optimize take on your Solr installation?

sfox-2
767 MB 76 seconds

(single, local SATA 7200rpm disk, unloaded XServe G5)

Sean Fox

Walter Underwood wrote:

> Please answer with the size of your index (post-optimize) and how long
> an optimize takes. I'll collect the data and see if I can draw a line
> through it.
>
> 190 MB, 55 seconds
>
> $ du -sk /apps/wss/solr_home/data/index
> 191592  /apps/wss/solr_home/data/index
> $  grep commit /apps/wss/tomcat/logs/stdout.log
> Feb 28, 2008 11:55:11 AM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: start commit(optimize=true,waitFlush=false,waitSearcher=true)
> Feb 28, 2008 11:56:06 AM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: end_commit_flush
> $ uname -a
> Linux spiderman4 2.6.9-22.EL #1 SMP Mon Sep 19 17:52:20 EDT 2005 ppc64 ppc64
> ppc64 GNU/Linux
>
> wunder
>

--
[hidden email] | Technical Director | SERC | Carleton College
Reply | Threaded
Open this post in threaded view
|

Re: How long does optimize take on your Solr installation?

Grant Ingersoll-2
In reply to this post by Walter Underwood, Netflix
You might want to get info about mergeFactors, and Lucene/Solr  
versions in use.

On Feb 28, 2008, at 1:15 PM, Walter Underwood wrote:

> Please answer with the size of your index (post-optimize) and how long
> an optimize takes. I'll collect the data and see if I can draw a line
> through it.
>
> 190 MB, 55 seconds
>
> $ du -sk /apps/wss/solr_home/data/index
> 191592  /apps/wss/solr_home/data/index
> $  grep commit /apps/wss/tomcat/logs/stdout.log
> Feb 28, 2008 11:55:11 AM org.apache.solr.update.DirectUpdateHandler2  
> commit
> INFO: start commit(optimize=true,waitFlush=false,waitSearcher=true)
> Feb 28, 2008 11:56:06 AM org.apache.solr.update.DirectUpdateHandler2  
> commit
> INFO: end_commit_flush
> $ uname -a
> Linux spiderman4 2.6.9-22.EL #1 SMP Mon Sep 19 17:52:20 EDT 2005  
> ppc64 ppc64
> ppc64 GNU/Linux
>
> wunder
>


Reply | Threaded
Open this post in threaded view
|

RE: How long does optimize take on your Solr installation?

Alex Benjamen
In reply to this post by Walter Underwood, Netflix
It mostly depends on whether or not the index is completely new or incremental
 
4Gb, 28MM docs, ~30min  (new index)
4Gb, 28MM docs, 30s  (incremental)
 
 
Reply | Threaded
Open this post in threaded view
|

Re: How long does optimize take on your Solr installation?

Walter Underwood, Netflix
Good point. My numbers are from a full rebuild. Let's collect maximum
times, to keep it simple. --wunder

On 2/28/08 7:28 PM, "Alex Benjamen" <[hidden email]> wrote:

> It mostly depends on whether or not the index is completely new or incremental
>  
> 4Gb, 28MM docs, ~30min  (new index)
> 4Gb, 28MM docs, 30s  (incremental)
>  
>  

Reply | Threaded
Open this post in threaded view
|

Re: How long does optimize take on your Solr installation?

Walter Underwood, Netflix
In reply to this post by Grant Ingersoll-2
And I could collect disk subsystem, JVM, processor, and so on, but
we'd have a seven dimensional rule of thumb, which is kinda scary.

wunder

On 2/28/08 12:14 PM, "Grant Ingersoll" <[hidden email]> wrote:

> You might want to get info about mergeFactors, and Lucene/Solr
> versions in use.
>
> On Feb 28, 2008, at 1:15 PM, Walter Underwood wrote:
>
>> Please answer with the size of your index (post-optimize) and how long
>> an optimize takes. I'll collect the data and see if I can draw a line
>> through it.
>>
>> 190 MB, 55 seconds
>>
>> $ du -sk /apps/wss/solr_home/data/index
>> 191592  /apps/wss/solr_home/data/index
>> $  grep commit /apps/wss/tomcat/logs/stdout.log
>> Feb 28, 2008 11:55:11 AM org.apache.solr.update.DirectUpdateHandler2
>> commit
>> INFO: start commit(optimize=true,waitFlush=false,waitSearcher=true)
>> Feb 28, 2008 11:56:06 AM org.apache.solr.update.DirectUpdateHandler2
>> commit
>> INFO: end_commit_flush
>> $ uname -a
>> Linux spiderman4 2.6.9-22.EL #1 SMP Mon Sep 19 17:52:20 EDT 2005
>> ppc64 ppc64
>> ppc64 GNU/Linux
>>
>> wunder
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: How long does optimize take on your Solr installation?

Yonik Seeley-2
In reply to this post by Walter Underwood, Netflix
On Fri, Feb 29, 2008 at 12:45 AM, Walter Underwood
<[hidden email]> wrote:
> Good point. My numbers are from a full rebuild. Let's collect maximum
>  times, to keep it simple. --wunder

You may see more variation than you expect since optimization is done
in stages of mergeFactor segments.  In the same environment, you could
add a single extra doc, and then an optimize would be faster than a
previous run because that add happened to force a bunch of merges.

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: How long does optimize take on your Solr installation?

Norberto Meijome-2
On Fri, 29 Feb 2008 13:02:21 -0500
"Yonik Seeley" <[hidden email]> wrote:

> On Fri, Feb 29, 2008 at 12:45 AM, Walter Underwood
> <[hidden email]> wrote:
> > Good point. My numbers are from a full rebuild. Let's collect maximum
> >  times, to keep it simple. --wunder  
>
> You may see more variation than you expect since optimization is done
> in stages of mergeFactor segments.  In the same environment, you could
> add a single extra doc, and then an optimize would be faster than a
> previous run because that add happened to force a bunch of merges.

Hi all,

Does providing "my optimise takes x minutes on this hardware on a data set this
big" actually tell us useful information, other than rough ideas of how long an
optimise operation could take for {those variables}? I mean, the data ,
configuration, etc you are working on makes quite a bit of difference.

As Walter mentioned, so many variables at hand can get scary...but they
could be grouped as :
1) your SOLR setup (schema, # of docs, configuration)

2) your hardware and OS configuration.

I would guess that to get a proper understanding and provide most useful
information they could be treated separately.

For example, for test 2), a sample SOLR configuration and data be provided and
a set of test scripts be provided. Then anyone can provide information back on
how fast their hardware / config works on SOLR-PERF-TEST_1 (optimised for
overall speed) vs SOLR-PERF-TEST_2 (optimised for commit times) vs ... whatever.

I am not too sure how to have a standard test for the first group... maybe the
data and configuration examples from 2) would be useful enough for finetuning ,
as examples (similar to MySQL 'large' and 'huge' configurations)...

just a thought...
B

_________________________
{Beto|Norberto|Numard} Meijome

Do not take away the camels hump, you may be stopping him from being a camel.

I speak for myself, not my employer. Contents may be hot. Slippery when wet.
Reading disclaimers makes you go blind. Writing them is worse. You have been
Warned.