Solr Optimization Fail

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr Optimization Fail

Rajinimaski
Hi,

 When we do optimize, it actually reduces the data size right?

I have index of size 6gb(5 million documents). Index is already created
with commits for every 10000 documents.

Now I was trying to do optimization with  http optimize command.   When i
did that,  data size became - 12gb.  Why this might have happened?

And can anyone please suggest me fix for it?

Thanks
Rajani
Reply | Threaded
Open this post in threaded view
|

RE: Solr Optimization Fail

Juan Pablo Mora
Maybe you are generating a snapshot of your index attached to the optimize ???
Look for post-commit or post-optimize events in your solr-config.xml

________________________________________
De: Rajani Maski [[hidden email]]
Enviado el: viernes, 16 de diciembre de 2011 11:11
Para: [hidden email]
Asunto: Solr Optimization Fail

Hi,

 When we do optimize, it actually reduces the data size right?

I have index of size 6gb(5 million documents). Index is already created
with commits for every 10000 documents.

Now I was trying to do optimization with  http optimize command.   When i
did that,  data size became - 12gb.  Why this might have happened?

And can anyone please suggest me fix for it?

Thanks
Rajani
Reply | Threaded
Open this post in threaded view
|

Re: Solr Optimization Fail

Rajinimaski
These parameters are commented in my solr config.xml

see the parameters attached.

<!-- The RunExecutableListener executes an external command from a
      hook such as postCommit or postOptimize.
         exe - the name of the executable to run
         dir - dir to use as the current working directory. default="."
         wait - the calling thread waits until the executable returns.
default="true"
         args - the arguments to pass to the program.  default=nothing
         env - environment variables to set.  default=nothing
      -->
    <!-- A postCommit event is fired after every commit or optimize command
    <listener event="postCommit" class="solr.RunExecutableListener">
      <str name="exe">solr/bin/snapshooter</str>
      <str name="dir">.</str>
      <bool name="wait">true</bool>
      <arr name="args"> <str>arg1</str> <str>arg2</str> </arr>
      <arr name="env"> <str>MYVAR=val1</str> </arr>
    </listener>
    -->
    <!-- A postOptimize event is fired only after every optimize command
    <listener event="postOptimize" class="solr.RunExecutableListener">
      <str name="exe">snapshooter</str>
      <str name="dir">solr/bin</str>
      <bool name="wait">true</bool>
    </listener>
    -->


When i do optimize on index of size 400 mb , it reduces the size of data
folder to 200 mb. But when data is huge it doubles it.
Why is that so?

Optimization : Actually should reduce the size of the data ? Or
just improves the search query performance?






On Fri, Dec 16, 2011 at 5:40 PM, Juan Pablo Mora <[hidden email]> wrote:

> Maybe you are generating a snapshot of your index attached to the optimize
> ???
> Look for post-commit or post-optimize events in your solr-config.xml
>
> ________________________________________
> De: Rajani Maski [[hidden email]]
> Enviado el: viernes, 16 de diciembre de 2011 11:11
> Para: [hidden email]
> Asunto: Solr Optimization Fail
>
> Hi,
>
>  When we do optimize, it actually reduces the data size right?
>
> I have index of size 6gb(5 million documents). Index is already created
> with commits for every 10000 documents.
>
> Now I was trying to do optimization with  http optimize command.   When i
> did that,  data size became - 12gb.  Why this might have happened?
>
> And can anyone please suggest me fix for it?
>
> Thanks
> Rajani
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr Optimization Fail

Tomás Fernández Löbbe
In reply to this post by Juan Pablo Mora
Are you on Windows? There is a JVM bug that makes Solr keep the old files,
even if they are not used anymore. The files are going to be eventually
removed, but if you want them out of there immediately try optimizing
twice, the second optimize doesn't do much but it will remove the old files.

On Fri, Dec 16, 2011 at 9:10 AM, Juan Pablo Mora <[hidden email]> wrote:

> Maybe you are generating a snapshot of your index attached to the optimize
> ???
> Look for post-commit or post-optimize events in your solr-config.xml
>
> ________________________________________
> De: Rajani Maski [[hidden email]]
> Enviado el: viernes, 16 de diciembre de 2011 11:11
> Para: [hidden email]
> Asunto: Solr Optimization Fail
>
> Hi,
>
>  When we do optimize, it actually reduces the data size right?
>
> I have index of size 6gb(5 million documents). Index is already created
> with commits for every 10000 documents.
>
> Now I was trying to do optimization with  http optimize command.   When i
> did that,  data size became - 12gb.  Why this might have happened?
>
> And can anyone please suggest me fix for it?
>
> Thanks
> Rajani
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr Optimization Fail

Rajinimaski
Oh, yes on windows, using java 1.6 and Solr 1.4.1.

Ok let me try that one...

Thank you so much.

Regards,
Rajani



2011/12/16 Tomás Fernández Löbbe <[hidden email]>

> Are you on Windows? There is a JVM bug that makes Solr keep the old files,
> even if they are not used anymore. The files are going to be eventually
> removed, but if you want them out of there immediately try optimizing
> twice, the second optimize doesn't do much but it will remove the old
> files.
>
> On Fri, Dec 16, 2011 at 9:10 AM, Juan Pablo Mora <[hidden email]>
> wrote:
>
> > Maybe you are generating a snapshot of your index attached to the
> optimize
> > ???
> > Look for post-commit or post-optimize events in your solr-config.xml
> >
> > ________________________________________
> > De: Rajani Maski [[hidden email]]
> > Enviado el: viernes, 16 de diciembre de 2011 11:11
> > Para: [hidden email]
> > Asunto: Solr Optimization Fail
> >
> > Hi,
> >
> >  When we do optimize, it actually reduces the data size right?
> >
> > I have index of size 6gb(5 million documents). Index is already created
> > with commits for every 10000 documents.
> >
> > Now I was trying to do optimization with  http optimize command.   When i
> > did that,  data size became - 12gb.  Why this might have happened?
> >
> > And can anyone please suggest me fix for it?
> >
> > Thanks
> > Rajani
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr Optimization Fail

Chris Hostetter-3

: Oh, yes on windows, using java 1.6 and Solr 1.4.1.

aparently no one has ever written a FAQ on this, so i just added one...

https://wiki.apache.org/solr/FAQ#Why_doesn.27t_my_index_directory_get_smaller_.28immediately.29_when_i_delete_documents.3F_force_a_merge.3F_optimize.3F


: > Are you on Windows? There is a JVM bug that makes Solr keep the old files,
: > even if they are not used anymore. The files are going to be eventually
: > removed, but if you want them out of there immediately try optimizing
: > twice, the second optimize doesn't do much but it will remove the old
: > files.

(NOTE: It's not a JVM Bug, it's just how the filesystem works in Windows:
you can't delete files that are currently open)

-Hoss