slave index is bigger than master index

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

slave index is bigger than master index

Muneeb Ali
Hi,

I am using Solr 1.4 version, with master-slave setup. We have one master slave and two slave servers. It was all working fine, but lately solr slaves are behaving strange. Particularly during replicating the index, the slave nodes die and always need a restart. Also the index size of slave nodes is much bigger (336GB) than the master node index (i.e. only 86GB).

I am guessing that its not removing previous indices at slave nodes when replicating? Has anyone faced similar issues?

Any help would be highly appreciated.

Thanks very much.

-Muneeb
Reply | Threaded
Open this post in threaded view
|

Re: slave index is bigger than master index

Tommaso Teofili
Hi,
I think that you may be using a Lucene/Solr IndexDeletionPolicy that does
not remove old commits (and you aren't propagating solr-config via
replication).
You can configre this feature on the solr-config.xml inside the
<deletionPolicy> tag:

*<deletionPolicy class="solr.SolrDeletionPolicy">
      <!-- The number of commit points to be kept -->
      <str name="maxCommitsToKeep">1</str>
      <!-- The number of optimized commit points to be kept -->
      <str name="maxOptimizedCommitsToKeep">0</str>
      <!--
          Delete all commit points once they have reached the given age.
          Supports DateMathParser syntax e.g.

          <str name="maxCommitAge">30MINUTES</str>
          <str name="maxCommitAge">1DAY</str>
      -->
    </deletionPolicy>*

I hope this can be helpful.
Cheers,
Tommaso

2010/7/26 Muneeb Ali <[hidden email]>

>
> Hi,
>
> I am using Solr 1.4 version, with master-slave setup. We have one master
> slave and two slave servers. It was all working fine, but lately solr
> slaves
> are behaving strange. Particularly during replicating the index, the slave
> nodes die and always need a restart. Also the index size of slave nodes is
> much bigger (336GB) than the master node index (i.e. only 86GB).
>
> I am guessing that its not removing previous indices at slave nodes when
> replicating? Has anyone faced similar issues?
>
> Any help would be highly appreciated.
>
> Thanks very much.
>
> -Muneeb
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/slave-index-is-bigger-than-master-index-tp996329p996329.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

AW: slave index is bigger than master index

Bastian S.
In reply to this post by Muneeb Ali
Hi,

are u calling <optimize/> on the master to finally remove deleted documents and merge the index files?
once a day is recommended:

http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerations


cheers

-----Urspr√ľngliche Nachricht-----
Von: Muneeb Ali [mailto:[hidden email]]
Gesendet: Montag, 26. Juli 2010 15:37
An: [hidden email]
Betreff: slave index is bigger than master index


Hi,

I am using Solr 1.4 version, with master-slave setup. We have one master slave and two slave servers. It was all working fine, but lately solr slaves are behaving strange. Particularly during replicating the index, the slave nodes die and always need a restart. Also the index size of slave nodes is much bigger (336GB) than the master node index (i.e. only 86GB).

I am guessing that its not removing previous indices at slave nodes when replicating? Has anyone faced similar issues?

Any help would be highly appreciated.

Thanks very much.

-Muneeb
--
View this message in context: http://lucene.472066.n3.nabble.com/slave-index-is-bigger-than-master-index-tp996329p996329.html
Sent from the Solr - User mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: AW: slave index is bigger than master index

Muneeb Ali
Yes I always run an optimize whenever I index on master. In fact I just ran an optimize command an hour ago, but it didn't make any difference.
Reply | Threaded
Open this post in threaded view
|

Re: slave index is bigger than master index

Muneeb Ali
In reply to this post by Tommaso Teofili
I just checked my config file, and I do have exact same values for deletionPolicy tag, as you attached in your email, so I dont really think it could be this.
Reply | Threaded
Open this post in threaded view
|

Re: slave index is bigger than master index

Peter
In reply to this post by Muneeb Ali
did you try an optimize on the slave too?

> Yes I always run an optimize whenever I index on master. In fact I just ran
> an optimize command an hour ago, but it didn't make any difference.
>  


Reply | Threaded
Open this post in threaded view
|

Re: slave index is bigger than master index

Muneeb Ali
No I didn't. I thought you aren't supposed to run optimize on slaves. Well but it doesn;t matter now, as I think its fixed now. I just added a dummy document on master, ran a commit call and then once that executed ran an optimize call. This triggered snapshooter to replicate the index, which somehow resulted in normal index size at slaves.

I still don't get what exactly happened there, and will be investigating into this. If I do find anything interesting, will update on this mailing list.

Thanks for all your input anyways,

-Muneeb
Reply | Threaded
Open this post in threaded view
|

RE: slave index is bigger than master index

Bastian S.
In reply to this post by Peter
as far as i know this is not needed, the optimized index is automatically replicated to the
slaves. therefore something seems to be really wrong with your setup. maybe the slave index
got corrupted for some reason? did u try deleting the data dir + slave restart for a fresh
replicated index? maybe worth a try..

good luck

-----Urspr√ľngliche Nachricht-----
Von: Peter Karich [mailto:[hidden email]]
Gesendet: Montag, 26. Juli 2010 16:54
An: [hidden email]
Betreff: Re: slave index is bigger than master index

did you try an optimize on the slave too?

> Yes I always run an optimize whenever I index on master. In fact I
> just ran an optimize command an hour ago, but it didn't make any difference.
>  


Reply | Threaded
Open this post in threaded view
|

Re: slave index is bigger than master index

Chris Hostetter-3
In reply to this post by Muneeb Ali

: No I didn't. I thought you aren't supposed to run optimize on slaves. Well

correct, you should make all changes to the master.

: but it doesn;t matter now, as I think its fixed now. I just added a dummy
: document on master, ran a commit call and then once that executed ran an
: optimize call. This triggered snapshooter to replicate the index, which
: somehow resulted in normal index size at slaves.

My hunch: are you running on windows?

Windows filesystems have issues with trying to delete a file while
processes still have the file handle open.  Since Solr needs those "old"
filehandles to continue serving requests while it opens up the "new" copy
of the index, those files wind up left on disk.  the *next* time a new
index is opened, it tries to delete those files again, and then they
succeed...

http://wiki.apache.org/lucene-java/LuceneFAQ#Why_do_I_have_a_deletable_file_.28and_old_segment_files_remain.29_after_running_optimize.3F

...if you notice this situation happen again, check and see if you have a
"deletables" file.

-Hoss

Reply | Threaded
Open this post in threaded view
|

Re: slave index is bigger than master index

Muneeb Ali
We have three dedicated servers for solr, two for slaves and one for master, all with linux/debian packages installed.

I understand that replication does always copies over the index in an exact form as in master index directory (or it is supposed to do that at least), and if the master index was optimized after indexing, one doesn't need to run an optimize call again on master to optimize the slave's index. But in our case thats what fixed it and I agree it is even more confusing now :s

Another problem is, we are serving live services using slave nodes, so I dont want to effect the live search, while playing with slave nodes' indices.

We will be running the indexing on master node today over the night. Lets see if it does it again.
Reply | Threaded
Open this post in threaded view
|

Re: slave index is bigger than master index

Peter

> We have three dedicated servers for solr, two for slaves and one for master,
> all with linux/debian packages installed.
>
> I understand that replication does always copies over the index in an exact
> form as in master index directory (or it is supposed to do that at least),
> and if the master index was optimized after indexing, one doesn't need to
> run an optimize call again on master to optimize the slave's index. But in
> our case thats what fixed it and I agree it is even more confusing now :s
>  

Thats why I said: try it on the slaves too ;-)
In our case it helped too to shrink 2*index to 1*index.
I think the data which necessary for the replication won't cleanup
before the next replication or before an optimize.
For us it was crucial to shrink the size because of limited
disc-resources and to make sure that the next
replication does not increase the index to >3*times of the initial size.

@muneeb so I think, optimization is not necessary or do you have disc
limitations too?
@Hoss or others: does this explanation sound logically?

> Another problem is, we are serving live services using slave nodes, so I
> dont want to effect the live search, while playing with slave nodes'
> indices.
>  

What do you mean here? Optimizing is too CPU expensive?

> We will be running the indexing on master node today over the night. Lets
> see if it does it again.
>  

Do you mean increase to double size?
Reply | Threaded
Open this post in threaded view
|

Re: slave index is bigger than master index

Lance Norskog-2
Ah! You have junk files piling up in the slave index directory. When
this happens, you may have to remove data/index entirely. I'm not sure
if Solr replication will handle that, or if you have to copy the whole
index to reset it.

You said the slaves time out- maybe the files are so large that the
master & slave need socket timeouts changed? In solrconfig.xml, these
two lines control that. Maybe they need to be increased.

        <str name="httpConnTimeout">5000</str>
        <str name="httpReadTimeout">10000</str>


On Tue, Jul 27, 2010 at 3:59 AM, Peter Karich <[hidden email]> wrote:

>
>> We have three dedicated servers for solr, two for slaves and one for master,
>> all with linux/debian packages installed.
>>
>> I understand that replication does always copies over the index in an exact
>> form as in master index directory (or it is supposed to do that at least),
>> and if the master index was optimized after indexing, one doesn't need to
>> run an optimize call again on master to optimize the slave's index. But in
>> our case thats what fixed it and I agree it is even more confusing now :s
>>
>
> Thats why I said: try it on the slaves too ;-)
> In our case it helped too to shrink 2*index to 1*index.
> I think the data which necessary for the replication won't cleanup
> before the next replication or before an optimize.
> For us it was crucial to shrink the size because of limited
> disc-resources and to make sure that the next
> replication does not increase the index to >3*times of the initial size.
>
> @muneeb so I think, optimization is not necessary or do you have disc
> limitations too?
> @Hoss or others: does this explanation sound logically?
>
>> Another problem is, we are serving live services using slave nodes, so I
>> dont want to effect the live search, while playing with slave nodes'
>> indices.
>>
>
> What do you mean here? Optimizing is too CPU expensive?
>
>> We will be running the indexing on master node today over the night. Lets
>> see if it does it again.
>>
>
> Do you mean increase to double size?
>



--
Lance Norskog
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: slave index is bigger than master index

Muneeb Ali
In reply to this post by Peter

Well I do have disk limitations too, and thats why I think slave nodes died,
when replicating data from master node. (as it was just adding on top of
existing index files).

:: What do you mean here? Optimizing is too CPU expensive?

What I meant by avoid playing around with slave nodes is that doing anything
(including optimization on slave nodes) that may effect the live search
performance, unless I have no option.

:: Do you mean increase to double size?

yes, as it did before on replication. But I didn't get a chance to run the
indexer yesterday.

 
--
View this message in context: http://lucene.472066.n3.nabble.com/slave-index-is-bigger-than-master-index-tp996329p1002426.html
Sent from the Solr - User mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: slave index is bigger than master index

Muneeb Ali
In reply to this post by Lance Norskog-2

>> In solrconfig.xml, these two lines control that. Maybe they need to be
increased.
>>     <str name="httpConnTimeout">5000</str>
>>     <str name="httpReadTimeout">10000</str>

Where do I add those in solrconfig? These lines doesn't seem to be present
in the example solrconfig file...
--
View this message in context: http://lucene.472066.n3.nabble.com/slave-index-is-bigger-than-master-index-tp996329p1002432.html
Sent from the Solr - User mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: slave index is bigger than master index

Muneeb Ali
In reply to this post by Peter
Well I do have disk limitations too, and thats why I think slave nodes died, when replicating data from master node. (as it was just adding on top of existing index files).

What do you mean here? Optimizing is too CPU expensive?

What I meant by avoid playing around with slave nodes is that doing anything (including optimization on slave nodes) that may effect the live search performance, unless I have no option.

Do you mean increase to double size?

yes, as it did before on replication. But I didn't get a chance to run the indexer yesterday.
Reply | Threaded
Open this post in threaded view
|

Re: slave index is bigger than master index

Muneeb Ali
In reply to this post by Lance Norskog-2
Where do these lines go in solr config?        

<str name="httpConnTimeout">5000</str>
<str name="httpReadTimeout">10000</str> 

Thanks,
-Mueeb
Reply | Threaded
Open this post in threaded view
|

Re: slave index is bigger than master index

Peter
In reply to this post by Muneeb Ali
Hi Muneeb,

I fear you'll have no chance: replicating an index will use more disc
space on the slave nodes.
Of course, you could minimize disc usage AFTER the replication via the
'optimize-hack'.

But are you sure the reason for the slave-node die, is due to disc
limitations?
Try to observe the slave indices e.g. via jvisualvm ... maybe it is due
to RAM or CPU limitations,
because a replicating index will do autowarming (requires some more RAM)
and
will do this in parallel to the old searcher which handles the requests
(requires some more CPU).

If RAM usage is the point try to reduce the caches (which could
negativly effect query time!)
and/or autowarming. Especially the filterCache requires some RAM.

So: measure before hacking :-) !

Regards,
Peter.

> Well I do have disk limitations too, and thats why I think slave nodes died,
> when replicating data from master node. (as it was just adding on top of
> existing index files).
>
> :: What do you mean here? Optimizing is too CPU expensive?
>
> What I meant by avoid playing around with slave nodes is that doing anything
> (including optimization on slave nodes) that may effect the live search
> performance, unless I have no option.
>
> :: Do you mean increase to double size?
>
> yes, as it did before on replication. But I didn't get a chance to run the
> indexer yesterday.
>
>  
>