Recovery - too many updates received since start

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Recovery - too many updates received since start

"Trym R. Møller"
Hi

I experience that a Solr looses its connection with Zookeeper and
re-establish it. After Solr is reconnection to Zookeeper it begins to
recover.
It has been missing the connection approximately 10 seconds and
meanwhile the leader slice has received some documents (maybe about 1000
documents). Solr fails to update peer sync with the log message:
Apr 21, 2012 10:13:40 AM org.apache.solr.update.PeerSync sync
WARNING: PeerSync: core=mycollection_slice21_shard1
url=zk-1:2181,zk-2:2181,zk-3:2181 too many updates received since start
- startingUpdates no longer overlaps with our currentUpdates

Looking into PeerSync and UpdateLog I can see that 100 updates is the
maximum allowed updates that a shard can be behind.
Is it correct that this is not configurable and what is the reasons for
choosing 100?

I suspect that one must compare the work needed to replicate the full
index with the performance loss/resource usage when enhancing the size
of the UpdateLog?

Any comments regarding this is greatly appreciated.

Best regards Trym
Reply | Threaded
Open this post in threaded view
|

Re: Recovery - too many updates received since start

Mark Miller-3

On Apr 24, 2012, at 9:31 AM, Trym R. Møller wrote:

> Hi
>
> I experience that a Solr looses its connection with Zookeeper and re-establish it. After Solr is reconnection to Zookeeper it begins to recover.
> It has been missing the connection approximately 10 seconds and meanwhile the leader slice has received some documents (maybe about 1000 documents). Solr fails to update peer sync with the log message:
> Apr 21, 2012 10:13:40 AM org.apache.solr.update.PeerSync sync
> WARNING: PeerSync: core=mycollection_slice21_shard1 url=zk-1:2181,zk-2:2181,zk-3:2181 too many updates received since start - startingUpdates no longer overlaps with our currentUpdates

You can configure the timeout here - I may have chosen a default that is too low based on some reports.

>
> Looking into PeerSync and UpdateLog I can see that 100 updates is the maximum allowed updates that a shard can be behind.
> Is it correct that this is not configurable and what is the reasons for choosing 100?

Yonik chose this - I'll let him expand on it if he see this. I think it's not configurable currently, but perhaps with the right caveats as doc, it should be.


>
> I suspect that one must compare the work needed to replicate the full index with the performance loss/resource usage when enhancing the size of the UpdateLog?

Yeah, I think that is the gist of it. There may be another gotchya or two, I just don't remember at the moment. Yonik?

>
> Any comments regarding this is greatly appreciated.
>
> Best regards Trym

- Mark Miller
lucidimagination.com











Reply | Threaded
Open this post in threaded view
|

Re: Recovery - too many updates received since start

Yonik Seeley-2-2
In reply to this post by "Trym R. Møller"
On Tue, Apr 24, 2012 at 9:31 AM, "Trym R. Møller" <[hidden email]> wrote:

> Hi
>
> I experience that a Solr looses its connection with Zookeeper and
> re-establish it. After Solr is reconnection to Zookeeper it begins to
> recover.
> It has been missing the connection approximately 10 seconds and meanwhile
> the leader slice has received some documents (maybe about 1000 documents).
> Solr fails to update peer sync with the log message:
> Apr 21, 2012 10:13:40 AM org.apache.solr.update.PeerSync sync
> WARNING: PeerSync: core=mycollection_slice21_shard1
> url=zk-1:2181,zk-2:2181,zk-3:2181 too many updates received since start -
> startingUpdates no longer overlaps with our currentUpdates
>
> Looking into PeerSync and UpdateLog I can see that 100 updates is the
> maximum allowed updates that a shard can be behind.
> Is it correct that this is not configurable and what is the reasons for
> choosing 100?
>
> I suspect that one must compare the work needed to replicate the full index
> with the performance loss/resource usage when enhancing the size of the
> UpdateLog?

The peersync messages don't stream, so we need to limit how many docs
will be in memory at once.
If someone makes that streamable, I'd be more comfortable making the
limit configurable.

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10