Soft commit and new replica types

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Soft commit and new replica types

Vadim Ivanov
Before 7.x all replicas in SolrCloud were NRT type.
And following rules were applicable:
https://stackoverflow.com/questions/45998804/when-should-we-apply-hard-commit-and-soft-commit-in-solr
and
https://lucene.apache.org/solr/guide/7_5/updatehandlers-in-solrconfig.html#commit-and-softcommit

But having  new TLOG and PULL replica types causing some mess in that explanations.
From Ref guide we have:
" NRT is the only type of replica that supports soft-commits..."
"If TLOG replica does become a leader, it will behave the same as if it was a NRT type of replica."
Does it mean, that if we do not have NRT replicas in the cluster then
autoSoftCommit section in solconfig.xml Ignored completely (even on TLOG leader)?

<autoSoftCommit>
  <maxTime>60000</maxTime>
</autoSoftCommit>

Should we say that in autoCommit section openSearcher is always true in that case?

<autoCommit>
  <maxDocs>10000</maxDocs>
  <maxTime>30000</maxTime>
  <maxSize>512m</maxSize>
  <openSearcher>false</openSearcher>
</autoCommit>

Does it mean that new Searcher always starts on all replicas when hard commit happens on leader?
Some words in Ref Guide about new replica types in section #commit-and-softcommit seems to be usefull.
--
Vadim

Reply | Threaded
Open this post in threaded view
|

Re: Soft commit and new replica types

Edward Ribeiro
Some insights in the new replica types below:

On Sat, December 8, 2018 08:42, Vadim Ivanov <
[hidden email] wrote:

>
> From Ref guide we have:
> " NRT is the only type of replica that supports soft-commits..."
> "If TLOG replica does become a leader, it will behave the same as if it
> was a NRT type of replica."
> Does it mean, that if we do not have NRT replicas in the cluster then
> autoSoftCommit section in solconfig.xml Ignored completely (even on TLOG
> leader)?
>

No, not completely. Both TLOG and PULL nodes will periodically poll the
leader for changes in index segments' files and download those segments
from the leader. If hard commit max time is defined in solrconfig.xml the
polling interval of each replica will be half that value. Or else if the
soft commit max time is defined then the replicas will use half the soft
commit max time as the interval. If neither are defined then the poll
interval will be 3 seconds (hard coded). See here:
https://github.com/apache/lucene-solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/org/apache/solr/cloud/ReplicateFromLeader.java#L68-L77

If the TLOG is the leader it will index locally and append the doc to
transaction log as a NRT node would do as well as it will synchronously
replicate the data to other TLOG replicas' transaction logs (PULL nodes
don't have transaction logs). But TLOG/PULL replicas doesn't support soft
commits nor real time gets, afaik.

>

> <autoSoftCommit>
>   <maxTime>60000</maxTime>
> </autoSoftCommit>
>
> Should we say that in autoCommit section openSearcher is always true in
> that case?


<autoCommit>
  <maxDocs>10000</maxDocs>
  <maxTime>30000</maxTime>
  <maxSize>512m</maxSize>
  <openSearcher>false</openSearcher>
</autoCommit>

Does it mean that new Searcher always starts on all replicas when hard
commit happens on leader?


Nope. Or at least, the searcher is not synchronously created. Each non
leader replica will periodically fetch the index changes from the leader
and open a new searcher to reflect those changes as seen here:
https://github.com/apache/lucene-solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/org/apache/solr/handler/IndexFetcher.java#L653
But it's important to note that the potential delay between the leader's
hard commit and the other replicas fetching those changes from the leader
and opening a new searcher to reflect latest changes.

PS: I am still digging these new replica types so I can have misunderstood
or missed some aspect of it.

Regards,
Edward
Reply | Threaded
Open this post in threaded view
|

RE: Soft commit and new replica types

Vadim Ivanov
Thanks, Edward, for clues.
What bothers me is newSearcher start, warming, cache clear... all that CPU consuming stuff in my heavy-indexing scenario.
With NRT I had autoSoftCommit:   <maxTime>300000</maxTime>.
So I had new Searcher no more than  every 5 min on every replica.
To have more or less  the same effect with TLOG - PULL collection,
I suppose, I have to have <autoCommit> :   <maxTime>300000</maxTime>
(yes, I understand that newSearchers start asynchronously on leader and replicas)
Am I right?
--
Vadim


> -----Original Message-----
> From: Edward Ribeiro [mailto:[hidden email]]
> Sent: Sunday, December 09, 2018 12:42 AM
> To: [hidden email]
> Subject: Re: Soft commit and new replica types
>
> Some insights in the new replica types below:
>
> On Sat, December 8, 2018 08:42, Vadim Ivanov <
> [hidden email] wrote:
>
> >
> > From Ref guide we have:
> > " NRT is the only type of replica that supports soft-commits..."
> > "If TLOG replica does become a leader, it will behave the same as if it
> > was a NRT type of replica."
> > Does it mean, that if we do not have NRT replicas in the cluster then
> > autoSoftCommit section in solconfig.xml Ignored completely (even on TLOG
> > leader)?
> >
>
> No, not completely. Both TLOG and PULL nodes will periodically poll the
> leader for changes in index segments' files and download those segments
> from the leader. If hard commit max time is defined in solrconfig.xml the
> polling interval of each replica will be half that value. Or else if the
> soft commit max time is defined then the replicas will use half the soft
> commit max time as the interval. If neither are defined then the poll
> interval will be 3 seconds (hard coded). See here:
> https://github.com/apache/lucene-
> solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> rg/apache/solr/cloud/ReplicateFromLeader.java#L68-L77
>
> If the TLOG is the leader it will index locally and append the doc to
> transaction log as a NRT node would do as well as it will synchronously
> replicate the data to other TLOG replicas' transaction logs (PULL nodes
> don't have transaction logs). But TLOG/PULL replicas doesn't support soft
> commits nor real time gets, afaik.
>
> >
>
> > <autoSoftCommit>
> >   <maxTime>60000</maxTime>
> > </autoSoftCommit>
> >
> > Should we say that in autoCommit section openSearcher is always true in
> > that case?
>
>
> <autoCommit>
>   <maxDocs>10000</maxDocs>
>   <maxTime>30000</maxTime>
>   <maxSize>512m</maxSize>
>   <openSearcher>false</openSearcher>
> </autoCommit>
>
> Does it mean that new Searcher always starts on all replicas when hard
> commit happens on leader?
>
>
> Nope. Or at least, the searcher is not synchronously created. Each non
> leader replica will periodically fetch the index changes from the leader
> and open a new searcher to reflect those changes as seen here:
> https://github.com/apache/lucene-
> solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> rg/apache/solr/handler/IndexFetcher.java#L653
> But it's important to note that the potential delay between the leader's
> hard commit and the other replicas fetching those changes from the leader
> and opening a new searcher to reflect latest changes.
>
> PS: I am still digging these new replica types so I can have misunderstood
> or missed some aspect of it.
>
> Regards,
> Edward

Reply | Threaded
Open this post in threaded view
|

Re: Soft commit and new replica types

Erick Erickson
Not quite, 600000. The polling interval is half the commit interval....

This has always bothered me a little bit, I wonder at the utility of a
config param. We already have old-style replication with a
configurable polling interval. Under very heavy indexing loads, it
seems to me that either the tlogs will grow quite large or we'll be
pulling a lot of unnecessary segments across the wire, segments
that'll soon be merged away and the merged segment re-pulled.

Apparently, though, nobody's seen this "in the wild", so it's
theoretical at this point.
On Sun, Dec 9, 2018 at 1:48 AM Vadim Ivanov
<[hidden email]> wrote:

>
> Thanks, Edward, for clues.
> What bothers me is newSearcher start, warming, cache clear... all that CPU consuming stuff in my heavy-indexing scenario.
> With NRT I had autoSoftCommit:   <maxTime>300000</maxTime>.
> So I had new Searcher no more than  every 5 min on every replica.
> To have more or less  the same effect with TLOG - PULL collection,
> I suppose, I have to have <autoCommit> :   <maxTime>300000</maxTime>
> (yes, I understand that newSearchers start asynchronously on leader and replicas)
> Am I right?
> --
> Vadim
>
>
> > -----Original Message-----
> > From: Edward Ribeiro [mailto:[hidden email]]
> > Sent: Sunday, December 09, 2018 12:42 AM
> > To: [hidden email]
> > Subject: Re: Soft commit and new replica types
> >
> > Some insights in the new replica types below:
> >
> > On Sat, December 8, 2018 08:42, Vadim Ivanov <
> > [hidden email] wrote:
> >
> > >
> > > From Ref guide we have:
> > > " NRT is the only type of replica that supports soft-commits..."
> > > "If TLOG replica does become a leader, it will behave the same as if it
> > > was a NRT type of replica."
> > > Does it mean, that if we do not have NRT replicas in the cluster then
> > > autoSoftCommit section in solconfig.xml Ignored completely (even on TLOG
> > > leader)?
> > >
> >
> > No, not completely. Both TLOG and PULL nodes will periodically poll the
> > leader for changes in index segments' files and download those segments
> > from the leader. If hard commit max time is defined in solrconfig.xml the
> > polling interval of each replica will be half that value. Or else if the
> > soft commit max time is defined then the replicas will use half the soft
> > commit max time as the interval. If neither are defined then the poll
> > interval will be 3 seconds (hard coded). See here:
> > https://github.com/apache/lucene-
> > solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> > rg/apache/solr/cloud/ReplicateFromLeader.java#L68-L77
> >
> > If the TLOG is the leader it will index locally and append the doc to
> > transaction log as a NRT node would do as well as it will synchronously
> > replicate the data to other TLOG replicas' transaction logs (PULL nodes
> > don't have transaction logs). But TLOG/PULL replicas doesn't support soft
> > commits nor real time gets, afaik.
> >
> > >
> >
> > > <autoSoftCommit>
> > >   <maxTime>60000</maxTime>
> > > </autoSoftCommit>
> > >
> > > Should we say that in autoCommit section openSearcher is always true in
> > > that case?
> >
> >
> > <autoCommit>
> >   <maxDocs>10000</maxDocs>
> >   <maxTime>30000</maxTime>
> >   <maxSize>512m</maxSize>
> >   <openSearcher>false</openSearcher>
> > </autoCommit>
> >
> > Does it mean that new Searcher always starts on all replicas when hard
> > commit happens on leader?
> >
> >
> > Nope. Or at least, the searcher is not synchronously created. Each non
> > leader replica will periodically fetch the index changes from the leader
> > and open a new searcher to reflect those changes as seen here:
> > https://github.com/apache/lucene-
> > solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> > rg/apache/solr/handler/IndexFetcher.java#L653
> > But it's important to note that the potential delay between the leader's
> > hard commit and the other replicas fetching those changes from the leader
> > and opening a new searcher to reflect latest changes.
> >
> > PS: I am still digging these new replica types so I can have misunderstood
> > or missed some aspect of it.
> >
> > Regards,
> > Edward
>
Reply | Threaded
Open this post in threaded view
|

Re: Soft commit and new replica types

Vadim Ivanov

 If hard commit max time is 300 sec then commit happens every 300 sec on tlog leader. And new segments pop up on the leader every 300 sec, during indexing. Polling interval on other replicas 150 sec, but not every poll attempt they fetch new segment from the leader, afaiu. Erick, do you mean that on all other  tlog replicas(not leaders) commit occurs every poll?  воскресенье, 09 декабря 2018г., 19:21 +03:00 от Erick Erickson  [hidden email] :

>Not quite, 600000. The polling interval is half the commit interval....
>
>This has always bothered me a little bit, I wonder at the utility of a
>config param. We already have old-style replication with a
>configurable polling interval. Under very heavy indexing loads, it
>seems to me that either the tlogs will grow quite large or we'll be
>pulling a lot of unnecessary segments across the wire, segments
>that'll soon be merged away and the merged segment re-pulled.
>
>Apparently, though, nobody's seen this "in the wild", so it's
>theoretical at this point.
>On Sun, Dec 9, 2018 at 1:48 AM Vadim Ivanov
< [hidden email]> wrote:

>
> Thanks, Edward, for clues.
> What bothers me is newSearcher start, warming, cache clear... all that CPU consuming stuff in my heavy-indexing scenario.
> With NRT I had autoSoftCommit:  300000 .
> So I had new Searcher no more than  every 5 min on every replica.
> To have more or less  the same effect with TLOG - PULL collection,
> I suppose, I have to have  :  300000
> (yes, I understand that newSearchers start asynchronously on leader and replicas)
> Am I right?
> --
> Vadim
>
>
>> -----Original Message-----
>> From: Edward Ribeiro [mailto:[hidden email]]
>> Sent: Sunday, December 09, 2018 12:42 AM
>> To:  [hidden email]
>> Subject: Re: Soft commit and new replica types
>>
>> Some insights in the new replica types below:
>>
>> On Sat, December 8, 2018 08:42, Vadim Ivanov <
>> [hidden email] wrote:
>>
>>>
>>> From Ref guide we have:
>>> " NRT is the only type of replica that supports soft-commits..."
>>> "If TLOG replica does become a leader, it will behave the same as if it
>>> was a NRT type of replica."
>>> Does it mean, that if we do not have NRT replicas in the cluster then
>>> autoSoftCommit section in solconfig.xml Ignored completely (even on TLOG
>>> leader)?
>>>
>>
>> No, not completely. Both TLOG and PULL nodes will periodically poll the
>> leader for changes in index segments' files and download those segments
>> from the leader. If hard commit max time is defined in solrconfig.xml the
>> polling interval of each replica will be half that value. Or else if the
>> soft commit max time is defined then the replicas will use half the soft
>> commit max time as the interval. If neither are defined then the poll
>> interval will be 3 seconds (hard coded). See here:
>> https://github.com/apache/lucene-
>> solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
>> rg/apache/solr/cloud/ReplicateFromLeader.java#L68-L77
>>
>> If the TLOG is the leader it will index locally and append the doc to
>> transaction log as a NRT node would do as well as it will synchronously
>> replicate the data to other TLOG replicas' transaction logs (PULL nodes
>> don't have transaction logs). But TLOG/PULL replicas doesn't support soft
>> commits nor real time gets, afaik.
>>
>>>
>>
>>>
>>> 60000
>>>
>>>
>>> Should we say that in autoCommit section openSearcher is always true in
>>> that case?
>>
>>
>>
>> 10000
>> 30000
>> 512m
>> false
>>
>>
>> Does it mean that new Searcher always starts on all replicas when hard
>> commit happens on leader?
>>
>>
>> Nope. Or at least, the searcher is not synchronously created. Each non
>> leader replica will periodically fetch the index changes from the leader
>> and open a new searcher to reflect those changes as seen here:
>> https://github.com/apache/lucene-
>> solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
>> rg/apache/solr/handler/IndexFetcher.java#L653
>> But it's important to note that the potential delay between the leader's
>> hard commit and the other replicas fetching those changes from the leader
>> and opening a new searcher to reflect latest changes.
>>
>> PS: I am still digging these new replica types so I can have misunderstood
>> or missed some aspect of it.
>>
>> Regards,
>> Edward
>
Reply | Threaded
Open this post in threaded view
|

Re: Soft commit and new replica types

Edward Ribeiro
Hi Vadim,

There is no commit on TLOG/PULL  follower replicas, only on the leader.
Followers fetch the segments and **reload the core** every 150 seconds (if
there were new segments, I suppose). Yeah, followers don't pay the CPU
price of indexing, but there are still cache invalidation, autowarming,
etc, in addition to network and IO demand. Is that ritht, Erick?

Besides that, Erick is pointing out that under a heavy indexing workload
you could either have:

1. Very large transaction logs;

2. Very large numbers of segments. If that is the case, you could have the
following scenario numerous times:
   2.1. follower replica downloads segment A and B from leader;
   2.2 leader merges segments A + B into C;
   2.3. follower replicas discard A and B and download C on next poll;

Under the second condition followers needlessly downloaded segments that
would eventually be merged.

IMO, you should carefully evaluate if the use of TLOG/PULL is really
recommended for your cluster setup, plus indexing and querying workload.
You can very much stay with a NRT setup if it suits you better. The videos
below provide a nice set of hints for when to choose between NRT or some
combination of TLOG and PULL.

https://youtu.be/XIb8X3MwVKc

https://youtu.be/dkWy2ykzAv0

https://youtu.be/XqfTjd9KDWU

Regards,
Edward

Em dom, 9 de dez de 2018 16:56, <[hidden email] escreveu:

>
>  If hard commit max time is 300 sec then commit happens every 300 sec on
> tlog leader. And new segments pop up on the leader every 300 sec, during
> indexing. Polling interval on other replicas 150 sec, but not every poll
> attempt they fetch new segment from the leader, afaiu. Erick, do you mean
> that on all other  tlog replicas(not leaders) commit occurs every poll?
> воскресенье, 09 декабря 2018г., 19:21 +03:00 от Erick Erickson
> [hidden email] :
>
> >Not quite, 600000. The polling interval is half the commit interval....
> >
> >This has always bothered me a little bit, I wonder at the utility of a
> >config param. We already have old-style replication with a
> >configurable polling interval. Under very heavy indexing loads, it
> >seems to me that either the tlogs will grow quite large or we'll be
> >pulling a lot of unnecessary segments across the wire, segments
> >that'll soon be merged away and the merged segment re-pulled.
> >
> >Apparently, though, nobody's seen this "in the wild", so it's
> >theoretical at this point.
> >On Sun, Dec 9, 2018 at 1:48 AM Vadim Ivanov
> < [hidden email]> wrote:
> >
> > Thanks, Edward, for clues.
> > What bothers me is newSearcher start, warming, cache clear... all that
> CPU consuming stuff in my heavy-indexing scenario.
> > With NRT I had autoSoftCommit:  300000 .
> > So I had new Searcher no more than  every 5 min on every replica.
> > To have more or less  the same effect with TLOG - PULL collection,
> > I suppose, I have to have  :  300000
> > (yes, I understand that newSearchers start asynchronously on leader and
> replicas)
> > Am I right?
> > --
> > Vadim
> >
> >
> >> -----Original Message-----
> >> From: Edward Ribeiro [mailto:[hidden email]]
> >> Sent: Sunday, December 09, 2018 12:42 AM
> >> To:  [hidden email]
> >> Subject: Re: Soft commit and new replica types
> >>
> >> Some insights in the new replica types below:
> >>
> >> On Sat, December 8, 2018 08:42, Vadim Ivanov <
> >> [hidden email] wrote:
> >>
> >>>
> >>> From Ref guide we have:
> >>> " NRT is the only type of replica that supports soft-commits..."
> >>> "If TLOG replica does become a leader, it will behave the same as if it
> >>> was a NRT type of replica."
> >>> Does it mean, that if we do not have NRT replicas in the cluster then
> >>> autoSoftCommit section in solconfig.xml Ignored completely (even on
> TLOG
> >>> leader)?
> >>>
> >>
> >> No, not completely. Both TLOG and PULL nodes will periodically poll the
> >> leader for changes in index segments' files and download those segments
> >> from the leader. If hard commit max time is defined in solrconfig.xml
> the
> >> polling interval of each replica will be half that value. Or else if the
> >> soft commit max time is defined then the replicas will use half the soft
> >> commit max time as the interval. If neither are defined then the poll
> >> interval will be 3 seconds (hard coded). See here:
> >> https://github.com/apache/lucene-
> >> solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> >> rg/apache/solr/cloud/ReplicateFromLeader.java#L68-L77
> >>
> >> If the TLOG is the leader it will index locally and append the doc to
> >> transaction log as a NRT node would do as well as it will synchronously
> >> replicate the data to other TLOG replicas' transaction logs (PULL nodes
> >> don't have transaction logs). But TLOG/PULL replicas doesn't support
> soft
> >> commits nor real time gets, afaik.
> >>
> >>>
> >>
> >>>
> >>> 60000
> >>>
> >>>
> >>> Should we say that in autoCommit section openSearcher is always true in
> >>> that case?
> >>
> >>
> >>
> >> 10000
> >> 30000
> >> 512m
> >> false
> >>
> >>
> >> Does it mean that new Searcher always starts on all replicas when hard
> >> commit happens on leader?
> >>
> >>
> >> Nope. Or at least, the searcher is not synchronously created. Each non
> >> leader replica will periodically fetch the index changes from the leader
> >> and open a new searcher to reflect those changes as seen here:
> >> https://github.com/apache/lucene-
> >> solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> >> rg/apache/solr/handler/IndexFetcher.java#L653
> >> But it's important to note that the potential delay between the leader's
> >> hard commit and the other replicas fetching those changes from the
> leader
> >> and opening a new searcher to reflect latest changes.
> >>
> >> PS: I am still digging these new replica types so I can have
> misunderstood
> >> or missed some aspect of it.
> >>
> >> Regards,
> >> Edward
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Soft commit and new replica types

Erick Erickson
bq. but not every poll attempt they fetch new segment from the leader

Ah, right. Ignore my comment. Commit will only occur on the followers
when there are new segments to pull down, so your'e right, roughly
every second poll would commit find things to bring down and open a
new searcher.........
On Sun, Dec 9, 2018 at 4:14 PM Edward Ribeiro <[hidden email]> wrote:

>
> Hi Vadim,
>
> There is no commit on TLOG/PULL  follower replicas, only on the leader.
> Followers fetch the segments and **reload the core** every 150 seconds (if
> there were new segments, I suppose). Yeah, followers don't pay the CPU
> price of indexing, but there are still cache invalidation, autowarming,
> etc, in addition to network and IO demand. Is that ritht, Erick?
>
> Besides that, Erick is pointing out that under a heavy indexing workload
> you could either have:
>
> 1. Very large transaction logs;
>
> 2. Very large numbers of segments. If that is the case, you could have the
> following scenario numerous times:
>    2.1. follower replica downloads segment A and B from leader;
>    2.2 leader merges segments A + B into C;
>    2.3. follower replicas discard A and B and download C on next poll;
>
> Under the second condition followers needlessly downloaded segments that
> would eventually be merged.
>
> IMO, you should carefully evaluate if the use of TLOG/PULL is really
> recommended for your cluster setup, plus indexing and querying workload.
> You can very much stay with a NRT setup if it suits you better. The videos
> below provide a nice set of hints for when to choose between NRT or some
> combination of TLOG and PULL.
>
> https://youtu.be/XIb8X3MwVKc
>
> https://youtu.be/dkWy2ykzAv0
>
> https://youtu.be/XqfTjd9KDWU
>
> Regards,
> Edward
>
> Em dom, 9 de dez de 2018 16:56, <[hidden email] escreveu:
>
> >
> >  If hard commit max time is 300 sec then commit happens every 300 sec on
> > tlog leader. And new segments pop up on the leader every 300 sec, during
> > indexing. Polling interval on other replicas 150 sec, but not every poll
> > attempt they fetch new segment from the leader, afaiu. Erick, do you mean
> > that on all other  tlog replicas(not leaders) commit occurs every poll?
> > воскресенье, 09 декабря 2018г., 19:21 +03:00 от Erick Erickson
> > [hidden email] :
> >
> > >Not quite, 600000. The polling interval is half the commit interval....
> > >
> > >This has always bothered me a little bit, I wonder at the utility of a
> > >config param. We already have old-style replication with a
> > >configurable polling interval. Under very heavy indexing loads, it
> > >seems to me that either the tlogs will grow quite large or we'll be
> > >pulling a lot of unnecessary segments across the wire, segments
> > >that'll soon be merged away and the merged segment re-pulled.
> > >
> > >Apparently, though, nobody's seen this "in the wild", so it's
> > >theoretical at this point.
> > >On Sun, Dec 9, 2018 at 1:48 AM Vadim Ivanov
> > < [hidden email]> wrote:
> > >
> > > Thanks, Edward, for clues.
> > > What bothers me is newSearcher start, warming, cache clear... all that
> > CPU consuming stuff in my heavy-indexing scenario.
> > > With NRT I had autoSoftCommit:  300000 .
> > > So I had new Searcher no more than  every 5 min on every replica.
> > > To have more or less  the same effect with TLOG - PULL collection,
> > > I suppose, I have to have  :  300000
> > > (yes, I understand that newSearchers start asynchronously on leader and
> > replicas)
> > > Am I right?
> > > --
> > > Vadim
> > >
> > >
> > >> -----Original Message-----
> > >> From: Edward Ribeiro [mailto:[hidden email]]
> > >> Sent: Sunday, December 09, 2018 12:42 AM
> > >> To:  [hidden email]
> > >> Subject: Re: Soft commit and new replica types
> > >>
> > >> Some insights in the new replica types below:
> > >>
> > >> On Sat, December 8, 2018 08:42, Vadim Ivanov <
> > >> [hidden email] wrote:
> > >>
> > >>>
> > >>> From Ref guide we have:
> > >>> " NRT is the only type of replica that supports soft-commits..."
> > >>> "If TLOG replica does become a leader, it will behave the same as if it
> > >>> was a NRT type of replica."
> > >>> Does it mean, that if we do not have NRT replicas in the cluster then
> > >>> autoSoftCommit section in solconfig.xml Ignored completely (even on
> > TLOG
> > >>> leader)?
> > >>>
> > >>
> > >> No, not completely. Both TLOG and PULL nodes will periodically poll the
> > >> leader for changes in index segments' files and download those segments
> > >> from the leader. If hard commit max time is defined in solrconfig.xml
> > the
> > >> polling interval of each replica will be half that value. Or else if the
> > >> soft commit max time is defined then the replicas will use half the soft
> > >> commit max time as the interval. If neither are defined then the poll
> > >> interval will be 3 seconds (hard coded). See here:
> > >> https://github.com/apache/lucene-
> > >> solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> > >> rg/apache/solr/cloud/ReplicateFromLeader.java#L68-L77
> > >>
> > >> If the TLOG is the leader it will index locally and append the doc to
> > >> transaction log as a NRT node would do as well as it will synchronously
> > >> replicate the data to other TLOG replicas' transaction logs (PULL nodes
> > >> don't have transaction logs). But TLOG/PULL replicas doesn't support
> > soft
> > >> commits nor real time gets, afaik.
> > >>
> > >>>
> > >>
> > >>>
> > >>> 60000
> > >>>
> > >>>
> > >>> Should we say that in autoCommit section openSearcher is always true in
> > >>> that case?
> > >>
> > >>
> > >>
> > >> 10000
> > >> 30000
> > >> 512m
> > >> false
> > >>
> > >>
> > >> Does it mean that new Searcher always starts on all replicas when hard
> > >> commit happens on leader?
> > >>
> > >>
> > >> Nope. Or at least, the searcher is not synchronously created. Each non
> > >> leader replica will periodically fetch the index changes from the leader
> > >> and open a new searcher to reflect those changes as seen here:
> > >> https://github.com/apache/lucene-
> > >> solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> > >> rg/apache/solr/handler/IndexFetcher.java#L653
> > >> But it's important to note that the potential delay between the leader's
> > >> hard commit and the other replicas fetching those changes from the
> > leader
> > >> and opening a new searcher to reflect latest changes.
> > >>
> > >> PS: I am still digging these new replica types so I can have
> > misunderstood
> > >> or missed some aspect of it.
> > >>
> > >> Regards,
> > >> Edward
> > >
> >
Reply | Threaded
Open this post in threaded view
|

Re: Soft commit and new replica types

Tomás Fernández Löbbe
I think this is a good point. The tricky part is that if TLOG replicas
don't replicate often, their transaction logs will get too big too, so you
want the replication interval of TLOG replicas to be tied to the
auto(hard)Commit interval (by default at least). If you are using them for
search, you may also not want to open a searcher for each fetch... for PULL
replicas, maybe the best way is to use the autoSoftCommit interval to
define the polling interval. That said, I'm not sure using different
configurations is a good idea, some people may be mixing TLOG and PULL and
querying them both alike.

In the meantime, if you have different hosts for TLOG and PULL replicas,
one workaround you can have is to define the autoCommit time with a system
property, and use different properties for TLOGs vs PULL nodes.

> There is no commit on TLOG/PULL  follower replicas, only on the leader.
> Followers fetch the segments and **reload the core** every 150 seconds

Edward, "reload" shouldn't really happen in regular TLOG/PULL fetches. Are
you seeing reloads?

On Mon, Dec 10, 2018 at 4:41 PM Erick Erickson <[hidden email]>
wrote:

> bq. but not every poll attempt they fetch new segment from the leader
>
> Ah, right. Ignore my comment. Commit will only occur on the followers
> when there are new segments to pull down, so your'e right, roughly
> every second poll would commit find things to bring down and open a
> new searcher.........
> On Sun, Dec 9, 2018 at 4:14 PM Edward Ribeiro <[hidden email]>
> wrote:
> >
> > Hi Vadim,
> >
> > There is no commit on TLOG/PULL  follower replicas, only on the leader.
> > Followers fetch the segments and **reload the core** every 150 seconds
> (if
> > there were new segments, I suppose). Yeah, followers don't pay the CPU
> > price of indexing, but there are still cache invalidation, autowarming,
> > etc, in addition to network and IO demand. Is that ritht, Erick?
> >
> > Besides that, Erick is pointing out that under a heavy indexing workload
> > you could either have:
> >
> > 1. Very large transaction logs;
> >
> > 2. Very large numbers of segments. If that is the case, you could have
> the
> > following scenario numerous times:
> >    2.1. follower replica downloads segment A and B from leader;
> >    2.2 leader merges segments A + B into C;
> >    2.3. follower replicas discard A and B and download C on next poll;
> >
> > Under the second condition followers needlessly downloaded segments that
> > would eventually be merged.
> >
> > IMO, you should carefully evaluate if the use of TLOG/PULL is really
> > recommended for your cluster setup, plus indexing and querying workload.
> > You can very much stay with a NRT setup if it suits you better. The
> videos
> > below provide a nice set of hints for when to choose between NRT or some
> > combination of TLOG and PULL.
> >
> > https://youtu.be/XIb8X3MwVKc
> >
> > https://youtu.be/dkWy2ykzAv0
> >
> > https://youtu.be/XqfTjd9KDWU
> >
> > Regards,
> > Edward
> >
> > Em dom, 9 de dez de 2018 16:56, <[hidden email]
> escreveu:
> >
> > >
> > >  If hard commit max time is 300 sec then commit happens every 300 sec
> on
> > > tlog leader. And new segments pop up on the leader every 300 sec,
> during
> > > indexing. Polling interval on other replicas 150 sec, but not every
> poll
> > > attempt they fetch new segment from the leader, afaiu. Erick, do you
> mean
> > > that on all other  tlog replicas(not leaders) commit occurs every poll?
> > > воскресенье, 09 декабря 2018г., 19:21 +03:00 от Erick Erickson
> > > [hidden email] :
> > >
> > > >Not quite, 600000. The polling interval is half the commit
> interval....
> > > >
> > > >This has always bothered me a little bit, I wonder at the utility of a
> > > >config param. We already have old-style replication with a
> > > >configurable polling interval. Under very heavy indexing loads, it
> > > >seems to me that either the tlogs will grow quite large or we'll be
> > > >pulling a lot of unnecessary segments across the wire, segments
> > > >that'll soon be merged away and the merged segment re-pulled.
> > > >
> > > >Apparently, though, nobody's seen this "in the wild", so it's
> > > >theoretical at this point.
> > > >On Sun, Dec 9, 2018 at 1:48 AM Vadim Ivanov
> > > < [hidden email]> wrote:
> > > >
> > > > Thanks, Edward, for clues.
> > > > What bothers me is newSearcher start, warming, cache clear... all
> that
> > > CPU consuming stuff in my heavy-indexing scenario.
> > > > With NRT I had autoSoftCommit:  300000 .
> > > > So I had new Searcher no more than  every 5 min on every replica.
> > > > To have more or less  the same effect with TLOG - PULL collection,
> > > > I suppose, I have to have  :  300000
> > > > (yes, I understand that newSearchers start asynchronously on leader
> and
> > > replicas)
> > > > Am I right?
> > > > --
> > > > Vadim
> > > >
> > > >
> > > >> -----Original Message-----
> > > >> From: Edward Ribeiro [mailto:[hidden email]]
> > > >> Sent: Sunday, December 09, 2018 12:42 AM
> > > >> To:  [hidden email]
> > > >> Subject: Re: Soft commit and new replica types
> > > >>
> > > >> Some insights in the new replica types below:
> > > >>
> > > >> On Sat, December 8, 2018 08:42, Vadim Ivanov <
> > > >> [hidden email] wrote:
> > > >>
> > > >>>
> > > >>> From Ref guide we have:
> > > >>> " NRT is the only type of replica that supports soft-commits..."
> > > >>> "If TLOG replica does become a leader, it will behave the same as
> if it
> > > >>> was a NRT type of replica."
> > > >>> Does it mean, that if we do not have NRT replicas in the cluster
> then
> > > >>> autoSoftCommit section in solconfig.xml Ignored completely (even on
> > > TLOG
> > > >>> leader)?
> > > >>>
> > > >>
> > > >> No, not completely. Both TLOG and PULL nodes will periodically poll
> the
> > > >> leader for changes in index segments' files and download those
> segments
> > > >> from the leader. If hard commit max time is defined in
> solrconfig.xml
> > > the
> > > >> polling interval of each replica will be half that value. Or else
> if the
> > > >> soft commit max time is defined then the replicas will use half the
> soft
> > > >> commit max time as the interval. If neither are defined then the
> poll
> > > >> interval will be 3 seconds (hard coded). See here:
> > > >> https://github.com/apache/lucene-
> > > >>
> solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> > > >> rg/apache/solr/cloud/ReplicateFromLeader.java#L68-L77
> > > >>
> > > >> If the TLOG is the leader it will index locally and append the doc
> to
> > > >> transaction log as a NRT node would do as well as it will
> synchronously
> > > >> replicate the data to other TLOG replicas' transaction logs (PULL
> nodes
> > > >> don't have transaction logs). But TLOG/PULL replicas doesn't support
> > > soft
> > > >> commits nor real time gets, afaik.
> > > >>
> > > >>>
> > > >>
> > > >>>
> > > >>> 60000
> > > >>>
> > > >>>
> > > >>> Should we say that in autoCommit section openSearcher is always
> true in
> > > >>> that case?
> > > >>
> > > >>
> > > >>
> > > >> 10000
> > > >> 30000
> > > >> 512m
> > > >> false
> > > >>
> > > >>
> > > >> Does it mean that new Searcher always starts on all replicas when
> hard
> > > >> commit happens on leader?
> > > >>
> > > >>
> > > >> Nope. Or at least, the searcher is not synchronously created. Each
> non
> > > >> leader replica will periodically fetch the index changes from the
> leader
> > > >> and open a new searcher to reflect those changes as seen here:
> > > >> https://github.com/apache/lucene-
> > > >>
> solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> > > >> rg/apache/solr/handler/IndexFetcher.java#L653
> > > >> But it's important to note that the potential delay between the
> leader's
> > > >> hard commit and the other replicas fetching those changes from the
> > > leader
> > > >> and opening a new searcher to reflect latest changes.
> > > >>
> > > >> PS: I am still digging these new replica types so I can have
> > > misunderstood
> > > >> or missed some aspect of it.
> > > >>
> > > >> Regards,
> > > >> Edward
> > > >
> > >
>
Reply | Threaded
Open this post in threaded view
|

Re: Soft commit and new replica types

Edward Ribeiro
Hi Tomás,

No, I am not seeing reloads. I am trying to understand the interactions
between hard commit, soft commit, transaction log update with a TLOG
cluster for both leader and follower replicas. For example, after getting
new segments from the leader the follower replica will still apply the
hard/soft commit?

PS: congratulations on the Berlin Buzzwords' talk. :)

Thanks!

On Mon, Dec 10, 2018 at 9:24 PM Tomás Fernández Löbbe <[hidden email]>
wrote:

> I think this is a good point. The tricky part is that if TLOG replicas
> don't replicate often, their transaction logs will get too big too, so you
> want the replication interval of TLOG replicas to be tied to the
> auto(hard)Commit interval (by default at least). If you are using them for
> search, you may also not want to open a searcher for each fetch... for PULL
> replicas, maybe the best way is to use the autoSoftCommit interval to
> define the polling interval. That said, I'm not sure using different
> configurations is a good idea, some people may be mixing TLOG and PULL and
> querying them both alike.
>
> In the meantime, if you have different hosts for TLOG and PULL replicas,
> one workaround you can have is to define the autoCommit time with a system
> property, and use different properties for TLOGs vs PULL nodes.
>
> > There is no commit on TLOG/PULL  follower replicas, only on the leader.
> > Followers fetch the segments and **reload the core** every 150 seconds
>
> Edward, "reload" shouldn't really happen in regular TLOG/PULL fetches. Are
> you seeing reloads?
>
> On Mon, Dec 10, 2018 at 4:41 PM Erick Erickson <[hidden email]>
> wrote:
>
> > bq. but not every poll attempt they fetch new segment from the leader
> >
> > Ah, right. Ignore my comment. Commit will only occur on the followers
> > when there are new segments to pull down, so your'e right, roughly
> > every second poll would commit find things to bring down and open a
> > new searcher.........
> > On Sun, Dec 9, 2018 at 4:14 PM Edward Ribeiro <[hidden email]>
> > wrote:
> > >
> > > Hi Vadim,
> > >
> > > There is no commit on TLOG/PULL  follower replicas, only on the leader.
> > > Followers fetch the segments and **reload the core** every 150 seconds
> > (if
> > > there were new segments, I suppose). Yeah, followers don't pay the CPU
> > > price of indexing, but there are still cache invalidation, autowarming,
> > > etc, in addition to network and IO demand. Is that ritht, Erick?
> > >
> > > Besides that, Erick is pointing out that under a heavy indexing
> workload
> > > you could either have:
> > >
> > > 1. Very large transaction logs;
> > >
> > > 2. Very large numbers of segments. If that is the case, you could have
> > the
> > > following scenario numerous times:
> > >    2.1. follower replica downloads segment A and B from leader;
> > >    2.2 leader merges segments A + B into C;
> > >    2.3. follower replicas discard A and B and download C on next poll;
> > >
> > > Under the second condition followers needlessly downloaded segments
> that
> > > would eventually be merged.
> > >
> > > IMO, you should carefully evaluate if the use of TLOG/PULL is really
> > > recommended for your cluster setup, plus indexing and querying
> workload.
> > > You can very much stay with a NRT setup if it suits you better. The
> > videos
> > > below provide a nice set of hints for when to choose between NRT or
> some
> > > combination of TLOG and PULL.
> > >
> > > https://youtu.be/XIb8X3MwVKc
> > >
> > > https://youtu.be/dkWy2ykzAv0
> > >
> > > https://youtu.be/XqfTjd9KDWU
> > >
> > > Regards,
> > > Edward
> > >
> > > Em dom, 9 de dez de 2018 16:56, <[hidden email]
> > escreveu:
> > >
> > > >
> > > >  If hard commit max time is 300 sec then commit happens every 300 sec
> > on
> > > > tlog leader. And new segments pop up on the leader every 300 sec,
> > during
> > > > indexing. Polling interval on other replicas 150 sec, but not every
> > poll
> > > > attempt they fetch new segment from the leader, afaiu. Erick, do you
> > mean
> > > > that on all other  tlog replicas(not leaders) commit occurs every
> poll?
> > > > воскресенье, 09 декабря 2018г., 19:21 +03:00 от Erick Erickson
> > > > [hidden email] :
> > > >
> > > > >Not quite, 600000. The polling interval is half the commit
> > interval....
> > > > >
> > > > >This has always bothered me a little bit, I wonder at the utility
> of a
> > > > >config param. We already have old-style replication with a
> > > > >configurable polling interval. Under very heavy indexing loads, it
> > > > >seems to me that either the tlogs will grow quite large or we'll be
> > > > >pulling a lot of unnecessary segments across the wire, segments
> > > > >that'll soon be merged away and the merged segment re-pulled.
> > > > >
> > > > >Apparently, though, nobody's seen this "in the wild", so it's
> > > > >theoretical at this point.
> > > > >On Sun, Dec 9, 2018 at 1:48 AM Vadim Ivanov
> > > > < [hidden email]> wrote:
> > > > >
> > > > > Thanks, Edward, for clues.
> > > > > What bothers me is newSearcher start, warming, cache clear... all
> > that
> > > > CPU consuming stuff in my heavy-indexing scenario.
> > > > > With NRT I had autoSoftCommit:  300000 .
> > > > > So I had new Searcher no more than  every 5 min on every replica.
> > > > > To have more or less  the same effect with TLOG - PULL collection,
> > > > > I suppose, I have to have  :  300000
> > > > > (yes, I understand that newSearchers start asynchronously on leader
> > and
> > > > replicas)
> > > > > Am I right?
> > > > > --
> > > > > Vadim
> > > > >
> > > > >
> > > > >> -----Original Message-----
> > > > >> From: Edward Ribeiro [mailto:[hidden email]]
> > > > >> Sent: Sunday, December 09, 2018 12:42 AM
> > > > >> To:  [hidden email]
> > > > >> Subject: Re: Soft commit and new replica types
> > > > >>
> > > > >> Some insights in the new replica types below:
> > > > >>
> > > > >> On Sat, December 8, 2018 08:42, Vadim Ivanov <
> > > > >> [hidden email] wrote:
> > > > >>
> > > > >>>
> > > > >>> From Ref guide we have:
> > > > >>> " NRT is the only type of replica that supports soft-commits..."
> > > > >>> "If TLOG replica does become a leader, it will behave the same as
> > if it
> > > > >>> was a NRT type of replica."
> > > > >>> Does it mean, that if we do not have NRT replicas in the cluster
> > then
> > > > >>> autoSoftCommit section in solconfig.xml Ignored completely (even
> on
> > > > TLOG
> > > > >>> leader)?
> > > > >>>
> > > > >>
> > > > >> No, not completely. Both TLOG and PULL nodes will periodically
> poll
> > the
> > > > >> leader for changes in index segments' files and download those
> > segments
> > > > >> from the leader. If hard commit max time is defined in
> > solrconfig.xml
> > > > the
> > > > >> polling interval of each replica will be half that value. Or else
> > if the
> > > > >> soft commit max time is defined then the replicas will use half
> the
> > soft
> > > > >> commit max time as the interval. If neither are defined then the
> > poll
> > > > >> interval will be 3 seconds (hard coded). See here:
> > > > >> https://github.com/apache/lucene-
> > > > >>
> > solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> > > > >> rg/apache/solr/cloud/ReplicateFromLeader.java#L68-L77
> > > > >>
> > > > >> If the TLOG is the leader it will index locally and append the doc
> > to
> > > > >> transaction log as a NRT node would do as well as it will
> > synchronously
> > > > >> replicate the data to other TLOG replicas' transaction logs (PULL
> > nodes
> > > > >> don't have transaction logs). But TLOG/PULL replicas doesn't
> support
> > > > soft
> > > > >> commits nor real time gets, afaik.
> > > > >>
> > > > >>>
> > > > >>
> > > > >>>
> > > > >>> 60000
> > > > >>>
> > > > >>>
> > > > >>> Should we say that in autoCommit section openSearcher is always
> > true in
> > > > >>> that case?
> > > > >>
> > > > >>
> > > > >>
> > > > >> 10000
> > > > >> 30000
> > > > >> 512m
> > > > >> false
> > > > >>
> > > > >>
> > > > >> Does it mean that new Searcher always starts on all replicas when
> > hard
> > > > >> commit happens on leader?
> > > > >>
> > > > >>
> > > > >> Nope. Or at least, the searcher is not synchronously created. Each
> > non
> > > > >> leader replica will periodically fetch the index changes from the
> > leader
> > > > >> and open a new searcher to reflect those changes as seen here:
> > > > >> https://github.com/apache/lucene-
> > > > >>
> > solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> > > > >> rg/apache/solr/handler/IndexFetcher.java#L653
> > > > >> But it's important to note that the potential delay between the
> > leader's
> > > > >> hard commit and the other replicas fetching those changes from the
> > > > leader
> > > > >> and opening a new searcher to reflect latest changes.
> > > > >>
> > > > >> PS: I am still digging these new replica types so I can have
> > > > misunderstood
> > > > >> or missed some aspect of it.
> > > > >>
> > > > >> Regards,
> > > > >> Edward
> > > > >
> > > >
> >
>
Reply | Threaded
Open this post in threaded view
|

RE: Soft commit and new replica types

Vadim Ivanov
bq. , after getting new segments from the leader the follower replica will still apply the hard/soft commit?
As was described in one of the videos below, follower tlog replica look for max docid in received new segments
and purge  its transaction log of older records. Than it starts new searcher(it may be called soft commit).
--
Vadim



> -----Original Message-----
> From: Edward Ribeiro [mailto:[hidden email]]
> Sent: Thursday, December 13, 2018 8:27 PM
> To: [hidden email]
> Subject: Re: Soft commit and new replica types
>
> Hi Tomás,
>
> No, I am not seeing reloads. I am trying to understand the interactions
> between hard commit, soft commit, transaction log update with a TLOG
> cluster for both leader and follower replicas. For example, after getting
> new segments from the leader the follower replica will still apply the
> hard/soft commit?
>
> PS: congratulations on the Berlin Buzzwords' talk. :)
>
> Thanks!
>
> On Mon, Dec 10, 2018 at 9:24 PM Tomás Fernández Löbbe
> <[hidden email]>
> wrote:
>
> > I think this is a good point. The tricky part is that if TLOG replicas
> > don't replicate often, their transaction logs will get too big too, so you
> > want the replication interval of TLOG replicas to be tied to the
> > auto(hard)Commit interval (by default at least). If you are using them for
> > search, you may also not want to open a searcher for each fetch... for PULL
> > replicas, maybe the best way is to use the autoSoftCommit interval to
> > define the polling interval. That said, I'm not sure using different
> > configurations is a good idea, some people may be mixing TLOG and PULL
> and
> > querying them both alike.
> >
> > In the meantime, if you have different hosts for TLOG and PULL replicas,
> > one workaround you can have is to define the autoCommit time with a
> system
> > property, and use different properties for TLOGs vs PULL nodes.
> >
> > > There is no commit on TLOG/PULL  follower replicas, only on the leader.
> > > Followers fetch the segments and **reload the core** every 150 seconds
> >
> > Edward, "reload" shouldn't really happen in regular TLOG/PULL fetches. Are
> > you seeing reloads?
> >
> > On Mon, Dec 10, 2018 at 4:41 PM Erick Erickson <[hidden email]>
> > wrote:
> >
> > > bq. but not every poll attempt they fetch new segment from the leader
> > >
> > > Ah, right. Ignore my comment. Commit will only occur on the followers
> > > when there are new segments to pull down, so your'e right, roughly
> > > every second poll would commit find things to bring down and open a
> > > new searcher.........
> > > On Sun, Dec 9, 2018 at 4:14 PM Edward Ribeiro
> <[hidden email]>
> > > wrote:
> > > >
> > > > Hi Vadim,
> > > >
> > > > There is no commit on TLOG/PULL  follower replicas, only on the leader.
> > > > Followers fetch the segments and **reload the core** every 150 seconds
> > > (if
> > > > there were new segments, I suppose). Yeah, followers don't pay the CPU
> > > > price of indexing, but there are still cache invalidation, autowarming,
> > > > etc, in addition to network and IO demand. Is that ritht, Erick?
> > > >
> > > > Besides that, Erick is pointing out that under a heavy indexing
> > workload
> > > > you could either have:
> > > >
> > > > 1. Very large transaction logs;
> > > >
> > > > 2. Very large numbers of segments. If that is the case, you could have
> > > the
> > > > following scenario numerous times:
> > > >    2.1. follower replica downloads segment A and B from leader;
> > > >    2.2 leader merges segments A + B into C;
> > > >    2.3. follower replicas discard A and B and download C on next poll;
> > > >
> > > > Under the second condition followers needlessly downloaded segments
> > that
> > > > would eventually be merged.
> > > >
> > > > IMO, you should carefully evaluate if the use of TLOG/PULL is really
> > > > recommended for your cluster setup, plus indexing and querying
> > workload.
> > > > You can very much stay with a NRT setup if it suits you better. The
> > > videos
> > > > below provide a nice set of hints for when to choose between NRT or
> > some
> > > > combination of TLOG and PULL.
> > > >
> > > > https://youtu.be/XIb8X3MwVKc
> > > >
> > > > https://youtu.be/dkWy2ykzAv0
> > > >
> > > > https://youtu.be/XqfTjd9KDWU
> > > >
> > > > Regards,
> > > > Edward
> > > >
> > > > Em dom, 9 de dez de 2018 16:56, <[hidden email]
> > > escreveu:
> > > >
> > > > >
> > > > >  If hard commit max time is 300 sec then commit happens every 300
> sec
> > > on
> > > > > tlog leader. And new segments pop up on the leader every 300 sec,
> > > during
> > > > > indexing. Polling interval on other replicas 150 sec, but not every
> > > poll
> > > > > attempt they fetch new segment from the leader, afaiu. Erick, do you
> > > mean
> > > > > that on all other  tlog replicas(not leaders) commit occurs every
> > poll?
> > > > > воскресенье, 09 декабря 2018г., 19:21 +03:00 от Erick Erickson
> > > > > [hidden email] :
> > > > >
> > > > > >Not quite, 600000. The polling interval is half the commit
> > > interval....
> > > > > >
> > > > > >This has always bothered me a little bit, I wonder at the utility
> > of a
> > > > > >config param. We already have old-style replication with a
> > > > > >configurable polling interval. Under very heavy indexing loads, it
> > > > > >seems to me that either the tlogs will grow quite large or we'll be
> > > > > >pulling a lot of unnecessary segments across the wire, segments
> > > > > >that'll soon be merged away and the merged segment re-pulled.
> > > > > >
> > > > > >Apparently, though, nobody's seen this "in the wild", so it's
> > > > > >theoretical at this point.
> > > > > >On Sun, Dec 9, 2018 at 1:48 AM Vadim Ivanov
> > > > > < [hidden email]> wrote:
> > > > > >
> > > > > > Thanks, Edward, for clues.
> > > > > > What bothers me is newSearcher start, warming, cache clear... all
> > > that
> > > > > CPU consuming stuff in my heavy-indexing scenario.
> > > > > > With NRT I had autoSoftCommit:  300000 .
> > > > > > So I had new Searcher no more than  every 5 min on every replica.
> > > > > > To have more or less  the same effect with TLOG - PULL collection,
> > > > > > I suppose, I have to have  :  300000
> > > > > > (yes, I understand that newSearchers start asynchronously on leader
> > > and
> > > > > replicas)
> > > > > > Am I right?
> > > > > > --
> > > > > > Vadim
> > > > > >
> > > > > >
> > > > > >> -----Original Message-----
> > > > > >> From: Edward Ribeiro [mailto:[hidden email]]
> > > > > >> Sent: Sunday, December 09, 2018 12:42 AM
> > > > > >> To:  [hidden email]
> > > > > >> Subject: Re: Soft commit and new replica types
> > > > > >>
> > > > > >> Some insights in the new replica types below:
> > > > > >>
> > > > > >> On Sat, December 8, 2018 08:42, Vadim Ivanov <
> > > > > >> [hidden email] wrote:
> > > > > >>
> > > > > >>>
> > > > > >>> From Ref guide we have:
> > > > > >>> " NRT is the only type of replica that supports soft-commits..."
> > > > > >>> "If TLOG replica does become a leader, it will behave the same as
> > > if it
> > > > > >>> was a NRT type of replica."
> > > > > >>> Does it mean, that if we do not have NRT replicas in the cluster
> > > then
> > > > > >>> autoSoftCommit section in solconfig.xml Ignored completely (even
> > on
> > > > > TLOG
> > > > > >>> leader)?
> > > > > >>>
> > > > > >>
> > > > > >> No, not completely. Both TLOG and PULL nodes will periodically
> > poll
> > > the
> > > > > >> leader for changes in index segments' files and download those
> > > segments
> > > > > >> from the leader. If hard commit max time is defined in
> > > solrconfig.xml
> > > > > the
> > > > > >> polling interval of each replica will be half that value. Or else
> > > if the
> > > > > >> soft commit max time is defined then the replicas will use half
> > the
> > > soft
> > > > > >> commit max time as the interval. If neither are defined then the
> > > poll
> > > > > >> interval will be 3 seconds (hard coded). See here:
> > > > > >> https://github.com/apache/lucene-
> > > > > >>
> > >
> solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> > > > > >> rg/apache/solr/cloud/ReplicateFromLeader.java#L68-L77
> > > > > >>
> > > > > >> If the TLOG is the leader it will index locally and append the doc
> > > to
> > > > > >> transaction log as a NRT node would do as well as it will
> > > synchronously
> > > > > >> replicate the data to other TLOG replicas' transaction logs (PULL
> > > nodes
> > > > > >> don't have transaction logs). But TLOG/PULL replicas doesn't
> > support
> > > > > soft
> > > > > >> commits nor real time gets, afaik.
> > > > > >>
> > > > > >>>
> > > > > >>
> > > > > >>>
> > > > > >>> 60000
> > > > > >>>
> > > > > >>>
> > > > > >>> Should we say that in autoCommit section openSearcher is always
> > > true in
> > > > > >>> that case?
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> 10000
> > > > > >> 30000
> > > > > >> 512m
> > > > > >> false
> > > > > >>
> > > > > >>
> > > > > >> Does it mean that new Searcher always starts on all replicas when
> > > hard
> > > > > >> commit happens on leader?
> > > > > >>
> > > > > >>
> > > > > >> Nope. Or at least, the searcher is not synchronously created. Each
> > > non
> > > > > >> leader replica will periodically fetch the index changes from the
> > > leader
> > > > > >> and open a new searcher to reflect those changes as seen here:
> > > > > >> https://github.com/apache/lucene-
> > > > > >>
> > >
> solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> > > > > >> rg/apache/solr/handler/IndexFetcher.java#L653
> > > > > >> But it's important to note that the potential delay between the
> > > leader's
> > > > > >> hard commit and the other replicas fetching those changes from the
> > > > > leader
> > > > > >> and opening a new searcher to reflect latest changes.
> > > > > >>
> > > > > >> PS: I am still digging these new replica types so I can have
> > > > > misunderstood
> > > > > >> or missed some aspect of it.
> > > > > >>
> > > > > >> Regards,
> > > > > >> Edward
> > > > > >
> > > > >
> > >
> >

Reply | Threaded
Open this post in threaded view
|

Re: Soft commit and new replica types

Tomás Fernández Löbbe
> >
> > No, I am not seeing reloads.

Ah, good.


> > I am trying to understand the interactions
> > between hard commit, soft commit, transaction log update with a TLOG
> > cluster for both leader and follower replicas. For example, after getting
> > new segments from the leader the follower replica will still apply the
> > hard/soft commit?
>

Think about the hard commit as a flush of the latest updates to a segment
plus checkpoint pointing to all the current valid segments. That checkpoint
is also a file. The soft commit is similar to the hard commit in the sense
that it creates a segment and a pointer to the valid segments, however,
those segments may not be flushed to disk yet, and the checkpoint is not on
a file. *In addition* to creating segments, the commits in Solr create
searchers to get the latest view of the index (hard-commits only when
openSearcher=true and soft-commits always), but that doesn't really matter
in the context of replication.

The follower replica (a TLOG/PULL) will ask the leader for the last hard
commit and replicate all the segments and the file indicating the commit.
All the TLOG/PULL replica does after it replicates is open a searcher with
all the segments in that checkpoint. Two important notes here: 1) the
follower replica doesn't "perform" a commit, it copied it from the leader
and 2) this "open a searcher" is not a soft/hard commit, is just opening a
searcher (a "commit" usually involves creating segments).

* If in the leader (a TLOG replica) you do a soft commit, it'll never make
it to the follower, because the follower only replicates the latest hard
commit (see ReplicationHandler.indexCommitPoint).
* If in the follower (a TLOG replica) you do a soft commit, it won't do any
difference, because in the TLOG case, documents are not added to the index
(only to the transaction log). (See UpdateCommand.IGNORE_INDEXWRITER flag)
* If in the follower (a PULL replica) you do a soft commit, it also
wouldn't do any difference, because it doesn't receive the documents anyway
(only replicates). Commit is skipped anyway (see
DistributedUpdateProcessor.processCommit)

The transaction log is only used for recovery purposes (or realtime get).

I hope that clarifies things.

>
> > PS: congratulations on the Berlin Buzzwords' talk. :)
>
Thanks!

> >
> > Thanks!
> >
> > On Mon, Dec 10, 2018 at 9:24 PM Tomás Fernández Löbbe
> > <[hidden email]>
> > wrote:
> >
> > > I think this is a good point. The tricky part is that if TLOG replicas
> > > don't replicate often, their transaction logs will get too big too, so
> you
> > > want the replication interval of TLOG replicas to be tied to the
> > > auto(hard)Commit interval (by default at least). If you are using them
> for
> > > search, you may also not want to open a searcher for each fetch... for
> PULL
> > > replicas, maybe the best way is to use the autoSoftCommit interval to
> > > define the polling interval. That said, I'm not sure using different
> > > configurations is a good idea, some people may be mixing TLOG and PULL
> > and
> > > querying them both alike.
> > >
> > > In the meantime, if you have different hosts for TLOG and PULL
> replicas,
> > > one workaround you can have is to define the autoCommit time with a
> > system
> > > property, and use different properties for TLOGs vs PULL nodes.
> > >
> > > > There is no commit on TLOG/PULL  follower replicas, only on the
> leader.
> > > > Followers fetch the segments and **reload the core** every 150
> seconds
> > >
> > > Edward, "reload" shouldn't really happen in regular TLOG/PULL fetches.
> Are
> > > you seeing reloads?
> > >
> > > On Mon, Dec 10, 2018 at 4:41 PM Erick Erickson <
> [hidden email]>
> > > wrote:
> > >
> > > > bq. but not every poll attempt they fetch new segment from the leader
> > > >
> > > > Ah, right. Ignore my comment. Commit will only occur on the followers
> > > > when there are new segments to pull down, so your'e right, roughly
> > > > every second poll would commit find things to bring down and open a
> > > > new searcher.........
> > > > On Sun, Dec 9, 2018 at 4:14 PM Edward Ribeiro
> > <[hidden email]>
> > > > wrote:
> > > > >
> > > > > Hi Vadim,
> > > > >
> > > > > There is no commit on TLOG/PULL  follower replicas, only on the
> leader.
> > > > > Followers fetch the segments and **reload the core** every 150
> seconds
> > > > (if
> > > > > there were new segments, I suppose). Yeah, followers don't pay the
> CPU
> > > > > price of indexing, but there are still cache invalidation,
> autowarming,
> > > > > etc, in addition to network and IO demand. Is that ritht, Erick?
> > > > >
> > > > > Besides that, Erick is pointing out that under a heavy indexing
> > > workload
> > > > > you could either have:
> > > > >
> > > > > 1. Very large transaction logs;
> > > > >
> > > > > 2. Very large numbers of segments. If that is the case, you could
> have
> > > > the
> > > > > following scenario numerous times:
> > > > >    2.1. follower replica downloads segment A and B from leader;
> > > > >    2.2 leader merges segments A + B into C;
> > > > >    2.3. follower replicas discard A and B and download C on next
> poll;
> > > > >
> > > > > Under the second condition followers needlessly downloaded segments
> > > that
> > > > > would eventually be merged.
> > > > >
> > > > > IMO, you should carefully evaluate if the use of TLOG/PULL is
> really
> > > > > recommended for your cluster setup, plus indexing and querying
> > > workload.
> > > > > You can very much stay with a NRT setup if it suits you better. The
> > > > videos
> > > > > below provide a nice set of hints for when to choose between NRT or
> > > some
> > > > > combination of TLOG and PULL.
> > > > >
> > > > > https://youtu.be/XIb8X3MwVKc
> > > > >
> > > > > https://youtu.be/dkWy2ykzAv0
> > > > >
> > > > > https://youtu.be/XqfTjd9KDWU
> > > > >
> > > > > Regards,
> > > > > Edward
> > > > >
> > > > > Em dom, 9 de dez de 2018 16:56, <[hidden email]
> > > > escreveu:
> > > > >
> > > > > >
> > > > > >  If hard commit max time is 300 sec then commit happens every 300
> > sec
> > > > on
> > > > > > tlog leader. And new segments pop up on the leader every 300 sec,
> > > > during
> > > > > > indexing. Polling interval on other replicas 150 sec, but not
> every
> > > > poll
> > > > > > attempt they fetch new segment from the leader, afaiu. Erick, do
> you
> > > > mean
> > > > > > that on all other  tlog replicas(not leaders) commit occurs every
> > > poll?
> > > > > > воскресенье, 09 декабря 2018г., 19:21 +03:00 от Erick Erickson
> > > > > > [hidden email] :
> > > > > >
> > > > > > >Not quite, 600000. The polling interval is half the commit
> > > > interval....
> > > > > > >
> > > > > > >This has always bothered me a little bit, I wonder at the
> utility
> > > of a
> > > > > > >config param. We already have old-style replication with a
> > > > > > >configurable polling interval. Under very heavy indexing loads,
> it
> > > > > > >seems to me that either the tlogs will grow quite large or
> we'll be
> > > > > > >pulling a lot of unnecessary segments across the wire, segments
> > > > > > >that'll soon be merged away and the merged segment re-pulled.
> > > > > > >
> > > > > > >Apparently, though, nobody's seen this "in the wild", so it's
> > > > > > >theoretical at this point.
> > > > > > >On Sun, Dec 9, 2018 at 1:48 AM Vadim Ivanov
> > > > > > < [hidden email]> wrote:
> > > > > > >
> > > > > > > Thanks, Edward, for clues.
> > > > > > > What bothers me is newSearcher start, warming, cache clear...
> all
> > > > that
> > > > > > CPU consuming stuff in my heavy-indexing scenario.
> > > > > > > With NRT I had autoSoftCommit:  300000 .
> > > > > > > So I had new Searcher no more than  every 5 min on every
> replica.
> > > > > > > To have more or less  the same effect with TLOG - PULL
> collection,
> > > > > > > I suppose, I have to have  :  300000
> > > > > > > (yes, I understand that newSearchers start asynchronously on
> leader
> > > > and
> > > > > > replicas)
> > > > > > > Am I right?
> > > > > > > --
> > > > > > > Vadim
> > > > > > >
> > > > > > >
> > > > > > >> -----Original Message-----
> > > > > > >> From: Edward Ribeiro [mailto:[hidden email]]
> > > > > > >> Sent: Sunday, December 09, 2018 12:42 AM
> > > > > > >> To:  [hidden email]
> > > > > > >> Subject: Re: Soft commit and new replica types
> > > > > > >>
> > > > > > >> Some insights in the new replica types below:
> > > > > > >>
> > > > > > >> On Sat, December 8, 2018 08:42, Vadim Ivanov <
> > > > > > >> [hidden email] wrote:
> > > > > > >>
> > > > > > >>>
> > > > > > >>> From Ref guide we have:
> > > > > > >>> " NRT is the only type of replica that supports
> soft-commits..."
> > > > > > >>> "If TLOG replica does become a leader, it will behave the
> same as
> > > > if it
> > > > > > >>> was a NRT type of replica."
> > > > > > >>> Does it mean, that if we do not have NRT replicas in the
> cluster
> > > > then
> > > > > > >>> autoSoftCommit section in solconfig.xml Ignored completely
> (even
> > > on
> > > > > > TLOG
> > > > > > >>> leader)?
> > > > > > >>>
> > > > > > >>
> > > > > > >> No, not completely. Both TLOG and PULL nodes will periodically
> > > poll
> > > > the
> > > > > > >> leader for changes in index segments' files and download those
> > > > segments
> > > > > > >> from the leader. If hard commit max time is defined in
> > > > solrconfig.xml
> > > > > > the
> > > > > > >> polling interval of each replica will be half that value. Or
> else
> > > > if the
> > > > > > >> soft commit max time is defined then the replicas will use
> half
> > > the
> > > > soft
> > > > > > >> commit max time as the interval. If neither are defined then
> the
> > > > poll
> > > > > > >> interval will be 3 seconds (hard coded). See here:
> > > > > > >> https://github.com/apache/lucene-
> > > > > > >>
> > > >
> > solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> > > > > > >> rg/apache/solr/cloud/ReplicateFromLeader.java#L68-L77
> > > > > > >>
> > > > > > >> If the TLOG is the leader it will index locally and append
> the doc
> > > > to
> > > > > > >> transaction log as a NRT node would do as well as it will
> > > > synchronously
> > > > > > >> replicate the data to other TLOG replicas' transaction logs
> (PULL
> > > > nodes
> > > > > > >> don't have transaction logs). But TLOG/PULL replicas doesn't
> > > support
> > > > > > soft
> > > > > > >> commits nor real time gets, afaik.
> > > > > > >>
> > > > > > >>>
> > > > > > >>
> > > > > > >>>
> > > > > > >>> 60000
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> Should we say that in autoCommit section openSearcher is
> always
> > > > true in
> > > > > > >>> that case?
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> 10000
> > > > > > >> 30000
> > > > > > >> 512m
> > > > > > >> false
> > > > > > >>
> > > > > > >>
> > > > > > >> Does it mean that new Searcher always starts on all replicas
> when
> > > > hard
> > > > > > >> commit happens on leader?
> > > > > > >>
> > > > > > >>
> > > > > > >> Nope. Or at least, the searcher is not synchronously created.
> Each
> > > > non
> > > > > > >> leader replica will periodically fetch the index changes from
> the
> > > > leader
> > > > > > >> and open a new searcher to reflect those changes as seen here:
> > > > > > >> https://github.com/apache/lucene-
> > > > > > >>
> > > >
> > solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> > > > > > >> rg/apache/solr/handler/IndexFetcher.java#L653
> > > > > > >> But it's important to note that the potential delay between
> the
> > > > leader's
> > > > > > >> hard commit and the other replicas fetching those changes
> from the
> > > > > > leader
> > > > > > >> and opening a new searcher to reflect latest changes.
> > > > > > >>
> > > > > > >> PS: I am still digging these new replica types so I can have
> > > > > > misunderstood
> > > > > > >> or missed some aspect of it.
> > > > > > >>
> > > > > > >> Regards,
> > > > > > >> Edward
> > > > > > >
> > > > > >
> > > >
> > >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Soft commit and new replica types

Edward Ribeiro
Indeed! It clarified a lot, thank you. :) Now I know I messed with the
reload core config, but the other aspects were more or less what I have
been expecting.

Do you think it's worth to submit a PR to the Reference Guide with those
explanations? I can take a stab at it.

Regards,
Edward

On Fri, Dec 14, 2018 at 3:08 AM Tomás Fernández Löbbe <[hidden email]>
wrote:

> > >
> > > No, I am not seeing reloads.
>
> Ah, good.
>
>
> > > I am trying to understand the interactions
> > > between hard commit, soft commit, transaction log update with a TLOG
> > > cluster for both leader and follower replicas. For example, after
> getting
> > > new segments from the leader the follower replica will still apply the
> > > hard/soft commit?
> >
>
> Think about the hard commit as a flush of the latest updates to a segment
> plus checkpoint pointing to all the current valid segments. That checkpoint
> is also a file. The soft commit is similar to the hard commit in the sense
> that it creates a segment and a pointer to the valid segments, however,
> those segments may not be flushed to disk yet, and the checkpoint is not on
> a file. *In addition* to creating segments, the commits in Solr create
> searchers to get the latest view of the index (hard-commits only when
> openSearcher=true and soft-commits always), but that doesn't really matter
> in the context of replication.
>
> The follower replica (a TLOG/PULL) will ask the leader for the last hard
> commit and replicate all the segments and the file indicating the commit.
> All the TLOG/PULL replica does after it replicates is open a searcher with
> all the segments in that checkpoint. Two important notes here: 1) the
> follower replica doesn't "perform" a commit, it copied it from the leader
> and 2) this "open a searcher" is not a soft/hard commit, is just opening a
> searcher (a "commit" usually involves creating segments).
>
> * If in the leader (a TLOG replica) you do a soft commit, it'll never make
> it to the follower, because the follower only replicates the latest hard
> commit (see ReplicationHandler.indexCommitPoint).
> * If in the follower (a TLOG replica) you do a soft commit, it won't do any
> difference, because in the TLOG case, documents are not added to the index
> (only to the transaction log). (See UpdateCommand.IGNORE_INDEXWRITER flag)
> * If in the follower (a PULL replica) you do a soft commit, it also
> wouldn't do any difference, because it doesn't receive the documents anyway
> (only replicates). Commit is skipped anyway (see
> DistributedUpdateProcessor.processCommit)
>
> The transaction log is only used for recovery purposes (or realtime get).
>
> I hope that clarifies things.
>
> >
> > > PS: congratulations on the Berlin Buzzwords' talk. :)
> >
> Thanks!
>
> > >
> > > Thanks!
> > >
> > > On Mon, Dec 10, 2018 at 9:24 PM Tomás Fernández Löbbe
> > > <[hidden email]>
> > > wrote:
> > >
> > > > I think this is a good point. The tricky part is that if TLOG
> replicas
> > > > don't replicate often, their transaction logs will get too big too,
> so
> > you
> > > > want the replication interval of TLOG replicas to be tied to the
> > > > auto(hard)Commit interval (by default at least). If you are using
> them
> > for
> > > > search, you may also not want to open a searcher for each fetch...
> for
> > PULL
> > > > replicas, maybe the best way is to use the autoSoftCommit interval to
> > > > define the polling interval. That said, I'm not sure using different
> > > > configurations is a good idea, some people may be mixing TLOG and
> PULL
> > > and
> > > > querying them both alike.
> > > >
> > > > In the meantime, if you have different hosts for TLOG and PULL
> > replicas,
> > > > one workaround you can have is to define the autoCommit time with a
> > > system
> > > > property, and use different properties for TLOGs vs PULL nodes.
> > > >
> > > > > There is no commit on TLOG/PULL  follower replicas, only on the
> > leader.
> > > > > Followers fetch the segments and **reload the core** every 150
> > seconds
> > > >
> > > > Edward, "reload" shouldn't really happen in regular TLOG/PULL
> fetches.
> > Are
> > > > you seeing reloads?
> > > >
> > > > On Mon, Dec 10, 2018 at 4:41 PM Erick Erickson <
> > [hidden email]>
> > > > wrote:
> > > >
> > > > > bq. but not every poll attempt they fetch new segment from the
> leader
> > > > >
> > > > > Ah, right. Ignore my comment. Commit will only occur on the
> followers
> > > > > when there are new segments to pull down, so your'e right, roughly
> > > > > every second poll would commit find things to bring down and open a
> > > > > new searcher.........
> > > > > On Sun, Dec 9, 2018 at 4:14 PM Edward Ribeiro
> > > <[hidden email]>
> > > > > wrote:
> > > > > >
> > > > > > Hi Vadim,
> > > > > >
> > > > > > There is no commit on TLOG/PULL  follower replicas, only on the
> > leader.
> > > > > > Followers fetch the segments and **reload the core** every 150
> > seconds
> > > > > (if
> > > > > > there were new segments, I suppose). Yeah, followers don't pay
> the
> > CPU
> > > > > > price of indexing, but there are still cache invalidation,
> > autowarming,
> > > > > > etc, in addition to network and IO demand. Is that ritht, Erick?
> > > > > >
> > > > > > Besides that, Erick is pointing out that under a heavy indexing
> > > > workload
> > > > > > you could either have:
> > > > > >
> > > > > > 1. Very large transaction logs;
> > > > > >
> > > > > > 2. Very large numbers of segments. If that is the case, you could
> > have
> > > > > the
> > > > > > following scenario numerous times:
> > > > > >    2.1. follower replica downloads segment A and B from leader;
> > > > > >    2.2 leader merges segments A + B into C;
> > > > > >    2.3. follower replicas discard A and B and download C on next
> > poll;
> > > > > >
> > > > > > Under the second condition followers needlessly downloaded
> segments
> > > > that
> > > > > > would eventually be merged.
> > > > > >
> > > > > > IMO, you should carefully evaluate if the use of TLOG/PULL is
> > really
> > > > > > recommended for your cluster setup, plus indexing and querying
> > > > workload.
> > > > > > You can very much stay with a NRT setup if it suits you better.
> The
> > > > > videos
> > > > > > below provide a nice set of hints for when to choose between NRT
> or
> > > > some
> > > > > > combination of TLOG and PULL.
> > > > > >
> > > > > > https://youtu.be/XIb8X3MwVKc
> > > > > >
> > > > > > https://youtu.be/dkWy2ykzAv0
> > > > > >
> > > > > > https://youtu.be/XqfTjd9KDWU
> > > > > >
> > > > > > Regards,
> > > > > > Edward
> > > > > >
> > > > > > Em dom, 9 de dez de 2018 16:56, <
> [hidden email]
> > > > > escreveu:
> > > > > >
> > > > > > >
> > > > > > >  If hard commit max time is 300 sec then commit happens every
> 300
> > > sec
> > > > > on
> > > > > > > tlog leader. And new segments pop up on the leader every 300
> sec,
> > > > > during
> > > > > > > indexing. Polling interval on other replicas 150 sec, but not
> > every
> > > > > poll
> > > > > > > attempt they fetch new segment from the leader, afaiu. Erick,
> do
> > you
> > > > > mean
> > > > > > > that on all other  tlog replicas(not leaders) commit occurs
> every
> > > > poll?
> > > > > > > воскресенье, 09 декабря 2018г., 19:21 +03:00 от Erick Erickson
> > > > > > > [hidden email] :
> > > > > > >
> > > > > > > >Not quite, 600000. The polling interval is half the commit
> > > > > interval....
> > > > > > > >
> > > > > > > >This has always bothered me a little bit, I wonder at the
> > utility
> > > > of a
> > > > > > > >config param. We already have old-style replication with a
> > > > > > > >configurable polling interval. Under very heavy indexing
> loads,
> > it
> > > > > > > >seems to me that either the tlogs will grow quite large or
> > we'll be
> > > > > > > >pulling a lot of unnecessary segments across the wire,
> segments
> > > > > > > >that'll soon be merged away and the merged segment re-pulled.
> > > > > > > >
> > > > > > > >Apparently, though, nobody's seen this "in the wild", so it's
> > > > > > > >theoretical at this point.
> > > > > > > >On Sun, Dec 9, 2018 at 1:48 AM Vadim Ivanov
> > > > > > > < [hidden email]> wrote:
> > > > > > > >
> > > > > > > > Thanks, Edward, for clues.
> > > > > > > > What bothers me is newSearcher start, warming, cache clear...
> > all
> > > > > that
> > > > > > > CPU consuming stuff in my heavy-indexing scenario.
> > > > > > > > With NRT I had autoSoftCommit:  300000 .
> > > > > > > > So I had new Searcher no more than  every 5 min on every
> > replica.
> > > > > > > > To have more or less  the same effect with TLOG - PULL
> > collection,
> > > > > > > > I suppose, I have to have  :  300000
> > > > > > > > (yes, I understand that newSearchers start asynchronously on
> > leader
> > > > > and
> > > > > > > replicas)
> > > > > > > > Am I right?
> > > > > > > > --
> > > > > > > > Vadim
> > > > > > > >
> > > > > > > >
> > > > > > > >> -----Original Message-----
> > > > > > > >> From: Edward Ribeiro [mailto:[hidden email]]
> > > > > > > >> Sent: Sunday, December 09, 2018 12:42 AM
> > > > > > > >> To:  [hidden email]
> > > > > > > >> Subject: Re: Soft commit and new replica types
> > > > > > > >>
> > > > > > > >> Some insights in the new replica types below:
> > > > > > > >>
> > > > > > > >> On Sat, December 8, 2018 08:42, Vadim Ivanov <
> > > > > > > >> [hidden email] wrote:
> > > > > > > >>
> > > > > > > >>>
> > > > > > > >>> From Ref guide we have:
> > > > > > > >>> " NRT is the only type of replica that supports
> > soft-commits..."
> > > > > > > >>> "If TLOG replica does become a leader, it will behave the
> > same as
> > > > > if it
> > > > > > > >>> was a NRT type of replica."
> > > > > > > >>> Does it mean, that if we do not have NRT replicas in the
> > cluster
> > > > > then
> > > > > > > >>> autoSoftCommit section in solconfig.xml Ignored completely
> > (even
> > > > on
> > > > > > > TLOG
> > > > > > > >>> leader)?
> > > > > > > >>>
> > > > > > > >>
> > > > > > > >> No, not completely. Both TLOG and PULL nodes will
> periodically
> > > > poll
> > > > > the
> > > > > > > >> leader for changes in index segments' files and download
> those
> > > > > segments
> > > > > > > >> from the leader. If hard commit max time is defined in
> > > > > solrconfig.xml
> > > > > > > the
> > > > > > > >> polling interval of each replica will be half that value. Or
> > else
> > > > > if the
> > > > > > > >> soft commit max time is defined then the replicas will use
> > half
> > > > the
> > > > > soft
> > > > > > > >> commit max time as the interval. If neither are defined then
> > the
> > > > > poll
> > > > > > > >> interval will be 3 seconds (hard coded). See here:
> > > > > > > >> https://github.com/apache/lucene-
> > > > > > > >>
> > > > >
> > > solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> > > > > > > >> rg/apache/solr/cloud/ReplicateFromLeader.java#L68-L77
> > > > > > > >>
> > > > > > > >> If the TLOG is the leader it will index locally and append
> > the doc
> > > > > to
> > > > > > > >> transaction log as a NRT node would do as well as it will
> > > > > synchronously
> > > > > > > >> replicate the data to other TLOG replicas' transaction logs
> > (PULL
> > > > > nodes
> > > > > > > >> don't have transaction logs). But TLOG/PULL replicas doesn't
> > > > support
> > > > > > > soft
> > > > > > > >> commits nor real time gets, afaik.
> > > > > > > >>
> > > > > > > >>>
> > > > > > > >>
> > > > > > > >>>
> > > > > > > >>> 60000
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>> Should we say that in autoCommit section openSearcher is
> > always
> > > > > true in
> > > > > > > >>> that case?
> > > > > > > >>
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> 10000
> > > > > > > >> 30000
> > > > > > > >> 512m
> > > > > > > >> false
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> Does it mean that new Searcher always starts on all replicas
> > when
> > > > > hard
> > > > > > > >> commit happens on leader?
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> Nope. Or at least, the searcher is not synchronously
> created.
> > Each
> > > > > non
> > > > > > > >> leader replica will periodically fetch the index changes
> from
> > the
> > > > > leader
> > > > > > > >> and open a new searcher to reflect those changes as seen
> here:
> > > > > > > >> https://github.com/apache/lucene-
> > > > > > > >>
> > > > >
> > > solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> > > > > > > >> rg/apache/solr/handler/IndexFetcher.java#L653
> > > > > > > >> But it's important to note that the potential delay between
> > the
> > > > > leader's
> > > > > > > >> hard commit and the other replicas fetching those changes
> > from the
> > > > > > > leader
> > > > > > > >> and opening a new searcher to reflect latest changes.
> > > > > > > >>
> > > > > > > >> PS: I am still digging these new replica types so I can have
> > > > > > > misunderstood
> > > > > > > >> or missed some aspect of it.
> > > > > > > >>
> > > > > > > >> Regards,
> > > > > > > >> Edward
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Soft commit and new replica types

Tomás Fernández Löbbe
Yes, that would be great.

Thanks

On Fri, Dec 14, 2018 at 5:38 PM Edward Ribeiro <[hidden email]>
wrote:

> Indeed! It clarified a lot, thank you. :) Now I know I messed with the
> reload core config, but the other aspects were more or less what I have
> been expecting.
>
> Do you think it's worth to submit a PR to the Reference Guide with those
> explanations? I can take a stab at it.
>
> Regards,
> Edward
>
> On Fri, Dec 14, 2018 at 3:08 AM Tomás Fernández Löbbe <
> [hidden email]>
> wrote:
>
> > > >
> > > > No, I am not seeing reloads.
> >
> > Ah, good.
> >
> >
> > > > I am trying to understand the interactions
> > > > between hard commit, soft commit, transaction log update with a TLOG
> > > > cluster for both leader and follower replicas. For example, after
> > getting
> > > > new segments from the leader the follower replica will still apply
> the
> > > > hard/soft commit?
> > >
> >
> > Think about the hard commit as a flush of the latest updates to a segment
> > plus checkpoint pointing to all the current valid segments. That
> checkpoint
> > is also a file. The soft commit is similar to the hard commit in the
> sense
> > that it creates a segment and a pointer to the valid segments, however,
> > those segments may not be flushed to disk yet, and the checkpoint is not
> on
> > a file. *In addition* to creating segments, the commits in Solr create
> > searchers to get the latest view of the index (hard-commits only when
> > openSearcher=true and soft-commits always), but that doesn't really
> matter
> > in the context of replication.
> >
> > The follower replica (a TLOG/PULL) will ask the leader for the last hard
> > commit and replicate all the segments and the file indicating the commit.
> > All the TLOG/PULL replica does after it replicates is open a searcher
> with
> > all the segments in that checkpoint. Two important notes here: 1) the
> > follower replica doesn't "perform" a commit, it copied it from the leader
> > and 2) this "open a searcher" is not a soft/hard commit, is just opening
> a
> > searcher (a "commit" usually involves creating segments).
> >
> > * If in the leader (a TLOG replica) you do a soft commit, it'll never
> make
> > it to the follower, because the follower only replicates the latest hard
> > commit (see ReplicationHandler.indexCommitPoint).
> > * If in the follower (a TLOG replica) you do a soft commit, it won't do
> any
> > difference, because in the TLOG case, documents are not added to the
> index
> > (only to the transaction log). (See UpdateCommand.IGNORE_INDEXWRITER
> flag)
> > * If in the follower (a PULL replica) you do a soft commit, it also
> > wouldn't do any difference, because it doesn't receive the documents
> anyway
> > (only replicates). Commit is skipped anyway (see
> > DistributedUpdateProcessor.processCommit)
> >
> > The transaction log is only used for recovery purposes (or realtime get).
> >
> > I hope that clarifies things.
> >
> > >
> > > > PS: congratulations on the Berlin Buzzwords' talk. :)
> > >
> > Thanks!
> >
> > > >
> > > > Thanks!
> > > >
> > > > On Mon, Dec 10, 2018 at 9:24 PM Tomás Fernández Löbbe
> > > > <[hidden email]>
> > > > wrote:
> > > >
> > > > > I think this is a good point. The tricky part is that if TLOG
> > replicas
> > > > > don't replicate often, their transaction logs will get too big too,
> > so
> > > you
> > > > > want the replication interval of TLOG replicas to be tied to the
> > > > > auto(hard)Commit interval (by default at least). If you are using
> > them
> > > for
> > > > > search, you may also not want to open a searcher for each fetch...
> > for
> > > PULL
> > > > > replicas, maybe the best way is to use the autoSoftCommit interval
> to
> > > > > define the polling interval. That said, I'm not sure using
> different
> > > > > configurations is a good idea, some people may be mixing TLOG and
> > PULL
> > > > and
> > > > > querying them both alike.
> > > > >
> > > > > In the meantime, if you have different hosts for TLOG and PULL
> > > replicas,
> > > > > one workaround you can have is to define the autoCommit time with a
> > > > system
> > > > > property, and use different properties for TLOGs vs PULL nodes.
> > > > >
> > > > > > There is no commit on TLOG/PULL  follower replicas, only on the
> > > leader.
> > > > > > Followers fetch the segments and **reload the core** every 150
> > > seconds
> > > > >
> > > > > Edward, "reload" shouldn't really happen in regular TLOG/PULL
> > fetches.
> > > Are
> > > > > you seeing reloads?
> > > > >
> > > > > On Mon, Dec 10, 2018 at 4:41 PM Erick Erickson <
> > > [hidden email]>
> > > > > wrote:
> > > > >
> > > > > > bq. but not every poll attempt they fetch new segment from the
> > leader
> > > > > >
> > > > > > Ah, right. Ignore my comment. Commit will only occur on the
> > followers
> > > > > > when there are new segments to pull down, so your'e right,
> roughly
> > > > > > every second poll would commit find things to bring down and
> open a
> > > > > > new searcher.........
> > > > > > On Sun, Dec 9, 2018 at 4:14 PM Edward Ribeiro
> > > > <[hidden email]>
> > > > > > wrote:
> > > > > > >
> > > > > > > Hi Vadim,
> > > > > > >
> > > > > > > There is no commit on TLOG/PULL  follower replicas, only on the
> > > leader.
> > > > > > > Followers fetch the segments and **reload the core** every 150
> > > seconds
> > > > > > (if
> > > > > > > there were new segments, I suppose). Yeah, followers don't pay
> > the
> > > CPU
> > > > > > > price of indexing, but there are still cache invalidation,
> > > autowarming,
> > > > > > > etc, in addition to network and IO demand. Is that ritht,
> Erick?
> > > > > > >
> > > > > > > Besides that, Erick is pointing out that under a heavy indexing
> > > > > workload
> > > > > > > you could either have:
> > > > > > >
> > > > > > > 1. Very large transaction logs;
> > > > > > >
> > > > > > > 2. Very large numbers of segments. If that is the case, you
> could
> > > have
> > > > > > the
> > > > > > > following scenario numerous times:
> > > > > > >    2.1. follower replica downloads segment A and B from leader;
> > > > > > >    2.2 leader merges segments A + B into C;
> > > > > > >    2.3. follower replicas discard A and B and download C on
> next
> > > poll;
> > > > > > >
> > > > > > > Under the second condition followers needlessly downloaded
> > segments
> > > > > that
> > > > > > > would eventually be merged.
> > > > > > >
> > > > > > > IMO, you should carefully evaluate if the use of TLOG/PULL is
> > > really
> > > > > > > recommended for your cluster setup, plus indexing and querying
> > > > > workload.
> > > > > > > You can very much stay with a NRT setup if it suits you better.
> > The
> > > > > > videos
> > > > > > > below provide a nice set of hints for when to choose between
> NRT
> > or
> > > > > some
> > > > > > > combination of TLOG and PULL.
> > > > > > >
> > > > > > > https://youtu.be/XIb8X3MwVKc
> > > > > > >
> > > > > > > https://youtu.be/dkWy2ykzAv0
> > > > > > >
> > > > > > > https://youtu.be/XqfTjd9KDWU
> > > > > > >
> > > > > > > Regards,
> > > > > > > Edward
> > > > > > >
> > > > > > > Em dom, 9 de dez de 2018 16:56, <
> > [hidden email]
> > > > > > escreveu:
> > > > > > >
> > > > > > > >
> > > > > > > >  If hard commit max time is 300 sec then commit happens every
> > 300
> > > > sec
> > > > > > on
> > > > > > > > tlog leader. And new segments pop up on the leader every 300
> > sec,
> > > > > > during
> > > > > > > > indexing. Polling interval on other replicas 150 sec, but not
> > > every
> > > > > > poll
> > > > > > > > attempt they fetch new segment from the leader, afaiu. Erick,
> > do
> > > you
> > > > > > mean
> > > > > > > > that on all other  tlog replicas(not leaders) commit occurs
> > every
> > > > > poll?
> > > > > > > > воскресенье, 09 декабря 2018г., 19:21 +03:00 от Erick
> Erickson
> > > > > > > > [hidden email] :
> > > > > > > >
> > > > > > > > >Not quite, 600000. The polling interval is half the commit
> > > > > > interval....
> > > > > > > > >
> > > > > > > > >This has always bothered me a little bit, I wonder at the
> > > utility
> > > > > of a
> > > > > > > > >config param. We already have old-style replication with a
> > > > > > > > >configurable polling interval. Under very heavy indexing
> > loads,
> > > it
> > > > > > > > >seems to me that either the tlogs will grow quite large or
> > > we'll be
> > > > > > > > >pulling a lot of unnecessary segments across the wire,
> > segments
> > > > > > > > >that'll soon be merged away and the merged segment
> re-pulled.
> > > > > > > > >
> > > > > > > > >Apparently, though, nobody's seen this "in the wild", so
> it's
> > > > > > > > >theoretical at this point.
> > > > > > > > >On Sun, Dec 9, 2018 at 1:48 AM Vadim Ivanov
> > > > > > > > < [hidden email]> wrote:
> > > > > > > > >
> > > > > > > > > Thanks, Edward, for clues.
> > > > > > > > > What bothers me is newSearcher start, warming, cache
> clear...
> > > all
> > > > > > that
> > > > > > > > CPU consuming stuff in my heavy-indexing scenario.
> > > > > > > > > With NRT I had autoSoftCommit:  300000 .
> > > > > > > > > So I had new Searcher no more than  every 5 min on every
> > > replica.
> > > > > > > > > To have more or less  the same effect with TLOG - PULL
> > > collection,
> > > > > > > > > I suppose, I have to have  :  300000
> > > > > > > > > (yes, I understand that newSearchers start asynchronously
> on
> > > leader
> > > > > > and
> > > > > > > > replicas)
> > > > > > > > > Am I right?
> > > > > > > > > --
> > > > > > > > > Vadim
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >> -----Original Message-----
> > > > > > > > >> From: Edward Ribeiro [mailto:[hidden email]]
> > > > > > > > >> Sent: Sunday, December 09, 2018 12:42 AM
> > > > > > > > >> To:  [hidden email]
> > > > > > > > >> Subject: Re: Soft commit and new replica types
> > > > > > > > >>
> > > > > > > > >> Some insights in the new replica types below:
> > > > > > > > >>
> > > > > > > > >> On Sat, December 8, 2018 08:42, Vadim Ivanov <
> > > > > > > > >> [hidden email] wrote:
> > > > > > > > >>
> > > > > > > > >>>
> > > > > > > > >>> From Ref guide we have:
> > > > > > > > >>> " NRT is the only type of replica that supports
> > > soft-commits..."
> > > > > > > > >>> "If TLOG replica does become a leader, it will behave the
> > > same as
> > > > > > if it
> > > > > > > > >>> was a NRT type of replica."
> > > > > > > > >>> Does it mean, that if we do not have NRT replicas in the
> > > cluster
> > > > > > then
> > > > > > > > >>> autoSoftCommit section in solconfig.xml Ignored
> completely
> > > (even
> > > > > on
> > > > > > > > TLOG
> > > > > > > > >>> leader)?
> > > > > > > > >>>
> > > > > > > > >>
> > > > > > > > >> No, not completely. Both TLOG and PULL nodes will
> > periodically
> > > > > poll
> > > > > > the
> > > > > > > > >> leader for changes in index segments' files and download
> > those
> > > > > > segments
> > > > > > > > >> from the leader. If hard commit max time is defined in
> > > > > > solrconfig.xml
> > > > > > > > the
> > > > > > > > >> polling interval of each replica will be half that value.
> Or
> > > else
> > > > > > if the
> > > > > > > > >> soft commit max time is defined then the replicas will use
> > > half
> > > > > the
> > > > > > soft
> > > > > > > > >> commit max time as the interval. If neither are defined
> then
> > > the
> > > > > > poll
> > > > > > > > >> interval will be 3 seconds (hard coded). See here:
> > > > > > > > >> https://github.com/apache/lucene-
> > > > > > > > >>
> > > > > >
> > > >
> solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> > > > > > > > >> rg/apache/solr/cloud/ReplicateFromLeader.java#L68-L77
> > > > > > > > >>
> > > > > > > > >> If the TLOG is the leader it will index locally and append
> > > the doc
> > > > > > to
> > > > > > > > >> transaction log as a NRT node would do as well as it will
> > > > > > synchronously
> > > > > > > > >> replicate the data to other TLOG replicas' transaction
> logs
> > > (PULL
> > > > > > nodes
> > > > > > > > >> don't have transaction logs). But TLOG/PULL replicas
> doesn't
> > > > > support
> > > > > > > > soft
> > > > > > > > >> commits nor real time gets, afaik.
> > > > > > > > >>
> > > > > > > > >>>
> > > > > > > > >>
> > > > > > > > >>>
> > > > > > > > >>> 60000
> > > > > > > > >>>
> > > > > > > > >>>
> > > > > > > > >>> Should we say that in autoCommit section openSearcher is
> > > always
> > > > > > true in
> > > > > > > > >>> that case?
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >> 10000
> > > > > > > > >> 30000
> > > > > > > > >> 512m
> > > > > > > > >> false
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >> Does it mean that new Searcher always starts on all
> replicas
> > > when
> > > > > > hard
> > > > > > > > >> commit happens on leader?
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >> Nope. Or at least, the searcher is not synchronously
> > created.
> > > Each
> > > > > > non
> > > > > > > > >> leader replica will periodically fetch the index changes
> > from
> > > the
> > > > > > leader
> > > > > > > > >> and open a new searcher to reflect those changes as seen
> > here:
> > > > > > > > >> https://github.com/apache/lucene-
> > > > > > > > >>
> > > > > >
> > > >
> solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> > > > > > > > >> rg/apache/solr/handler/IndexFetcher.java#L653
> > > > > > > > >> But it's important to note that the potential delay
> between
> > > the
> > > > > > leader's
> > > > > > > > >> hard commit and the other replicas fetching those changes
> > > from the
> > > > > > > > leader
> > > > > > > > >> and opening a new searcher to reflect latest changes.
> > > > > > > > >>
> > > > > > > > >> PS: I am still digging these new replica types so I can
> have
> > > > > > > > misunderstood
> > > > > > > > >> or missed some aspect of it.
> > > > > > > > >>
> > > > > > > > >> Regards,
> > > > > > > > >> Edward
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > >
> > >
> >
>