check softCommit , autocommit and hard commit count

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

check softCommit , autocommit and hard commit count

Puppy Linux Distros
Hi,

I am trying to calculate the total number of softCommit , autocommit and
hard commit from the solr logs. Can you please check whether the below
commands are correct ?

Let me know how to find the total softcommit, hardcommit and autocommit
from the logs.


*1. totalcommit=`cat $solrlogfile | grep "start commit" | wc -l`*

*totalcommit =  **41906*


*2. totalsoftcommit=`cat $solrlogfile | grep "start commit" | grep
"softCommit=true" | wc -l`*

*totalsoftcommit =  **921*


*3. totalhardcommits=`cat $solrlogfile | grep "start commit" | grep
"softCommit=false" | grep "openSearcher=true" | wc -l`*

*totalhardcommits=  **40982*

*4.  totalautocommit=`cat $solrlogfile | grep "realtime" | wc -l`*

*totalautocommit= 3*



*When I did a softcommit I can see an autocommit triggered after 15 min.
There are 921 softcommit in the logs so there should be equal autocommits
in the log. I can see only 3 auto commit in the logs. Is it cuz a hard
commit triggered immediately after the softcommit ?*

--
Regards,

Vivek CV
Reply | Threaded
Open this post in threaded view
|

Re: check softCommit , autocommit and hard commit count

Shawn Heisey-2
On 11/30/2017 4:36 AM, Puppy Linux Distros wrote:

> I am trying to calculate the total number of softCommit , autocommit and
> hard commit from the solr logs. Can you please check whether the below
> commands are correct ?
>
> Let me know how to find the total softcommit, hardcommit and autocommit
> from the logs.
>
>
> *1. totalcommit=`cat $solrlogfile | grep "start commit" | wc -l`*
>
> *totalcommit =  **41906*
>
>
> *2. totalsoftcommit=`cat $solrlogfile | grep "start commit" | grep
> "softCommit=true" | wc -l`*
>
> *totalsoftcommit =  **921*

These look reasonable ... but be aware that the default logging config
will roll the solr.log file to a new empty file when it reaches 4
megabytes, which doesn't really take that long on a busy server, so if
you're only looking at "solr.log" you may have an incomplete picture.  I
personally change the roll size limit to 4 gigabytes so solr.log covers
a lot more time.

Solr restarts will *also* roll/archive logfiles, so you probably can't
just look through every file in the logs directory that starts with
"solr.log" -- it may be difficult to figure out exactly which files
apply to the current running instance.  It might turn out that I'm
completely wrong in that statement -- I haven't confirmed exactly what a
Solr restart actually does with the logfiles.

> *3. totalhardcommits=`cat $solrlogfile | grep "start commit" | grep
> "softCommit=false" | grep "openSearcher=true" | wc -l`*
>
> *totalhardcommits=  **40982*

If you have configured autoCommit in solrconfig.xml and have set
openSearcher to false in that config, then there will be hard commits
that *don't* open a new searcher, so the "openSearcher=true" part will
not catch those commits.  Example configs in recent versions have
autoCommit set up this way, and this recommended config for
*everybody*.  The default autoCommit interval in the example configs is
15 seconds, which I think is a little too aggressive, but this kind of
commit is typically very fast, so I've never seen that config cause
problems.

The example configs do not have autoSoftCommit configured.  If users
want to automatically do commits for visibility, we recommend that they
use autoSoftCommit.

> *4.  totalautocommit=`cat $solrlogfile | grep "realtime" | wc -l`*
>
> *totalautocommit= 3*

These aren't autoCommits.  They are new searchers for the realtime get
handler, which is capable of accessing documents that haven't been
committed yet.  In addition to the index on disk, it searches the
transaction logs.  Opening a new realtime searcher should be very fast,
and they happen without any configuration. I'm not sure why you're only
seeing this happen three times here. Presumably in a log where there are
40000 total commits, you are doing a fair amount of indexing, so I would
have expected a new realtime searcher to have been created much more
frequently, even if there were no commits done at all.

Maybe the realtime get handler can use the standard searcher, and only
opens a new realtime searcher in cases where new documents have been
indexed but there hasn't been a recent commit that opens a new
searcher.  If that's the case, then I have no idea how long it would
wait before firing up a new realtime searcher.  I wouldn't expect that
to be very long ... so if your indexing/committing cycles are normally
very fast, maybe Solr doesn't feel it's necessary to open realtime
searchers very often.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|

Re: check softCommit , autocommit and hard commit count

Puppy Linux Distros
Hello,

Thanks Shawn. Can you provide command to find the total number of
autocommits in the solr.log?

On Thu, Nov 30, 2017 at 7:20 PM, Shawn Heisey <[hidden email]> wrote:

> On 11/30/2017 4:36 AM, Puppy Linux Distros wrote:
>
>> I am trying to calculate the total number of softCommit , autocommit and
>> hard commit from the solr logs. Can you please check whether the below
>> commands are correct ?
>>
>> Let me know how to find the total softcommit, hardcommit and autocommit
>> from the logs.
>>
>>
>> *1. totalcommit=`cat $solrlogfile | grep "start commit" | wc -l`*
>>
>> *totalcommit =  **41906*
>>
>>
>> *2. totalsoftcommit=`cat $solrlogfile | grep "start commit" | grep
>> "softCommit=true" | wc -l`*
>>
>> *totalsoftcommit =  **921*
>>
>
> These look reasonable ... but be aware that the default logging config
> will roll the solr.log file to a new empty file when it reaches 4
> megabytes, which doesn't really take that long on a busy server, so if
> you're only looking at "solr.log" you may have an incomplete picture.  I
> personally change the roll size limit to 4 gigabytes so solr.log covers a
> lot more time.
>
> Solr restarts will *also* roll/archive logfiles, so you probably can't
> just look through every file in the logs directory that starts with
> "solr.log" -- it may be difficult to figure out exactly which files apply
> to the current running instance.  It might turn out that I'm completely
> wrong in that statement -- I haven't confirmed exactly what a Solr restart
> actually does with the logfiles.
>
> *3. totalhardcommits=`cat $solrlogfile | grep "start commit" | grep
>> "softCommit=false" | grep "openSearcher=true" | wc -l`*
>>
>> *totalhardcommits=  **40982*
>>
>
> If you have configured autoCommit in solrconfig.xml and have set
> openSearcher to false in that config, then there will be hard commits that
> *don't* open a new searcher, so the "openSearcher=true" part will not catch
> those commits.  Example configs in recent versions have autoCommit set up
> this way, and this recommended config for *everybody*.  The default
> autoCommit interval in the example configs is 15 seconds, which I think is
> a little too aggressive, but this kind of commit is typically very fast, so
> I've never seen that config cause problems.
>
> The example configs do not have autoSoftCommit configured.  If users want
> to automatically do commits for visibility, we recommend that they use
> autoSoftCommit.
>
> *4.  totalautocommit=`cat $solrlogfile | grep "realtime" | wc -l`*
>>
>> *totalautocommit= 3*
>>
>
> These aren't autoCommits.  They are new searchers for the realtime get
> handler, which is capable of accessing documents that haven't been
> committed yet.  In addition to the index on disk, it searches the
> transaction logs.  Opening a new realtime searcher should be very fast, and
> they happen without any configuration. I'm not sure why you're only seeing
> this happen three times here. Presumably in a log where there are 40000
> total commits, you are doing a fair amount of indexing, so I would have
> expected a new realtime searcher to have been created much more frequently,
> even if there were no commits done at all.
>
> Maybe the realtime get handler can use the standard searcher, and only
> opens a new realtime searcher in cases where new documents have been
> indexed but there hasn't been a recent commit that opens a new searcher.
> If that's the case, then I have no idea how long it would wait before
> firing up a new realtime searcher.  I wouldn't expect that to be very long
> ... so if your indexing/committing cycles are normally very fast, maybe
> Solr doesn't feel it's necessary to open realtime searchers very often.
>
> Thanks,
> Shawn
>
>


--
Regards,

Vivek CV
Reply | Threaded
Open this post in threaded view
|

Re: check softCommit , autocommit and hard commit count

Puppy Linux Distros
Hi,

Thanks Shawn for the help.

I think I should have added few more details to my previous mail.

I know it's a bad practice but due to some reasons, our application fires
hard commits via code(upon most of the /update) and invokes the /update api
with commit=true and application very less uses softcommits. I will
recommend devs to look forward with more softcommits and make use of
realtime searchers in future.

However, my current scenario is to get the solr to latest 7.1.0 so I need
to collect the current traffic in solr to have an optimized trade-offs with
the latest stack that I am looking forward to. Current stack is bit older
like 4.10. so got to process/parse current solr logs.

I have my own log storage mechanism with us so I have one month solr.log
stored and hence rotation/archive isn't an issue here. Once I get a hold of
unique phrases in each logs that appends with each type of
commits(softcommit, autohardcommit,hardcommit), I can frame some metrics of
current traffic.

Our current stack still maintains default autocommit config like
opensearcher=false and 15s period. Currently dont have softcommits enabled,
however softcommits and hardcommits invokes explicitly from application,
hence its bit hard to get them separated from solr.log unless I get some
unique phrases/regex/words out of each log lines that each type of commits
fires. Would be really helpful if any inputs in this area.

In addition to that, just wanted to confirm, if there no pending /update
written to disk, does autocommit really fires at it's interval or is it
going to be idle if nothing to write to disk..? In other way, suppose, I
made a softcommit on 5th second and I made a hardcommit explicitly on 10th
second, is it really going to happen an autocommit on 15th second for no
reason since hardcommit on 10th second has already wrote the changes to
disk and re-built the index. If it happens in that way, it makes sense to
me if I see very less autocommit logs since I have very frequent
hardcommits firing from the application.

Every help is appreciated.
Thanks in advance,

On Mon, Dec 4, 2017 at 10:51 AM, Puppy Linux Distros <[hidden email]>
wrote:

> Hello,
>
> Thanks Shawn. Can you provide command to find the total number of
> autocommits in the solr.log?
>
> On Thu, Nov 30, 2017 at 7:20 PM, Shawn Heisey <[hidden email]> wrote:
>
>> On 11/30/2017 4:36 AM, Puppy Linux Distros wrote:
>>
>>> I am trying to calculate the total number of softCommit , autocommit and
>>> hard commit from the solr logs. Can you please check whether the below
>>> commands are correct ?
>>>
>>> Let me know how to find the total softcommit, hardcommit and autocommit
>>> from the logs.
>>>
>>>
>>> *1. totalcommit=`cat $solrlogfile | grep "start commit" | wc -l`*
>>>
>>> *totalcommit =  **41906*
>>>
>>>
>>> *2. totalsoftcommit=`cat $solrlogfile | grep "start commit" | grep
>>> "softCommit=true" | wc -l`*
>>>
>>> *totalsoftcommit =  **921*
>>>
>>
>> These look reasonable ... but be aware that the default logging config
>> will roll the solr.log file to a new empty file when it reaches 4
>> megabytes, which doesn't really take that long on a busy server, so if
>> you're only looking at "solr.log" you may have an incomplete picture.  I
>> personally change the roll size limit to 4 gigabytes so solr.log covers a
>> lot more time.
>>
>> Solr restarts will *also* roll/archive logfiles, so you probably can't
>> just look through every file in the logs directory that starts with
>> "solr.log" -- it may be difficult to figure out exactly which files apply
>> to the current running instance.  It might turn out that I'm completely
>> wrong in that statement -- I haven't confirmed exactly what a Solr restart
>> actually does with the logfiles.
>>
>> *3. totalhardcommits=`cat $solrlogfile | grep "start commit" | grep
>>> "softCommit=false" | grep "openSearcher=true" | wc -l`*
>>>
>>> *totalhardcommits=  **40982*
>>>
>>
>> If you have configured autoCommit in solrconfig.xml and have set
>> openSearcher to false in that config, then there will be hard commits that
>> *don't* open a new searcher, so the "openSearcher=true" part will not catch
>> those commits.  Example configs in recent versions have autoCommit set up
>> this way, and this recommended config for *everybody*.  The default
>> autoCommit interval in the example configs is 15 seconds, which I think is
>> a little too aggressive, but this kind of commit is typically very fast, so
>> I've never seen that config cause problems.
>>
>> The example configs do not have autoSoftCommit configured.  If users want
>> to automatically do commits for visibility, we recommend that they use
>> autoSoftCommit.
>>
>> *4.  totalautocommit=`cat $solrlogfile | grep "realtime" | wc -l`*
>>>
>>> *totalautocommit= 3*
>>>
>>
>> These aren't autoCommits.  They are new searchers for the realtime get
>> handler, which is capable of accessing documents that haven't been
>> committed yet.  In addition to the index on disk, it searches the
>> transaction logs.  Opening a new realtime searcher should be very fast, and
>> they happen without any configuration. I'm not sure why you're only seeing
>> this happen three times here. Presumably in a log where there are 40000
>> total commits, you are doing a fair amount of indexing, so I would have
>> expected a new realtime searcher to have been created much more frequently,
>> even if there were no commits done at all.
>>
>> Maybe the realtime get handler can use the standard searcher, and only
>> opens a new realtime searcher in cases where new documents have been
>> indexed but there hasn't been a recent commit that opens a new searcher.
>> If that's the case, then I have no idea how long it would wait before
>> firing up a new realtime searcher.  I wouldn't expect that to be very long
>> ... so if your indexing/committing cycles are normally very fast, maybe
>> Solr doesn't feel it's necessary to open realtime searchers very often.
>>
>> Thanks,
>> Shawn
>>
>>
>
>
> --
> Regards,
>
> Vivek CV
>
>
>


--
Regards,

Vivek CV
Reply | Threaded
Open this post in threaded view
|

Re: check softCommit , autocommit and hard commit count

Erick Erickson
Neither commit does anything if no updates have been received.

But you don't need to wait for the devs to STOP DOING THAT ;). In
solrconfig.xml you can set:
IgnoreCommitOptimizeUpdateProcessorFactory
see the ref guide....

Best,
Erick

On Mon, Dec 4, 2017 at 12:53 AM, Puppy Linux Distros <[hidden email]> wrote:

> Hi,
>
> Thanks Shawn for the help.
>
> I think I should have added few more details to my previous mail.
>
> I know it's a bad practice but due to some reasons, our application fires
> hard commits via code(upon most of the /update) and invokes the /update api
> with commit=true and application very less uses softcommits. I will
> recommend devs to look forward with more softcommits and make use of
> realtime searchers in future.
>
> However, my current scenario is to get the solr to latest 7.1.0 so I need
> to collect the current traffic in solr to have an optimized trade-offs with
> the latest stack that I am looking forward to. Current stack is bit older
> like 4.10. so got to process/parse current solr logs.
>
> I have my own log storage mechanism with us so I have one month solr.log
> stored and hence rotation/archive isn't an issue here. Once I get a hold of
> unique phrases in each logs that appends with each type of
> commits(softcommit, autohardcommit,hardcommit), I can frame some metrics of
> current traffic.
>
> Our current stack still maintains default autocommit config like
> opensearcher=false and 15s period. Currently dont have softcommits enabled,
> however softcommits and hardcommits invokes explicitly from application,
> hence its bit hard to get them separated from solr.log unless I get some
> unique phrases/regex/words out of each log lines that each type of commits
> fires. Would be really helpful if any inputs in this area.
>
> In addition to that, just wanted to confirm, if there no pending /update
> written to disk, does autocommit really fires at it's interval or is it
> going to be idle if nothing to write to disk..? In other way, suppose, I
> made a softcommit on 5th second and I made a hardcommit explicitly on 10th
> second, is it really going to happen an autocommit on 15th second for no
> reason since hardcommit on 10th second has already wrote the changes to
> disk and re-built the index. If it happens in that way, it makes sense to
> me if I see very less autocommit logs since I have very frequent
> hardcommits firing from the application.
>
> Every help is appreciated.
> Thanks in advance,
>
> On Mon, Dec 4, 2017 at 10:51 AM, Puppy Linux Distros <[hidden email]>
> wrote:
>
>> Hello,
>>
>> Thanks Shawn. Can you provide command to find the total number of
>> autocommits in the solr.log?
>>
>> On Thu, Nov 30, 2017 at 7:20 PM, Shawn Heisey <[hidden email]> wrote:
>>
>>> On 11/30/2017 4:36 AM, Puppy Linux Distros wrote:
>>>
>>>> I am trying to calculate the total number of softCommit , autocommit and
>>>> hard commit from the solr logs. Can you please check whether the below
>>>> commands are correct ?
>>>>
>>>> Let me know how to find the total softcommit, hardcommit and autocommit
>>>> from the logs.
>>>>
>>>>
>>>> *1. totalcommit=`cat $solrlogfile | grep "start commit" | wc -l`*
>>>>
>>>> *totalcommit =  **41906*
>>>>
>>>>
>>>> *2. totalsoftcommit=`cat $solrlogfile | grep "start commit" | grep
>>>> "softCommit=true" | wc -l`*
>>>>
>>>> *totalsoftcommit =  **921*
>>>>
>>>
>>> These look reasonable ... but be aware that the default logging config
>>> will roll the solr.log file to a new empty file when it reaches 4
>>> megabytes, which doesn't really take that long on a busy server, so if
>>> you're only looking at "solr.log" you may have an incomplete picture.  I
>>> personally change the roll size limit to 4 gigabytes so solr.log covers a
>>> lot more time.
>>>
>>> Solr restarts will *also* roll/archive logfiles, so you probably can't
>>> just look through every file in the logs directory that starts with
>>> "solr.log" -- it may be difficult to figure out exactly which files apply
>>> to the current running instance.  It might turn out that I'm completely
>>> wrong in that statement -- I haven't confirmed exactly what a Solr restart
>>> actually does with the logfiles.
>>>
>>> *3. totalhardcommits=`cat $solrlogfile | grep "start commit" | grep
>>>> "softCommit=false" | grep "openSearcher=true" | wc -l`*
>>>>
>>>> *totalhardcommits=  **40982*
>>>>
>>>
>>> If you have configured autoCommit in solrconfig.xml and have set
>>> openSearcher to false in that config, then there will be hard commits that
>>> *don't* open a new searcher, so the "openSearcher=true" part will not catch
>>> those commits.  Example configs in recent versions have autoCommit set up
>>> this way, and this recommended config for *everybody*.  The default
>>> autoCommit interval in the example configs is 15 seconds, which I think is
>>> a little too aggressive, but this kind of commit is typically very fast, so
>>> I've never seen that config cause problems.
>>>
>>> The example configs do not have autoSoftCommit configured.  If users want
>>> to automatically do commits for visibility, we recommend that they use
>>> autoSoftCommit.
>>>
>>> *4.  totalautocommit=`cat $solrlogfile | grep "realtime" | wc -l`*
>>>>
>>>> *totalautocommit= 3*
>>>>
>>>
>>> These aren't autoCommits.  They are new searchers for the realtime get
>>> handler, which is capable of accessing documents that haven't been
>>> committed yet.  In addition to the index on disk, it searches the
>>> transaction logs.  Opening a new realtime searcher should be very fast, and
>>> they happen without any configuration. I'm not sure why you're only seeing
>>> this happen three times here. Presumably in a log where there are 40000
>>> total commits, you are doing a fair amount of indexing, so I would have
>>> expected a new realtime searcher to have been created much more frequently,
>>> even if there were no commits done at all.
>>>
>>> Maybe the realtime get handler can use the standard searcher, and only
>>> opens a new realtime searcher in cases where new documents have been
>>> indexed but there hasn't been a recent commit that opens a new searcher.
>>> If that's the case, then I have no idea how long it would wait before
>>> firing up a new realtime searcher.  I wouldn't expect that to be very long
>>> ... so if your indexing/committing cycles are normally very fast, maybe
>>> Solr doesn't feel it's necessary to open realtime searchers very often.
>>>
>>> Thanks,
>>> Shawn
>>>
>>>
>>
>>
>> --
>> Regards,
>>
>> Vivek CV
>>
>>
>>
>
>
> --
> Regards,
>
> Vivek CV
Reply | Threaded
Open this post in threaded view
|

Re: check softCommit , autocommit and hard commit count

Shawn Heisey-2
In reply to this post by Puppy Linux Distros
On 12/4/2017 1:53 AM, Puppy Linux Distros wrote:
> I know it's a bad practice but due to some reasons, our application fires
> hard commits via code(upon most of the /update) and invokes the /update api
> with commit=true and application very less uses softcommits. I will
> recommend devs to look forward with more softcommits and make use of
> realtime searchers in future.

Anyone who AUTOMATICALLY says it's bad practice to send hard commits
doesn't fully understand all the mechanics.  It's true that soft commits
are recommended when you want to see index changes, but this is only
because they *MIGHT* be faster than hard commits.

It's actually the opening of the searcher that tends to be a performance
killer, and soft commits DO open a new searcher. There are situations in
which a soft commit is NOT any faster than a hard commit, but because it
MIGHT be faster, they are generally recommended.

There are plenty of users who never worry about the difference and
always use hard commits.  Most of the time this is long-time users who
got into Solr before version 4.0, when soft commits were introduced.

> In addition to that, just wanted to confirm, if there no pending /update
> written to disk, does autocommit really fires at it's interval or is it
> going to be idle if nothing to write to disk..? In other way, suppose, I
> made a softcommit on 5th second and I made a hardcommit explicitly on 10th
> second, is it really going to happen an autocommit on 15th second for no
> reason since hardcommit on 10th second has already wrote the changes to
> disk and re-built the index. If it happens in that way, it makes sense to
> me if I see very less autocommit logs since I have very frequent
> hardcommits firing from the application.

As Erick said, if there have been no changes to the index, then *any*
kind of commit will do nothing.  The automatic commits don't fire if
there have been no changes to the index.  I do not know what happens in
the log when a commit is requested that does nothing.

Thanks,
Shawn