when transaction logs are closing?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

when transaction logs are closing?

Bernd Fehling
I'm trying to figure out when transaction logs are closing.
Unfortunately the docs and guides are not very clear about this.

I tried any combination of commits with waitSearcher true/false,
expungeDeletes true/false, openSearcher true/false.
And also optimize with maxSegements=1.

The stats of my updateHandler say
transaction_logs_total_number:  2
transaction_logs_total_size:    59287641

A "lsof | grep tlog" reports still many open tlog files.
Actually there are only 2 tlog files but each java process has handles
open to tlog.

But why they are not closing even after hard commit and optimize
with maxSegments=1 ?
There is no need to keep the tlogs open. Everything is flushed to disk,
optimized, all nodes are up, running and in sync.

Can someone explain the rules when tlogs are closing?

Regards
Bernd
Reply | Threaded
Open this post in threaded view
|

Re: when transaction logs are closing?

Emir Arnautović
Hi Bernd,
I did not look at the code, but I would guess never. Solr tends to keep file handle for each file that it uses and it keeps last N transaction logs. Transaction log file is flushed and new one is created when you issue hard commit - with or without open searcher. At that moment it will delete the oldest one, but the number of tlogs will remain the same.

Hope that someone more into transaction logs will jump in and correct me if I am wrong.

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 9 Oct 2017, at 09:21, Bernd Fehling <[hidden email]> wrote:
>
> I'm trying to figure out when transaction logs are closing.
> Unfortunately the docs and guides are not very clear about this.
>
> I tried any combination of commits with waitSearcher true/false,
> expungeDeletes true/false, openSearcher true/false.
> And also optimize with maxSegements=1.
>
> The stats of my updateHandler say
> transaction_logs_total_number:  2
> transaction_logs_total_size:    59287641
>
> A "lsof | grep tlog" reports still many open tlog files.
> Actually there are only 2 tlog files but each java process has handles
> open to tlog.
>
> But why they are not closing even after hard commit and optimize
> with maxSegments=1 ?
> There is no need to keep the tlogs open. Everything is flushed to disk,
> optimized, all nodes are up, running and in sync.
>
> Can someone explain the rules when tlogs are closing?
>
> Regards
> Bernd

Reply | Threaded
Open this post in threaded view
|

Re: when transaction logs are closing?

alessandro.benedetti
In addition to what Emir mentioned, when Solr opens a new Transaction Log
file it will delete the older ones up to some conditions :
keep at least N number of records [1] and max K number of files[2].
N is specified in the solrconfig.xml ( in the update handler section) and
can be documents related or files related or both.
So , potentially it could delete no one.

This blog from Erick is quite explicative[3] .
If you like to take a look to the code, this class should help[4]



[1]  <str name="numRecordsToKeep">${solr.ulog.numRecordsToKeep:100}</str>
[2]  <str name="maxNumLogsToKeep">${solr.ulog.maxNumLogsToKeep:10}</str>
[3]
https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
[4] org.apache.solr.update.UpdateLog




-----
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
Reply | Threaded
Open this post in threaded view
|

Re: when transaction logs are closing?

Bernd Fehling
In reply to this post by Emir Arnautović
Thanks a lot Alessandro and Emir.

Am 09.10.2017 um 13:40 schrieb alessandro.benedetti:

> In addition to what Emir mentioned, when Solr opens a new Transaction Log
> file it will delete the older ones up to some conditions :
> keep at least N number of records [1] and max K number of files[2].
> N is specified in the solrconfig.xml ( in the update handler section) and
> can be documents related or files related or both.
> So , potentially it could delete no one.
>
> This blog from Erick is quite explicative[3] .
> If you like to take a look to the code, this class should help[4]
>
>
>
> [1]  <str name="numRecordsToKeep">${solr.ulog.numRecordsToKeep:100}</str>
> [2]  <str name="maxNumLogsToKeep">${solr.ulog.maxNumLogsToKeep:10}</str>
> [3]
> https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
> [4] org.apache.solr.update.UpdateLog
>
>
>
>
> -----
> ---------------
> Alessandro Benedetti
> Search Consultant, R&D Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>
Reply | Threaded
Open this post in threaded view
|

Re: when transaction logs are closing?

Erick Erickson
bq: Actually there are only 2 tlog files but each java process has
handles open to tlog.

I'm a little confused by this. Each core in each Solr JVM should have
a handle to it's _own_ tlog open. So if JVM1 has 10 cores (replicas),
I'd expect 10 different tlogs to be open, one for each core. But I
don't expect two processes to have a handle open to the _same_ tlog
unless something's horribly wrong.

And as Emir and Alessandro point out, every hard commit of any flavor
should close the current tlog (for the replica/core) and open a new
one. Soft commits have no effect.

Oh, and please don't expungeDeletes or optimize if you can possibly
help it, there'll be a blog coming soon on why this is A Bad Idea
unless you have a pattern where you periodically (I'm thinking daily)
update your index and optimize as part of your process.

Best,
Erick

P.S. The reference guide is freely editable. Well, actually they're
just files in asciidoc format. It'd be great if you wanted to edit
them. I use Atom to edit them as it's free. If you do feel moved to do
this, just raise a JIRA and add the diff as a patch. There's also an
IntelliJ plugin that will do. I'm heavily editing the Near Real Time
page...

On Mon, Oct 9, 2017 at 5:35 AM, Bernd Fehling
<[hidden email]> wrote:

> Thanks a lot Alessandro and Emir.
>
> Am 09.10.2017 um 13:40 schrieb alessandro.benedetti:
>> In addition to what Emir mentioned, when Solr opens a new Transaction Log
>> file it will delete the older ones up to some conditions :
>> keep at least N number of records [1] and max K number of files[2].
>> N is specified in the solrconfig.xml ( in the update handler section) and
>> can be documents related or files related or both.
>> So , potentially it could delete no one.
>>
>> This blog from Erick is quite explicative[3] .
>> If you like to take a look to the code, this class should help[4]
>>
>>
>>
>> [1]  <str name="numRecordsToKeep">${solr.ulog.numRecordsToKeep:100}</str>
>> [2]  <str name="maxNumLogsToKeep">${solr.ulog.maxNumLogsToKeep:10}</str>
>> [3]
>> https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>> [4] org.apache.solr.update.UpdateLog
>>
>>
>>
>>
>> -----
>> ---------------
>> Alessandro Benedetti
>> Search Consultant, R&D Software Engineer, Director
>> Sease Ltd. - www.sease.io
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>
Reply | Threaded
Open this post in threaded view
|

Re: when transaction logs are closing?

Bernd Fehling
In reply to this post by Bernd Fehling
Hi Erik,

the cause of having 2 tlog files but about 180 open file handles
was the point asking the question "when transaction logs are closing".
Second was the statement you also use "...every hard commit of any
flavor should close the current tlog".
This is somehow true, but I would expect that the meaning of
"close the current tlog" will physically close the tlog and release the
file handles. But actually the current tlog is only logically closed
with its END_MESSAGE "SOLR_TLOG_END" and _still_ physically open
because it is part of the current index e.g. for RTS.
I think this is a point which confuses most users about tlog.

Yes, optimize is part of our weekly bulk load process.
The question still stands, why are tlog files not closed physically
after full optimize down to 1 segment?
From my point of view (with a huge static index built once a week)
the optimize is some kind of "sledgehammer" to force all data physically
to disk, wipe out all deleted docs and free any OS resources.

Regards
Bernd


Am 09.10.2017 um 19:46 schrieb Erick Erickson:

> bq: Actually there are only 2 tlog files but each java process has
> handles open to tlog.
>
> I'm a little confused by this. Each core in each Solr JVM should have
> a handle to it's _own_ tlog open. So if JVM1 has 10 cores (replicas),
> I'd expect 10 different tlogs to be open, one for each core. But I
> don't expect two processes to have a handle open to the _same_ tlog
> unless something's horribly wrong.
>
> And as Emir and Alessandro point out, every hard commit of any flavor
> should close the current tlog (for the replica/core) and open a new
> one. Soft commits have no effect.
>
> Oh, and please don't expungeDeletes or optimize if you can possibly
> help it, there'll be a blog coming soon on why this is A Bad Idea
> unless you have a pattern where you periodically (I'm thinking daily)
> update your index and optimize as part of your process.
>
> Best,
> Erick
>
> P.S. The reference guide is freely editable. Well, actually they're
> just files in asciidoc format. It'd be great if you wanted to edit
> them. I use Atom to edit them as it's free. If you do feel moved to do
> this, just raise a JIRA and add the diff as a patch. There's also an
> IntelliJ plugin that will do. I'm heavily editing the Near Real Time
> page...
>
> On Mon, Oct 9, 2017 at 5:35 AM, Bernd Fehling
> <[hidden email]> wrote:
>> Thanks a lot Alessandro and Emir.
>>
>> Am 09.10.2017 um 13:40 schrieb alessandro.benedetti:
>>> In addition to what Emir mentioned, when Solr opens a new Transaction Log
>>> file it will delete the older ones up to some conditions :
>>> keep at least N number of records [1] and max K number of files[2].
>>> N is specified in the solrconfig.xml ( in the update handler section) and
>>> can be documents related or files related or both.
>>> So , potentially it could delete no one.
>>>
>>> This blog from Erick is quite explicative[3] .
>>> If you like to take a look to the code, this class should help[4]
>>>
>>>
>>>
>>> [1]  <str name="numRecordsToKeep">${solr.ulog.numRecordsToKeep:100}</str>
>>> [2]  <str name="maxNumLogsToKeep">${solr.ulog.maxNumLogsToKeep:10}</str>
>>> [3]
>>> https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>>> [4] org.apache.solr.update.UpdateLog
>>>
>>>
>>>
>>>
>>> -----
>>> ---------------
>>> Alessandro Benedetti
>>> Search Consultant, R&D Software Engineer, Director
>>> Sease Ltd. - www.sease.io
>>> --
>>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>>
Reply | Threaded
Open this post in threaded view
|

Re: when transaction logs are closing?

Erick Erickson
Bernd:

bq: ...having 2 tlog files but about 180 open file handles was the
point asking the question...

This doesn't make any sense to me. I can't imagine that what you
describe is intended behavior _if_ all those file handles are open on
tlogs.

I suspect that they _aren't_ open on tlogs however. This may well be
an XY problem: you're asking about file handles and tlogs when what
your real question is "why are there 180 open file handles?". It's
always possible you _did_ ask that and I missed it of course ;)

Anyway, each and every file in your _index_ will have an open file
handle for the duration of an open searcher. Whenever you search,
random parts of the index have to be read (OK, through the magic of
MMapDirectory, but that's not relevant) and opening/closing files
would be A Bad Thing for search performance, so they're opened when
the a searcher is opened and closed when it's closed.

Each segment can be made up of a bunch of files, so you can have
_0.tim, _0.fdt, _0.fdx and the like for any given segment, _1.tim,
_1.fdt etc. Is there a rough correlation between the total number of
files in the collective indexes for all your replicas and the number
reported? There are 10-15 files open per segment...

Best,
Erick

On Tue, Oct 10, 2017 at 12:04 AM, Bernd Fehling
<[hidden email]> wrote:

> Hi Erik,
>
> the cause of having 2 tlog files but about 180 open file handles
> was the point asking the question "when transaction logs are closing".
> Second was the statement you also use "...every hard commit of any
> flavor should close the current tlog".
> This is somehow true, but I would expect that the meaning of
> "close the current tlog" will physically close the tlog and release the
> file handles. But actually the current tlog is only logically closed
> with its END_MESSAGE "SOLR_TLOG_END" and _still_ physically open
> because it is part of the current index e.g. for RTS.
> I think this is a point which confuses most users about tlog.
>
> Yes, optimize is part of our weekly bulk load process.
> The question still stands, why are tlog files not closed physically
> after full optimize down to 1 segment?
> From my point of view (with a huge static index built once a week)
> the optimize is some kind of "sledgehammer" to force all data physically
> to disk, wipe out all deleted docs and free any OS resources.
>
> Regards
> Bernd
>
>
> Am 09.10.2017 um 19:46 schrieb Erick Erickson:
>> bq: Actually there are only 2 tlog files but each java process has
>> handles open to tlog.
>>
>> I'm a little confused by this. Each core in each Solr JVM should have
>> a handle to it's _own_ tlog open. So if JVM1 has 10 cores (replicas),
>> I'd expect 10 different tlogs to be open, one for each core. But I
>> don't expect two processes to have a handle open to the _same_ tlog
>> unless something's horribly wrong.
>>
>> And as Emir and Alessandro point out, every hard commit of any flavor
>> should close the current tlog (for the replica/core) and open a new
>> one. Soft commits have no effect.
>>
>> Oh, and please don't expungeDeletes or optimize if you can possibly
>> help it, there'll be a blog coming soon on why this is A Bad Idea
>> unless you have a pattern where you periodically (I'm thinking daily)
>> update your index and optimize as part of your process.
>>
>> Best,
>> Erick
>>
>> P.S. The reference guide is freely editable. Well, actually they're
>> just files in asciidoc format. It'd be great if you wanted to edit
>> them. I use Atom to edit them as it's free. If you do feel moved to do
>> this, just raise a JIRA and add the diff as a patch. There's also an
>> IntelliJ plugin that will do. I'm heavily editing the Near Real Time
>> page...
>>
>> On Mon, Oct 9, 2017 at 5:35 AM, Bernd Fehling
>> <[hidden email]> wrote:
>>> Thanks a lot Alessandro and Emir.
>>>
>>> Am 09.10.2017 um 13:40 schrieb alessandro.benedetti:
>>>> In addition to what Emir mentioned, when Solr opens a new Transaction Log
>>>> file it will delete the older ones up to some conditions :
>>>> keep at least N number of records [1] and max K number of files[2].
>>>> N is specified in the solrconfig.xml ( in the update handler section) and
>>>> can be documents related or files related or both.
>>>> So , potentially it could delete no one.
>>>>
>>>> This blog from Erick is quite explicative[3] .
>>>> If you like to take a look to the code, this class should help[4]
>>>>
>>>>
>>>>
>>>> [1]  <str name="numRecordsToKeep">${solr.ulog.numRecordsToKeep:100}</str>
>>>> [2]  <str name="maxNumLogsToKeep">${solr.ulog.maxNumLogsToKeep:10}</str>
>>>> [3]
>>>> https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>>>> [4] org.apache.solr.update.UpdateLog
>>>>
>>>>
>>>>
>>>>
>>>> -----
>>>> ---------------
>>>> Alessandro Benedetti
>>>> Search Consultant, R&D Software Engineer, Director
>>>> Sease Ltd. - www.sease.io
>>>> --
>>>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>>>