CDCR - how to deal with the transaction log files

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

CDCR - how to deal with the transaction log files

Xie, Sean
Once enabled CDCR, update log stores an unlimited number of entries. This is causing the tlog folder getting bigger and bigger, as well as the open files are growing. How can one reduce the number of open files and also to reduce the tlog files? If it’s not taken care properly, sooner or later the log files size and open file count will exceed the limits.

Thanks
Sean


Confidentiality Notice::  This email, including attachments, may include non-public, proprietary, confidential or legally privileged information.  If you are not an intended recipient or an authorized agent of an intended recipient, you are hereby notified that any dissemination, distribution or copying of the information contained in or transmitted with this e-mail is unauthorized and strictly prohibited.  If you have received this email in error, please notify the sender by replying to this message and permanently delete this e-mail, its attachments, and any copies of it immediately.  You should not retain, copy or use this e-mail or any attachment for any purpose, nor disclose all or any part of the contents to any other person. Thank you.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: CDCR - how to deal with the transaction log files

Erick Erickson
This should not be the case if you are actively sending updates to the
target cluster. The tlog is used to store unsent updates, so if the
connection is broken for some time, the target cluster will have a
chance to catch up.

If you don't have the remote DC online and do not intend to bring it
online soon, you should turn CDCR off.

Best,
Erick

On Fri, Jul 7, 2017 at 9:35 PM, Xie, Sean <[hidden email]> wrote:
> Once enabled CDCR, update log stores an unlimited number of entries. This is causing the tlog folder getting bigger and bigger, as well as the open files are growing. How can one reduce the number of open files and also to reduce the tlog files? If it’s not taken care properly, sooner or later the log files size and open file count will exceed the limits.
>
> Thanks
> Sean
>
>
> Confidentiality Notice::  This email, including attachments, may include non-public, proprietary, confidential or legally privileged information.  If you are not an intended recipient or an authorized agent of an intended recipient, you are hereby notified that any dissemination, distribution or copying of the information contained in or transmitted with this e-mail is unauthorized and strictly prohibited.  If you have received this email in error, please notify the sender by replying to this message and permanently delete this e-mail, its attachments, and any copies of it immediately.  You should not retain, copy or use this e-mail or any attachment for any purpose, nor disclose all or any part of the contents to any other person. Thank you.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: CDCR - how to deal with the transaction log files

Xie, Sean
In reply to this post by Xie, Sean
I have monitored the CDCR process for a while, the updates are actively sent to the target without a problem. However the tlog size and files count are growing everyday, even when there is 0 updates to sent, the tlog stays there:

Following is from the action=queues command, and you can see after about a month or so running days, the total transaction are reaching to 140K total files, and size is about 103G.

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">465</int>
</lst>
<lst name="queues">
<lst name="some_zk_url_list">
<lst name="MY_COLLECTION">
<long name="queueSize">0</long>
<str name="lastTimestamp">2017-07-07T23:19:09.655Z</str>
</lst>
</lst>
</lst>
<long name="tlogTotalSize">102740042616</long>
<long name="tlogTotalCount">140809</long>
<str name="updateLogSynchronizer">stopped</str>
</response>

Any help on it? Or do I need to configure something else? The CDCR configuration is pretty much following the wiki:

On target:

  <requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
    <lst name="buffer">
      <str name="defaultState">disabled</str>
    </lst>
  </requestHandler>

  <updateRequestProcessorChain name="cdcr-processor-chain">
    <processor class="solr.CdcrUpdateProcessorFactory"/>
    <processor class="solr.RunUpdateProcessorFactory"/>
  </updateRequestProcessorChain>

  <requestHandler name="/update" class="solr.UpdateRequestHandler">
    <lst name="defaults">
      <str name="update.chain">cdcr-processor-chain</str>
    </lst>
  </requestHandler>

  <updateHandler class="solr.DirectUpdateHandler2">
    <updateLog class="solr.CdcrUpdateLog">
      <str name="dir">${solr.ulog.dir:}</str>
    </updateLog>
    <autoCommit>
      <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
      <openSearcher>false</openSearcher>
    </autoCommit>

    <autoSoftCommit>
      <maxTime>${solr.autoSoftCommit.maxTime:30000}</maxTime>
    </autoSoftCommit>    
  </updateHandler>

On source:
  <requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
    <lst name="replica">
      <str name="zkHost">${TargetZk}</str>
      <str name="source">MY_COLLECTION</str>
      <str name="target">MY_COLLECTION</str>
    </lst>

    <lst name="replicator">
      <str name="threadPoolSize">1</str>
      <str name="schedule">1000</str>
      <str name="batchSize">128</str>
    </lst>

    <lst name="updateLogSynchronizer">
      <str name="schedule">60000</str>
    </lst>
  </requestHandler>

  <updateHandler class="solr.DirectUpdateHandler2">
    <updateLog class="solr.CdcrUpdateLog">
      <str name="dir">${solr.ulog.dir:}</str>
    </updateLog>
    <autoCommit>
      <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
      <openSearcher>false</openSearcher>
    </autoCommit>

    <autoSoftCommit>
      <maxTime>${solr.autoSoftCommit.maxTime:30000}</maxTime>
    </autoSoftCommit>    
  </updateHandler>

Thanks.
Sean

On 7/8/17, 12:10 PM, "Erick Erickson" <[hidden email]> wrote:

    This should not be the case if you are actively sending updates to the
    target cluster. The tlog is used to store unsent updates, so if the
    connection is broken for some time, the target cluster will have a
    chance to catch up.
   
    If you don't have the remote DC online and do not intend to bring it
    online soon, you should turn CDCR off.
   
    Best,
    Erick
   
    On Fri, Jul 7, 2017 at 9:35 PM, Xie, Sean <[hidden email]> wrote:
    > Once enabled CDCR, update log stores an unlimited number of entries. This is causing the tlog folder getting bigger and bigger, as well as the open files are growing. How can one reduce the number of open files and also to reduce the tlog files? If it’s not taken care properly, sooner or later the log files size and open file count will exceed the limits.
    >
    > Thanks
    > Sean
    >
    >
    > Confidentiality Notice::  This email, including attachments, may include non-public, proprietary, confidential or legally privileged information.  If you are not an intended recipient or an authorized agent of an intended recipient, you are hereby notified that any dissemination, distribution or copying of the information contained in or transmitted with this e-mail is unauthorized and strictly prohibited.  If you have received this email in error, please notify the sender by replying to this message and permanently delete this e-mail, its attachments, and any copies of it immediately.  You should not retain, copy or use this e-mail or any attachment for any purpose, nor disclose all or any part of the contents to any other person. Thank you.
   

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: CDCR - how to deal with the transaction log files

Xie, Sean
In reply to this post by Xie, Sean
Did another round of testing, the tlog on target cluster is cleaned up once the hard commit is triggered. However, on source cluster, the tlog files stay there and never gets cleaned up.

Not sure if there is any command to run manually to trigger the updateLogSynchronizer. The updateLogSynchronizer already set at run at every 10 seconds, but seems it didn’t help.

Any help?

Thanks
Sean

On 7/8/17, 1:14 PM, "Xie, Sean" <[hidden email]> wrote:

    I have monitored the CDCR process for a while, the updates are actively sent to the target without a problem. However the tlog size and files count are growing everyday, even when there is 0 updates to sent, the tlog stays there:
   
    Following is from the action=queues command, and you can see after about a month or so running days, the total transaction are reaching to 140K total files, and size is about 103G.
   
    <response>
    <lst name="responseHeader">
    <int name="status">0</int>
    <int name="QTime">465</int>
    </lst>
    <lst name="queues">
    <lst name="some_zk_url_list">
    <lst name="MY_COLLECTION">
    <long name="queueSize">0</long>
    <str name="lastTimestamp">2017-07-07T23:19:09.655Z</str>
    </lst>
    </lst>
    </lst>
    <long name="tlogTotalSize">102740042616</long>
    <long name="tlogTotalCount">140809</long>
    <str name="updateLogSynchronizer">stopped</str>
    </response>
   
    Any help on it? Or do I need to configure something else? The CDCR configuration is pretty much following the wiki:
   
    On target:
   
      <requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
        <lst name="buffer">
          <str name="defaultState">disabled</str>
        </lst>
      </requestHandler>
   
      <updateRequestProcessorChain name="cdcr-processor-chain">
        <processor class="solr.CdcrUpdateProcessorFactory"/>
        <processor class="solr.RunUpdateProcessorFactory"/>
      </updateRequestProcessorChain>
   
      <requestHandler name="/update" class="solr.UpdateRequestHandler">
        <lst name="defaults">
          <str name="update.chain">cdcr-processor-chain</str>
        </lst>
      </requestHandler>
   
      <updateHandler class="solr.DirectUpdateHandler2">
        <updateLog class="solr.CdcrUpdateLog">
          <str name="dir">${solr.ulog.dir:}</str>
        </updateLog>
        <autoCommit>
          <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
          <openSearcher>false</openSearcher>
        </autoCommit>
   
        <autoSoftCommit>
          <maxTime>${solr.autoSoftCommit.maxTime:30000}</maxTime>
        </autoSoftCommit>    
      </updateHandler>
   
    On source:
      <requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
        <lst name="replica">
          <str name="zkHost">${TargetZk}</str>
          <str name="source">MY_COLLECTION</str>
          <str name="target">MY_COLLECTION</str>
        </lst>
   
        <lst name="replicator">
          <str name="threadPoolSize">1</str>
          <str name="schedule">1000</str>
          <str name="batchSize">128</str>
        </lst>
   
        <lst name="updateLogSynchronizer">
          <str name="schedule">60000</str>
        </lst>
      </requestHandler>
   
      <updateHandler class="solr.DirectUpdateHandler2">
        <updateLog class="solr.CdcrUpdateLog">
          <str name="dir">${solr.ulog.dir:}</str>
        </updateLog>
        <autoCommit>
          <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
          <openSearcher>false</openSearcher>
        </autoCommit>
   
        <autoSoftCommit>
          <maxTime>${solr.autoSoftCommit.maxTime:30000}</maxTime>
        </autoSoftCommit>    
      </updateHandler>
   
    Thanks.
    Sean
   
    On 7/8/17, 12:10 PM, "Erick Erickson" <[hidden email]> wrote:
   
        This should not be the case if you are actively sending updates to the
        target cluster. The tlog is used to store unsent updates, so if the
        connection is broken for some time, the target cluster will have a
        chance to catch up.
       
        If you don't have the remote DC online and do not intend to bring it
        online soon, you should turn CDCR off.
       
        Best,
        Erick
       
        On Fri, Jul 7, 2017 at 9:35 PM, Xie, Sean <[hidden email]> wrote:
        > Once enabled CDCR, update log stores an unlimited number of entries. This is causing the tlog folder getting bigger and bigger, as well as the open files are growing. How can one reduce the number of open files and also to reduce the tlog files? If it’s not taken care properly, sooner or later the log files size and open file count will exceed the limits.
        >
        > Thanks
        > Sean
        >
        >
        > Confidentiality Notice::  This email, including attachments, may include non-public, proprietary, confidential or legally privileged information.  If you are not an intended recipient or an authorized agent of an intended recipient, you are hereby notified that any dissemination, distribution or copying of the information contained in or transmitted with this e-mail is unauthorized and strictly prohibited.  If you have received this email in error, please notify the sender by replying to this message and permanently delete this e-mail, its attachments, and any copies of it immediately.  You should not retain, copy or use this e-mail or any attachment for any purpose, nor disclose all or any part of the contents to any other person. Thank you.
       
   
   

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: CDCR - how to deal with the transaction log files

Michael McCarthy
We have been experiencing this same issue for months now, with version 6.2.  No solution to date.

-----Original Message-----
From: Xie, Sean [mailto:[hidden email]]
Sent: Sunday, July 09, 2017 9:41 PM
To: [hidden email]
Subject: [EXTERNAL] Re: CDCR - how to deal with the transaction log files

Did another round of testing, the tlog on target cluster is cleaned up once the hard commit is triggered. However, on source cluster, the tlog files stay there and never gets cleaned up.

Not sure if there is any command to run manually to trigger the updateLogSynchronizer. The updateLogSynchronizer already set at run at every 10 seconds, but seems it didn’t help.

Any help?

Thanks
Sean

On 7/8/17, 1:14 PM, "Xie, Sean" <[hidden email]> wrote:

    I have monitored the CDCR process for a while, the updates are actively sent to the target without a problem. However the tlog size and files count are growing everyday, even when there is 0 updates to sent, the tlog stays there:

    Following is from the action=queues command, and you can see after about a month or so running days, the total transaction are reaching to 140K total files, and size is about 103G.

    <response>
    <lst name="responseHeader">
    <int name="status">0</int>
    <int name="QTime">465</int>
    </lst>
    <lst name="queues">
    <lst name="some_zk_url_list">
    <lst name="MY_COLLECTION">
    <long name="queueSize">0</long>
    <str name="lastTimestamp">2017-07-07T23:19:09.655Z</str>
    </lst>
    </lst>
    </lst>
    <long name="tlogTotalSize">102740042616</long>
    <long name="tlogTotalCount">140809</long>
    <str name="updateLogSynchronizer">stopped</str>
    </response>

    Any help on it? Or do I need to configure something else? The CDCR configuration is pretty much following the wiki:

    On target:

      <requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
        <lst name="buffer">
          <str name="defaultState">disabled</str>
        </lst>
      </requestHandler>

      <updateRequestProcessorChain name="cdcr-processor-chain">
        <processor class="solr.CdcrUpdateProcessorFactory"/>
        <processor class="solr.RunUpdateProcessorFactory"/>
      </updateRequestProcessorChain>

      <requestHandler name="/update" class="solr.UpdateRequestHandler">
        <lst name="defaults">
          <str name="update.chain">cdcr-processor-chain</str>
        </lst>
      </requestHandler>

      <updateHandler class="solr.DirectUpdateHandler2">
        <updateLog class="solr.CdcrUpdateLog">
          <str name="dir">${solr.ulog.dir:}</str>
        </updateLog>
        <autoCommit>
          <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
          <openSearcher>false</openSearcher>
        </autoCommit>

        <autoSoftCommit>
          <maxTime>${solr.autoSoftCommit.maxTime:30000}</maxTime>
        </autoSoftCommit>
      </updateHandler>

    On source:
      <requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
        <lst name="replica">
          <str name="zkHost">${TargetZk}</str>
          <str name="source">MY_COLLECTION</str>
          <str name="target">MY_COLLECTION</str>
        </lst>

        <lst name="replicator">
          <str name="threadPoolSize">1</str>
          <str name="schedule">1000</str>
          <str name="batchSize">128</str>
        </lst>

        <lst name="updateLogSynchronizer">
          <str name="schedule">60000</str>
        </lst>
      </requestHandler>

      <updateHandler class="solr.DirectUpdateHandler2">
        <updateLog class="solr.CdcrUpdateLog">
          <str name="dir">${solr.ulog.dir:}</str>
        </updateLog>
        <autoCommit>
          <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
          <openSearcher>false</openSearcher>
        </autoCommit>

        <autoSoftCommit>
          <maxTime>${solr.autoSoftCommit.maxTime:30000}</maxTime>
        </autoSoftCommit>
      </updateHandler>

    Thanks.
    Sean

    On 7/8/17, 12:10 PM, "Erick Erickson" <[hidden email]> wrote:

        This should not be the case if you are actively sending updates to the
        target cluster. The tlog is used to store unsent updates, so if the
        connection is broken for some time, the target cluster will have a
        chance to catch up.

        If you don't have the remote DC online and do not intend to bring it
        online soon, you should turn CDCR off.

        Best,
        Erick

        On Fri, Jul 7, 2017 at 9:35 PM, Xie, Sean <[hidden email]> wrote:
        > Once enabled CDCR, update log stores an unlimited number of entries. This is causing the tlog folder getting bigger and bigger, as well as the open files are growing. How can one reduce the number of open files and also to reduce the tlog files? If it’s not taken care properly, sooner or later the log files size and open file count will exceed the limits.
        >
        > Thanks
        > Sean
        >
        >
        > Confidentiality Notice::  This email, including attachments, may include non-public, proprietary, confidential or legally privileged information.  If you are not an intended recipient or an authorized agent of an intended recipient, you are hereby notified that any dissemination, distribution or copying of the information contained in or transmitted with this e-mail is unauthorized and strictly prohibited.  If you have received this email in error, please notify the sender by replying to this message and permanently delete this e-mail, its attachments, and any copies of it immediately.  You should not retain, copy or use this e-mail or any attachment for any purpose, nor disclose all or any part of the contents to any other person. Thank you.






Nothing in this message is intended to constitute an electronic signature unless a specific statement to the contrary is included in this message.

Confidentiality Note: This message is intended only for the person or entity to which it is addressed. It may contain confidential and/or privileged material. Any review, transmission, dissemination or other use, or taking of any action in reliance upon this message by persons or entities other than the intended recipient is prohibited and may be unlawful. If you received this message in error, please contact the sender and delete it from your computer.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: CDCR - how to deal with the transaction log files

Xie, Sean
In reply to this post by Xie, Sean
Did some source code reading, and looks like when lastProcessedVersion==-1, then it will do nothing:

https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/CdcrUpdateLogSynchronizer.java

        // if we received -1, it means that the log reader on the leader has not yet started to read log entries
        // do nothing
        if (lastVersion == -1) {
          return;
        }

So I queried the solr to find out, and here is the results:

/cdcr?action=LASTPROCESSEDVERSION

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
</lst>
<long name="lastProcessedVersion">-1</long>
</response>

Anything could cause this issue to happen?


Sean


On 7/10/17, 9:08 AM, "Michael McCarthy" <[hidden email]> wrote:

    We have been experiencing this same issue for months now, with version 6.2.  No solution to date.
   
    -----Original Message-----
    From: Xie, Sean [mailto:[hidden email]]
    Sent: Sunday, July 09, 2017 9:41 PM
    To: [hidden email]
    Subject: [EXTERNAL] Re: CDCR - how to deal with the transaction log files
   
    Did another round of testing, the tlog on target cluster is cleaned up once the hard commit is triggered. However, on source cluster, the tlog files stay there and never gets cleaned up.
   
    Not sure if there is any command to run manually to trigger the updateLogSynchronizer. The updateLogSynchronizer already set at run at every 10 seconds, but seems it didn’t help.
   
    Any help?
   
    Thanks
    Sean
   
    On 7/8/17, 1:14 PM, "Xie, Sean" <[hidden email]> wrote:
   
        I have monitored the CDCR process for a while, the updates are actively sent to the target without a problem. However the tlog size and files count are growing everyday, even when there is 0 updates to sent, the tlog stays there:
   
        Following is from the action=queues command, and you can see after about a month or so running days, the total transaction are reaching to 140K total files, and size is about 103G.
   
        <response>
        <lst name="responseHeader">
        <int name="status">0</int>
        <int name="QTime">465</int>
        </lst>
        <lst name="queues">
        <lst name="some_zk_url_list">
        <lst name="MY_COLLECTION">
        <long name="queueSize">0</long>
        <str name="lastTimestamp">2017-07-07T23:19:09.655Z</str>
        </lst>
        </lst>
        </lst>
        <long name="tlogTotalSize">102740042616</long>
        <long name="tlogTotalCount">140809</long>
        <str name="updateLogSynchronizer">stopped</str>
        </response>
   
        Any help on it? Or do I need to configure something else? The CDCR configuration is pretty much following the wiki:
   
        On target:
   
          <requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
            <lst name="buffer">
              <str name="defaultState">disabled</str>
            </lst>
          </requestHandler>
   
          <updateRequestProcessorChain name="cdcr-processor-chain">
            <processor class="solr.CdcrUpdateProcessorFactory"/>
            <processor class="solr.RunUpdateProcessorFactory"/>
          </updateRequestProcessorChain>
   
          <requestHandler name="/update" class="solr.UpdateRequestHandler">
            <lst name="defaults">
              <str name="update.chain">cdcr-processor-chain</str>
            </lst>
          </requestHandler>
   
          <updateHandler class="solr.DirectUpdateHandler2">
            <updateLog class="solr.CdcrUpdateLog">
              <str name="dir">${solr.ulog.dir:}</str>
            </updateLog>
            <autoCommit>
              <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
              <openSearcher>false</openSearcher>
            </autoCommit>
   
            <autoSoftCommit>
              <maxTime>${solr.autoSoftCommit.maxTime:30000}</maxTime>
            </autoSoftCommit>
          </updateHandler>
   
        On source:
          <requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
            <lst name="replica">
              <str name="zkHost">${TargetZk}</str>
              <str name="source">MY_COLLECTION</str>
              <str name="target">MY_COLLECTION</str>
            </lst>
   
            <lst name="replicator">
              <str name="threadPoolSize">1</str>
              <str name="schedule">1000</str>
              <str name="batchSize">128</str>
            </lst>
   
            <lst name="updateLogSynchronizer">
              <str name="schedule">60000</str>
            </lst>
          </requestHandler>
   
          <updateHandler class="solr.DirectUpdateHandler2">
            <updateLog class="solr.CdcrUpdateLog">
              <str name="dir">${solr.ulog.dir:}</str>
            </updateLog>
            <autoCommit>
              <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
              <openSearcher>false</openSearcher>
            </autoCommit>
   
            <autoSoftCommit>
              <maxTime>${solr.autoSoftCommit.maxTime:30000}</maxTime>
            </autoSoftCommit>
          </updateHandler>
   
        Thanks.
        Sean
   
        On 7/8/17, 12:10 PM, "Erick Erickson" <[hidden email]> wrote:
   
            This should not be the case if you are actively sending updates to the
            target cluster. The tlog is used to store unsent updates, so if the
            connection is broken for some time, the target cluster will have a
            chance to catch up.
   
            If you don't have the remote DC online and do not intend to bring it
            online soon, you should turn CDCR off.
   
            Best,
            Erick
   
            On Fri, Jul 7, 2017 at 9:35 PM, Xie, Sean <[hidden email]> wrote:
            > Once enabled CDCR, update log stores an unlimited number of entries. This is causing the tlog folder getting bigger and bigger, as well as the open files are growing. How can one reduce the number of open files and also to reduce the tlog files? If it’s not taken care properly, sooner or later the log files size and open file count will exceed the limits.
            >
            > Thanks
            > Sean
            >
            >
            > Confidentiality Notice::  This email, including attachments, may include non-public, proprietary, confidential or legally privileged information.  If you are not an intended recipient or an authorized agent of an intended recipient, you are hereby notified that any dissemination, distribution or copying of the information contained in or transmitted with this e-mail is unauthorized and strictly prohibited.  If you have received this email in error, please notify the sender by replying to this message and permanently delete this e-mail, its attachments, and any copies of it immediately.  You should not retain, copy or use this e-mail or any attachment for any purpose, nor disclose all or any part of the contents to any other person. Thank you.
   
   
   
   
   
   
    Nothing in this message is intended to constitute an electronic signature unless a specific statement to the contrary is included in this message.
   
    Confidentiality Note: This message is intended only for the person or entity to which it is addressed. It may contain confidential and/or privileged material. Any review, transmission, dissemination or other use, or taking of any action in reliance upon this message by persons or entities other than the intended recipient is prohibited and may be unlawful. If you received this message in error, please contact the sender and delete it from your computer.
   

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: CDCR - how to deal with the transaction log files

Xie, Sean
In reply to this post by Xie, Sean
After several experiments and observation, finally make it work.
The key point is you have to also disablebuffer on source cluster. I don’t know why in the wiki, it didn’t mention it, but I figured this out through the source code.
Once disablebuffer on source cluster, the lastProcessedVersion will become a position number, and when there is hard commit, the old unused tlog files get deleted.

Hope my finding can help other users who experience the same issue.


On 7/10/17, 9:08 AM, "Michael McCarthy" <[hidden email]> wrote:

    We have been experiencing this same issue for months now, with version 6.2.  No solution to date.
   
    -----Original Message-----
    From: Xie, Sean [mailto:[hidden email]]
    Sent: Sunday, July 09, 2017 9:41 PM
    To: [hidden email]
    Subject: [EXTERNAL] Re: CDCR - how to deal with the transaction log files
   
    Did another round of testing, the tlog on target cluster is cleaned up once the hard commit is triggered. However, on source cluster, the tlog files stay there and never gets cleaned up.
   
    Not sure if there is any command to run manually to trigger the updateLogSynchronizer. The updateLogSynchronizer already set at run at every 10 seconds, but seems it didn’t help.
   
    Any help?
   
    Thanks
    Sean
   
    On 7/8/17, 1:14 PM, "Xie, Sean" <[hidden email]> wrote:
   
        I have monitored the CDCR process for a while, the updates are actively sent to the target without a problem. However the tlog size and files count are growing everyday, even when there is 0 updates to sent, the tlog stays there:
   
        Following is from the action=queues command, and you can see after about a month or so running days, the total transaction are reaching to 140K total files, and size is about 103G.
   
        <response>
        <lst name="responseHeader">
        <int name="status">0</int>
        <int name="QTime">465</int>
        </lst>
        <lst name="queues">
        <lst name="some_zk_url_list">
        <lst name="MY_COLLECTION">
        <long name="queueSize">0</long>
        <str name="lastTimestamp">2017-07-07T23:19:09.655Z</str>
        </lst>
        </lst>
        </lst>
        <long name="tlogTotalSize">102740042616</long>
        <long name="tlogTotalCount">140809</long>
        <str name="updateLogSynchronizer">stopped</str>
        </response>
   
        Any help on it? Or do I need to configure something else? The CDCR configuration is pretty much following the wiki:
   
        On target:
   
          <requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
            <lst name="buffer">
              <str name="defaultState">disabled</str>
            </lst>
          </requestHandler>
   
          <updateRequestProcessorChain name="cdcr-processor-chain">
            <processor class="solr.CdcrUpdateProcessorFactory"/>
            <processor class="solr.RunUpdateProcessorFactory"/>
          </updateRequestProcessorChain>
   
          <requestHandler name="/update" class="solr.UpdateRequestHandler">
            <lst name="defaults">
              <str name="update.chain">cdcr-processor-chain</str>
            </lst>
          </requestHandler>
   
          <updateHandler class="solr.DirectUpdateHandler2">
            <updateLog class="solr.CdcrUpdateLog">
              <str name="dir">${solr.ulog.dir:}</str>
            </updateLog>
            <autoCommit>
              <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
              <openSearcher>false</openSearcher>
            </autoCommit>
   
            <autoSoftCommit>
              <maxTime>${solr.autoSoftCommit.maxTime:30000}</maxTime>
            </autoSoftCommit>
          </updateHandler>
   
        On source:
          <requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
            <lst name="replica">
              <str name="zkHost">${TargetZk}</str>
              <str name="source">MY_COLLECTION</str>
              <str name="target">MY_COLLECTION</str>
            </lst>
   
            <lst name="replicator">
              <str name="threadPoolSize">1</str>
              <str name="schedule">1000</str>
              <str name="batchSize">128</str>
            </lst>
   
            <lst name="updateLogSynchronizer">
              <str name="schedule">60000</str>
            </lst>
          </requestHandler>
   
          <updateHandler class="solr.DirectUpdateHandler2">
            <updateLog class="solr.CdcrUpdateLog">
              <str name="dir">${solr.ulog.dir:}</str>
            </updateLog>
            <autoCommit>
              <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
              <openSearcher>false</openSearcher>
            </autoCommit>
   
            <autoSoftCommit>
              <maxTime>${solr.autoSoftCommit.maxTime:30000}</maxTime>
            </autoSoftCommit>
          </updateHandler>
   
        Thanks.
        Sean
   
        On 7/8/17, 12:10 PM, "Erick Erickson" <[hidden email]> wrote:
   
            This should not be the case if you are actively sending updates to the
            target cluster. The tlog is used to store unsent updates, so if the
            connection is broken for some time, the target cluster will have a
            chance to catch up.
   
            If you don't have the remote DC online and do not intend to bring it
            online soon, you should turn CDCR off.
   
            Best,
            Erick
   
            On Fri, Jul 7, 2017 at 9:35 PM, Xie, Sean <[hidden email]> wrote:
            > Once enabled CDCR, update log stores an unlimited number of entries. This is causing the tlog folder getting bigger and bigger, as well as the open files are growing. How can one reduce the number of open files and also to reduce the tlog files? If it’s not taken care properly, sooner or later the log files size and open file count will exceed the limits.
            >
            > Thanks
            > Sean
            >
            >
            > Confidentiality Notice::  This email, including attachments, may include non-public, proprietary, confidential or legally privileged information.  If you are not an intended recipient or an authorized agent of an intended recipient, you are hereby notified that any dissemination, distribution or copying of the information contained in or transmitted with this e-mail is unauthorized and strictly prohibited.  If you have received this email in error, please notify the sender by replying to this message and permanently delete this e-mail, its attachments, and any copies of it immediately.  You should not retain, copy or use this e-mail or any attachment for any purpose, nor disclose all or any part of the contents to any other person. Thank you.
   
   
   
   
   
   
    Nothing in this message is intended to constitute an electronic signature unless a specific statement to the contrary is included in this message.
   
    Confidentiality Note: This message is intended only for the person or entity to which it is addressed. It may contain confidential and/or privileged material. Any review, transmission, dissemination or other use, or taking of any action in reliance upon this message by persons or entities other than the intended recipient is prohibited and may be unlawful. If you received this message in error, please contact the sender and delete it from your computer.
   

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: CDCR - how to deal with the transaction log files

Varun Thacker-4
After disabling the buffer are you still seeing documents being replicated
to the target cluster(s) ?

On Mon, Jul 10, 2017 at 1:07 PM, Xie, Sean <[hidden email]> wrote:

> After several experiments and observation, finally make it work.
> The key point is you have to also disablebuffer on source cluster. I don’t
> know why in the wiki, it didn’t mention it, but I figured this out through
> the source code.
> Once disablebuffer on source cluster, the lastProcessedVersion will become
> a position number, and when there is hard commit, the old unused tlog files
> get deleted.
>
> Hope my finding can help other users who experience the same issue.
>
>
> On 7/10/17, 9:08 AM, "Michael McCarthy" <[hidden email]> wrote:
>
>     We have been experiencing this same issue for months now, with version
> 6.2.  No solution to date.
>
>     -----Original Message-----
>     From: Xie, Sean [mailto:[hidden email]]
>     Sent: Sunday, July 09, 2017 9:41 PM
>     To: [hidden email]
>     Subject: [EXTERNAL] Re: CDCR - how to deal with the transaction log
> files
>
>     Did another round of testing, the tlog on target cluster is cleaned up
> once the hard commit is triggered. However, on source cluster, the tlog
> files stay there and never gets cleaned up.
>
>     Not sure if there is any command to run manually to trigger the
> updateLogSynchronizer. The updateLogSynchronizer already set at run at
> every 10 seconds, but seems it didn’t help.
>
>     Any help?
>
>     Thanks
>     Sean
>
>     On 7/8/17, 1:14 PM, "Xie, Sean" <[hidden email]> wrote:
>
>         I have monitored the CDCR process for a while, the updates are
> actively sent to the target without a problem. However the tlog size and
> files count are growing everyday, even when there is 0 updates to sent, the
> tlog stays there:
>
>         Following is from the action=queues command, and you can see after
> about a month or so running days, the total transaction are reaching to
> 140K total files, and size is about 103G.
>
>         <response>
>         <lst name="responseHeader">
>         <int name="status">0</int>
>         <int name="QTime">465</int>
>         </lst>
>         <lst name="queues">
>         <lst name="some_zk_url_list">
>         <lst name="MY_COLLECTION">
>         <long name="queueSize">0</long>
>         <str name="lastTimestamp">2017-07-07T23:19:09.655Z</str>
>         </lst>
>         </lst>
>         </lst>
>         <long name="tlogTotalSize">102740042616</long>
>         <long name="tlogTotalCount">140809</long>
>         <str name="updateLogSynchronizer">stopped</str>
>         </response>
>
>         Any help on it? Or do I need to configure something else? The CDCR
> configuration is pretty much following the wiki:
>
>         On target:
>
>           <requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
>             <lst name="buffer">
>               <str name="defaultState">disabled</str>
>             </lst>
>           </requestHandler>
>
>           <updateRequestProcessorChain name="cdcr-processor-chain">
>             <processor class="solr.CdcrUpdateProcessorFactory"/>
>             <processor class="solr.RunUpdateProcessorFactory"/>
>           </updateRequestProcessorChain>
>
>           <requestHandler name="/update" class="solr.
> UpdateRequestHandler">
>             <lst name="defaults">
>               <str name="update.chain">cdcr-processor-chain</str>
>             </lst>
>           </requestHandler>
>
>           <updateHandler class="solr.DirectUpdateHandler2">
>             <updateLog class="solr.CdcrUpdateLog">
>               <str name="dir">${solr.ulog.dir:}</str>
>             </updateLog>
>             <autoCommit>
>               <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
>               <openSearcher>false</openSearcher>
>             </autoCommit>
>
>             <autoSoftCommit>
>               <maxTime>${solr.autoSoftCommit.maxTime:30000}</maxTime>
>             </autoSoftCommit>
>           </updateHandler>
>
>         On source:
>           <requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
>             <lst name="replica">
>               <str name="zkHost">${TargetZk}</str>
>               <str name="source">MY_COLLECTION</str>
>               <str name="target">MY_COLLECTION</str>
>             </lst>
>
>             <lst name="replicator">
>               <str name="threadPoolSize">1</str>
>               <str name="schedule">1000</str>
>               <str name="batchSize">128</str>
>             </lst>
>
>             <lst name="updateLogSynchronizer">
>               <str name="schedule">60000</str>
>             </lst>
>           </requestHandler>
>
>           <updateHandler class="solr.DirectUpdateHandler2">
>             <updateLog class="solr.CdcrUpdateLog">
>               <str name="dir">${solr.ulog.dir:}</str>
>             </updateLog>
>             <autoCommit>
>               <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
>               <openSearcher>false</openSearcher>
>             </autoCommit>
>
>             <autoSoftCommit>
>               <maxTime>${solr.autoSoftCommit.maxTime:30000}</maxTime>
>             </autoSoftCommit>
>           </updateHandler>
>
>         Thanks.
>         Sean
>
>         On 7/8/17, 12:10 PM, "Erick Erickson" <[hidden email]>
> wrote:
>
>             This should not be the case if you are actively sending
> updates to the
>             target cluster. The tlog is used to store unsent updates, so
> if the
>             connection is broken for some time, the target cluster will
> have a
>             chance to catch up.
>
>             If you don't have the remote DC online and do not intend to
> bring it
>             online soon, you should turn CDCR off.
>
>             Best,
>             Erick
>
>             On Fri, Jul 7, 2017 at 9:35 PM, Xie, Sean <[hidden email]>
> wrote:
>             > Once enabled CDCR, update log stores an unlimited number of
> entries. This is causing the tlog folder getting bigger and bigger, as well
> as the open files are growing. How can one reduce the number of open files
> and also to reduce the tlog files? If it’s not taken care properly, sooner
> or later the log files size and open file count will exceed the limits.
>             >
>             > Thanks
>             > Sean
>             >
>             >
>             > Confidentiality Notice::  This email, including attachments,
> may include non-public, proprietary, confidential or legally privileged
> information.  If you are not an intended recipient or an authorized agent
> of an intended recipient, you are hereby notified that any dissemination,
> distribution or copying of the information contained in or transmitted with
> this e-mail is unauthorized and strictly prohibited.  If you have received
> this email in error, please notify the sender by replying to this message
> and permanently delete this e-mail, its attachments, and any copies of it
> immediately.  You should not retain, copy or use this e-mail or any
> attachment for any purpose, nor disclose all or any part of the contents to
> any other person. Thank you.
>
>
>
>
>
>
>     Nothing in this message is intended to constitute an electronic
> signature unless a specific statement to the contrary is included in this
> message.
>
>     Confidentiality Note: This message is intended only for the person or
> entity to which it is addressed. It may contain confidential and/or
> privileged material. Any review, transmission, dissemination or other use,
> or taking of any action in reliance upon this message by persons or
> entities other than the intended recipient is prohibited and may be
> unlawful. If you received this message in error, please contact the sender
> and delete it from your computer.
>
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: CDCR - how to deal with the transaction log files

Xie, Sean
In reply to this post by Xie, Sean
Yes. Documents are being sent to target. Monitoring the output from “action=queues”, depending your settings, you will see the documents replication progress.

On the other hand, if enable the buffer, the lastprocessedversion is always returning -1. Reading the source code, the CdcrUpdateLogSynchroizer does not continue to do the clean if this value is -1.

Sean

On 7/10/17, 5:18 PM, "Varun Thacker" <[hidden email]> wrote:

    After disabling the buffer are you still seeing documents being replicated
    to the target cluster(s) ?
   
    On Mon, Jul 10, 2017 at 1:07 PM, Xie, Sean <[hidden email]> wrote:
   
    > After several experiments and observation, finally make it work.
    > The key point is you have to also disablebuffer on source cluster. I don’t
    > know why in the wiki, it didn’t mention it, but I figured this out through
    > the source code.
    > Once disablebuffer on source cluster, the lastProcessedVersion will become
    > a position number, and when there is hard commit, the old unused tlog files
    > get deleted.
    >
    > Hope my finding can help other users who experience the same issue.
    >
    >
    > On 7/10/17, 9:08 AM, "Michael McCarthy" <[hidden email]> wrote:
    >
    >     We have been experiencing this same issue for months now, with version
    > 6.2.  No solution to date.
    >
    >     -----Original Message-----
    >     From: Xie, Sean [mailto:[hidden email]]
    >     Sent: Sunday, July 09, 2017 9:41 PM
    >     To: [hidden email]
    >     Subject: [EXTERNAL] Re: CDCR - how to deal with the transaction log
    > files
    >
    >     Did another round of testing, the tlog on target cluster is cleaned up
    > once the hard commit is triggered. However, on source cluster, the tlog
    > files stay there and never gets cleaned up.
    >
    >     Not sure if there is any command to run manually to trigger the
    > updateLogSynchronizer. The updateLogSynchronizer already set at run at
    > every 10 seconds, but seems it didn’t help.
    >
    >     Any help?
    >
    >     Thanks
    >     Sean
    >
    >     On 7/8/17, 1:14 PM, "Xie, Sean" <[hidden email]> wrote:
    >
    >         I have monitored the CDCR process for a while, the updates are
    > actively sent to the target without a problem. However the tlog size and
    > files count are growing everyday, even when there is 0 updates to sent, the
    > tlog stays there:
    >
    >         Following is from the action=queues command, and you can see after
    > about a month or so running days, the total transaction are reaching to
    > 140K total files, and size is about 103G.
    >
    >         <response>
    >         <lst name="responseHeader">
    >         <int name="status">0</int>
    >         <int name="QTime">465</int>
    >         </lst>
    >         <lst name="queues">
    >         <lst name="some_zk_url_list">
    >         <lst name="MY_COLLECTION">
    >         <long name="queueSize">0</long>
    >         <str name="lastTimestamp">2017-07-07T23:19:09.655Z</str>
    >         </lst>
    >         </lst>
    >         </lst>
    >         <long name="tlogTotalSize">102740042616</long>
    >         <long name="tlogTotalCount">140809</long>
    >         <str name="updateLogSynchronizer">stopped</str>
    >         </response>
    >
    >         Any help on it? Or do I need to configure something else? The CDCR
    > configuration is pretty much following the wiki:
    >
    >         On target:
    >
    >           <requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
    >             <lst name="buffer">
    >               <str name="defaultState">disabled</str>
    >             </lst>
    >           </requestHandler>
    >
    >           <updateRequestProcessorChain name="cdcr-processor-chain">
    >             <processor class="solr.CdcrUpdateProcessorFactory"/>
    >             <processor class="solr.RunUpdateProcessorFactory"/>
    >           </updateRequestProcessorChain>
    >
    >           <requestHandler name="/update" class="solr.
    > UpdateRequestHandler">
    >             <lst name="defaults">
    >               <str name="update.chain">cdcr-processor-chain</str>
    >             </lst>
    >           </requestHandler>
    >
    >           <updateHandler class="solr.DirectUpdateHandler2">
    >             <updateLog class="solr.CdcrUpdateLog">
    >               <str name="dir">${solr.ulog.dir:}</str>
    >             </updateLog>
    >             <autoCommit>
    >               <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
    >               <openSearcher>false</openSearcher>
    >             </autoCommit>
    >
    >             <autoSoftCommit>
    >               <maxTime>${solr.autoSoftCommit.maxTime:30000}</maxTime>
    >             </autoSoftCommit>
    >           </updateHandler>
    >
    >         On source:
    >           <requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
    >             <lst name="replica">
    >               <str name="zkHost">${TargetZk}</str>
    >               <str name="source">MY_COLLECTION</str>
    >               <str name="target">MY_COLLECTION</str>
    >             </lst>
    >
    >             <lst name="replicator">
    >               <str name="threadPoolSize">1</str>
    >               <str name="schedule">1000</str>
    >               <str name="batchSize">128</str>
    >             </lst>
    >
    >             <lst name="updateLogSynchronizer">
    >               <str name="schedule">60000</str>
    >             </lst>
    >           </requestHandler>
    >
    >           <updateHandler class="solr.DirectUpdateHandler2">
    >             <updateLog class="solr.CdcrUpdateLog">
    >               <str name="dir">${solr.ulog.dir:}</str>
    >             </updateLog>
    >             <autoCommit>
    >               <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
    >               <openSearcher>false</openSearcher>
    >             </autoCommit>
    >
    >             <autoSoftCommit>
    >               <maxTime>${solr.autoSoftCommit.maxTime:30000}</maxTime>
    >             </autoSoftCommit>
    >           </updateHandler>
    >
    >         Thanks.
    >         Sean
    >
    >         On 7/8/17, 12:10 PM, "Erick Erickson" <[hidden email]>
    > wrote:
    >
    >             This should not be the case if you are actively sending
    > updates to the
    >             target cluster. The tlog is used to store unsent updates, so
    > if the
    >             connection is broken for some time, the target cluster will
    > have a
    >             chance to catch up.
    >
    >             If you don't have the remote DC online and do not intend to
    > bring it
    >             online soon, you should turn CDCR off.
    >
    >             Best,
    >             Erick
    >
    >             On Fri, Jul 7, 2017 at 9:35 PM, Xie, Sean <[hidden email]>
    > wrote:
    >             > Once enabled CDCR, update log stores an unlimited number of
    > entries. This is causing the tlog folder getting bigger and bigger, as well
    > as the open files are growing. How can one reduce the number of open files
    > and also to reduce the tlog files? If it’s not taken care properly, sooner
    > or later the log files size and open file count will exceed the limits.
    >             >
    >             > Thanks
    >             > Sean
    >             >
    >             >
    >             > Confidentiality Notice::  This email, including attachments,
    > may include non-public, proprietary, confidential or legally privileged
    > information.  If you are not an intended recipient or an authorized agent
    > of an intended recipient, you are hereby notified that any dissemination,
    > distribution or copying of the information contained in or transmitted with
    > this e-mail is unauthorized and strictly prohibited.  If you have received
    > this email in error, please notify the sender by replying to this message
    > and permanently delete this e-mail, its attachments, and any copies of it
    > immediately.  You should not retain, copy or use this e-mail or any
    > attachment for any purpose, nor disclose all or any part of the contents to
    > any other person. Thank you.
    >
    >
    >
    >
    >
    >
    >     Nothing in this message is intended to constitute an electronic
    > signature unless a specific statement to the contrary is included in this
    > message.
    >
    >     Confidentiality Note: This message is intended only for the person or
    > entity to which it is addressed. It may contain confidential and/or
    > privileged material. Any review, transmission, dissemination or other use,
    > or taking of any action in reliance upon this message by persons or
    > entities other than the intended recipient is prohibited and may be
    > unlawful. If you received this message in error, please contact the sender
    > and delete it from your computer.
    >
    >
    >
   

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: CDCR - how to deal with the transaction log files

Varun Thacker-4
Yeah it just seems weird that you would need to disable the buffer on the
source cluster though.

The docs say "Replicas do not need to buffer updates, and it is recommended
to disable buffer on the target SolrCloud" which means the source should
have it enabled.

But the fact that it's working for you proves otherwise . What version of
Solr are you running? I'll try reproducing this problem at my end and see
if it's a documentation gap or a bug.

On Mon, Jul 10, 2017 at 7:15 PM, Xie, Sean <[hidden email]> wrote:

> Yes. Documents are being sent to target. Monitoring the output from
> “action=queues”, depending your settings, you will see the documents
> replication progress.
>
> On the other hand, if enable the buffer, the lastprocessedversion is
> always returning -1. Reading the source code, the CdcrUpdateLogSynchroizer
> does not continue to do the clean if this value is -1.
>
> Sean
>
> On 7/10/17, 5:18 PM, "Varun Thacker" <[hidden email]> wrote:
>
>     After disabling the buffer are you still seeing documents being
> replicated
>     to the target cluster(s) ?
>
>     On Mon, Jul 10, 2017 at 1:07 PM, Xie, Sean <[hidden email]> wrote:
>
>     > After several experiments and observation, finally make it work.
>     > The key point is you have to also disablebuffer on source cluster. I
> don’t
>     > know why in the wiki, it didn’t mention it, but I figured this out
> through
>     > the source code.
>     > Once disablebuffer on source cluster, the lastProcessedVersion will
> become
>     > a position number, and when there is hard commit, the old unused
> tlog files
>     > get deleted.
>     >
>     > Hope my finding can help other users who experience the same issue.
>     >
>     >
>     > On 7/10/17, 9:08 AM, "Michael McCarthy" <[hidden email]>
> wrote:
>     >
>     >     We have been experiencing this same issue for months now, with
> version
>     > 6.2.  No solution to date.
>     >
>     >     -----Original Message-----
>     >     From: Xie, Sean [mailto:[hidden email]]
>     >     Sent: Sunday, July 09, 2017 9:41 PM
>     >     To: [hidden email]
>     >     Subject: [EXTERNAL] Re: CDCR - how to deal with the transaction
> log
>     > files
>     >
>     >     Did another round of testing, the tlog on target cluster is
> cleaned up
>     > once the hard commit is triggered. However, on source cluster, the
> tlog
>     > files stay there and never gets cleaned up.
>     >
>     >     Not sure if there is any command to run manually to trigger the
>     > updateLogSynchronizer. The updateLogSynchronizer already set at run
> at
>     > every 10 seconds, but seems it didn’t help.
>     >
>     >     Any help?
>     >
>     >     Thanks
>     >     Sean
>     >
>     >     On 7/8/17, 1:14 PM, "Xie, Sean" <[hidden email]> wrote:
>     >
>     >         I have monitored the CDCR process for a while, the updates
> are
>     > actively sent to the target without a problem. However the tlog size
> and
>     > files count are growing everyday, even when there is 0 updates to
> sent, the
>     > tlog stays there:
>     >
>     >         Following is from the action=queues command, and you can see
> after
>     > about a month or so running days, the total transaction are reaching
> to
>     > 140K total files, and size is about 103G.
>     >
>     >         <response>
>     >         <lst name="responseHeader">
>     >         <int name="status">0</int>
>     >         <int name="QTime">465</int>
>     >         </lst>
>     >         <lst name="queues">
>     >         <lst name="some_zk_url_list">
>     >         <lst name="MY_COLLECTION">
>     >         <long name="queueSize">0</long>
>     >         <str name="lastTimestamp">2017-07-07T23:19:09.655Z</str>
>     >         </lst>
>     >         </lst>
>     >         </lst>
>     >         <long name="tlogTotalSize">102740042616</long>
>     >         <long name="tlogTotalCount">140809</long>
>     >         <str name="updateLogSynchronizer">stopped</str>
>     >         </response>
>     >
>     >         Any help on it? Or do I need to configure something else?
> The CDCR
>     > configuration is pretty much following the wiki:
>     >
>     >         On target:
>     >
>     >           <requestHandler name="/cdcr" class="solr.
> CdcrRequestHandler">
>     >             <lst name="buffer">
>     >               <str name="defaultState">disabled</str>
>     >             </lst>
>     >           </requestHandler>
>     >
>     >           <updateRequestProcessorChain name="cdcr-processor-chain">
>     >             <processor class="solr.CdcrUpdateProcessorFactory"/>
>     >             <processor class="solr.RunUpdateProcessorFactory"/>
>     >           </updateRequestProcessorChain>
>     >
>     >           <requestHandler name="/update" class="solr.
>     > UpdateRequestHandler">
>     >             <lst name="defaults">
>     >               <str name="update.chain">cdcr-processor-chain</str>
>     >             </lst>
>     >           </requestHandler>
>     >
>     >           <updateHandler class="solr.DirectUpdateHandler2">
>     >             <updateLog class="solr.CdcrUpdateLog">
>     >               <str name="dir">${solr.ulog.dir:}</str>
>     >             </updateLog>
>     >             <autoCommit>
>     >               <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
>     >               <openSearcher>false</openSearcher>
>     >             </autoCommit>
>     >
>     >             <autoSoftCommit>
>     >               <maxTime>${solr.autoSoftCommit.maxTime:30000}<
> /maxTime>
>     >             </autoSoftCommit>
>     >           </updateHandler>
>     >
>     >         On source:
>     >           <requestHandler name="/cdcr" class="solr.
> CdcrRequestHandler">
>     >             <lst name="replica">
>     >               <str name="zkHost">${TargetZk}</str>
>     >               <str name="source">MY_COLLECTION</str>
>     >               <str name="target">MY_COLLECTION</str>
>     >             </lst>
>     >
>     >             <lst name="replicator">
>     >               <str name="threadPoolSize">1</str>
>     >               <str name="schedule">1000</str>
>     >               <str name="batchSize">128</str>
>     >             </lst>
>     >
>     >             <lst name="updateLogSynchronizer">
>     >               <str name="schedule">60000</str>
>     >             </lst>
>     >           </requestHandler>
>     >
>     >           <updateHandler class="solr.DirectUpdateHandler2">
>     >             <updateLog class="solr.CdcrUpdateLog">
>     >               <str name="dir">${solr.ulog.dir:}</str>
>     >             </updateLog>
>     >             <autoCommit>
>     >               <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
>     >               <openSearcher>false</openSearcher>
>     >             </autoCommit>
>     >
>     >             <autoSoftCommit>
>     >               <maxTime>${solr.autoSoftCommit.maxTime:30000}<
> /maxTime>
>     >             </autoSoftCommit>
>     >           </updateHandler>
>     >
>     >         Thanks.
>     >         Sean
>     >
>     >         On 7/8/17, 12:10 PM, "Erick Erickson" <
> [hidden email]>
>     > wrote:
>     >
>     >             This should not be the case if you are actively sending
>     > updates to the
>     >             target cluster. The tlog is used to store unsent
> updates, so
>     > if the
>     >             connection is broken for some time, the target cluster
> will
>     > have a
>     >             chance to catch up.
>     >
>     >             If you don't have the remote DC online and do not intend
> to
>     > bring it
>     >             online soon, you should turn CDCR off.
>     >
>     >             Best,
>     >             Erick
>     >
>     >             On Fri, Jul 7, 2017 at 9:35 PM, Xie, Sean <
> [hidden email]>
>     > wrote:
>     >             > Once enabled CDCR, update log stores an unlimited
> number of
>     > entries. This is causing the tlog folder getting bigger and bigger,
> as well
>     > as the open files are growing. How can one reduce the number of open
> files
>     > and also to reduce the tlog files? If it’s not taken care properly,
> sooner
>     > or later the log files size and open file count will exceed the
> limits.
>     >             >
>     >             > Thanks
>     >             > Sean
>     >             >
>     >             >
>     >             > Confidentiality Notice::  This email, including
> attachments,
>     > may include non-public, proprietary, confidential or legally
> privileged
>     > information.  If you are not an intended recipient or an authorized
> agent
>     > of an intended recipient, you are hereby notified that any
> dissemination,
>     > distribution or copying of the information contained in or
> transmitted with
>     > this e-mail is unauthorized and strictly prohibited.  If you have
> received
>     > this email in error, please notify the sender by replying to this
> message
>     > and permanently delete this e-mail, its attachments, and any copies
> of it
>     > immediately.  You should not retain, copy or use this e-mail or any
>     > attachment for any purpose, nor disclose all or any part of the
> contents to
>     > any other person. Thank you.
>     >
>     >
>     >
>     >
>     >
>     >
>     >     Nothing in this message is intended to constitute an electronic
>     > signature unless a specific statement to the contrary is included in
> this
>     > message.
>     >
>     >     Confidentiality Note: This message is intended only for the
> person or
>     > entity to which it is addressed. It may contain confidential and/or
>     > privileged material. Any review, transmission, dissemination or
> other use,
>     > or taking of any action in reliance upon this message by persons or
>     > entities other than the intended recipient is prohibited and may be
>     > unlawful. If you received this message in error, please contact the
> sender
>     > and delete it from your computer.
>     >
>     >
>     >
>
>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: CDCR - how to deal with the transaction log files

Xie, Sean
In reply to this post by Xie, Sean
My guess was the documentation gap.

I did a testing that turning off the CDCR by using action=stop, while continuously sending documents to the source cluster. The tlog files were growing; And after the hard commit, a new tlog file was created and the old files stayed there forever. As soon as I turned on CDCR, the documents started to replicate to the target.

After a hard commit and scheduled log synchronizer run, the old tlog files got deleted.

Btw, I’m running on 6.5.1.



On 7/10/17, 10:57 PM, "Varun Thacker" <[hidden email]> wrote:

    Yeah it just seems weird that you would need to disable the buffer on the
    source cluster though.
   
    The docs say "Replicas do not need to buffer updates, and it is recommended
    to disable buffer on the target SolrCloud" which means the source should
    have it enabled.
   
    But the fact that it's working for you proves otherwise . What version of
    Solr are you running? I'll try reproducing this problem at my end and see
    if it's a documentation gap or a bug.
   
    On Mon, Jul 10, 2017 at 7:15 PM, Xie, Sean <[hidden email]> wrote:
   
    > Yes. Documents are being sent to target. Monitoring the output from
    > “action=queues”, depending your settings, you will see the documents
    > replication progress.
    >
    > On the other hand, if enable the buffer, the lastprocessedversion is
    > always returning -1. Reading the source code, the CdcrUpdateLogSynchroizer
    > does not continue to do the clean if this value is -1.
    >
    > Sean
    >
    > On 7/10/17, 5:18 PM, "Varun Thacker" <[hidden email]> wrote:
    >
    >     After disabling the buffer are you still seeing documents being
    > replicated
    >     to the target cluster(s) ?
    >
    >     On Mon, Jul 10, 2017 at 1:07 PM, Xie, Sean <[hidden email]> wrote:
    >
    >     > After several experiments and observation, finally make it work.
    >     > The key point is you have to also disablebuffer on source cluster. I
    > don’t
    >     > know why in the wiki, it didn’t mention it, but I figured this out
    > through
    >     > the source code.
    >     > Once disablebuffer on source cluster, the lastProcessedVersion will
    > become
    >     > a position number, and when there is hard commit, the old unused
    > tlog files
    >     > get deleted.
    >     >
    >     > Hope my finding can help other users who experience the same issue.
    >     >
    >     >
    >     > On 7/10/17, 9:08 AM, "Michael McCarthy" <[hidden email]>
    > wrote:
    >     >
    >     >     We have been experiencing this same issue for months now, with
    > version
    >     > 6.2.  No solution to date.
    >     >
    >     >     -----Original Message-----
    >     >     From: Xie, Sean [mailto:[hidden email]]
    >     >     Sent: Sunday, July 09, 2017 9:41 PM
    >     >     To: [hidden email]
    >     >     Subject: [EXTERNAL] Re: CDCR - how to deal with the transaction
    > log
    >     > files
    >     >
    >     >     Did another round of testing, the tlog on target cluster is
    > cleaned up
    >     > once the hard commit is triggered. However, on source cluster, the
    > tlog
    >     > files stay there and never gets cleaned up.
    >     >
    >     >     Not sure if there is any command to run manually to trigger the
    >     > updateLogSynchronizer. The updateLogSynchronizer already set at run
    > at
    >     > every 10 seconds, but seems it didn’t help.
    >     >
    >     >     Any help?
    >     >
    >     >     Thanks
    >     >     Sean
    >     >
    >     >     On 7/8/17, 1:14 PM, "Xie, Sean" <[hidden email]> wrote:
    >     >
    >     >         I have monitored the CDCR process for a while, the updates
    > are
    >     > actively sent to the target without a problem. However the tlog size
    > and
    >     > files count are growing everyday, even when there is 0 updates to
    > sent, the
    >     > tlog stays there:
    >     >
    >     >         Following is from the action=queues command, and you can see
    > after
    >     > about a month or so running days, the total transaction are reaching
    > to
    >     > 140K total files, and size is about 103G.
    >     >
    >     >         <response>
    >     >         <lst name="responseHeader">
    >     >         <int name="status">0</int>
    >     >         <int name="QTime">465</int>
    >     >         </lst>
    >     >         <lst name="queues">
    >     >         <lst name="some_zk_url_list">
    >     >         <lst name="MY_COLLECTION">
    >     >         <long name="queueSize">0</long>
    >     >         <str name="lastTimestamp">2017-07-07T23:19:09.655Z</str>
    >     >         </lst>
    >     >         </lst>
    >     >         </lst>
    >     >         <long name="tlogTotalSize">102740042616</long>
    >     >         <long name="tlogTotalCount">140809</long>
    >     >         <str name="updateLogSynchronizer">stopped</str>
    >     >         </response>
    >     >
    >     >         Any help on it? Or do I need to configure something else?
    > The CDCR
    >     > configuration is pretty much following the wiki:
    >     >
    >     >         On target:
    >     >
    >     >           <requestHandler name="/cdcr" class="solr.
    > CdcrRequestHandler">
    >     >             <lst name="buffer">
    >     >               <str name="defaultState">disabled</str>
    >     >             </lst>
    >     >           </requestHandler>
    >     >
    >     >           <updateRequestProcessorChain name="cdcr-processor-chain">
    >     >             <processor class="solr.CdcrUpdateProcessorFactory"/>
    >     >             <processor class="solr.RunUpdateProcessorFactory"/>
    >     >           </updateRequestProcessorChain>
    >     >
    >     >           <requestHandler name="/update" class="solr.
    >     > UpdateRequestHandler">
    >     >             <lst name="defaults">
    >     >               <str name="update.chain">cdcr-processor-chain</str>
    >     >             </lst>
    >     >           </requestHandler>
    >     >
    >     >           <updateHandler class="solr.DirectUpdateHandler2">
    >     >             <updateLog class="solr.CdcrUpdateLog">
    >     >               <str name="dir">${solr.ulog.dir:}</str>
    >     >             </updateLog>
    >     >             <autoCommit>
    >     >               <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
    >     >               <openSearcher>false</openSearcher>
    >     >             </autoCommit>
    >     >
    >     >             <autoSoftCommit>
    >     >               <maxTime>${solr.autoSoftCommit.maxTime:30000}<
    > /maxTime>
    >     >             </autoSoftCommit>
    >     >           </updateHandler>
    >     >
    >     >         On source:
    >     >           <requestHandler name="/cdcr" class="solr.
    > CdcrRequestHandler">
    >     >             <lst name="replica">
    >     >               <str name="zkHost">${TargetZk}</str>
    >     >               <str name="source">MY_COLLECTION</str>
    >     >               <str name="target">MY_COLLECTION</str>
    >     >             </lst>
    >     >
    >     >             <lst name="replicator">
    >     >               <str name="threadPoolSize">1</str>
    >     >               <str name="schedule">1000</str>
    >     >               <str name="batchSize">128</str>
    >     >             </lst>
    >     >
    >     >             <lst name="updateLogSynchronizer">
    >     >               <str name="schedule">60000</str>
    >     >             </lst>
    >     >           </requestHandler>
    >     >
    >     >           <updateHandler class="solr.DirectUpdateHandler2">
    >     >             <updateLog class="solr.CdcrUpdateLog">
    >     >               <str name="dir">${solr.ulog.dir:}</str>
    >     >             </updateLog>
    >     >             <autoCommit>
    >     >               <maxTime>${solr.autoCommit.maxTime:180000}</maxTime>
    >     >               <openSearcher>false</openSearcher>
    >     >             </autoCommit>
    >     >
    >     >             <autoSoftCommit>
    >     >               <maxTime>${solr.autoSoftCommit.maxTime:30000}<
    > /maxTime>
    >     >             </autoSoftCommit>
    >     >           </updateHandler>
    >     >
    >     >         Thanks.
    >     >         Sean
    >     >
    >     >         On 7/8/17, 12:10 PM, "Erick Erickson" <
    > [hidden email]>
    >     > wrote:
    >     >
    >     >             This should not be the case if you are actively sending
    >     > updates to the
    >     >             target cluster. The tlog is used to store unsent
    > updates, so
    >     > if the
    >     >             connection is broken for some time, the target cluster
    > will
    >     > have a
    >     >             chance to catch up.
    >     >
    >     >             If you don't have the remote DC online and do not intend
    > to
    >     > bring it
    >     >             online soon, you should turn CDCR off.
    >     >
    >     >             Best,
    >     >             Erick
    >     >
    >     >             On Fri, Jul 7, 2017 at 9:35 PM, Xie, Sean <
    > [hidden email]>
    >     > wrote:
    >     >             > Once enabled CDCR, update log stores an unlimited
    > number of
    >     > entries. This is causing the tlog folder getting bigger and bigger,
    > as well
    >     > as the open files are growing. How can one reduce the number of open
    > files
    >     > and also to reduce the tlog files? If it’s not taken care properly,
    > sooner
    >     > or later the log files size and open file count will exceed the
    > limits.
    >     >             >
    >     >             > Thanks
    >     >             > Sean
    >     >             >
    >     >             >
    >     >             > Confidentiality Notice::  This email, including
    > attachments,
    >     > may include non-public, proprietary, confidential or legally
    > privileged
    >     > information.  If you are not an intended recipient or an authorized
    > agent
    >     > of an intended recipient, you are hereby notified that any
    > dissemination,
    >     > distribution or copying of the information contained in or
    > transmitted with
    >     > this e-mail is unauthorized and strictly prohibited.  If you have
    > received
    >     > this email in error, please notify the sender by replying to this
    > message
    >     > and permanently delete this e-mail, its attachments, and any copies
    > of it
    >     > immediately.  You should not retain, copy or use this e-mail or any
    >     > attachment for any purpose, nor disclose all or any part of the
    > contents to
    >     > any other person. Thank you.
    >     >
    >     >
    >     >
    >     >
    >     >
    >     >
    >     >     Nothing in this message is intended to constitute an electronic
    >     > signature unless a specific statement to the contrary is included in
    > this
    >     > message.
    >     >
    >     >     Confidentiality Note: This message is intended only for the
    > person or
    >     > entity to which it is addressed. It may contain confidential and/or
    >     > privileged material. Any review, transmission, dissemination or
    > other use,
    >     > or taking of any action in reliance upon this message by persons or
    >     > entities other than the intended recipient is prohibited and may be
    >     > unlawful. If you received this message in error, please contact the
    > sender
    >     > and delete it from your computer.
    >     >
    >     >
    >     >
    >
    >
    >
   

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: CDCR - how to deal with the transaction log files

jmyatt
This post was updated on .
glad to hear you found your solution!  I have been combing over this post and others on this discussion board many times and have tried so many tweaks to configuration, order of steps, etc, all with absolutely no success in getting the Source cluster tlogs to delete.  So incredibly frustrating.  If anyone has other pearls of wisdom I'd love some advice.  Quick hits on what I've tried:

- solrconfig exactly like Sean's (target and source respectively) expect no autoSoftCommit
- I am also calling cdcr?action=DISABLEBUFFER (on source as well as on target) explicitly before starting since the config setting of defaultState=disabled doesn't seem to work
- when I create the collection on source first, I get the warning "The log reader for target collection {collection name} is not initialised".  When I reverse the order (create the collection on target first), no such warning
- tlogs replicate as expected, hard commits on both target and source cause tlogs to rollover, etc - all of that works as expected
- action=QUEUES on source reflects the queueSize accurately.  Also *always* shows updateLogSynchronizer state as "stopped" on the leader node
- action=LASTPROCESSEDVERSION on both source and target always seems correct (I don't see the -1 that Sean mentioned).
- I'm creating new collections every time and running full data imports that take 5-10 minutes. Again, all data replication, log rollover, and autocommit activity seems to work as expected, and logs on target are deleted.  It's just those pesky source tlogs I can't get to delete.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: CDCR - how to deal with the transaction log files

Xie, Sean
Try run second data import or any other indexing jobs after the replication of the first data import is completed.

My observation is during the replication period (when there is docs in queue), tlog clean up will not triggered. So when queue is 0, and submit second batch and monitor the queue and tlogs again.

-- Thank you
Sean

From: jmyatt <[hidden email]<mailto:[hidden email]>>
Date: Wednesday, Jul 12, 2017, 6:58 PM
To: [hidden email] <[hidden email]<mailto:[hidden email]>>
Subject: [EXTERNAL] Re: CDCR - how to deal with the transaction log files

glad to hear you found your solution!  I have been combing over this post and
others on this discussion board many times and have tried so many tweaks to
configuration, order of steps, etc, all with absolutely no success in
getting the Source cluster tlogs to delete.  So incredibly frustrating.  If
anyone has other pearls of wisdom I'd love some advice.  Quick hits on what
I've tried:

- solrconfig exactly like Sean's (target and source respectively) expect no
autoSoftCommit
- I am also calling cdcr?action=DISABLEBUFFER (on source as well as on
target) explicitly before starting since the config setting of
defaultState=disabled doesn't seem to work
- when I create the collection on source first, I get the warning "The log
reader for target collection {collection name} is not initialised".  When I
reverse the order (create the collection on target first), no such warning
- tlogs replicate as expected, hard commits on both target and source cause
tlogs to rollover, etc - all of that works as expected
- action=QUEUES on source reflects the queueSize accurately.  Also *always*
shows updateLogSynchronizer state as "stopped"
- action=LASTPROCESSEDVERSION on both source and target always seems correct
(I don't see the -1 that Sean mentioned).
- I'm creating new collections every time and running full data imports that
take 5-10 minutes. Again, all data replication, log rollover, and autocommit
activity seems to work as expected, and logs on target are deleted.  It's
just those pesky source tlogs I can't get to delete.



--
View this message in context: http://lucene.472066.n3.nabble.com/CDCR-how-to-deal-with-the-transaction-log-files-tp4345062p4345715.html
Sent from the Solr - User mailing list archive at Nabble.com.

Confidentiality Notice::  This email, including attachments, may include non-public, proprietary, confidential or legally privileged information.  If you are not an intended recipient or an authorized agent of an intended recipient, you are hereby notified that any dissemination, distribution or copying of the information contained in or transmitted with this e-mail is unauthorized and strictly prohibited.  If you have received this email in error, please notify the sender by replying to this message and permanently delete this e-mail, its attachments, and any copies of it immediately.  You should not retain, copy or use this e-mail or any attachment for any purpose, nor disclose all or any part of the contents to any other person. Thank you.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: CDCR - how to deal with the transaction log files

jmyatt
Thanks for the suggestion - tried that today and still no luck.  Time to write a script to naively / blindly delete old logs and run that in cron. *sigh*
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: CDCR - how to deal with the transaction log files

Varun Thacker-4
https://issues.apache.org/jira/browse/SOLR-11069 is tracking why is
LASTPROCESSEDVERSION=-1
on the source cluster always

On Fri, Jul 14, 2017 at 11:46 AM, jmyatt <[hidden email]> wrote:

> Thanks for the suggestion - tried that today and still no luck.  Time to
> write a script to naively / blindly delete old logs and run that in cron.
> *sigh*
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/CDCR-how-to-deal-with-the-transaction-log-
> files-tp4345062p4346138.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: CDCR - how to deal with the transaction log files

Susheel Kumar-3
I just voted for https://issues.apache.org/jira/browse/SOLR-11069 to get it
resolved, as we are discussing to start using CDCR soon.

On Fri, Jul 14, 2017 at 5:21 PM, Varun Thacker <[hidden email]> wrote:

> https://issues.apache.org/jira/browse/SOLR-11069 is tracking why is
> LASTPROCESSEDVERSION=-1
> on the source cluster always
>
> On Fri, Jul 14, 2017 at 11:46 AM, jmyatt <[hidden email]> wrote:
>
> > Thanks for the suggestion - tried that today and still no luck.  Time to
> > write a script to naively / blindly delete old logs and run that in cron.
> > *sigh*
> >
> >
> >
> > --
> > View this message in context: http://lucene.472066.n3.
> > nabble.com/CDCR-how-to-deal-with-the-transaction-log-
> > files-tp4345062p4346138.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
>
Loading...