[jira] [Commented] (SOLR-11069) LASTPROCESSEDVERSION for CDCR is flawed when buffering is enabled

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Commented] (SOLR-11069) LASTPROCESSEDVERSION for CDCR is flawed when buffering is enabled

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-11069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126249#comment-16126249 ]

Erick Erickson commented on SOLR-11069:

Thanks for testing! So net-net is that with this patch, with the exception of the tlog purging being a little confusing, the patch seems to fix CDCR?

On a relatively brief inspection of the code the 10 tlog bit is unimportant. The loop in CdcrUpdateLog.addOldLog removes old logs if and only if there's nothing pointing to it. In fact I don't really see the reason for even testing it, assuming that the "if (!this.hasLogPointer(log)) {"
line preserves tlogs necessary for CDCR.

I'm not sure we need to fix the fact that tlogs aren't getting purged quite the way we'd expect on this ticket, perhaps raise another one? Especially if this behavior is also present on 6.1, which I believe it is. CDCR is pretty broken with the infinite bootstrapping, but just a little confusing with the tlog retention.

> LASTPROCESSEDVERSION for CDCR is flawed when buffering is enabled
> -----------------------------------------------------------------
>                 Key: SOLR-11069
>                 URL: https://issues.apache.org/jira/browse/SOLR-11069
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public)
>          Components: CDCR
>    Affects Versions: 7.0
>            Reporter: Amrit Sarkar
>            Assignee: Erick Erickson
>         Attachments: SOLR-11069.patch
> {{LASTPROCESSEDVERSION}} (a.b.v. LPV) action for CDCR breaks down due to poorly initialised and maintained buffer log for either source or target cluster core nodes.
> If buffer is enabled for cores of either source or target cluster, it return {{-1}}, *irrespective of number of entries in tlog read by the {{leader}}* node of each shard of respective collection of respective cluster. Once disabled, it starts telling us the correct LPV for each core.
> Due to the same flawed behavior, Update Log Synchroniser may doesn't work properly as expected, i.e. provides incorrect seek to the {{non-leader}} nodes to advance at. I am not sure whether this is an intended behavior for sync but it surely doesn't feel right.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]