[jira] [Commented] (SOLR-12413) Solr ignores aliases.json from ZooKeeper at startup

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (SOLR-12413) Solr ignores aliases.json from ZooKeeper at startup

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-12413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518949#comment-16518949 ]

Gus Heck commented on SOLR-12413:
---------------------------------

I had previously figured out that the  "-1 value for version" works for the reason you give, but that solution subtly ties this in code magic number to the current correct order of operations. If at any time the code saves without updating first (that would be a bug, but probably a race condition type of bug that might not show up immediately) we wind up overwriting the existing aliases.json with a blank one. By using 0 which is a valid version number and handling it as a special case with an automatic increment to version 1 we are guaranteed to never overwrite a long standing aliases.json node. The only case where we can get in trouble due to order of operations is this much narrower case for a aliases.json node that has just recently been replaced manually while solr was offline. I also like that it doesn't muddle the meaning of a -1 value for a version.

So I feel like the patch I supplied is more robust if a bit more complicated. Also the significance is clearer reading the code I think. If these reasons are unconvincing, I can go with smaller fix based on a -1 value for version :)

> Solr ignores aliases.json from ZooKeeper at startup
> ---------------------------------------------------
>
>                 Key: SOLR-12413
>                 URL: https://issues.apache.org/jira/browse/SOLR-12413
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public)
>          Components: SolrCloud
>    Affects Versions: 7.2.1
>         Environment: A SolrCloud cluster with ZooKeeper (one node is enough to reproduce).
> Solr 7.2.1.
> ZooKeeper 3.4.6.
>            Reporter: Gaël Jourdan
>            Assignee: David Smiley
>            Priority: Major
>         Attachments: SOLR-12413-nocommit.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Since upgrading to 7.2.1, we ran into an issue where Solr ignores _aliases.json_ file stored in ZooKeeper.
>  
> +Steps to reproduce the problem:+
>  # SolrCloud cluster is down
>  # Direct update of _aliases.json_ file in ZooKeeper with Solr ZkCLI *without using Collections API* :
>  ** {{java ... org.apache.solr.cloud.ZkCLI -zkhost ... -cmd clear /aliases.json}}
>  ** {{java ... org.apache.solr.cloud.ZkCLI -zkhost ... -cmd put /aliases.json "new content"}}
>  # SolrCloud cluster is started => _aliases.json_ not taken into account
>  
> +Analysis:+ 
> Digging a bit in the code, what is actually causing the issue is that, when starting, Solr now checks for the metadata of the _aliases.json_ file and if the version metadata from ZooKeeper is lower or equal to local version, it keeps the local version.
> When it starts, Solr has a local version of 0 for the aliases but ZooKeeper also has a version of 0 of the file because we just recreated it. So Solr ignores ZooKeeper configuration and never has a chance to load aliases.
>  
> Relevant parts of Solr code are:
>  * [https://github.com/apache/lucene-solr/blob/branch_7_2/solr/solrj/src/java/org/apache/solr/common/cloud/ZkStateReader.java] : line 1562 : method setIfNewer
> {code:java}
> /**
> * Update the internal aliases reference with a new one, provided that its ZK version has increased.
> *
> * @param newAliases the potentially newer version of Aliases
> */
> private boolean setIfNewer(Aliases newAliases) {
>   synchronized (this) {
>     int cmp = Integer.compare(aliases.getZNodeVersion(), newAliases.getZNodeVersion());
>     if (cmp < 0) {
>       LOG.debug("Aliases: cmp={}, new definition is: {}", cmp, newAliases);
>       aliases = newAliases;
>       this.notifyAll();
>       return true;
>     } else {
>       LOG.debug("Aliases: cmp={}, not overwriting ZK version.", cmp);
>       assert cmp != 0 || Arrays.equals(aliases.toJSON(), newAliases.toJSON()) : aliases + " != " + newAliases;
>     return false;
>     }
>   }
> }{code}
>  * [https://github.com/apache/lucene-solr/blob/branch_7_2/solr/solrj/src/java/org/apache/solr/common/cloud/Aliases.java] : line 45 : the "empty" Aliases object with default version 0
> {code:java}
> /**
> * An empty, minimal Aliases primarily used to support the non-cloud solr use cases. Not normally useful
> * in cloud situations where the version of the node needs to be tracked even if all aliases are removed.
> * A version of 0 is provided rather than -1 to minimize the possibility that if this is used in a cloud
> * instance data is written without version checking.
> */
> public static final Aliases EMPTY = new Aliases(Collections.emptyMap(), Collections.emptyMap(), 0);{code}
>  
> Note that a workaround is to force ZooKeeper to always have a version greater than 0 for _aliases.json_ file (for instance by not clearing the file and just overwriting it again and again).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]