[jira] [Commented] (SOLR-12413) Solr ignores aliases.json from ZooKeeper at startup

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (SOLR-12413) Solr ignores aliases.json from ZooKeeper at startup

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-12413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518891#comment-16518891 ]

David Smiley commented on SOLR-12413:
-------------------------------------

Good catch on identifying why my proposed test was fundamentally flawed; I wasn't quite sure yet.  I can also see that it's probably impossible to do a unit test for this.

Attached is a "nocommit" patch that hacks ZkController.createClusterZkNodes to ensure that the default aliases.json has "alias1" pointing to "collection1".  And it has a shortened version of the flawed test that merely tries to see if querying "alias1" from the get-go works.  I wanted to see if Aliases.EMPTY with a zkNodeVersion of -1 works.  Note the additional asserts as well.  The rationale for why I think this works is because the first aliases operation to occur is update() which sets ZkStateReader.AliasesManager.aliases to be whatever zookeeper has, which will be a good zk version (not -1).  applyModificationAndExportToZk will only ever be called _after_ this point, at which we never see the '-1' again.  This isn't all to say your patch doesn't also solve the problem but if we agree this "-1" solution works too then it's way simpler. (no additional lines of code except some assertions)

> Solr ignores aliases.json from ZooKeeper at startup
> ---------------------------------------------------
>
>                 Key: SOLR-12413
>                 URL: https://issues.apache.org/jira/browse/SOLR-12413
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public)
>          Components: SolrCloud
>    Affects Versions: 7.2.1
>         Environment: A SolrCloud cluster with ZooKeeper (one node is enough to reproduce).
> Solr 7.2.1.
> ZooKeeper 3.4.6.
>            Reporter: Gaël Jourdan
>            Assignee: David Smiley
>            Priority: Major
>         Attachments: SOLR-12413-nocommit.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Since upgrading to 7.2.1, we ran into an issue where Solr ignores _aliases.json_ file stored in ZooKeeper.
>  
> +Steps to reproduce the problem:+
>  # SolrCloud cluster is down
>  # Direct update of _aliases.json_ file in ZooKeeper with Solr ZkCLI *without using Collections API* :
>  ** {{java ... org.apache.solr.cloud.ZkCLI -zkhost ... -cmd clear /aliases.json}}
>  ** {{java ... org.apache.solr.cloud.ZkCLI -zkhost ... -cmd put /aliases.json "new content"}}
>  # SolrCloud cluster is started => _aliases.json_ not taken into account
>  
> +Analysis:+ 
> Digging a bit in the code, what is actually causing the issue is that, when starting, Solr now checks for the metadata of the _aliases.json_ file and if the version metadata from ZooKeeper is lower or equal to local version, it keeps the local version.
> When it starts, Solr has a local version of 0 for the aliases but ZooKeeper also has a version of 0 of the file because we just recreated it. So Solr ignores ZooKeeper configuration and never has a chance to load aliases.
>  
> Relevant parts of Solr code are:
>  * [https://github.com/apache/lucene-solr/blob/branch_7_2/solr/solrj/src/java/org/apache/solr/common/cloud/ZkStateReader.java] : line 1562 : method setIfNewer
> {code:java}
> /**
> * Update the internal aliases reference with a new one, provided that its ZK version has increased.
> *
> * @param newAliases the potentially newer version of Aliases
> */
> private boolean setIfNewer(Aliases newAliases) {
>   synchronized (this) {
>     int cmp = Integer.compare(aliases.getZNodeVersion(), newAliases.getZNodeVersion());
>     if (cmp < 0) {
>       LOG.debug("Aliases: cmp={}, new definition is: {}", cmp, newAliases);
>       aliases = newAliases;
>       this.notifyAll();
>       return true;
>     } else {
>       LOG.debug("Aliases: cmp={}, not overwriting ZK version.", cmp);
>       assert cmp != 0 || Arrays.equals(aliases.toJSON(), newAliases.toJSON()) : aliases + " != " + newAliases;
>     return false;
>     }
>   }
> }{code}
>  * [https://github.com/apache/lucene-solr/blob/branch_7_2/solr/solrj/src/java/org/apache/solr/common/cloud/Aliases.java] : line 45 : the "empty" Aliases object with default version 0
> {code:java}
> /**
> * An empty, minimal Aliases primarily used to support the non-cloud solr use cases. Not normally useful
> * in cloud situations where the version of the node needs to be tracked even if all aliases are removed.
> * A version of 0 is provided rather than -1 to minimize the possibility that if this is used in a cloud
> * instance data is written without version checking.
> */
> public static final Aliases EMPTY = new Aliases(Collections.emptyMap(), Collections.emptyMap(), 0);{code}
>  
> Note that a workaround is to force ZooKeeper to always have a version greater than 0 for _aliases.json_ file (for instance by not clearing the file and just overwriting it again and again).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]