[jira] [Updated] (SOLR-11444) Improve Aliases.java and comma delimited collection list handling

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Updated] (SOLR-11444) Improve Aliases.java and comma delimited collection list handling

JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/SOLR-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Smiley updated SOLR-11444:
    Attachment: SOLR_11444_Aliases.patch

New patch.  All existing tests pass.  Probably ready to commit but would love a review on some points.

New behavior: collection references in the URL path can now be comma delimited lists, just as is already possible with the little-known {{collection}} parameter.  Thus you can now do {{http://localhost:8983/solr/collection1,collection2/select?...}}.  The point of this is to have better consistency in treatment between both options, which in turn helps make the code to process them easier and more maintainable, removing gotcha edge-cases that were present.  I propose that this {{collection}} parameter in 8.0 be purely internal (or removed entirely?), thus not supported in SolrJ as it's needless, I think -- similar to {{qt}}.

* {{request()}}: The {{collection}} parameter is now fetched as top precedence, instead of the argument/param to the method.  Although it might seem this is a break in semantics, I'm doubtful since code I replaced in this class (in {{sendRequest()}}) used to compose the URL to Solr differently depending on wether a {{collection}} parameter was present.  After all, HttpSolrCall (Solr side) considers {{collection}} first (assuming the path isn't a core name).  FYI [~[hidden email]]
* New: you can now index (update) documents to an alias (or collection list) that references more than one collection.  It's routed to the first in the list. This change matches Solr's existing behavior (as implemented by HttpSolrCall).
* {{sendRequest()}}: improved clarity of gathering the URL list; no intended change in behavior.

HttpSolrCall & V2HttpCall  (FYI [~noble.paul])
* Most changes are just a refactor to improve the code.
* Collections in the path are parsed comma-delimited now to be consistent with {{collection}} param.
* {{getAuthCtx()}}: Now trusts/honors {{collectionList}} when present, instead of duplicating or adding special case logic of how to detect the collections, thus easier to maintain.  [~anshumg] do you think this is fine?

* Updated to ensure we more thoroughly tested all the ways that one can refer to collection lists and aliases. This includes comma delimited collection references in the URL path now.
* Test indexing with CloudSolrClient to multi-collection alias.

* Simplified a bit, removing one method.  FYI [~ichattopadhyaya].  Perhaps instead of keeping getAlias and removing getCollectionName; the reverse could be done?  I dunno, I could go either way.  There is a caller that specifically wants to know if it was alias-resolved which would be awkward to use getCollectionName to detect that.

SQL handler, SolrSchema
* getTableMap: instead of attempting to expand the alias to its target collection, simply pretend the alias is itself a table/collection.  I believe this should work, whereas the code it replaces assumed incorrectly that an alias maps to one collection when in fact it's (potentially) a comma delimited list -- and I believe the related in streaming expressions here doesn't support collection references that are comma delimited.  That could be added by I left that as a TODO.  FYI [~risdenk]

> Improve Aliases.java and comma delimited collection list handling
> -----------------------------------------------------------------
>                 Key: SOLR-11444
>                 URL: https://issues.apache.org/jira/browse/SOLR-11444
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public)
>          Components: SolrCloud
>            Reporter: David Smiley
>            Assignee: David Smiley
>         Attachments: SOLR_11444_Aliases.patch, SOLR_11444_Aliases.patch
> While starting to look at SOLR-11299 I noticed some brittleness in assumptions about Strings that refer to a collection.  Sometimes they are in fact references to comma separated lists, which appears was added with the introduction of collection aliases (an alias can refer to a comma delimited list).  So Java's type system kind of goes out the window when we do this.  In one case this leads to a bug -- CloudSolrClient will throw an NPE if you try to write to such an alias.  Sending an update via HTTP will allow it and send it to the first in the list.
> So this issue is about refactoring and some little improvements pertaining to Aliases.java plus certain key spots that deal with collection references.  I don't think I want to go as far as changing the public SolrJ API except to adding documentation on what's possible.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]