[jira] [Commented] (SOLR-12590) Improve Solr resource loader coverage in the ref guide

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (SOLR-12590) Improve Solr resource loader coverage in the ref guide

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/SOLR-12590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576762#comment-16576762 ]

Christine Poerschke commented on SOLR-12590:
--------------------------------------------



bq. ... Do you have the bandwidth to test this assertion? ...

Hmm, ok, so i've explored reaching the {{// delegate to the class loader (looking into $INSTANCE_DIR/lib jars)}} code path in [ZkSolrResourceLoader|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.4.0/solr/core/src/java/org/apache/solr/cloud/ZkSolrResourceLoader.java#L122] for large learning-to-rank models, and, well, here's just some notes from that really:

* We have a [ManagedModelStore|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.4.0/solr/contrib/ltr/src/java/org/apache/solr/ltr/store/rest/ManagedModelStore.java] using {{"/schema/model-store"}} as the REST endpoint. Conceptually, if the model store itself wasn't there (in ZooKeeper) then in principle looking elsewhere locally might be an option; having said that:
** if there is a (small) model store then perhaps one would wish to keep that and any alternative additional (large) model store should be separate.
** {{SolrResourceLoader}} has a {{managedResourceRegistry}} but it's not immediately obvious from a quick look if {{ZkSolrResourceLoader}} (or something else) has an equivalent which would look locally if it's not there in ZooKeeper.

* Models use features and we have a [ManagedFeatureStore|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.4.0/solr/contrib/ltr/src/java/org/apache/solr/ltr/store/rest/ManagedFeatureStore.java] using {{"/schema/feature-store"}} as the REST endpoint.
** If there was a concept of a (small/regular) model store in ZooKeeper and an (additional/larger) model store locally, then similarly an additional large feature store locally might be logical.
** In such a hypothetical scenario, could models in the large model store use feature from the small feature store, and vice versa? What if both places have models with the same name?
** Current code detail: features are conceptually organised into [feature stores|https://lucene.apache.org/solr/guide/7_4/learning-to-rank.html#feature-stores] akin to namespaces
but in terms of implementation they are all persisted in the same place i.e. {{_schema_feature-store.json}} matching the {{"/schema/feature-store"}} upload REST endpoint.

So from this exploration I think the wrapper model concept introduced in SOLR-11250 is currently the only way to support large models (without changing ZooKeeper's max file size limit).

> Improve Solr resource loader coverage in the ref guide
> ------------------------------------------------------
>
>                 Key: SOLR-12590
>                 URL: https://issues.apache.org/jira/browse/SOLR-12590
>             Project: Solr
>          Issue Type: Task
>      Security Level: Public(Default Security Level. Issues are Public)
>          Components: documentation
>            Reporter: Steve Rowe
>            Assignee: Steve Rowe
>            Priority: Major
>         Attachments: SOLR-12590.patch
>
>
> In SolrCloud, storing large resources (e.g. binary machine learned models) on the local filesystem should be a viable alternative to increasing ZooKeeper's max file size limit (1MB), but there are undocumented complications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]