Accessing other core in SolrCloud mode

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Accessing other core in SolrCloud mode

gnandre
Hi,

I have one custom Solr plugin that uses following logic to access some
other core present on the same Solr instance.

request.getCore().getCoreContainer().getCore(otherCoreName) where request
is an object of type SolrQueryRequest

This works fine in master-slave mode.

Now if try to use the same logic in SolrCloud mode, it does not work
because what was the core name above is a collection name now and core
name is something like otherCoreName_shard1_replica_n1. There is also a
possibility that the the core might be sharded in SolrCloud mode and it
partially exists on two or more separate Solr instances.

How would I change above logic for accessing other Solr core so that it
works in SolrCloud mode?
Reply | Threaded
Open this post in threaded view
|

Re: Accessing other core in SolrCloud mode

Erick Erickson
This kind of seems like an XY problem. Why do you want to get to the other core?
If you need to run the same query on multiple cores… you shouldn’t be thinking
that way, think “collections” rather than cores. And you can use “collection aliasing”
(see the Collections API CREATEALIAS command) to alias to multiple _collections_.
The rest is automatic.

If that’s irrelevant, please tell us _why_ your custom code needs to access otherCore,
maybe there’s something built in. It’s just that much of the time trying to force
stand-alone logic on SolrCloud is making it harder than it needs to be.

Best,
Erick

> On Jan 6, 2020, at 2:39 PM, Arnold Bronley <[hidden email]> wrote:
>
> Hi,
>
> I have one custom Solr plugin that uses following logic to access some
> other core present on the same Solr instance.
>
> request.getCore().getCoreContainer().getCore(otherCoreName) where request
> is an object of type SolrQueryRequest
>
> This works fine in master-slave mode.
>
> Now if try to use the same logic in SolrCloud mode, it does not work
> because what was the core name above is a collection name now and core
> name is something like otherCoreName_shard1_replica_n1. There is also a
> possibility that the the core might be sharded in SolrCloud mode and it
> partially exists on two or more separate Solr instances.
>
> How would I change above logic for accessing other Solr core so that it
> works in SolrCloud mode?

Reply | Threaded
Open this post in threaded view
|

Re: Accessing other core in SolrCloud mode

gnandre
Hi Erick,

Thanks for replying. I know that I should deal at collection level in
SolrCloud mode and leave dealing with cores to SolrCloud. I am also aware
of collection aliasing feature.

However, the plugins that I am trying to migrate to SolrCloud have some
usecases like following:
1. One of the plugins requires other core so that it can fetch some data
from it. This data gets used in pre-indexing phase for the main core. This
plugin gets invoked during the pre-indexing phase of the main core.
2. One other plugin requires some other cores so that it can execute
MoreLikeThis queries against them (MoreLikeThis like method returns Lucene
Query object so I assume that it does not work at Solr collection level but
at Solr core level unless we convert it to SolrQuery object somehow and
execute it through some Solr client)

On Mon, Jan 6, 2020 at 3:28 PM Erick Erickson <[hidden email]>
wrote:

> This kind of seems like an XY problem. Why do you want to get to the other
> core?
> If you need to run the same query on multiple cores… you shouldn’t be
> thinking
> that way, think “collections” rather than cores. And you can use
> “collection aliasing”
> (see the Collections API CREATEALIAS command) to alias to multiple
> _collections_.
> The rest is automatic.
>
> If that’s irrelevant, please tell us _why_ your custom code needs to
> access otherCore,
> maybe there’s something built in. It’s just that much of the time trying
> to force
> stand-alone logic on SolrCloud is making it harder than it needs to be.
>
> Best,
> Erick
>
> > On Jan 6, 2020, at 2:39 PM, Arnold Bronley <[hidden email]>
> wrote:
> >
> > Hi,
> >
> > I have one custom Solr plugin that uses following logic to access some
> > other core present on the same Solr instance.
> >
> > request.getCore().getCoreContainer().getCore(otherCoreName) where request
> > is an object of type SolrQueryRequest
> >
> > This works fine in master-slave mode.
> >
> > Now if try to use the same logic in SolrCloud mode, it does not work
> > because what was the core name above is a collection name now and core
> > name is something like otherCoreName_shard1_replica_n1. There is also a
> > possibility that the the core might be sharded in SolrCloud mode and it
> > partially exists on two or more separate Solr instances.
> >
> > How would I change above logic for accessing other Solr core so that it
> > works in SolrCloud mode?
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Accessing other core in SolrCloud mode

Erick Erickson
For <1>, hmmm. Apart from redesigning…. You can create
a single-shard collection. The trick is to co-locate one replica
in the “otherCollection” on _every_ Solr instance your main
collection is on. You’ll have to look at the ZK collection information
to know exactly what the otherCore name is, it’ll be something like
otherCollection_shard1_replica_n1 where “n1” is the changeable bit.

Other than that, a replica is just another core at that level, so you can
use it just like you would any other core, at least for read purposes.

There’s some “cross core join” code that does something similar, I
don’t have the class name to hand, you’ll have to go spelunking if
that seems similar to what you need.

For <2>, MoreLikeThis is supported in SolrCloud. Your problem is
slightly different. That said, to support MLT in a single collection,
Solr has to be able to send the necessary parameters to one replica
in each collection. I _think_ you could use the same pattern in
your custom code (and maybe from the client?) to essentially send
MLT to other collections. I think a place to start is the patch at:
https://issues.apache.org/jira/browse/SOLR-788

Good Luck!
Erick

> On Jan 6, 2020, at 4:58 PM, Arnold Bronley <[hidden email]> wrote:
>
> Hi Erick,
>
> Thanks for replying. I know that I should deal at collection level in
> SolrCloud mode and leave dealing with cores to SolrCloud. I am also aware
> of collection aliasing feature.
>
> However, the plugins that I am trying to migrate to SolrCloud have some
> usecases like following:
> 1. One of the plugins requires other core so that it can fetch some data
> from it. This data gets used in pre-indexing phase for the main core. This
> plugin gets invoked during the pre-indexing phase of the main core.
> 2. One other plugin requires some other cores so that it can execute
> MoreLikeThis queries against them (MoreLikeThis like method returns Lucene
> Query object so I assume that it does not work at Solr collection level but
> at Solr core level unless we convert it to SolrQuery object somehow and
> execute it through some Solr client)
>
> On Mon, Jan 6, 2020 at 3:28 PM Erick Erickson <[hidden email]>
> wrote:
>
>> This kind of seems like an XY problem. Why do you want to get to the other
>> core?
>> If you need to run the same query on multiple cores… you shouldn’t be
>> thinking
>> that way, think “collections” rather than cores. And you can use
>> “collection aliasing”
>> (see the Collections API CREATEALIAS command) to alias to multiple
>> _collections_.
>> The rest is automatic.
>>
>> If that’s irrelevant, please tell us _why_ your custom code needs to
>> access otherCore,
>> maybe there’s something built in. It’s just that much of the time trying
>> to force
>> stand-alone logic on SolrCloud is making it harder than it needs to be.
>>
>> Best,
>> Erick
>>
>>> On Jan 6, 2020, at 2:39 PM, Arnold Bronley <[hidden email]>
>> wrote:
>>>
>>> Hi,
>>>
>>> I have one custom Solr plugin that uses following logic to access some
>>> other core present on the same Solr instance.
>>>
>>> request.getCore().getCoreContainer().getCore(otherCoreName) where request
>>> is an object of type SolrQueryRequest
>>>
>>> This works fine in master-slave mode.
>>>
>>> Now if try to use the same logic in SolrCloud mode, it does not work
>>> because what was the core name above is a collection name now and core
>>> name is something like otherCoreName_shard1_replica_n1. There is also a
>>> possibility that the the core might be sharded in SolrCloud mode and it
>>> partially exists on two or more separate Solr instances.
>>>
>>> How would I change above logic for accessing other Solr core so that it
>>> works in SolrCloud mode?
>>
>>