Data colocation hint to solr index

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Data colocation hint to solr index

maninder batth
Hi,
In my company, we serve car manuals for different car manufacturers with
their various makes and models. Typically, the search is always done within
context of a car manufacturer, year, make and model. Is there a way in Solr
to create indexes based on this criteria? Currently, the index contains all
manufactueres, makes and models. This causes index to go over a terabyte.
Hence, if we could teach solr to co-locate all the data for a particular
manufactuere, make and model, that would be an ideal thing to do.
I was wondering if this is possible?

Regards,
Jim
Reply | Threaded
Open this post in threaded view
|

Re: Data colocation hint to solr index

Shalin Shekhar Mangar
If you're using SolrCloud then you can use composite IDs such as
<make>!doc-id to co-locate documents belonging to a manufacturer together
and at query time, you can add _route_=<make>! to the request to route it
to the correct node.

On Mon, Nov 3, 2014 at 11:00 PM, maninder batth <[hidden email]>
wrote:

> Hi,
> In my company, we serve car manuals for different car manufacturers with
> their various makes and models. Typically, the search is always done within
> context of a car manufacturer, year, make and model. Is there a way in Solr
> to create indexes based on this criteria? Currently, the index contains all
> manufactueres, makes and models. This causes index to go over a terabyte.
> Hence, if we could teach solr to co-locate all the data for a particular
> manufactuere, make and model, that would be an ideal thing to do.
> I was wondering if this is possible?
>
> Regards,
> Jim
>



--
Regards,
Shalin Shekhar Mangar.
Reply | Threaded
Open this post in threaded view
|

Re: Data colocation hint to solr index

maninder batth
Thank you for recommendation on composite IDs. We currently use solr 3.x.
After reading on composite ids, it sounds like a feature of solr 4.x. Is
something similar available in solr 3.x also? Also, we do not use solrCloud.

On Mon, Nov 3, 2014 at 12:41 PM, Shalin Shekhar Mangar <
[hidden email]> wrote:

> If you're using SolrCloud then you can use composite IDs such as
> <make>!doc-id to co-locate documents belonging to a manufacturer together
> and at query time, you can add _route_=<make>! to the request to route it
> to the correct node.
>
> On Mon, Nov 3, 2014 at 11:00 PM, maninder batth <[hidden email]>
> wrote:
>
> > Hi,
> > In my company, we serve car manuals for different car manufacturers with
> > their various makes and models. Typically, the search is always done
> within
> > context of a car manufacturer, year, make and model. Is there a way in
> Solr
> > to create indexes based on this criteria? Currently, the index contains
> all
> > manufactueres, makes and models. This causes index to go over a terabyte.
> > Hence, if we could teach solr to co-locate all the data for a particular
> > manufactuere, make and model, that would be an ideal thing to do.
> > I was wondering if this is possible?
> >
> > Regards,
> > Jim
> >
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>
Reply | Threaded
Open this post in threaded view
|

Re: Data colocation hint to solr index

Erick Erickson
You have a TB-scale index and you're not using SolrCloud? Are
you using master/slave or otherwise splitting up your index? Because
if you're not, then please ship me some of your hardware because it
must be awesome.

Which is a tongue-in-cheek way of saying there must be lots of details
you aren't telling us that would help us help you.


Best,
Erick

On Mon, Nov 3, 2014 at 11:45 AM, maninder batth <[hidden email]> wrote:

> Thank you for recommendation on composite IDs. We currently use solr 3.x.
> After reading on composite ids, it sounds like a feature of solr 4.x. Is
> something similar available in solr 3.x also? Also, we do not use solrCloud.
>
> On Mon, Nov 3, 2014 at 12:41 PM, Shalin Shekhar Mangar <
> [hidden email]> wrote:
>
>> If you're using SolrCloud then you can use composite IDs such as
>> <make>!doc-id to co-locate documents belonging to a manufacturer together
>> and at query time, you can add _route_=<make>! to the request to route it
>> to the correct node.
>>
>> On Mon, Nov 3, 2014 at 11:00 PM, maninder batth <[hidden email]>
>> wrote:
>>
>> > Hi,
>> > In my company, we serve car manuals for different car manufacturers with
>> > their various makes and models. Typically, the search is always done
>> within
>> > context of a car manufacturer, year, make and model. Is there a way in
>> Solr
>> > to create indexes based on this criteria? Currently, the index contains
>> all
>> > manufactueres, makes and models. This causes index to go over a terabyte.
>> > Hence, if we could teach solr to co-locate all the data for a particular
>> > manufactuere, make and model, that would be an ideal thing to do.
>> > I was wondering if this is possible?
>> >
>> > Regards,
>> > Jim
>> >
>>
>>
>>
>> --
>> Regards,
>> Shalin Shekhar Mangar.
>>
Reply | Threaded
Open this post in threaded view
|

Re: Data colocation hint to solr index

Shawn Heisey-2
In reply to this post by maninder batth
On 11/3/2014 12:45 PM, maninder batth wrote:
> Thank you for recommendation on composite IDs. We currently use solr 3.x.
> After reading on composite ids, it sounds like a feature of solr 4.x. Is
> something similar available in solr 3.x also? Also, we do not use solrCloud.

The compositeId router is part of SolrCloud, which you will only find in
Solr 4.0 and newer.  On 3.x, you must normally handle all shard routing
outside of Solr.  It might be possible to configure the dataimport
handler so that its JDBC query selects only documents that belong on
that shard, if you happen to be using the dataimport handler already.

Thanks,
Shawn