Maximum number of fields allowed in a Solr document

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Maximum number of fields allowed in a Solr document

alexw
Hi,

We are in the process of designing a Solr app where we might have  
millions of documents and within each of the document, we might have  
thousands of dynamic fields. These fields are small and only contain  
an integer, which needs to be retrievable and sortable.

My questions is:

1. Is there a limit on the number of fields allowed per document?
2. What is the performance impact for such design?
3. Has anyone done this before and is it a wise thing to do?

Thanks,

Alex
Reply | Threaded
Open this post in threaded view
|

Re: Maximum number of fields allowed in a Solr document

Otis Gospodnetic-2
Hi Alex,

There is no build-in limit.  The limit is going to be dictated by your hardware resources.  In particular, this sounds like a memory intensive app because of sorting on lots of different fields.  You didn't mention the size of your index, but that's a factor, too.  Once in a while people on the list mention cases with lots and lots of fields, so I'd check ML archives.

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----

> From: Alex Wang <[hidden email]>
> To: "[hidden email]" <[hidden email]>
> Sent: Thu, November 26, 2009 12:47:36 PM
> Subject: Maximum number of fields allowed in a Solr document
>
> Hi,
>
> We are in the process of designing a Solr app where we might have  
> millions of documents and within each of the document, we might have  
> thousands of dynamic fields. These fields are small and only contain  
> an integer, which needs to be retrievable and sortable.
>
> My questions is:
>
> 1. Is there a limit on the number of fields allowed per document?
> 2. What is the performance impact for such design?
> 3. Has anyone done this before and is it a wise thing to do?
>
> Thanks,
>
> Alex

Reply | Threaded
Open this post in threaded view
|

Re: Maximum number of fields allowed in a Solr document

alexw
Thanks Otis for the reply. Yes this will be pretty memory intensive.  
The size of the index is 5 cores with a maximum of 500K documents each  
core. I did search the archives before but did not find any definite  
answer. Thanks again!

Alex



On Nov 27, 2009, at 11:09 PM, Otis Gospodnetic wrote:

> Hi Alex,
>
> There is no build-in limit.  The limit is going to be dictated by  
> your hardware resources.  In particular, this sounds like a memory  
> intensive app because of sorting on lots of different fields.  You  
> didn't mention the size of your index, but that's a factor, too.  
> Once in a while people on the list mention cases with lots and lots  
> of fields, so I'd check ML archives.
>
> Otis
> --
> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>
>
>
> ----- Original Message ----
>> From: Alex Wang <[hidden email]>
>> To: "[hidden email]" <[hidden email]>
>> Sent: Thu, November 26, 2009 12:47:36 PM
>> Subject: Maximum number of fields allowed in a Solr document
>>
>> Hi,
>>
>> We are in the process of designing a Solr app where we might have
>> millions of documents and within each of the document, we might have
>> thousands of dynamic fields. These fields are small and only contain
>> an integer, which needs to be retrievable and sortable.
>>
>> My questions is:
>>
>> 1. Is there a limit on the number of fields allowed per document?
>> 2. What is the performance impact for such design?
>> 3. Has anyone done this before and is it a wise thing to do?
>>
>> Thanks,
>>
>> Alex
>

Reply | Threaded
Open this post in threaded view
|

Re: Maximum number of fields allowed in a Solr document

Lance Norskog-2
Lucene creates an array of one item per document for every field you
sort on. If you sort on a thousand fields, Lucene will create 1000
different arrays of 500K ints. I assume there is some sort of cache of
these arrays. In Solr, it is also possible to sort using a function as
the relevance value. This is rather slow, and caches no data between
queries.

You may want to do sorting in your front-end applications, or get
database ids from Solr and do sorting in the database query.

On Mon, Nov 30, 2009 at 7:14 AM, Alex Wang <[hidden email]> wrote:

> Thanks Otis for the reply. Yes this will be pretty memory intensive.
> The size of the index is 5 cores with a maximum of 500K documents each
> core. I did search the archives before but did not find any definite
> answer. Thanks again!
>
> Alex
>
>
>
> On Nov 27, 2009, at 11:09 PM, Otis Gospodnetic wrote:
>
>> Hi Alex,
>>
>> There is no build-in limit.  The limit is going to be dictated by
>> your hardware resources.  In particular, this sounds like a memory
>> intensive app because of sorting on lots of different fields.  You
>> didn't mention the size of your index, but that's a factor, too.
>> Once in a while people on the list mention cases with lots and lots
>> of fields, so I'd check ML archives.
>>
>> Otis
>> --
>> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
>> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>>
>>
>>
>> ----- Original Message ----
>>> From: Alex Wang <[hidden email]>
>>> To: "[hidden email]" <[hidden email]>
>>> Sent: Thu, November 26, 2009 12:47:36 PM
>>> Subject: Maximum number of fields allowed in a Solr document
>>>
>>> Hi,
>>>
>>> We are in the process of designing a Solr app where we might have
>>> millions of documents and within each of the document, we might have
>>> thousands of dynamic fields. These fields are small and only contain
>>> an integer, which needs to be retrievable and sortable.
>>>
>>> My questions is:
>>>
>>> 1. Is there a limit on the number of fields allowed per document?
>>> 2. What is the performance impact for such design?
>>> 3. Has anyone done this before and is it a wise thing to do?
>>>
>>> Thanks,
>>>
>>> Alex
>>
>
>



--
Lance Norskog
[hidden email]