Solr Size Limitation upto 32 KB files

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr Size Limitation upto 32 KB files

Kranthi Kumar K

Hi,

 

We are currently using Solr 4.2.1 version in our project and everything is going well. But recently, we are facing an issue with Solr Data Import. It is not importing the files with size greater than 32766 bytes (i.e, 32 kb) and showing 2 exceptions:

 

  1. java.lang.illegalargumentexception
  2. org.apache.lucene.util.bytesref hash$maxbyteslengthexceededexception

 

Please find the attached screenshot for reference.

 

We have searched for solutions in many forums and didn’t find the exact solution for this issue. Interestingly, we found in the article, by changing the type of the ‘field’ from sting to  ‘text_general’ might solve the issue. Please have a look in the below forum:

 

https://stackoverflow.com/questions/29445323/adding-a-document-to-the-index-in-solr-document-contains-at-least-one-immense-t  

 

Schema.xml:

Changed from:

‘<field name="text" type="string_rev" indexed="true" stored="false" multiValued="true" />’

 

Changed to:

‘<field name="text" type="text_general " indexed="true" stored="false" multiValued="true" />’

 

We have tried it but still it is not importing the files > 32 KB or 32766 bytes.

 

Could you please let us know the solution to fix this issue? We’ll be awaiting your reply.

 

 

image001

Thanks & Regards,

Kranthi Kumar.K,

Software Engineer,

Ccube Fintech Global Services Pvt Ltd.,

Email/Skype: [hidden email],

Mobile: +91-8978078449.

 

 

Reply | Threaded
Open this post in threaded view
|

Re: Solr Size Limitation upto 32 KB files

Bernd Fehling
Hi,
I don't know the limits about Solr 4.2.1 but the RefGuide of Solr 6.6
says about Field Types for Class StrField:
"String (UTF-8 encoded string or Unicode). Strings are intended for
small fields and are not tokenized or analyzed in any way.
They have a hard limit of slightly less than 32K."

If you are trying to add larger content then you have to "chop" that
by yourself and add it as multivalued. Can be done within a self written loader.

Don't forget, Solr/Lucene is an indexer and not a fulltext engine.

Regards
Bernd


Am 02.01.19 um 10:23 schrieb Kranthi Kumar K:

> Hi,
>
> We are currently using Solr 4.2.1 version in our project and everything is going well. But recently, we are facing an issue with Solr Data Import. It is not importing the files with size greater than 32766 bytes (i.e, 32 kb) and showing 2 exceptions:
>
>
>    1.  java.lang.illegalargumentexception
>    2.  org.apache.lucene.util.bytesref hash$maxbyteslengthexceededexception
>
>
> Please find the attached screenshot for reference.
>
> We have searched for solutions in many forums and didn't find the exact solution for this issue. Interestingly, we found in the article, by changing the type of the 'field' from sting to  'text_general' might solve the issue. Please have a look in the below forum:
>
> https://stackoverflow.com/questions/29445323/adding-a-document-to-the-index-in-solr-document-contains-at-least-one-immense-t
>
> Schema.xml:
> Changed from:
> '<field name="text" type="string_rev" indexed="true" stored="false" multiValued="true" />'
>
> Changed to:
> '<field name="text" type="text_general " indexed="true" stored="false" multiValued="true" />'
>
> We have tried it but still it is not importing the files > 32 KB or 32766 bytes.
>
> Could you please let us know the solution to fix this issue? We'll be awaiting your reply.
>
>
> [image001]
> Thanks & Regards,
> Kranthi Kumar.K,
> Software Engineer,
> Ccube Fintech Global Services Pvt Ltd.,
> Email/Skype: [hidden email]<mailto:[hidden email]>,
> Mobile: +91-8978078449.
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr Size Limitation upto 32 KB files

Erick Erickson
Adding to what Bernd said, _string_ fields that large are almost always
a result of misunderstanding the use case. Especially if you
find yourself searching with the q=field:*word* pattern.

If you're trying to search within the string you need a
TextField-based type, not a StrField.

Best,
Erick

On Wed, Jan 2, 2019 at 4:03 AM Bernd Fehling
<[hidden email]> wrote:

>
> Hi,
> I don't know the limits about Solr 4.2.1 but the RefGuide of Solr 6.6
> says about Field Types for Class StrField:
> "String (UTF-8 encoded string or Unicode). Strings are intended for
> small fields and are not tokenized or analyzed in any way.
> They have a hard limit of slightly less than 32K."
>
> If you are trying to add larger content then you have to "chop" that
> by yourself and add it as multivalued. Can be done within a self written loader.
>
> Don't forget, Solr/Lucene is an indexer and not a fulltext engine.
>
> Regards
> Bernd
>
>
> Am 02.01.19 um 10:23 schrieb Kranthi Kumar K:
> > Hi,
> >
> > We are currently using Solr 4.2.1 version in our project and everything is going well. But recently, we are facing an issue with Solr Data Import. It is not importing the files with size greater than 32766 bytes (i.e, 32 kb) and showing 2 exceptions:
> >
> >
> >    1.  java.lang.illegalargumentexception
> >    2.  org.apache.lucene.util.bytesref hash$maxbyteslengthexceededexception
> >
> >
> > Please find the attached screenshot for reference.
> >
> > We have searched for solutions in many forums and didn't find the exact solution for this issue. Interestingly, we found in the article, by changing the type of the 'field' from sting to  'text_general' might solve the issue. Please have a look in the below forum:
> >
> > https://stackoverflow.com/questions/29445323/adding-a-document-to-the-index-in-solr-document-contains-at-least-one-immense-t
> >
> > Schema.xml:
> > Changed from:
> > '<field name="text" type="string_rev" indexed="true" stored="false" multiValued="true" />'
> >
> > Changed to:
> > '<field name="text" type="text_general " indexed="true" stored="false" multiValued="true" />'
> >
> > We have tried it but still it is not importing the files > 32 KB or 32766 bytes.
> >
> > Could you please let us know the solution to fix this issue? We'll be awaiting your reply.
> >
> >
> > [image001]
> > Thanks & Regards,
> > Kranthi Kumar.K,
> > Software Engineer,
> > Ccube Fintech Global Services Pvt Ltd.,
> > Email/Skype: [hidden email]<mailto:[hidden email]>,
> > Mobile: +91-8978078449.
> >
> >
> >
Reply | Threaded
Open this post in threaded view
|

Re: Solr Size Limitation upto 32 KB files

Jan Høydahl / Cominvent
In reply to this post by Kranthi Kumar K
You are not saying exactly how you index those documents. But check out the requestParsers tag in solrconfig.xml, see https://lucene.apache.org/solr/guide/6_6/requestdispatcher-in-solrconfig.html#RequestDispatcherinSolrConfig-requestParsersElement <https://lucene.apache.org/solr/guide/6_6/requestdispatcher-in-solrconfig.html#RequestDispatcherinSolrConfig-requestParsersElement>

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 2. jan. 2019 kl. 10:23 skrev Kranthi Kumar K <[hidden email]>:
>
> Hi,
>  
> We are currently using Solr 4.2.1 version in our project and everything is going well. But recently, we are facing an issue with Solr Data Import. It is not importing the files with size greater than 32766 bytes (i.e, 32 kb) and showing 2 exceptions:
>  
> java.lang.illegalargumentexception
> org.apache.lucene.util.bytesref hash$maxbyteslengthexceededexception
>  
> Please find the attached screenshot for reference.
>  
> We have searched for solutions in many forums and didn’t find the exact solution for this issue. Interestingly, we found in the article, by changing the type of the ‘field’ from sting to  ‘text_general’ might solve the issue. Please have a look in the below forum:
>  
> https://stackoverflow.com/questions/29445323/adding-a-document-to-the-index-in-solr-document-contains-at-least-one-immense-t <https://stackoverflow.com/questions/29445323/adding-a-document-to-the-index-in-solr-document-contains-at-least-one-immense-t>  
>  
> Schema.xml:
> Changed from:
> ‘<field name="text" type="string_rev" indexed="true" stored="false" multiValued="true" />’
>  
> Changed to:
> ‘<field name="text" type="text_general " indexed="true" stored="false" multiValued="true" />’
>  
> We have tried it but still it is not importing the files > 32 KB or 32766 bytes.
>  
> Could you please let us know the solution to fix this issue? We’ll be awaiting your reply.
>  
>  
> <image001.png>
>
> Thanks & Regards,
> Kranthi Kumar.K,
> Software Engineer,
> Ccube Fintech Global Services Pvt Ltd.,
> Email/Skype: [hidden email] <mailto:[hidden email]>,
> Mobile: +91-8978078449.