Solr5.4 - Indexing a big file (size = 2.4Go)

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr5.4 - Indexing a big file (size = 2.4Go)

Bruno Mannina-2
Dear Solr User,



I got a invalid content length when I try to index my file (xml file with a
size of 2.4Go)



I use simpleposttool like in the documentation on my ubuntu server.

>bin/post -port 1234 -c mycollection /home/bruno/2013.xml



It works with smaller file but not with this one. I suppose it's the size.



Is exist a param to change to allow big file ?



I change in the solrconfig the param formdatauploadlimitinkb to 4096 and
miltipartuploadlimitinkb to 4096000 without successing.



Do you have an idea ?



Many thanks for your help,



Best Regards

Bruno



---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast.
https://www.avast.com/antivirus
Reply | Threaded
Open this post in threaded view
|

Re: Solr5.4 - Indexing a big file (size = 2.4Go)

Erick Erickson
Why do you want to index a 2G file in the first place? You can't
really do anything with it.

If you deliver it to a browser, the browser will churn forever. If you
try to export it it'll suck up
your bandwidth terribly.

If it's a bunch of individual docs (in Solr's xml format) about the
only thing that makes sense is to break it up.

This sounds like an XY problem, you've asked how to do X (index a 2G
file) without telling us Y (what
the use-case is).

Best,
Erick

On Wed, May 30, 2018 at 7:18 AM, Bruno Mannina
<[hidden email]> wrote:

> Dear Solr User,
>
>
>
> I got a invalid content length when I try to index my file (xml file with a
> size of 2.4Go)
>
>
>
> I use simpleposttool like in the documentation on my ubuntu server.
>
>>bin/post -port 1234 -c mycollection /home/bruno/2013.xml
>
>
>
> It works with smaller file but not with this one. I suppose it's the size.
>
>
>
> Is exist a param to change to allow big file ?
>
>
>
> I change in the solrconfig the param formdatauploadlimitinkb to 4096 and
> miltipartuploadlimitinkb to 4096000 without successing.
>
>
>
> Do you have an idea ?
>
>
>
> Many thanks for your help,
>
>
>
> Best Regards
>
> Bruno
>
>
>
> ---
> L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast.
> https://www.avast.com/antivirus
Reply | Threaded
Open this post in threaded view
|

RE: Solr5.4 - Indexing a big file (size = 2.4Go)

cleonard
Is it one document that is 2.4 GB or is that 2.4GB several documents?

There are some limits in solrconfig.xml.  Perhaps you are hitting the multipartUploadLimitInKB?

    <requestParsers enableRemoteStreaming="true"
                    multipartUploadLimitInKB="2048000"
                    formdataUploadLimitInKB="2048"
                    addHttpRequestToContext="false"/>


-----Original Message-----
From: Erick Erickson <[hidden email]>
Sent: Wednesday, May 30, 2018 7:50 AM
To: solr-user <[hidden email]>
Subject: Re: Solr5.4 - Indexing a big file (size = 2.4Go)

Why do you want to index a 2G file in the first place? You can't really do anything with it.

If you deliver it to a browser, the browser will churn forever. If you try to export it it'll suck up your bandwidth terribly.

If it's a bunch of individual docs (in Solr's xml format) about the only thing that makes sense is to break it up.

This sounds like an XY problem, you've asked how to do X (index a 2G
file) without telling us Y (what
the use-case is).

Best,
Erick

On Wed, May 30, 2018 at 7:18 AM, Bruno Mannina <[hidden email]> wrote:

> Dear Solr User,
>
>
>
> I got a invalid content length when I try to index my file (xml file
> with a size of 2.4Go)
>
>
>
> I use simpleposttool like in the documentation on my ubuntu server.
>
>>bin/post -port 1234 -c mycollection /home/bruno/2013.xml
>
>
>
> It works with smaller file but not with this one. I suppose it's the size.
>
>
>
> Is exist a param to change to allow big file ?
>
>
>
> I change in the solrconfig the param formdatauploadlimitinkb to 4096
> and miltipartuploadlimitinkb to 4096000 without successing.
>
>
>
> Do you have an idea ?
>
>
>
> Many thanks for your help,
>
>
>
> Best Regards
>
> Bruno
>
>
>
> ---
> L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast.
> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.a
> vast.com%2Fantivirus&data=02%7C01%7CCLeonard%40whisolutions.com%7C2546
> 89a9ef634c7f3cc708d5c63cc4e9%7C46326bff992841a0baca17c16c94ea99%7C0%7C
> 0%7C636632886654271771&sdata=8FiKfTYaUvx29ihtoHHgRriVr6%2Bb5SHx%2F6fx4
> BwQAGI%3D&reserved=0
Reply | Threaded
Open this post in threaded view
|

RE: Solr5.4 - Indexing a big file (size = 2.4Go)

Bruno Mannina
In reply to this post by Erick Erickson
Hi Erick,

I want to index this file because I received this file from my boss.

This file contains around 1.5M docs.

I think I will split this file and index them.
It will be better.

Thanks

-----Message d'origine-----
De : Erick Erickson [mailto:[hidden email]]
Envoyé : mercredi 30 mai 2018 16:50
À : solr-user
Objet : Re: Solr5.4 - Indexing a big file (size = 2.4Go)

Why do you want to index a 2G file in the first place? You can't really do anything with it.

If you deliver it to a browser, the browser will churn forever. If you try to export it it'll suck up your bandwidth terribly.

If it's a bunch of individual docs (in Solr's xml format) about the only thing that makes sense is to break it up.

This sounds like an XY problem, you've asked how to do X (index a 2G
file) without telling us Y (what
the use-case is).

Best,
Erick

On Wed, May 30, 2018 at 7:18 AM, Bruno Mannina <[hidden email]> wrote:

> Dear Solr User,
>
>
>
> I got a invalid content length when I try to index my file (xml file
> with a size of 2.4Go)
>
>
>
> I use simpleposttool like in the documentation on my ubuntu server.
>
>>bin/post -port 1234 -c mycollection /home/bruno/2013.xml
>
>
>
> It works with smaller file but not with this one. I suppose it's the size.
>
>
>
> Is exist a param to change to allow big file ?
>
>
>
> I change in the solrconfig the param formdatauploadlimitinkb to 4096
> and miltipartuploadlimitinkb to 4096000 without successing.
>
>
>
> Do you have an idea ?
>
>
>
> Many thanks for your help,
>
>
>
> Best Regards
>
> Bruno
>
>
>
> ---
> L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast.
> https://www.avast.com/antivirus