Lucene vs Solr Indexing Speed on Sample data Issue!!!

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Lucene vs Solr Indexing Speed on Sample data Issue!!!

Argho Chatterjee
Hello Everyone,

I had posted a question on stackoverflow.com after performing a few POCs

My hadrware consist of a single i-3 intel processor (4 CPU as per "dxdiag"
on run ), 8GB Ram, Laptop machine.

My Question Link :
http://stackoverflow.com/questions/30823314/lucene-vs-solr-indexning-speed-for-sampe-data

but no one could solve it as of now..
I hope the question I posted is undertandable.

Please if anyone could help me out with the indexing speed of Solr (way
slower) vs Lucene (way faster)..

I am trying to build a module for real time indexing and querying, and the
traffic is high, POC pass with Lucene for handling High Traffic for
Indexing, for Solr It is not able to do so..

Again My Machine Spec :
HP, intel core i3, 8GB ram, TB HDD.

Please let me know if there is a problem with Solr or am I doing anything
wrong.

Thanks
Argho
Reply | Threaded
Open this post in threaded view
|

Re: Lucene vs Solr Indexing Speed on Sample data Issue!!!

Malcolm Upayavira Holmes
Please post the original question here, so that everything people need
to review your question is included within this thread!

Oh, and for a high-throughput system, 8Gb RAM doesn't sound like much. A
Lucene index, whether inside Solr or not, benefits from a lot of RAM.

Thanks!

Upayavira

On Mon, Jun 15, 2015, at 05:19 AM, Argho Chatterjee wrote:

> Hello Everyone,
>
> I had posted a question on stackoverflow.com after performing a few POCs
>
> My hadrware consist of a single i-3 intel processor (4 CPU as per
> "dxdiag"
> on run ), 8GB Ram, Laptop machine.
>
> My Question Link :
> http://stackoverflow.com/questions/30823314/lucene-vs-solr-indexning-speed-for-sampe-data
>
> but no one could solve it as of now..
> I hope the question I posted is undertandable.
>
> Please if anyone could help me out with the indexing speed of Solr (way
> slower) vs Lucene (way faster)..
>
> I am trying to build a module for real time indexing and querying, and
> the
> traffic is high, POC pass with Lucene for handling High Traffic for
> Indexing, for Solr It is not able to do so..
>
> Again My Machine Spec :
> HP, intel core i3, 8GB ram, TB HDD.
>
> Please let me know if there is a problem with Solr or am I doing anything
> wrong.
>
> Thanks
> Argho
Reply | Threaded
Open this post in threaded view
|

Re: Lucene vs Solr Indexing Speed on Sample data Issue!!!

Ted Dunning
And what does high throughput actually mean in terms of number of documents
per second and bytes (or terms) per document?



On Mon, Jun 15, 2015 at 11:56 AM, Upayavira <[hidden email]> wrote:

> Please post the original question here, so that everything people need
> to review your question is included within this thread!
>
> Oh, and for a high-throughput system, 8Gb RAM doesn't sound like much. A
> Lucene index, whether inside Solr or not, benefits from a lot of RAM.
>
> Thanks!
>
> Upayavira
>
> On Mon, Jun 15, 2015, at 05:19 AM, Argho Chatterjee wrote:
> > Hello Everyone,
> >
> > I had posted a question on stackoverflow.com after performing a few POCs
> >
> > My hadrware consist of a single i-3 intel processor (4 CPU as per
> > "dxdiag"
> > on run ), 8GB Ram, Laptop machine.
> >
> > My Question Link :
> >
> http://stackoverflow.com/questions/30823314/lucene-vs-solr-indexning-speed-for-sampe-data
> >
> > but no one could solve it as of now..
> > I hope the question I posted is undertandable.
> >
> > Please if anyone could help me out with the indexing speed of Solr (way
> > slower) vs Lucene (way faster)..
> >
> > I am trying to build a module for real time indexing and querying, and
> > the
> > traffic is high, POC pass with Lucene for handling High Traffic for
> > Indexing, for Solr It is not able to do so..
> >
> > Again My Machine Spec :
> > HP, intel core i3, 8GB ram, TB HDD.
> >
> > Please let me know if there is a problem with Solr or am I doing anything
> > wrong.
> >
> > Thanks
> > Argho
>
Reply | Threaded
Open this post in threaded view
|

RE: Lucene vs Solr Indexing Speed on Sample data Issue!!!

Fuad Efendi
From my experience, "high throughput" example:

Using single-thread SolrJ client, I can index (for example) 1000 documents per second. And this is maximum "speed".
Using 12 Threads, I can index 12000 documents per second, just because we have 8-core SOLR, and 75% of processing is CPU-bound.


You can do it with SOLR + SolrJ easily; with Lucene you will need much more development efforts, but it is the same.



Thanks,


http://www.tokenizer.ca

-----Original Message-----
From: Ted Dunning [mailto:[hidden email]]
Sent: June-15-15 3:17 PM
To: [hidden email]
Subject: Re: Lucene vs Solr Indexing Speed on Sample data Issue!!!

And what does high throughput actually mean in terms of number of documents per second and bytes (or terms) per document?



On Mon, Jun 15, 2015 at 11:56 AM, Upayavira <[hidden email]> wrote:

> Please post the original question here, so that everything people need
> to review your question is included within this thread!
>
> Oh, and for a high-throughput system, 8Gb RAM doesn't sound like much.
> A Lucene index, whether inside Solr or not, benefits from a lot of RAM.
>
> Thanks!
>
> Upayavira
>
> On Mon, Jun 15, 2015, at 05:19 AM, Argho Chatterjee wrote:
> > Hello Everyone,
> >
> > I had posted a question on stackoverflow.com after performing a few
> > POCs
> >
> > My hadrware consist of a single i-3 intel processor (4 CPU as per
> > "dxdiag"
> > on run ), 8GB Ram, Laptop machine.
> >
> > My Question Link :
> >
> http://stackoverflow.com/questions/30823314/lucene-vs-solr-indexning-s
> peed-for-sampe-data
> >
> > but no one could solve it as of now..
> > I hope the question I posted is undertandable.
> >
> > Please if anyone could help me out with the indexing speed of Solr
> > (way
> > slower) vs Lucene (way faster)..
> >
> > I am trying to build a module for real time indexing and querying,
> > and the traffic is high, POC pass with Lucene for handling High
> > Traffic for Indexing, for Solr It is not able to do so..
> >
> > Again My Machine Spec :
> > HP, intel core i3, 8GB ram, TB HDD.
> >
> > Please let me know if there is a problem with Solr or am I doing
> > anything wrong.
> >
> > Thanks
> > Argho
>

Reply | Threaded
Open this post in threaded view
|

RE: Lucene vs Solr Indexing Speed on Sample data Issue!!!

Fuad Efendi
In reply to this post by Ted Dunning
In general, "out-of-the-box", pre-configured SOLR is slower than not-configured-at-all Lucene.

From another viewpoint, single-threaded HTTP access is I/O bound, and there is network roundtrip 50ms before SOLR spends 5 nanoseconds to index document. Using 128 parallel threads at the client side and fine-tuning Tomcat will help.




-----Original Message-----
From: Fuad Efendi [mailto:[hidden email]]
Sent: June-15-15 4:18 PM
To: '[hidden email]'
Subject: RE: Lucene vs Solr Indexing Speed on Sample data Issue!!!

From my experience, "high throughput" example:

Using single-thread SolrJ client, I can index (for example) 1000 documents per second. And this is maximum "speed".
Using 12 Threads, I can index 12000 documents per second, just because we have 8-core SOLR, and 75% of processing is CPU-bound.


You can do it with SOLR + SolrJ easily; with Lucene you will need much more development efforts, but it is the same.



Thanks,


http://www.tokenizer.ca

-----Original Message-----
From: Ted Dunning [mailto:[hidden email]]
Sent: June-15-15 3:17 PM
To: [hidden email]
Subject: Re: Lucene vs Solr Indexing Speed on Sample data Issue!!!

And what does high throughput actually mean in terms of number of documents per second and bytes (or terms) per document?



On Mon, Jun 15, 2015 at 11:56 AM, Upayavira <[hidden email]> wrote:

> Please post the original question here, so that everything people need
> to review your question is included within this thread!
>
> Oh, and for a high-throughput system, 8Gb RAM doesn't sound like much.
> A Lucene index, whether inside Solr or not, benefits from a lot of RAM.
>
> Thanks!
>
> Upayavira
>
> On Mon, Jun 15, 2015, at 05:19 AM, Argho Chatterjee wrote:
> > Hello Everyone,
> >
> > I had posted a question on stackoverflow.com after performing a few
> > POCs
> >
> > My hadrware consist of a single i-3 intel processor (4 CPU as per
> > "dxdiag"
> > on run ), 8GB Ram, Laptop machine.
> >
> > My Question Link :
> >
> http://stackoverflow.com/questions/30823314/lucene-vs-solr-indexning-s
> peed-for-sampe-data
> >
> > but no one could solve it as of now..
> > I hope the question I posted is undertandable.
> >
> > Please if anyone could help me out with the indexing speed of Solr
> > (way
> > slower) vs Lucene (way faster)..
> >
> > I am trying to build a module for real time indexing and querying,
> > and the traffic is high, POC pass with Lucene for handling High
> > Traffic for Indexing, for Solr It is not able to do so..
> >
> > Again My Machine Spec :
> > HP, intel core i3, 8GB ram, TB HDD.
> >
> > Please let me know if there is a problem with Solr or am I doing
> > anything wrong.
> >
> > Thanks
> > Argho
>