Ingestion not scaling horizontally as I add more cores to Solr

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Ingestion not scaling horizontally as I add more cores to Solr

Shashank Pedamallu
Hi,



I’m trying to find the upper thresholds of ingestion and I have tried the following. In each of the experiments, I’m ingesting random documents with 5 fields.


Number of Cores Number of documents ingested per second per core
1       89000
3       33000
5       18000


As you can see, the number of documents being ingested per core is not scaling horizontally as I'm adding more cores. Rather the total number of documents getting ingested for Solr JVM is being topped around 90k documents per second.


From the iostats and top commands, I do not see any bottlenecks with the iops or cpu respectively, CPU usaeg is around 65% and a sample of iostats is below:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

          55.32    0.00    2.33    1.64    0.00   40.71


Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn

sda5           2523.00     45812.00    298312.00      45812     298312


Can someone please guide me as to how I can debug this further and root-cause the bottleneck for not being able to increase the ingestion horizontally.


Thanks,

Shashank
Reply | Threaded
Open this post in threaded view
|

Re: Ingestion not scaling horizontally as I add more cores to Solr

Gus Heck
Ingested how? Sounds like your document sending mechanism is maxed, not the
solr cluster...

On Wed, Jan 10, 2018 at 2:58 PM, Shashank Pedamallu <[hidden email]>
wrote:

> Hi,
>
>
>
> I’m trying to find the upper thresholds of ingestion and I have tried the
> following. In each of the experiments, I’m ingesting random documents with
> 5 fields.
>
>
> Number of Cores Number of documents ingested per second per core
> 1       89000
> 3       33000
> 5       18000
>
>
> As you can see, the number of documents being ingested per core is not
> scaling horizontally as I'm adding more cores. Rather the total number of
> documents getting ingested for Solr JVM is being topped around 90k
> documents per second.
>
>
> From the iostats and top commands, I do not see any bottlenecks with the
> iops or cpu respectively, CPU usaeg is around 65% and a sample of iostats
> is below:
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>
>           55.32    0.00    2.33    1.64    0.00   40.71
>
>
> Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
>
> sda5           2523.00     45812.00    298312.00      45812     298312
>
>
> Can someone please guide me as to how I can debug this further and
> root-cause the bottleneck for not being able to increase the ingestion
> horizontally.
>
>
> Thanks,
>
> Shashank
>



--
http://www.the111shift.com
Reply | Threaded
Open this post in threaded view
|

Re: Ingestion not scaling horizontally as I add more cores to Solr

Shashank Pedamallu
Hi Gus,

Thank  for the reply. I’m sending via jmeter running on my local machine to Solr running on a remote vm.

Thanks,
Shashank

On 1/10/18, 12:34 PM, "Gus Heck" <[hidden email]> wrote:

    Ingested how? Sounds like your document sending mechanism is maxed, not the
    solr cluster...
   
    On Wed, Jan 10, 2018 at 2:58 PM, Shashank Pedamallu <[hidden email]>
    wrote:
   
    > Hi,
    >
    >
    >
    > I’m trying to find the upper thresholds of ingestion and I have tried the
    > following. In each of the experiments, I’m ingesting random documents with
    > 5 fields.
    >
    >
    > Number of Cores Number of documents ingested per second per core
    > 1       89000
    > 3       33000
    > 5       18000
    >
    >
    > As you can see, the number of documents being ingested per core is not
    > scaling horizontally as I'm adding more cores. Rather the total number of
    > documents getting ingested for Solr JVM is being topped around 90k
    > documents per second.
    >
    >
    > From the iostats and top commands, I do not see any bottlenecks with the
    > iops or cpu respectively, CPU usaeg is around 65% and a sample of iostats
    > is below:
    >
    > avg-cpu:  %user   %nice %system %iowait  %steal   %idle
    >
    >           55.32    0.00    2.33    1.64    0.00   40.71
    >
    >
    > Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
    >
    > sda5           2523.00     45812.00    298312.00      45812     298312
    >
    >
    > Can someone please guide me as to how I can debug this further and
    > root-cause the bottleneck for not being able to increase the ingestion
    > horizontally.
    >
    >
    > Thanks,
    >
    > Shashank
    >
   
   
   
    --
    https://urldefense.proofpoint.com/v2/url?u=http-3A__www.the111shift.com&d=DwIFaQ&c=uilaK90D4TOVoH58JNXRgQ&r=blJD2pBapH3dDkoajIf9mT9SSbbs19wRbChNde1ErNI&m=DT_33Z3k4h8T1t65CuyH0oMxay15ddkfDYAQefzgpa4&s=6-1wd3YPVRgcvlk3LkK7Wz-3hDFliEGwVGc44HJH1x4&e=
   

Reply | Threaded
Open this post in threaded view
|

Re: Ingestion not scaling horizontally as I add more cores to Solr

Gus Heck
Ok then here's a few things to check...

   - Did you sept up an actual multiple node cluster or are you running
   this all on one box?
   - Are you configuring Jmeter to send with multiple threads?
   - Are they all sending to the same node, or are you distributing across
   nodes? Is there a load balancer?
   - Are you sending from a machine on the same network as the machines in
   the Solr cluster?
   - If you are sending requests up to the cloud from your local machine,
   that is frequently a slow link.
   - Also don't forget to check your zookeeper cluster's health... if it's
   bogged down that will slow down solr.

If you have all machines on the same network, many threads, load balancing
and no questionable equipment (or networking limitations put in place by
IT) in the middle, then something (either CPU or network interface) should
be maxed out somewhere on at least one machine, either on the Jmeter side
or Solr side.

-Gus

On Wed, Jan 10, 2018 at 3:54 PM, Shashank Pedamallu <[hidden email]>
wrote:

> Hi Gus,
>
> Thank  for the reply. I’m sending via jmeter running on my local machine
> to Solr running on a remote vm.
>
> Thanks,
> Shashank
>
> On 1/10/18, 12:34 PM, "Gus Heck" <[hidden email]> wrote:
>
>     Ingested how? Sounds like your document sending mechanism is maxed,
> not the
>     solr cluster...
>
>     On Wed, Jan 10, 2018 at 2:58 PM, Shashank Pedamallu <
> [hidden email]>
>     wrote:
>
>     > Hi,
>     >
>     >
>     >
>     > I’m trying to find the upper thresholds of ingestion and I have
> tried the
>     > following. In each of the experiments, I’m ingesting random
> documents with
>     > 5 fields.
>     >
>     >
>     > Number of Cores Number of documents ingested per second per core
>     > 1       89000
>     > 3       33000
>     > 5       18000
>     >
>     >
>     > As you can see, the number of documents being ingested per core is
> not
>     > scaling horizontally as I'm adding more cores. Rather the total
> number of
>     > documents getting ingested for Solr JVM is being topped around 90k
>     > documents per second.
>     >
>     >
>     > From the iostats and top commands, I do not see any bottlenecks with
> the
>     > iops or cpu respectively, CPU usaeg is around 65% and a sample of
> iostats
>     > is below:
>     >
>     > avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>     >
>     >           55.32    0.00    2.33    1.64    0.00   40.71
>     >
>     >
>     > Device:            tps    kB_read/s    kB_wrtn/s    kB_read
> kB_wrtn
>     >
>     > sda5           2523.00     45812.00    298312.00      45812
>  298312
>     >
>     >
>     > Can someone please guide me as to how I can debug this further and
>     > root-cause the bottleneck for not being able to increase the
> ingestion
>     > horizontally.
>     >
>     >
>     > Thanks,
>     >
>     > Shashank
>     >
>
>
>
>     --
>     https://urldefense.proofpoint.com/v2/url?u=http-3A__www.
> the111shift.com&d=DwIFaQ&c=uilaK90D4TOVoH58JNXRgQ&r=
> blJD2pBapH3dDkoajIf9mT9SSbbs19wRbChNde1ErNI&m=DT_
> 33Z3k4h8T1t65CuyH0oMxay15ddkfDYAQefzgpa4&s=6-1wd3YPVRgcvlk3LkK7Wz-
> 3hDFliEGwVGc44HJH1x4&e=
>
>
>


--
http://www.the111shift.com
Reply | Threaded
Open this post in threaded view
|

Re: Ingestion not scaling horizontally as I add more cores to Solr

Erick Erickson
And I'd add
- are you sending one document at a time or batching them up? See:
https://lucidworks.com/2015/10/05/really-batch-updates-solr-2/

Best,
Erick

On Wed, Jan 10, 2018 at 1:35 PM, Gus Heck <[hidden email]> wrote:

> Ok then here's a few things to check...
>
>    - Did you sept up an actual multiple node cluster or are you running
>    this all on one box?
>    - Are you configuring Jmeter to send with multiple threads?
>    - Are they all sending to the same node, or are you distributing across
>    nodes? Is there a load balancer?
>    - Are you sending from a machine on the same network as the machines in
>    the Solr cluster?
>    - If you are sending requests up to the cloud from your local machine,
>    that is frequently a slow link.
>    - Also don't forget to check your zookeeper cluster's health... if it's
>    bogged down that will slow down solr.
>
> If you have all machines on the same network, many threads, load balancing
> and no questionable equipment (or networking limitations put in place by
> IT) in the middle, then something (either CPU or network interface) should
> be maxed out somewhere on at least one machine, either on the Jmeter side
> or Solr side.
>
> -Gus
>
> On Wed, Jan 10, 2018 at 3:54 PM, Shashank Pedamallu <[hidden email]
> >
> wrote:
>
> > Hi Gus,
> >
> > Thank  for the reply. I’m sending via jmeter running on my local machine
> > to Solr running on a remote vm.
> >
> > Thanks,
> > Shashank
> >
> > On 1/10/18, 12:34 PM, "Gus Heck" <[hidden email]> wrote:
> >
> >     Ingested how? Sounds like your document sending mechanism is maxed,
> > not the
> >     solr cluster...
> >
> >     On Wed, Jan 10, 2018 at 2:58 PM, Shashank Pedamallu <
> > [hidden email]>
> >     wrote:
> >
> >     > Hi,
> >     >
> >     >
> >     >
> >     > I’m trying to find the upper thresholds of ingestion and I have
> > tried the
> >     > following. In each of the experiments, I’m ingesting random
> > documents with
> >     > 5 fields.
> >     >
> >     >
> >     > Number of Cores Number of documents ingested per second per core
> >     > 1       89000
> >     > 3       33000
> >     > 5       18000
> >     >
> >     >
> >     > As you can see, the number of documents being ingested per core is
> > not
> >     > scaling horizontally as I'm adding more cores. Rather the total
> > number of
> >     > documents getting ingested for Solr JVM is being topped around 90k
> >     > documents per second.
> >     >
> >     >
> >     > From the iostats and top commands, I do not see any bottlenecks
> with
> > the
> >     > iops or cpu respectively, CPU usaeg is around 65% and a sample of
> > iostats
> >     > is below:
> >     >
> >     > avg-cpu:  %user   %nice %system %iowait  %steal   %idle
> >     >
> >     >           55.32    0.00    2.33    1.64    0.00   40.71
> >     >
> >     >
> >     > Device:            tps    kB_read/s    kB_wrtn/s    kB_read
> > kB_wrtn
> >     >
> >     > sda5           2523.00     45812.00    298312.00      45812
> >  298312
> >     >
> >     >
> >     > Can someone please guide me as to how I can debug this further and
> >     > root-cause the bottleneck for not being able to increase the
> > ingestion
> >     > horizontally.
> >     >
> >     >
> >     > Thanks,
> >     >
> >     > Shashank
> >     >
> >
> >
> >
> >     --
> >     https://urldefense.proofpoint.com/v2/url?u=http-3A__www.
> > the111shift.com&d=DwIFaQ&c=uilaK90D4TOVoH58JNXRgQ&r=
> > blJD2pBapH3dDkoajIf9mT9SSbbs19wRbChNde1ErNI&m=DT_
> > 33Z3k4h8T1t65CuyH0oMxay15ddkfDYAQefzgpa4&s=6-1wd3YPVRgcvlk3LkK7Wz-
> > 3hDFliEGwVGc44HJH1x4&e=
> >
> >
> >
>
>
> --
> http://www.the111shift.com
>
Reply | Threaded
Open this post in threaded view
|

Re: Ingestion not scaling horizontally as I add more cores to Solr

Shashank Pedamallu
In reply to this post by Gus Heck
- Did you sept up an actual multiple node cluster or are you running this all on one box?
Sorry, I should have mentioned this earlier. I’m running Solr in non-cloud mode. It is just a single node Solr.

- Are you configuring Jmeter to send with multiple threads?
Yes, multiple threads looping a fixed number of times

- Are they all sending to the same node, or are you distributing across nodes? Is there a load balancer?
Yes, since there is only one node.

- If you are sending requests up to the cloud from your local machine, that is frequently a slow link.
Not a public cloud. Our private one.

- are you sending one document at a time or batching them up?
Batching them up. About 1000 documents in one request

Thanks,
Shashank

On 1/10/18, 1:35 PM, "Gus Heck" <[hidden email]> wrote:

    Ok then here's a few things to check...
   
       - Did you sept up an actual multiple node cluster or are you running
       this all on one box?
       - Are you configuring Jmeter to send with multiple threads?
       - Are they all sending to the same node, or are you distributing across
       nodes? Is there a load balancer?
       - Are you sending from a machine on the same network as the machines in
       the Solr cluster?
       - If you are sending requests up to the cloud from your local machine,
       that is frequently a slow link.
       - Also don't forget to check your zookeeper cluster's health... if it's
       bogged down that will slow down solr.
   
    If you have all machines on the same network, many threads, load balancing
    and no questionable equipment (or networking limitations put in place by
    IT) in the middle, then something (either CPU or network interface) should
    be maxed out somewhere on at least one machine, either on the Jmeter side
    or Solr side.
   
    -Gus
   
    On Wed, Jan 10, 2018 at 3:54 PM, Shashank Pedamallu <[hidden email]>
    wrote:
   
    > Hi Gus,
    >
    > Thank  for the reply. I’m sending via jmeter running on my local machine
    > to Solr running on a remote vm.
    >
    > Thanks,
    > Shashank
    >
    > On 1/10/18, 12:34 PM, "Gus Heck" <[hidden email]> wrote:
    >
    >     Ingested how? Sounds like your document sending mechanism is maxed,
    > not the
    >     solr cluster...
    >
    >     On Wed, Jan 10, 2018 at 2:58 PM, Shashank Pedamallu <
    > [hidden email]>
    >     wrote:
    >
    >     > Hi,
    >     >
    >     >
    >     >
    >     > I’m trying to find the upper thresholds of ingestion and I have
    > tried the
    >     > following. In each of the experiments, I’m ingesting random
    > documents with
    >     > 5 fields.
    >     >
    >     >
    >     > Number of Cores Number of documents ingested per second per core
    >     > 1       89000
    >     > 3       33000
    >     > 5       18000
    >     >
    >     >
    >     > As you can see, the number of documents being ingested per core is
    > not
    >     > scaling horizontally as I'm adding more cores. Rather the total
    > number of
    >     > documents getting ingested for Solr JVM is being topped around 90k
    >     > documents per second.
    >     >
    >     >
    >     > From the iostats and top commands, I do not see any bottlenecks with
    > the
    >     > iops or cpu respectively, CPU usaeg is around 65% and a sample of
    > iostats
    >     > is below:
    >     >
    >     > avg-cpu:  %user   %nice %system %iowait  %steal   %idle
    >     >
    >     >           55.32    0.00    2.33    1.64    0.00   40.71
    >     >
    >     >
    >     > Device:            tps    kB_read/s    kB_wrtn/s    kB_read
    > kB_wrtn
    >     >
    >     > sda5           2523.00     45812.00    298312.00      45812
    >  298312
    >     >
    >     >
    >     > Can someone please guide me as to how I can debug this further and
    >     > root-cause the bottleneck for not being able to increase the
    > ingestion
    >     > horizontally.
    >     >
    >     >
    >     > Thanks,
    >     >
    >     > Shashank
    >     >
    >
    >
    >
    >     --
    >     https://urldefense.proofpoint.com/v2/url?u=http-3A__www.
    > the111shift.com&d=DwIFaQ&c=uilaK90D4TOVoH58JNXRgQ&r=
    > blJD2pBapH3dDkoajIf9mT9SSbbs19wRbChNde1ErNI&m=DT_
    > 33Z3k4h8T1t65CuyH0oMxay15ddkfDYAQefzgpa4&s=6-1wd3YPVRgcvlk3LkK7Wz-
    > 3hDFliEGwVGc44HJH1x4&e=
    >
    >
    >
   
   
    --
    https://urldefense.proofpoint.com/v2/url?u=http-3A__www.the111shift.com&d=DwIFaQ&c=uilaK90D4TOVoH58JNXRgQ&r=blJD2pBapH3dDkoajIf9mT9SSbbs19wRbChNde1ErNI&m=pbia4eQUWz4n0Xt_yX7Qwpe78uY4BponCK3oC3Hw0lE&s=DM4yi3jj900fXF2lcbx7YqLurs4n-fbQaD7JZYUfym8&e=
   



Reply | Threaded
Open this post in threaded view
|

Re: Ingestion not scaling horizontally as I add more cores to Solr

Erick Erickson
OK, so I'm assuming your indexer indexes to 1, 3 and 5 separate cores
depending on how many are available, right? And these cores are essentially
totally independent.

I'd guess your gating factor is your ingestion process. Try spinning up two
identical ones from two separate clients. Eventually you should be able to
max out your CPU as you add cores. The fact that your indexing rate is
fairly constant at 90K docs/sec is a red flag that that's the rate you're
feeding docs to Solr.

At some point you'll max out our CPU and that'll be the limit.

Best,
Erick

On Wed, Jan 10, 2018 at 1:52 PM, Shashank Pedamallu <[hidden email]>
wrote:

> - Did you sept up an actual multiple node cluster or are you running this
> all on one box?
> Sorry, I should have mentioned this earlier. I’m running Solr in non-cloud
> mode. It is just a single node Solr.
>
> - Are you configuring Jmeter to send with multiple threads?
> Yes, multiple threads looping a fixed number of times
>
> - Are they all sending to the same node, or are you distributing across
> nodes? Is there a load balancer?
> Yes, since there is only one node.
>
> - If you are sending requests up to the cloud from your local machine,
> that is frequently a slow link.
> Not a public cloud. Our private one.
>
> - are you sending one document at a time or batching them up?
> Batching them up. About 1000 documents in one request
>
> Thanks,
> Shashank
>
> On 1/10/18, 1:35 PM, "Gus Heck" <[hidden email]> wrote:
>
>     Ok then here's a few things to check...
>
>        - Did you sept up an actual multiple node cluster or are you running
>        this all on one box?
>        - Are you configuring Jmeter to send with multiple threads?
>        - Are they all sending to the same node, or are you distributing
> across
>        nodes? Is there a load balancer?
>        - Are you sending from a machine on the same network as the
> machines in
>        the Solr cluster?
>        - If you are sending requests up to the cloud from your local
> machine,
>        that is frequently a slow link.
>        - Also don't forget to check your zookeeper cluster's health... if
> it's
>        bogged down that will slow down solr.
>
>     If you have all machines on the same network, many threads, load
> balancing
>     and no questionable equipment (or networking limitations put in place
> by
>     IT) in the middle, then something (either CPU or network interface)
> should
>     be maxed out somewhere on at least one machine, either on the Jmeter
> side
>     or Solr side.
>
>     -Gus
>
>     On Wed, Jan 10, 2018 at 3:54 PM, Shashank Pedamallu <
> [hidden email]>
>     wrote:
>
>     > Hi Gus,
>     >
>     > Thank  for the reply. I’m sending via jmeter running on my local
> machine
>     > to Solr running on a remote vm.
>     >
>     > Thanks,
>     > Shashank
>     >
>     > On 1/10/18, 12:34 PM, "Gus Heck" <[hidden email]> wrote:
>     >
>     >     Ingested how? Sounds like your document sending mechanism is
> maxed,
>     > not the
>     >     solr cluster...
>     >
>     >     On Wed, Jan 10, 2018 at 2:58 PM, Shashank Pedamallu <
>     > [hidden email]>
>     >     wrote:
>     >
>     >     > Hi,
>     >     >
>     >     >
>     >     >
>     >     > I’m trying to find the upper thresholds of ingestion and I have
>     > tried the
>     >     > following. In each of the experiments, I’m ingesting random
>     > documents with
>     >     > 5 fields.
>     >     >
>     >     >
>     >     > Number of Cores Number of documents ingested per second per
> core
>     >     > 1       89000
>     >     > 3       33000
>     >     > 5       18000
>     >     >
>     >     >
>     >     > As you can see, the number of documents being ingested per
> core is
>     > not
>     >     > scaling horizontally as I'm adding more cores. Rather the total
>     > number of
>     >     > documents getting ingested for Solr JVM is being topped around
> 90k
>     >     > documents per second.
>     >     >
>     >     >
>     >     > From the iostats and top commands, I do not see any
> bottlenecks with
>     > the
>     >     > iops or cpu respectively, CPU usaeg is around 65% and a sample
> of
>     > iostats
>     >     > is below:
>     >     >
>     >     > avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>     >     >
>     >     >           55.32    0.00    2.33    1.64    0.00   40.71
>     >     >
>     >     >
>     >     > Device:            tps    kB_read/s    kB_wrtn/s    kB_read
>     > kB_wrtn
>     >     >
>     >     > sda5           2523.00     45812.00    298312.00      45812
>     >  298312
>     >     >
>     >     >
>     >     > Can someone please guide me as to how I can debug this further
> and
>     >     > root-cause the bottleneck for not being able to increase the
>     > ingestion
>     >     > horizontally.
>     >     >
>     >     >
>     >     > Thanks,
>     >     >
>     >     > Shashank
>     >     >
>     >
>     >
>     >
>     >     --
>     >     https://urldefense.proofpoint.com/v2/url?u=http-3A__www.
>     > the111shift.com&d=DwIFaQ&c=uilaK90D4TOVoH58JNXRgQ&r=
>     > blJD2pBapH3dDkoajIf9mT9SSbbs19wRbChNde1ErNI&m=DT_
>     > 33Z3k4h8T1t65CuyH0oMxay15ddkfDYAQefzgpa4&s=6-1wd3YPVRgcvlk3LkK7Wz-
>     > 3hDFliEGwVGc44HJH1x4&e=
>     >
>     >
>     >
>
>
>     --
>     https://urldefense.proofpoint.com/v2/url?u=http-3A__www.
> the111shift.com&d=DwIFaQ&c=uilaK90D4TOVoH58JNXRgQ&r=
> blJD2pBapH3dDkoajIf9mT9SSbbs19wRbChNde1ErNI&m=pbia4eQUWz4n0Xt_
> yX7Qwpe78uY4BponCK3oC3Hw0lE&s=DM4yi3jj900fXF2lcbx7YqLurs4n-
> fbQaD7JZYUfym8&e=
>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Ingestion not scaling horizontally as I add more cores to Solr

Shashank Pedamallu
They are separate cases. In attempt 1 – I was ingesting to only 1 core. Then to 3 cores and then 5 cores. Yes, they are completely independent cores.

I think I was not reading the ‘iostats’ right. With –x option,  the ‘avgrq-sz’ parameter is constantly above 300. From some readings online, I see that 3 digit number for this parameter is a red flag. I’m trying to run the experiments on better disk now.

Yes, the intent is to max out the cpu to find the maximum load the system can handle.

Thanks,
Shashank

On 1/10/18, 4:59 PM, "Erick Erickson" <[hidden email]> wrote:

    OK, so I'm assuming your indexer indexes to 1, 3 and 5 separate cores
    depending on how many are available, right? And these cores are essentially
    totally independent.
   
    I'd guess your gating factor is your ingestion process. Try spinning up two
    identical ones from two separate clients. Eventually you should be able to
    max out your CPU as you add cores. The fact that your indexing rate is
    fairly constant at 90K docs/sec is a red flag that that's the rate you're
    feeding docs to Solr.
   
    At some point you'll max out our CPU and that'll be the limit.
   
    Best,
    Erick
   
    On Wed, Jan 10, 2018 at 1:52 PM, Shashank Pedamallu <[hidden email]>
    wrote:
   
    > - Did you sept up an actual multiple node cluster or are you running this
    > all on one box?
    > Sorry, I should have mentioned this earlier. I’m running Solr in non-cloud
    > mode. It is just a single node Solr.
    >
    > - Are you configuring Jmeter to send with multiple threads?
    > Yes, multiple threads looping a fixed number of times
    >
    > - Are they all sending to the same node, or are you distributing across
    > nodes? Is there a load balancer?
    > Yes, since there is only one node.
    >
    > - If you are sending requests up to the cloud from your local machine,
    > that is frequently a slow link.
    > Not a public cloud. Our private one.
    >
    > - are you sending one document at a time or batching them up?
    > Batching them up. About 1000 documents in one request
    >
    > Thanks,
    > Shashank
    >
    > On 1/10/18, 1:35 PM, "Gus Heck" <[hidden email]> wrote:
    >
    >     Ok then here's a few things to check...
    >
    >        - Did you sept up an actual multiple node cluster or are you running
    >        this all on one box?
    >        - Are you configuring Jmeter to send with multiple threads?
    >        - Are they all sending to the same node, or are you distributing
    > across
    >        nodes? Is there a load balancer?
    >        - Are you sending from a machine on the same network as the
    > machines in
    >        the Solr cluster?
    >        - If you are sending requests up to the cloud from your local
    > machine,
    >        that is frequently a slow link.
    >        - Also don't forget to check your zookeeper cluster's health... if
    > it's
    >        bogged down that will slow down solr.
    >
    >     If you have all machines on the same network, many threads, load
    > balancing
    >     and no questionable equipment (or networking limitations put in place
    > by
    >     IT) in the middle, then something (either CPU or network interface)
    > should
    >     be maxed out somewhere on at least one machine, either on the Jmeter
    > side
    >     or Solr side.
    >
    >     -Gus
    >
    >     On Wed, Jan 10, 2018 at 3:54 PM, Shashank Pedamallu <
    > [hidden email]>
    >     wrote:
    >
    >     > Hi Gus,
    >     >
    >     > Thank  for the reply. I’m sending via jmeter running on my local
    > machine
    >     > to Solr running on a remote vm.
    >     >
    >     > Thanks,
    >     > Shashank
    >     >
    >     > On 1/10/18, 12:34 PM, "Gus Heck" <[hidden email]> wrote:
    >     >
    >     >     Ingested how? Sounds like your document sending mechanism is
    > maxed,
    >     > not the
    >     >     solr cluster...
    >     >
    >     >     On Wed, Jan 10, 2018 at 2:58 PM, Shashank Pedamallu <
    >     > [hidden email]>
    >     >     wrote:
    >     >
    >     >     > Hi,
    >     >     >
    >     >     >
    >     >     >
    >     >     > I’m trying to find the upper thresholds of ingestion and I have
    >     > tried the
    >     >     > following. In each of the experiments, I’m ingesting random
    >     > documents with
    >     >     > 5 fields.
    >     >     >
    >     >     >
    >     >     > Number of Cores Number of documents ingested per second per
    > core
    >     >     > 1       89000
    >     >     > 3       33000
    >     >     > 5       18000
    >     >     >
    >     >     >
    >     >     > As you can see, the number of documents being ingested per
    > core is
    >     > not
    >     >     > scaling horizontally as I'm adding more cores. Rather the total
    >     > number of
    >     >     > documents getting ingested for Solr JVM is being topped around
    > 90k
    >     >     > documents per second.
    >     >     >
    >     >     >
    >     >     > From the iostats and top commands, I do not see any
    > bottlenecks with
    >     > the
    >     >     > iops or cpu respectively, CPU usaeg is around 65% and a sample
    > of
    >     > iostats
    >     >     > is below:
    >     >     >
    >     >     > avg-cpu:  %user   %nice %system %iowait  %steal   %idle
    >     >     >
    >     >     >           55.32    0.00    2.33    1.64    0.00   40.71
    >     >     >
    >     >     >
    >     >     > Device:            tps    kB_read/s    kB_wrtn/s    kB_read
    >     > kB_wrtn
    >     >     >
    >     >     > sda5           2523.00     45812.00    298312.00      45812
    >     >  298312
    >     >     >
    >     >     >
    >     >     > Can someone please guide me as to how I can debug this further
    > and
    >     >     > root-cause the bottleneck for not being able to increase the
    >     > ingestion
    >     >     > horizontally.
    >     >     >
    >     >     >
    >     >     > Thanks,
    >     >     >
    >     >     > Shashank
    >     >     >
    >     >
    >     >
    >     >
    >     >     --
    >     >     https://urldefense.proofpoint.com/v2/url?u=http-3A__www.
    >     > the111shift.com&d=DwIFaQ&c=uilaK90D4TOVoH58JNXRgQ&r=
    >     > blJD2pBapH3dDkoajIf9mT9SSbbs19wRbChNde1ErNI&m=DT_
    >     > 33Z3k4h8T1t65CuyH0oMxay15ddkfDYAQefzgpa4&s=6-1wd3YPVRgcvlk3LkK7Wz-
    >     > 3hDFliEGwVGc44HJH1x4&e=
    >     >
    >     >
    >     >
    >
    >
    >     --
    >     https://urldefense.proofpoint.com/v2/url?u=http-3A__www.
    > the111shift.com&d=DwIFaQ&c=uilaK90D4TOVoH58JNXRgQ&r=
    > blJD2pBapH3dDkoajIf9mT9SSbbs19wRbChNde1ErNI&m=pbia4eQUWz4n0Xt_
    > yX7Qwpe78uY4BponCK3oC3Hw0lE&s=DM4yi3jj900fXF2lcbx7YqLurs4n-
    > fbQaD7JZYUfym8&e=
    >
    >
    >
    >
    >
   

Reply | Threaded
Open this post in threaded view
|

Re: Ingestion not scaling horizontally as I add more cores to Solr

Shawn Heisey-2
In reply to this post by Shashank Pedamallu
On 1/10/2018 12:58 PM, Shashank Pedamallu wrote:
> As you can see, the number of documents being ingested per core is not scaling horizontally as I'm adding more cores. Rather the total number of documents getting ingested for Solr JVM is being topped around 90k documents per second.

I would call 90K documents per second a very respectable speed.  I can't
get my indexing to happen at anywhere near that rate.  My indexing is
not multi-threaded, though.

>  From the iostats and top commands, I do not see any bottlenecks with the iops or cpu respectively, CPU usaeg is around 65% and a sample of iostats is below:
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>
>            55.32    0.00    2.33    1.64    0.00   40.71
>
> Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
>
> sda5           2523.00     45812.00    298312.00      45812     298312

Nearly 300 megabytes per second write speed?  That's a LOT of data.
This storage must be quite a bit better than a single spinning disk.
You won't get that kind of sustained transfer speed out of standard
spinning disks unless they are using something like RAID10 or RAID0.
This transfer speed is also well beyond the capabilities of Gigabit
Ethernet.

When Gus asked whether you were sending documents to the cloud from your
local machine, I don't think he was referring to a public cloud.  I
think he assumed you were running SolrCloud, so "cloud" was probably
referring to your Solr installation, not a public cloud service.  If I
had to guess, I think the intent was to find out what caliber of machine
you're using to send the indexing requests.

I don't know if the bottleneck is on the client side or the server side.
  But I would imagine that with everything on a single machine, you may
not be able to get the ingestion rate to go much higher.

Is the jmeter running on a different machine from Solr or on the same
machine?

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

Re: Ingestion not scaling horizontally as I add more cores to Solr

Shashank Pedamallu
Thank you very much for the reply Shawn. Is the jmeter running on a different machine from Solr or on the same machine?
Solr is running on a dedicated VM. And I’ve tried to split the client requests from multiple machines but the result was not different. So, I don’t think the bottleneck is with the client side.

Thanks,
Shashank


On 1/10/18, 10:54 PM, "Shawn Heisey" <[hidden email]> wrote:

    On 1/10/2018 12:58 PM, Shashank Pedamallu wrote:
    > As you can see, the number of documents being ingested per core is not scaling horizontally as I'm adding more cores. Rather the total number of documents getting ingested for Solr JVM is being topped around 90k documents per second.
   
    I would call 90K documents per second a very respectable speed.  I can't
    get my indexing to happen at anywhere near that rate.  My indexing is
    not multi-threaded, though.
   
    >  From the iostats and top commands, I do not see any bottlenecks with the iops or cpu respectively, CPU usaeg is around 65% and a sample of iostats is below:
    >
    > avg-cpu:  %user   %nice %system %iowait  %steal   %idle
    >
    >            55.32    0.00    2.33    1.64    0.00   40.71
    >
    > Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
    >
    > sda5           2523.00     45812.00    298312.00      45812     298312
   
    Nearly 300 megabytes per second write speed?  That's a LOT of data.
    This storage must be quite a bit better than a single spinning disk.
    You won't get that kind of sustained transfer speed out of standard
    spinning disks unless they are using something like RAID10 or RAID0.
    This transfer speed is also well beyond the capabilities of Gigabit
    Ethernet.
   
    When Gus asked whether you were sending documents to the cloud from your
    local machine, I don't think he was referring to a public cloud.  I
    think he assumed you were running SolrCloud, so "cloud" was probably
    referring to your Solr installation, not a public cloud service.  If I
    had to guess, I think the intent was to find out what caliber of machine
    you're using to send the indexing requests.
   
    I don't know if the bottleneck is on the client side or the server side.
      But I would imagine that with everything on a single machine, you may
    not be able to get the ingestion rate to go much higher.
   
    Is the jmeter running on a different machine from Solr or on the same
    machine?
   
    Thanks,
    Shawn
   

Reply | Threaded
Open this post in threaded view
|

Re: Ingestion not scaling horizontally as I add more cores to Solr

Kevin Risden-3
When you say "multiple machines", was these all local machines or vms or
something else? I worked with a group once that used laptops to benchmark a
service and it was a WiFi network limit that caused weird results. LAN
connections or even better a dedicated client machine would help push more
documents.

Kevin Risden

On Thu, Jan 11, 2018 at 11:39 AM, Shashank Pedamallu <[hidden email]>
wrote:

> Thank you very much for the reply Shawn. Is the jmeter running on a
> different machine from Solr or on the same machine?
> Solr is running on a dedicated VM. And I’ve tried to split the client
> requests from multiple machines but the result was not different. So, I
> don’t think the bottleneck is with the client side.
>
> Thanks,
> Shashank
>
>
> On 1/10/18, 10:54 PM, "Shawn Heisey" <[hidden email]> wrote:
>
>     On 1/10/2018 12:58 PM, Shashank Pedamallu wrote:
>     > As you can see, the number of documents being ingested per core is
> not scaling horizontally as I'm adding more cores. Rather the total number
> of documents getting ingested for Solr JVM is being topped around 90k
> documents per second.
>
>     I would call 90K documents per second a very respectable speed.  I
> can't
>     get my indexing to happen at anywhere near that rate.  My indexing is
>     not multi-threaded, though.
>
>     >  From the iostats and top commands, I do not see any bottlenecks
> with the iops or cpu respectively, CPU usaeg is around 65% and a sample of
> iostats is below:
>     >
>     > avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>     >
>     >            55.32    0.00    2.33    1.64    0.00   40.71
>     >
>     > Device:            tps    kB_read/s    kB_wrtn/s    kB_read
> kB_wrtn
>     >
>     > sda5           2523.00     45812.00    298312.00      45812
>  298312
>
>     Nearly 300 megabytes per second write speed?  That's a LOT of data.
>     This storage must be quite a bit better than a single spinning disk.
>     You won't get that kind of sustained transfer speed out of standard
>     spinning disks unless they are using something like RAID10 or RAID0.
>     This transfer speed is also well beyond the capabilities of Gigabit
>     Ethernet.
>
>     When Gus asked whether you were sending documents to the cloud from
> your
>     local machine, I don't think he was referring to a public cloud.  I
>     think he assumed you were running SolrCloud, so "cloud" was probably
>     referring to your Solr installation, not a public cloud service.  If I
>     had to guess, I think the intent was to find out what caliber of
> machine
>     you're using to send the indexing requests.
>
>     I don't know if the bottleneck is on the client side or the server
> side.
>       But I would imagine that with everything on a single machine, you may
>     not be able to get the ingestion rate to go much higher.
>
>     Is the jmeter running on a different machine from Solr or on the same
>     machine?
>
>     Thanks,
>     Shawn
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Ingestion not scaling horizontally as I add more cores to Solr

Shashank Pedamallu
Thank you for the reply Kevin. I was using 6 vms from our private cloud. 5 among them, I was using as clients to ingest data on 5 independent cores. One vm is hosting the Solr which is where all ingest requests are received for all cores. Since they are all on same network, I think they should not be limited by the network bandwidth for the amount of requests I’m sending.

Thanks,
Shashank

On 1/11/18, 10:21 AM, "Kevin Risden" <[hidden email]> wrote:

    When you say "multiple machines", was these all local machines or vms or
    something else? I worked with a group once that used laptops to benchmark a
    service and it was a WiFi network limit that caused weird results. LAN
    connections or even better a dedicated client machine would help push more
    documents.
   
    Kevin Risden
   
    On Thu, Jan 11, 2018 at 11:39 AM, Shashank Pedamallu <[hidden email]>
    wrote:
   
    > Thank you very much for the reply Shawn. Is the jmeter running on a
    > different machine from Solr or on the same machine?
    > Solr is running on a dedicated VM. And I’ve tried to split the client
    > requests from multiple machines but the result was not different. So, I
    > don’t think the bottleneck is with the client side.
    >
    > Thanks,
    > Shashank
    >
    >
    > On 1/10/18, 10:54 PM, "Shawn Heisey" <[hidden email]> wrote:
    >
    >     On 1/10/2018 12:58 PM, Shashank Pedamallu wrote:
    >     > As you can see, the number of documents being ingested per core is
    > not scaling horizontally as I'm adding more cores. Rather the total number
    > of documents getting ingested for Solr JVM is being topped around 90k
    > documents per second.
    >
    >     I would call 90K documents per second a very respectable speed.  I
    > can't
    >     get my indexing to happen at anywhere near that rate.  My indexing is
    >     not multi-threaded, though.
    >
    >     >  From the iostats and top commands, I do not see any bottlenecks
    > with the iops or cpu respectively, CPU usaeg is around 65% and a sample of
    > iostats is below:
    >     >
    >     > avg-cpu:  %user   %nice %system %iowait  %steal   %idle
    >     >
    >     >            55.32    0.00    2.33    1.64    0.00   40.71
    >     >
    >     > Device:            tps    kB_read/s    kB_wrtn/s    kB_read
    > kB_wrtn
    >     >
    >     > sda5           2523.00     45812.00    298312.00      45812
    >  298312
    >
    >     Nearly 300 megabytes per second write speed?  That's a LOT of data.
    >     This storage must be quite a bit better than a single spinning disk.
    >     You won't get that kind of sustained transfer speed out of standard
    >     spinning disks unless they are using something like RAID10 or RAID0.
    >     This transfer speed is also well beyond the capabilities of Gigabit
    >     Ethernet.
    >
    >     When Gus asked whether you were sending documents to the cloud from
    > your
    >     local machine, I don't think he was referring to a public cloud.  I
    >     think he assumed you were running SolrCloud, so "cloud" was probably
    >     referring to your Solr installation, not a public cloud service.  If I
    >     had to guess, I think the intent was to find out what caliber of
    > machine
    >     you're using to send the indexing requests.
    >
    >     I don't know if the bottleneck is on the client side or the server
    > side.
    >       But I would imagine that with everything on a single machine, you may
    >     not be able to get the ingestion rate to go much higher.
    >
    >     Is the jmeter running on a different machine from Solr or on the same
    >     machine?
    >
    >     Thanks,
    >     Shawn
    >
    >
    >
   

Reply | Threaded
Open this post in threaded view
|

Re: Ingestion not scaling horizontally as I add more cores to Solr

Shawn Heisey-2
On 1/11/2018 11:50 AM, Shashank Pedamallu wrote:
> Thank you for the reply Kevin. I was using 6 vms from our private cloud. 5 among them, I was using as clients to ingest data on 5 independent cores. One vm is hosting the Solr which is where all ingest requests are received for all cores. Since they are all on same network, I think they should not be limited by the network bandwidth for the amount of requests I’m sending.

How large are the documents that you are indexing?  If they are 1K
(which would be a pretty small document), then 90K of them per second is
about 88 megabytes per second of raw data, which is near the practical
upper-end bandwidth limit of a gigabit ethernet connection.  The
theoretical maximum for gigabit ethernet is 125 megabytes per second,
but protocol overhead (at the ethernet, IP, and TCP layers) typically
limits the real-world achievable throughput of TCP-based communication
over gigabit to something lower, perhaps 100 megabytes per second. 
Additional overhead from the HTTP layer and the request format (javabin,
xml, json, csv, etc) would reduce it a little bit more.  If the
documents are bigger than 1K, then it would require even more network
bandwidth.

If the VMs are on different physical hosts, then you are likely to need
an actual network connection for traffic between them.  Having them all
on the same physical host might actually increase the amount of
available network bandwidth, because the traffic might never need to
leave the machine and travel over a real physical network.  But if they
are all on the same physical host, then there would be less disk I/O
bandwidth available.  Running all VMs on the same physical host is not
recommended for production, because it means that the entire
installation goes down if the physical host dies.

Thanks,
Shawn