Solr index slow response

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Solr index slow response

Aaron Yingcai Sun
Hello, Solr!


We are having some performance issue when try to send documents for solr to index. The repose time is very slow and unpredictable some time.


Solr server is running on a quit powerful server, 32 cpus, 400GB RAM, while 300 GB is reserved for solr, while this happening, cpu usage is around 30%, mem usage is 34%.  io also look ok according to iotop. SSD disk.


Our application send 100 documents to solr per request, json encoded. the size is around 5M each time. some times the response time is under 1 seconds, some times could be 300 seconds, the slow response happens very often.


"Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for 3600000ms; if 1000000 uncommited docs"


There are around 100 clients sending those documents at the same time, but each for the client is blocking call which wait the http response then send the next one.


I tried to make the number of documents smaller in one request, such as 20, but  still I see slow response time to time, like 80 seconds.


Would you help to give some hint how improve the response time?  solr does not seems very loaded, there must be a way to make the response faster.


BRs

//Aaron



Reply | Threaded
Open this post in threaded view
|

Re: Solr index slow response

Emir Arnautović
Hi Aaron,
Which version of Solr? How did you configure your heap? Is it standalone Solr or SolrCloud? A single server? Do you use some monitoring tool? Do you see some spikes, pauses or CPU usage is constant?

Thanks,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 18 Mar 2019, at 11:47, Aaron Yingcai Sun <[hidden email]> wrote:
>
> Hello, Solr!
>
>
> We are having some performance issue when try to send documents for solr to index. The repose time is very slow and unpredictable some time.
>
>
> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM, while 300 GB is reserved for solr, while this happening, cpu usage is around 30%, mem usage is 34%.  io also look ok according to iotop. SSD disk.
>
>
> Our application send 100 documents to solr per request, json encoded. the size is around 5M each time. some times the response time is under 1 seconds, some times could be 300 seconds, the slow response happens very often.
>
>
> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for 3600000ms; if 1000000 uncommited docs"
>
>
> There are around 100 clients sending those documents at the same time, but each for the client is blocking call which wait the http response then send the next one.
>
>
> I tried to make the number of documents smaller in one request, such as 20, but  still I see slow response time to time, like 80 seconds.
>
>
> Would you help to give some hint how improve the response time?  solr does not seems very loaded, there must be a way to make the response faster.
>
>
> BRs
>
> //Aaron
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Solr index slow response

Aaron Yingcai Sun
Hello, Emir,

Thanks for the reply, this is the solr version and heap info, standalone single solr server. I don't have monitor tool connected. only look at 'top', has not seen cpu spike so far, when the slow response happens, cpu usage is not high at all, around 30%.


# curl 'http://.../solr/admin/info/system?wt=json&indent=true'
{
  "responseHeader":{
    "status":0,
    "QTime":27},
  "mode":"std",
  "solr_home":"/ardome/solr",
  "lucene":{
    "solr-spec-version":"6.5.1",
    "solr-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 - jimczi - 2017-04-21 12:23:42",
    "lucene-spec-version":"6.5.1",
    "lucene-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 - jimczi - 2017-04-21 12:17:15"},
  "jvm":{
    "version":"1.8.0_144 25.144-b01",
    "name":"Oracle Corporation Java HotSpot(TM) 64-Bit Server VM",
    "spec":{
      "vendor":"Oracle Corporation",
      "name":"Java Platform API Specification",
      "version":"1.8"},
    "jre":{
      "vendor":"Oracle Corporation",
      "version":"1.8.0_144"},
    "vm":{
      "vendor":"Oracle Corporation",
      "name":"Java HotSpot(TM) 64-Bit Server VM",
      "version":"25.144-b01"},
    "processors":32,
    "memory":{
      "free":"69.1 GB",
      "total":"180.2 GB",
      "max":"266.7 GB",
      "used":"111 GB (%41.6)",
      "raw":{
        "free":74238728336,
        "total":193470136320,
        "max":286331502592,
        "used":119231407984,
        "used%":41.64103736566334}},
    "jmx":{
      "bootclasspath":"/usr/java/jdk1.8.0_144/jre/lib/resources.jar:/usr/java/jdk1.8.0_144/jre/lib/rt.jar:/usr/java/jdk1.8.0_144/jre/lib/sunrsasign.jar:/usr/java/jdk1.8.0_144/jre/lib/jsse.jar:/usr/java/jdk1.8.0_144/jre/lib/jce.jar:/usr/java/jdk1.8.0_144/jre/lib/charsets.jar:/usr/java/jdk1.8.0_144/jre/lib/jfr.jar:/usr/java/jdk1.8.0_144/jre/classes",
      "classpath":"...",
      "commandLineArgs":["-Xms100G",
        "-Xmx300G",
        "-DSTOP.PORT=8079",
        "-DSTOP.KEY=..",
        "-Dsolr.solr.home=..",
        "-Djetty.port=8983"],
      "startTime":"2019-03-18T09:35:27.892Z",
      "upTimeMS":9258422}},
  "system":{
    "name":"Linux",
    "arch":"amd64",
    "availableProcessors":32,
    "systemLoadAverage":14.72,
    "version":"3.0.101-311.g08a8a9d-default",
    "committedVirtualMemorySize":2547960700928,
    "freePhysicalMemorySize":4530696192,
    "freeSwapSpaceSize":3486846976,
    "processCpuLoad":0.3257436126790475,
    "processCpuTime":93869450000000,
    "systemCpuLoad":0.3279781055816521,
    "totalPhysicalMemorySize":406480175104,
    "totalSwapSpaceSize":4302303232,
    "maxFileDescriptorCount":32768,
    "openFileDescriptorCount":385,
    "uname":"Linux ... 3.0.101-311.g08a8a9d-default #1 SMP Wed Dec 14 10:15:37 UTC 2016 (08a8a9d) x86_64 x86_64 x86_64 GNU/Linux\n",
    "uptime":" 13:09pm  up 5 days 21:23,  7 users,  load average: 14.72, 12.28, 11.48\n"}}




________________________________
From: Emir Arnautović <[hidden email]>
Sent: Monday, March 18, 2019 12:10:30 PM
To: [hidden email]
Subject: Re: Solr index slow response

Hi Aaron,
Which version of Solr? How did you configure your heap? Is it standalone Solr or SolrCloud? A single server? Do you use some monitoring tool? Do you see some spikes, pauses or CPU usage is constant?

Thanks,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 18 Mar 2019, at 11:47, Aaron Yingcai Sun <[hidden email]> wrote:
>
> Hello, Solr!
>
>
> We are having some performance issue when try to send documents for solr to index. The repose time is very slow and unpredictable some time.
>
>
> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM, while 300 GB is reserved for solr, while this happening, cpu usage is around 30%, mem usage is 34%.  io also look ok according to iotop. SSD disk.
>
>
> Our application send 100 documents to solr per request, json encoded. the size is around 5M each time. some times the response time is under 1 seconds, some times could be 300 seconds, the slow response happens very often.
>
>
> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for 3600000ms; if 1000000 uncommited docs"
>
>
> There are around 100 clients sending those documents at the same time, but each for the client is blocking call which wait the http response then send the next one.
>
>
> I tried to make the number of documents smaller in one request, such as 20, but  still I see slow response time to time, like 80 seconds.
>
>
> Would you help to give some hint how improve the response time?  solr does not seems very loaded, there must be a way to make the response faster.
>
>
> BRs
>
> //Aaron
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Solr index slow response

Toke Eskildsen-2
In reply to this post by Aaron Yingcai Sun
On Mon, 2019-03-18 at 10:47 +0000, Aaron Yingcai Sun wrote:
> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM,
> while 300 GB is reserved for solr, [...]

300GB for Solr sounds excessive.

> Our application send 100 documents to solr per request, json encoded.
> the size is around 5M each time. some times the response time is
> under 1 seconds, some times could be 300 seconds, the slow response
> happens very often.
> ...
> There are around 100 clients sending those documents at the same
> time, but each for the client is blocking call which wait the http
> response then send the next one.

100 clients * 5MB/batch = at most 500MB. Or maybe you meant 100 clients
* 100 documents * 5MB/document = at most 50GB? Either way it is a long
way from 300GB and the stats you provide further down the thread
indicates that you are overprovisioning quite a lot:

"memory":{
      "free":"69.1 GB",
      "total":"180.2 GB",
      "max":"266.7 GB",
      "used":"111 GB (%41.6)",

Intermittent slow response times are a known effect of having large
Java heaps, due to stop-the-world garbage collections.

Try dialing Xmx _way_ down: If your batches are only 5MB each, try
Xmx=20g or less. I know that the stats above says that Solr uses 111GB,
but the JVM has a tendency to expand the heap quite a lot when it is
getting hammered. If you want to check beforehand, you can see how much
memeory is freed from full GCs in the GC-log.

- Toke Eskildsen, Royal Danish Library


Reply | Threaded
Open this post in threaded view
|

Re: Solr index slow response

Emir Arnautović
In reply to this post by Aaron Yingcai Sun
Hi Aaron,
Without looking too much into numbers, my bet would be that it is large heap that is causing issues. I would decrease is significantly (<30GB) and see if it is enough for your max load. Also, disable swap or reduce swappiness to min.

In any case, you should install some monitoring tool that would help you do better analysis when you run into problems. One such tool is our monitoring solution: https://sematext.com/spm

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 18 Mar 2019, at 13:14, Aaron Yingcai Sun <[hidden email]> wrote:
>
> Hello, Emir,
>
> Thanks for the reply, this is the solr version and heap info, standalone single solr server. I don't have monitor tool connected. only look at 'top', has not seen cpu spike so far, when the slow response happens, cpu usage is not high at all, around 30%.
>
>
> # curl 'http://.../solr/admin/info/system?wt=json&indent=true'
> {
>  "responseHeader":{
>    "status":0,
>    "QTime":27},
>  "mode":"std",
>  "solr_home":"/ardome/solr",
>  "lucene":{
>    "solr-spec-version":"6.5.1",
>    "solr-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 - jimczi - 2017-04-21 12:23:42",
>    "lucene-spec-version":"6.5.1",
>    "lucene-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 - jimczi - 2017-04-21 12:17:15"},
>  "jvm":{
>    "version":"1.8.0_144 25.144-b01",
>    "name":"Oracle Corporation Java HotSpot(TM) 64-Bit Server VM",
>    "spec":{
>      "vendor":"Oracle Corporation",
>      "name":"Java Platform API Specification",
>      "version":"1.8"},
>    "jre":{
>      "vendor":"Oracle Corporation",
>      "version":"1.8.0_144"},
>    "vm":{
>      "vendor":"Oracle Corporation",
>      "name":"Java HotSpot(TM) 64-Bit Server VM",
>      "version":"25.144-b01"},
>    "processors":32,
>    "memory":{
>      "free":"69.1 GB",
>      "total":"180.2 GB",
>      "max":"266.7 GB",
>      "used":"111 GB (%41.6)",
>      "raw":{
>        "free":74238728336,
>        "total":193470136320,
>        "max":286331502592,
>        "used":119231407984,
>        "used%":41.64103736566334}},
>    "jmx":{
>      "bootclasspath":"/usr/java/jdk1.8.0_144/jre/lib/resources.jar:/usr/java/jdk1.8.0_144/jre/lib/rt.jar:/usr/java/jdk1.8.0_144/jre/lib/sunrsasign.jar:/usr/java/jdk1.8.0_144/jre/lib/jsse.jar:/usr/java/jdk1.8.0_144/jre/lib/jce.jar:/usr/java/jdk1.8.0_144/jre/lib/charsets.jar:/usr/java/jdk1.8.0_144/jre/lib/jfr.jar:/usr/java/jdk1.8.0_144/jre/classes",
>      "classpath":"...",
>      "commandLineArgs":["-Xms100G",
>        "-Xmx300G",
>        "-DSTOP.PORT=8079",
>        "-DSTOP.KEY=..",
>        "-Dsolr.solr.home=..",
>        "-Djetty.port=8983"],
>      "startTime":"2019-03-18T09:35:27.892Z",
>      "upTimeMS":9258422}},
>  "system":{
>    "name":"Linux",
>    "arch":"amd64",
>    "availableProcessors":32,
>    "systemLoadAverage":14.72,
>    "version":"3.0.101-311.g08a8a9d-default",
>    "committedVirtualMemorySize":2547960700928,
>    "freePhysicalMemorySize":4530696192,
>    "freeSwapSpaceSize":3486846976,
>    "processCpuLoad":0.3257436126790475,
>    "processCpuTime":93869450000000,
>    "systemCpuLoad":0.3279781055816521,
>    "totalPhysicalMemorySize":406480175104,
>    "totalSwapSpaceSize":4302303232,
>    "maxFileDescriptorCount":32768,
>    "openFileDescriptorCount":385,
>    "uname":"Linux ... 3.0.101-311.g08a8a9d-default #1 SMP Wed Dec 14 10:15:37 UTC 2016 (08a8a9d) x86_64 x86_64 x86_64 GNU/Linux\n",
>    "uptime":" 13:09pm  up 5 days 21:23,  7 users,  load average: 14.72, 12.28, 11.48\n"}}
>
>
>
>
> ________________________________
> From: Emir Arnautović <[hidden email]>
> Sent: Monday, March 18, 2019 12:10:30 PM
> To: [hidden email]
> Subject: Re: Solr index slow response
>
> Hi Aaron,
> Which version of Solr? How did you configure your heap? Is it standalone Solr or SolrCloud? A single server? Do you use some monitoring tool? Do you see some spikes, pauses or CPU usage is constant?
>
> Thanks,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
>> On 18 Mar 2019, at 11:47, Aaron Yingcai Sun <[hidden email]> wrote:
>>
>> Hello, Solr!
>>
>>
>> We are having some performance issue when try to send documents for solr to index. The repose time is very slow and unpredictable some time.
>>
>>
>> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM, while 300 GB is reserved for solr, while this happening, cpu usage is around 30%, mem usage is 34%.  io also look ok according to iotop. SSD disk.
>>
>>
>> Our application send 100 documents to solr per request, json encoded. the size is around 5M each time. some times the response time is under 1 seconds, some times could be 300 seconds, the slow response happens very often.
>>
>>
>> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for 3600000ms; if 1000000 uncommited docs"
>>
>>
>> There are around 100 clients sending those documents at the same time, but each for the client is blocking call which wait the http response then send the next one.
>>
>>
>> I tried to make the number of documents smaller in one request, such as 20, but  still I see slow response time to time, like 80 seconds.
>>
>>
>> Would you help to give some hint how improve the response time?  solr does not seems very loaded, there must be a way to make the response faster.
>>
>>
>> BRs
>>
>> //Aaron
>>
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Solr index slow response

Emir Arnautović
One more thing - it is considered a good practice to use the same value for Xmx and Xms.

Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 18 Mar 2019, at 14:19, Emir Arnautović <[hidden email]> wrote:
>
> Hi Aaron,
> Without looking too much into numbers, my bet would be that it is large heap that is causing issues. I would decrease is significantly (<30GB) and see if it is enough for your max load. Also, disable swap or reduce swappiness to min.
>
> In any case, you should install some monitoring tool that would help you do better analysis when you run into problems. One such tool is our monitoring solution: https://sematext.com/spm
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
>> On 18 Mar 2019, at 13:14, Aaron Yingcai Sun <[hidden email]> wrote:
>>
>> Hello, Emir,
>>
>> Thanks for the reply, this is the solr version and heap info, standalone single solr server. I don't have monitor tool connected. only look at 'top', has not seen cpu spike so far, when the slow response happens, cpu usage is not high at all, around 30%.
>>
>>
>> # curl 'http://.../solr/admin/info/system?wt=json&indent=true'
>> {
>> "responseHeader":{
>>   "status":0,
>>   "QTime":27},
>> "mode":"std",
>> "solr_home":"/ardome/solr",
>> "lucene":{
>>   "solr-spec-version":"6.5.1",
>>   "solr-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 - jimczi - 2017-04-21 12:23:42",
>>   "lucene-spec-version":"6.5.1",
>>   "lucene-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 - jimczi - 2017-04-21 12:17:15"},
>> "jvm":{
>>   "version":"1.8.0_144 25.144-b01",
>>   "name":"Oracle Corporation Java HotSpot(TM) 64-Bit Server VM",
>>   "spec":{
>>     "vendor":"Oracle Corporation",
>>     "name":"Java Platform API Specification",
>>     "version":"1.8"},
>>   "jre":{
>>     "vendor":"Oracle Corporation",
>>     "version":"1.8.0_144"},
>>   "vm":{
>>     "vendor":"Oracle Corporation",
>>     "name":"Java HotSpot(TM) 64-Bit Server VM",
>>     "version":"25.144-b01"},
>>   "processors":32,
>>   "memory":{
>>     "free":"69.1 GB",
>>     "total":"180.2 GB",
>>     "max":"266.7 GB",
>>     "used":"111 GB (%41.6)",
>>     "raw":{
>>       "free":74238728336,
>>       "total":193470136320,
>>       "max":286331502592,
>>       "used":119231407984,
>>       "used%":41.64103736566334}},
>>   "jmx":{
>>     "bootclasspath":"/usr/java/jdk1.8.0_144/jre/lib/resources.jar:/usr/java/jdk1.8.0_144/jre/lib/rt.jar:/usr/java/jdk1.8.0_144/jre/lib/sunrsasign.jar:/usr/java/jdk1.8.0_144/jre/lib/jsse.jar:/usr/java/jdk1.8.0_144/jre/lib/jce.jar:/usr/java/jdk1.8.0_144/jre/lib/charsets.jar:/usr/java/jdk1.8.0_144/jre/lib/jfr.jar:/usr/java/jdk1.8.0_144/jre/classes",
>>     "classpath":"...",
>>     "commandLineArgs":["-Xms100G",
>>       "-Xmx300G",
>>       "-DSTOP.PORT=8079",
>>       "-DSTOP.KEY=..",
>>       "-Dsolr.solr.home=..",
>>       "-Djetty.port=8983"],
>>     "startTime":"2019-03-18T09:35:27.892Z",
>>     "upTimeMS":9258422}},
>> "system":{
>>   "name":"Linux",
>>   "arch":"amd64",
>>   "availableProcessors":32,
>>   "systemLoadAverage":14.72,
>>   "version":"3.0.101-311.g08a8a9d-default",
>>   "committedVirtualMemorySize":2547960700928,
>>   "freePhysicalMemorySize":4530696192,
>>   "freeSwapSpaceSize":3486846976,
>>   "processCpuLoad":0.3257436126790475,
>>   "processCpuTime":93869450000000,
>>   "systemCpuLoad":0.3279781055816521,
>>   "totalPhysicalMemorySize":406480175104,
>>   "totalSwapSpaceSize":4302303232,
>>   "maxFileDescriptorCount":32768,
>>   "openFileDescriptorCount":385,
>>   "uname":"Linux ... 3.0.101-311.g08a8a9d-default #1 SMP Wed Dec 14 10:15:37 UTC 2016 (08a8a9d) x86_64 x86_64 x86_64 GNU/Linux\n",
>>   "uptime":" 13:09pm  up 5 days 21:23,  7 users,  load average: 14.72, 12.28, 11.48\n"}}
>>
>>
>>
>>
>> ________________________________
>> From: Emir Arnautović <[hidden email]>
>> Sent: Monday, March 18, 2019 12:10:30 PM
>> To: [hidden email]
>> Subject: Re: Solr index slow response
>>
>> Hi Aaron,
>> Which version of Solr? How did you configure your heap? Is it standalone Solr or SolrCloud? A single server? Do you use some monitoring tool? Do you see some spikes, pauses or CPU usage is constant?
>>
>> Thanks,
>> Emir
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>
>>
>>
>>> On 18 Mar 2019, at 11:47, Aaron Yingcai Sun <[hidden email]> wrote:
>>>
>>> Hello, Solr!
>>>
>>>
>>> We are having some performance issue when try to send documents for solr to index. The repose time is very slow and unpredictable some time.
>>>
>>>
>>> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM, while 300 GB is reserved for solr, while this happening, cpu usage is around 30%, mem usage is 34%.  io also look ok according to iotop. SSD disk.
>>>
>>>
>>> Our application send 100 documents to solr per request, json encoded. the size is around 5M each time. some times the response time is under 1 seconds, some times could be 300 seconds, the slow response happens very often.
>>>
>>>
>>> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for 3600000ms; if 1000000 uncommited docs"
>>>
>>>
>>> There are around 100 clients sending those documents at the same time, but each for the client is blocking call which wait the http response then send the next one.
>>>
>>>
>>> I tried to make the number of documents smaller in one request, such as 20, but  still I see slow response time to time, like 80 seconds.
>>>
>>>
>>> Would you help to give some hint how improve the response time?  solr does not seems very loaded, there must be a way to make the response faster.
>>>
>>>
>>> BRs
>>>
>>> //Aaron
>>>
>>>
>>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Solr index slow response

Aaron Yingcai Sun
In reply to this post by Toke Eskildsen-2
I'm a bit confused, why large heap size would make it slower?  Isn't that give it enough room to make it not busy doing GC all the time?

My http/json request contains 100 documents, the total size of the 100 documents is around 5M, there are ~100 client sending those requests continuously.

Previously the JVM is set to max 32 GB ,  the speed was even worse,  now it's running with min 100GB, max 300GB, it use around 100GB.


this page suggest to use smaller number of documents per request,   https://wiki.apache.org/solr/SolrPerformanceProblems

SolrPerformanceProblems - Solr Wiki<https://wiki.apache.org/solr/SolrPerformanceProblems>
wiki.apache.org
General information. There is a performance bug that makes *everything* slow in versions 6.4.0 and 6.4.1. The problem is fixed in 6.4.2. It is described by SOLR-10130.This is highly version specific, so if you are not running one of the affected versions, don't worry about it.

So I try to reduce the number, still I could get lots of large response QTime:

190318-142652.695-160214 DBG1:doc_count: 10 , doc_size: 609  KB, Res code: 200, QTime: 47918 ms, Request time: 47921 ms.
190318-142652.704-160179 DBG1:doc_count: 10 , doc_size: 568  KB, Res code: 200, QTime: 36919 ms, Request time: 36922 ms.
190318-142652.780-160197 DBG1:doc_count: 10 , doc_size: 609  KB, Res code: 200, QTime: 36082 ms, Request time: 36084 ms.
190318-142652.859-160200 DBG1:doc_count: 10 , doc_size: 569  KB, Res code: 200, QTime: 36880 ms, Request time: 36882 ms.
190318-142653.131-160148 DBG1:doc_count: 10 , doc_size: 608  KB, Res code: 200, QTime: 37222 ms, Request time: 37224 ms.
190318-142653.154-160211 DBG1:doc_count: 10 , doc_size: 541  KB, Res code: 200, QTime: 37241 ms, Request time: 37243 ms.
190318-142653.223-163490 DBG1:doc_count: 10 , doc_size: 589  KB, Res code: 200, QTime: 37174 ms, Request time: 37176 ms.
190318-142653.359-160154 DBG1:doc_count: 10 , doc_size: 592  KB, Res code: 200, QTime: 37008 ms, Request time: 37011 ms.
190318-142653.497-163491 DBG1:doc_count: 10 , doc_size: 583  KB, Res code: 200, QTime: 24828 ms, Request time: 24830 ms.
190318-142653.987-160208 DBG1:doc_count: 10 , doc_size: 669  KB, Res code: 200, QTime: 23900 ms, Request time: 23902 ms.
190318-142654.114-160208 DBG1:doc_count: 10 , doc_size: 544  KB, Res code: 200, QTime: 121 ms, Request time: 122 ms.
190318-142654.233-160208 DBG1:doc_count: 10 , doc_size: 536  KB, Res code: 200, QTime: 113 ms, Request time: 115 ms.
190318-142654.354-160208 DBG1:doc_count: 10 , doc_size: 598  KB, Res code: 200, QTime: 116 ms, Request time: 117 ms.
190318-142654.466-160208 DBG1:doc_count: 10 , doc_size: 546  KB, Res code: 200, QTime: 107 ms, Request time: 108 ms.
190318-142654.586-160208 DBG1:doc_count: 10 , doc_size: 566  KB, Res code: 200, QTime: 114 ms, Request time: 115 ms.
190318-142654.687-160208 DBG1:doc_count: 10 , doc_size: 541  KB, Res code: 200, QTime: 96 ms, Request time: 98 ms.
190318-142654.768-160208 DBG1:doc_count: 10 , doc_size: 455  KB, Res code: 200, QTime: 75 ms, Request time: 77 ms.
190318-142654.870-160208 DBG1:doc_count: 10 , doc_size: 538  KB, Res code: 200, QTime: 97 ms, Request time: 98 ms.
190318-142654.967-160208 DBG1:doc_count: 10 , doc_size: 539  KB, Res code: 200, QTime: 92 ms, Request time: 93 ms.
190318-142655.096-160208 DBG1:doc_count: 10 , doc_size: 672  KB, Res code: 200, QTime: 124 ms, Request time: 125 ms.
190318-142655.210-160208 DBG1:doc_count: 10 , doc_size: 605  KB, Res code: 200, QTime: 108 ms, Request time: 110 ms.
190318-142655.304-160208 DBG1:doc_count: 10 , doc_size: 481  KB, Res code: 200, QTime: 89 ms, Request time: 90 ms.
190318-142655.410-160208 DBG1:doc_count: 10 , doc_size: 468  KB, Res code: 200, QTime: 101 ms, Request time: 102 ms.


________________________________
From: Toke Eskildsen <[hidden email]>
Sent: Monday, March 18, 2019 2:13:12 PM
To: [hidden email]
Subject: Re: Solr index slow response

On Mon, 2019-03-18 at 10:47 +0000, Aaron Yingcai Sun wrote:
> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM,
> while 300 GB is reserved for solr, [...]

300GB for Solr sounds excessive.

> Our application send 100 documents to solr per request, json encoded.
> the size is around 5M each time. some times the response time is
> under 1 seconds, some times could be 300 seconds, the slow response
> happens very often.
> ...
> There are around 100 clients sending those documents at the same
> time, but each for the client is blocking call which wait the http
> response then send the next one.

100 clients * 5MB/batch = at most 500MB. Or maybe you meant 100 clients
* 100 documents * 5MB/document = at most 50GB? Either way it is a long
way from 300GB and the stats you provide further down the thread
indicates that you are overprovisioning quite a lot:

"memory":{
      "free":"69.1 GB",
      "total":"180.2 GB",
      "max":"266.7 GB",
      "used":"111 GB (%41.6)",

Intermittent slow response times are a known effect of having large
Java heaps, due to stop-the-world garbage collections.

Try dialing Xmx _way_ down: If your batches are only 5MB each, try
Xmx=20g or less. I know that the stats above says that Solr uses 111GB,
but the JVM has a tendency to expand the heap quite a lot when it is
getting hammered. If you want to check beforehand, you can see how much
memeory is freed from full GCs in the GC-log.

- Toke Eskildsen, Royal Danish Library


Reply | Threaded
Open this post in threaded view
|

Re: Solr index slow response

Aaron Yingcai Sun
In reply to this post by Emir Arnautović
Hi, Emir,

My system used to run with max 32GB, the response time is bad as well.  swap is set to 4GB, there 3.2 free, I doubt swap would affect it since there is such huge free memory.

I could try to with set Xms and Xmx to the same value, but I doubt how much would that change the response time.


BRs

//Aaron

________________________________
From: Emir Arnautović <[hidden email]>
Sent: Monday, March 18, 2019 2:19:19 PM
To: [hidden email]
Subject: Re: Solr index slow response

Hi Aaron,
Without looking too much into numbers, my bet would be that it is large heap that is causing issues. I would decrease is significantly (<30GB) and see if it is enough for your max load. Also, disable swap or reduce swappiness to min.

In any case, you should install some monitoring tool that would help you do better analysis when you run into problems. One such tool is our monitoring solution: https://sematext.com/spm

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 18 Mar 2019, at 13:14, Aaron Yingcai Sun <[hidden email]> wrote:
>
> Hello, Emir,
>
> Thanks for the reply, this is the solr version and heap info, standalone single solr server. I don't have monitor tool connected. only look at 'top', has not seen cpu spike so far, when the slow response happens, cpu usage is not high at all, around 30%.
>
>
> # curl 'http://.../solr/admin/info/system?wt=json&indent=true'
> {
>  "responseHeader":{
>    "status":0,
>    "QTime":27},
>  "mode":"std",
>  "solr_home":"/ardome/solr",
>  "lucene":{
>    "solr-spec-version":"6.5.1",
>    "solr-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 - jimczi - 2017-04-21 12:23:42",
>    "lucene-spec-version":"6.5.1",
>    "lucene-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 - jimczi - 2017-04-21 12:17:15"},
>  "jvm":{
>    "version":"1.8.0_144 25.144-b01",
>    "name":"Oracle Corporation Java HotSpot(TM) 64-Bit Server VM",
>    "spec":{
>      "vendor":"Oracle Corporation",
>      "name":"Java Platform API Specification",
>      "version":"1.8"},
>    "jre":{
>      "vendor":"Oracle Corporation",
>      "version":"1.8.0_144"},
>    "vm":{
>      "vendor":"Oracle Corporation",
>      "name":"Java HotSpot(TM) 64-Bit Server VM",
>      "version":"25.144-b01"},
>    "processors":32,
>    "memory":{
>      "free":"69.1 GB",
>      "total":"180.2 GB",
>      "max":"266.7 GB",
>      "used":"111 GB (%41.6)",
>      "raw":{
>        "free":74238728336,
>        "total":193470136320,
>        "max":286331502592,
>        "used":119231407984,
>        "used%":41.64103736566334}},
>    "jmx":{
>      "bootclasspath":"/usr/java/jdk1.8.0_144/jre/lib/resources.jar:/usr/java/jdk1.8.0_144/jre/lib/rt.jar:/usr/java/jdk1.8.0_144/jre/lib/sunrsasign.jar:/usr/java/jdk1.8.0_144/jre/lib/jsse.jar:/usr/java/jdk1.8.0_144/jre/lib/jce.jar:/usr/java/jdk1.8.0_144/jre/lib/charsets.jar:/usr/java/jdk1.8.0_144/jre/lib/jfr.jar:/usr/java/jdk1.8.0_144/jre/classes",
>      "classpath":"...",
>      "commandLineArgs":["-Xms100G",
>        "-Xmx300G",
>        "-DSTOP.PORT=8079",
>        "-DSTOP.KEY=..",
>        "-Dsolr.solr.home=..",
>        "-Djetty.port=8983"],
>      "startTime":"2019-03-18T09:35:27.892Z",
>      "upTimeMS":9258422}},
>  "system":{
>    "name":"Linux",
>    "arch":"amd64",
>    "availableProcessors":32,
>    "systemLoadAverage":14.72,
>    "version":"3.0.101-311.g08a8a9d-default",
>    "committedVirtualMemorySize":2547960700928,
>    "freePhysicalMemorySize":4530696192,
>    "freeSwapSpaceSize":3486846976,
>    "processCpuLoad":0.3257436126790475,
>    "processCpuTime":93869450000000,
>    "systemCpuLoad":0.3279781055816521,
>    "totalPhysicalMemorySize":406480175104,
>    "totalSwapSpaceSize":4302303232,
>    "maxFileDescriptorCount":32768,
>    "openFileDescriptorCount":385,
>    "uname":"Linux ... 3.0.101-311.g08a8a9d-default #1 SMP Wed Dec 14 10:15:37 UTC 2016 (08a8a9d) x86_64 x86_64 x86_64 GNU/Linux\n",
>    "uptime":" 13:09pm  up 5 days 21:23,  7 users,  load average: 14.72, 12.28, 11.48\n"}}
>
>
>
>
> ________________________________
> From: Emir Arnautović <[hidden email]>
> Sent: Monday, March 18, 2019 12:10:30 PM
> To: [hidden email]
> Subject: Re: Solr index slow response
>
> Hi Aaron,
> Which version of Solr? How did you configure your heap? Is it standalone Solr or SolrCloud? A single server? Do you use some monitoring tool? Do you see some spikes, pauses or CPU usage is constant?
>
> Thanks,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
>> On 18 Mar 2019, at 11:47, Aaron Yingcai Sun <[hidden email]> wrote:
>>
>> Hello, Solr!
>>
>>
>> We are having some performance issue when try to send documents for solr to index. The repose time is very slow and unpredictable some time.
>>
>>
>> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM, while 300 GB is reserved for solr, while this happening, cpu usage is around 30%, mem usage is 34%.  io also look ok according to iotop. SSD disk.
>>
>>
>> Our application send 100 documents to solr per request, json encoded. the size is around 5M each time. some times the response time is under 1 seconds, some times could be 300 seconds, the slow response happens very often.
>>
>>
>> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for 3600000ms; if 1000000 uncommited docs"
>>
>>
>> There are around 100 clients sending those documents at the same time, but each for the client is blocking call which wait the http response then send the next one.
>>
>>
>> I tried to make the number of documents smaller in one request, such as 20, but  still I see slow response time to time, like 80 seconds.
>>
>>
>> Would you help to give some hint how improve the response time?  solr does not seems very loaded, there must be a way to make the response faster.
>>
>>
>> BRs
>>
>> //Aaron
>>
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Solr index slow response

Emir Arnautović
In reply to this post by Aaron Yingcai Sun
Hi Aaron,
You are right - large heap means that there will be no major GC all the time, but eventually it will happen and then the larger the heap the longer it will take. So with 300GB heap it takes observed 300s. If you used to run on 32GB heap and it was slow, it probably means that heap is too small for your load, but if you did not run into OOM, then it means it is starving but still can handle the load.
In any case, I would not go with super large heap. One option is to split your server to more Solr instances. That would let you run more instances with smaller heap. I am not sure about your business case and if it is possible to do it without switching to SolrCloud, but if you have 100s of clients that are completely separate, you can simply split clients to several Solr instances that are running on the same server and have some logic that routes client to the right instance.
If your average doc size is 5MB, then I would reduce the number of documents and that will reduce some load to heap. Unfortunately, the only way to answer similar questions is to run some tests.

Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 18 Mar 2019, at 14:30, Aaron Yingcai Sun <[hidden email]> wrote:
>
> I'm a bit confused, why large heap size would make it slower?  Isn't that give it enough room to make it not busy doing GC all the time?
>
> My http/json request contains 100 documents, the total size of the 100 documents is around 5M, there are ~100 client sending those requests continuously.
>
> Previously the JVM is set to max 32 GB ,  the speed was even worse,  now it's running with min 100GB, max 300GB, it use around 100GB.
>
>
> this page suggest to use smaller number of documents per request,   https://wiki.apache.org/solr/SolrPerformanceProblems
>
> SolrPerformanceProblems - Solr Wiki<https://wiki.apache.org/solr/SolrPerformanceProblems>
> wiki.apache.org
> General information. There is a performance bug that makes *everything* slow in versions 6.4.0 and 6.4.1. The problem is fixed in 6.4.2. It is described by SOLR-10130.This is highly version specific, so if you are not running one of the affected versions, don't worry about it.
>
> So I try to reduce the number, still I could get lots of large response QTime:
>
> 190318-142652.695-160214 DBG1:doc_count: 10 , doc_size: 609  KB, Res code: 200, QTime: 47918 ms, Request time: 47921 ms.
> 190318-142652.704-160179 DBG1:doc_count: 10 , doc_size: 568  KB, Res code: 200, QTime: 36919 ms, Request time: 36922 ms.
> 190318-142652.780-160197 DBG1:doc_count: 10 , doc_size: 609  KB, Res code: 200, QTime: 36082 ms, Request time: 36084 ms.
> 190318-142652.859-160200 DBG1:doc_count: 10 , doc_size: 569  KB, Res code: 200, QTime: 36880 ms, Request time: 36882 ms.
> 190318-142653.131-160148 DBG1:doc_count: 10 , doc_size: 608  KB, Res code: 200, QTime: 37222 ms, Request time: 37224 ms.
> 190318-142653.154-160211 DBG1:doc_count: 10 , doc_size: 541  KB, Res code: 200, QTime: 37241 ms, Request time: 37243 ms.
> 190318-142653.223-163490 DBG1:doc_count: 10 , doc_size: 589  KB, Res code: 200, QTime: 37174 ms, Request time: 37176 ms.
> 190318-142653.359-160154 DBG1:doc_count: 10 , doc_size: 592  KB, Res code: 200, QTime: 37008 ms, Request time: 37011 ms.
> 190318-142653.497-163491 DBG1:doc_count: 10 , doc_size: 583  KB, Res code: 200, QTime: 24828 ms, Request time: 24830 ms.
> 190318-142653.987-160208 DBG1:doc_count: 10 , doc_size: 669  KB, Res code: 200, QTime: 23900 ms, Request time: 23902 ms.
> 190318-142654.114-160208 DBG1:doc_count: 10 , doc_size: 544  KB, Res code: 200, QTime: 121 ms, Request time: 122 ms.
> 190318-142654.233-160208 DBG1:doc_count: 10 , doc_size: 536  KB, Res code: 200, QTime: 113 ms, Request time: 115 ms.
> 190318-142654.354-160208 DBG1:doc_count: 10 , doc_size: 598  KB, Res code: 200, QTime: 116 ms, Request time: 117 ms.
> 190318-142654.466-160208 DBG1:doc_count: 10 , doc_size: 546  KB, Res code: 200, QTime: 107 ms, Request time: 108 ms.
> 190318-142654.586-160208 DBG1:doc_count: 10 , doc_size: 566  KB, Res code: 200, QTime: 114 ms, Request time: 115 ms.
> 190318-142654.687-160208 DBG1:doc_count: 10 , doc_size: 541  KB, Res code: 200, QTime: 96 ms, Request time: 98 ms.
> 190318-142654.768-160208 DBG1:doc_count: 10 , doc_size: 455  KB, Res code: 200, QTime: 75 ms, Request time: 77 ms.
> 190318-142654.870-160208 DBG1:doc_count: 10 , doc_size: 538  KB, Res code: 200, QTime: 97 ms, Request time: 98 ms.
> 190318-142654.967-160208 DBG1:doc_count: 10 , doc_size: 539  KB, Res code: 200, QTime: 92 ms, Request time: 93 ms.
> 190318-142655.096-160208 DBG1:doc_count: 10 , doc_size: 672  KB, Res code: 200, QTime: 124 ms, Request time: 125 ms.
> 190318-142655.210-160208 DBG1:doc_count: 10 , doc_size: 605  KB, Res code: 200, QTime: 108 ms, Request time: 110 ms.
> 190318-142655.304-160208 DBG1:doc_count: 10 , doc_size: 481  KB, Res code: 200, QTime: 89 ms, Request time: 90 ms.
> 190318-142655.410-160208 DBG1:doc_count: 10 , doc_size: 468  KB, Res code: 200, QTime: 101 ms, Request time: 102 ms.
>
>
> ________________________________
> From: Toke Eskildsen <[hidden email]>
> Sent: Monday, March 18, 2019 2:13:12 PM
> To: [hidden email]
> Subject: Re: Solr index slow response
>
> On Mon, 2019-03-18 at 10:47 +0000, Aaron Yingcai Sun wrote:
>> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM,
>> while 300 GB is reserved for solr, [...]
>
> 300GB for Solr sounds excessive.
>
>> Our application send 100 documents to solr per request, json encoded.
>> the size is around 5M each time. some times the response time is
>> under 1 seconds, some times could be 300 seconds, the slow response
>> happens very often.
>> ...
>> There are around 100 clients sending those documents at the same
>> time, but each for the client is blocking call which wait the http
>> response then send the next one.
>
> 100 clients * 5MB/batch = at most 500MB. Or maybe you meant 100 clients
> * 100 documents * 5MB/document = at most 50GB? Either way it is a long
> way from 300GB and the stats you provide further down the thread
> indicates that you are overprovisioning quite a lot:
>
> "memory":{
>      "free":"69.1 GB",
>      "total":"180.2 GB",
>      "max":"266.7 GB",
>      "used":"111 GB (%41.6)",
>
> Intermittent slow response times are a known effect of having large
> Java heaps, due to stop-the-world garbage collections.
>
> Try dialing Xmx _way_ down: If your batches are only 5MB each, try
> Xmx=20g or less. I know that the stats above says that Solr uses 111GB,
> but the JVM has a tendency to expand the heap quite a lot when it is
> getting hammered. If you want to check beforehand, you can see how much
> memeory is freed from full GCs in the GC-log.
>
> - Toke Eskildsen, Royal Danish Library
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Solr index slow response

Emir Arnautović
In reply to this post by Aaron Yingcai Sun
4GB swap on 400GB machine does not make much sense, so disable it. Even 4GB, some pages might be swapped, and if those are some Solr pages, it’ll affect Solr.

Setting Xms and Xmx to the same value will not solve your issue but you will avoid heap resize when your heap reaches Xms.

Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 18 Mar 2019, at 14:36, Aaron Yingcai Sun <[hidden email]> wrote:
>
> Hi, Emir,
>
> My system used to run with max 32GB, the response time is bad as well.  swap is set to 4GB, there 3.2 free, I doubt swap would affect it since there is such huge free memory.
>
> I could try to with set Xms and Xmx to the same value, but I doubt how much would that change the response time.
>
>
> BRs
>
> //Aaron
>
> ________________________________
> From: Emir Arnautović <[hidden email]>
> Sent: Monday, March 18, 2019 2:19:19 PM
> To: [hidden email]
> Subject: Re: Solr index slow response
>
> Hi Aaron,
> Without looking too much into numbers, my bet would be that it is large heap that is causing issues. I would decrease is significantly (<30GB) and see if it is enough for your max load. Also, disable swap or reduce swappiness to min.
>
> In any case, you should install some monitoring tool that would help you do better analysis when you run into problems. One such tool is our monitoring solution: https://sematext.com/spm
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
>> On 18 Mar 2019, at 13:14, Aaron Yingcai Sun <[hidden email]> wrote:
>>
>> Hello, Emir,
>>
>> Thanks for the reply, this is the solr version and heap info, standalone single solr server. I don't have monitor tool connected. only look at 'top', has not seen cpu spike so far, when the slow response happens, cpu usage is not high at all, around 30%.
>>
>>
>> # curl 'http://.../solr/admin/info/system?wt=json&indent=true'
>> {
>> "responseHeader":{
>>   "status":0,
>>   "QTime":27},
>> "mode":"std",
>> "solr_home":"/ardome/solr",
>> "lucene":{
>>   "solr-spec-version":"6.5.1",
>>   "solr-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 - jimczi - 2017-04-21 12:23:42",
>>   "lucene-spec-version":"6.5.1",
>>   "lucene-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 - jimczi - 2017-04-21 12:17:15"},
>> "jvm":{
>>   "version":"1.8.0_144 25.144-b01",
>>   "name":"Oracle Corporation Java HotSpot(TM) 64-Bit Server VM",
>>   "spec":{
>>     "vendor":"Oracle Corporation",
>>     "name":"Java Platform API Specification",
>>     "version":"1.8"},
>>   "jre":{
>>     "vendor":"Oracle Corporation",
>>     "version":"1.8.0_144"},
>>   "vm":{
>>     "vendor":"Oracle Corporation",
>>     "name":"Java HotSpot(TM) 64-Bit Server VM",
>>     "version":"25.144-b01"},
>>   "processors":32,
>>   "memory":{
>>     "free":"69.1 GB",
>>     "total":"180.2 GB",
>>     "max":"266.7 GB",
>>     "used":"111 GB (%41.6)",
>>     "raw":{
>>       "free":74238728336,
>>       "total":193470136320,
>>       "max":286331502592,
>>       "used":119231407984,
>>       "used%":41.64103736566334}},
>>   "jmx":{
>>     "bootclasspath":"/usr/java/jdk1.8.0_144/jre/lib/resources.jar:/usr/java/jdk1.8.0_144/jre/lib/rt.jar:/usr/java/jdk1.8.0_144/jre/lib/sunrsasign.jar:/usr/java/jdk1.8.0_144/jre/lib/jsse.jar:/usr/java/jdk1.8.0_144/jre/lib/jce.jar:/usr/java/jdk1.8.0_144/jre/lib/charsets.jar:/usr/java/jdk1.8.0_144/jre/lib/jfr.jar:/usr/java/jdk1.8.0_144/jre/classes",
>>     "classpath":"...",
>>     "commandLineArgs":["-Xms100G",
>>       "-Xmx300G",
>>       "-DSTOP.PORT=8079",
>>       "-DSTOP.KEY=..",
>>       "-Dsolr.solr.home=..",
>>       "-Djetty.port=8983"],
>>     "startTime":"2019-03-18T09:35:27.892Z",
>>     "upTimeMS":9258422}},
>> "system":{
>>   "name":"Linux",
>>   "arch":"amd64",
>>   "availableProcessors":32,
>>   "systemLoadAverage":14.72,
>>   "version":"3.0.101-311.g08a8a9d-default",
>>   "committedVirtualMemorySize":2547960700928,
>>   "freePhysicalMemorySize":4530696192,
>>   "freeSwapSpaceSize":3486846976,
>>   "processCpuLoad":0.3257436126790475,
>>   "processCpuTime":93869450000000,
>>   "systemCpuLoad":0.3279781055816521,
>>   "totalPhysicalMemorySize":406480175104,
>>   "totalSwapSpaceSize":4302303232,
>>   "maxFileDescriptorCount":32768,
>>   "openFileDescriptorCount":385,
>>   "uname":"Linux ... 3.0.101-311.g08a8a9d-default #1 SMP Wed Dec 14 10:15:37 UTC 2016 (08a8a9d) x86_64 x86_64 x86_64 GNU/Linux\n",
>>   "uptime":" 13:09pm  up 5 days 21:23,  7 users,  load average: 14.72, 12.28, 11.48\n"}}
>>
>>
>>
>>
>> ________________________________
>> From: Emir Arnautović <[hidden email]>
>> Sent: Monday, March 18, 2019 12:10:30 PM
>> To: [hidden email]
>> Subject: Re: Solr index slow response
>>
>> Hi Aaron,
>> Which version of Solr? How did you configure your heap? Is it standalone Solr or SolrCloud? A single server? Do you use some monitoring tool? Do you see some spikes, pauses or CPU usage is constant?
>>
>> Thanks,
>> Emir
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>
>>
>>
>>> On 18 Mar 2019, at 11:47, Aaron Yingcai Sun <[hidden email]> wrote:
>>>
>>> Hello, Solr!
>>>
>>>
>>> We are having some performance issue when try to send documents for solr to index. The repose time is very slow and unpredictable some time.
>>>
>>>
>>> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM, while 300 GB is reserved for solr, while this happening, cpu usage is around 30%, mem usage is 34%.  io also look ok according to iotop. SSD disk.
>>>
>>>
>>> Our application send 100 documents to solr per request, json encoded. the size is around 5M each time. some times the response time is under 1 seconds, some times could be 300 seconds, the slow response happens very often.
>>>
>>>
>>> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for 3600000ms; if 1000000 uncommited docs"
>>>
>>>
>>> There are around 100 clients sending those documents at the same time, but each for the client is blocking call which wait the http response then send the next one.
>>>
>>>
>>> I tried to make the number of documents smaller in one request, such as 20, but  still I see slow response time to time, like 80 seconds.
>>>
>>>
>>> Would you help to give some hint how improve the response time?  solr does not seems very loaded, there must be a way to make the response faster.
>>>
>>>
>>> BRs
>>>
>>> //Aaron
>>>
>>>
>>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Solr index slow response

Chris Ulicny
In reply to this post by Aaron Yingcai Sun
One other thing to look at besides the heap is your commit settings. We've
experienced something similar, and changing commit settings alleviated the
issue.

Are you opening a search on every hardcommit? If so, you might want to
reconsider and use the softcommit for the hourly creation of a new searcher.

The hardcommit interval should be much lower. Probably something on the
order of seconds (15000) instead of hours (currently 3600000). When the
hard commit fires, numerous merges might be firing off in the background
due to the volume of documents you are indexing, which might explain the
periodic bad response times shown in your logs.

It would depend on your specific scenario, but here's our setup. During
long periods of constant indexing of documents to a staging collection (~2
billion documents), we have following commit settings

softcommit: 3600000ms (for periodic validation of data, since it's not in
production)
hardcommit: openSearcher -> false, 15000ms (no document limit)

This makes the documents available for searching every hour, but doesn't
result in the large bursts of IO due to the infrequent hard commits.

For more info, Erick Erickson has a great write up:
https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Best,
Chris

On Mon, Mar 18, 2019 at 9:36 AM Aaron Yingcai Sun <[hidden email]> wrote:

> Hi, Emir,
>
> My system used to run with max 32GB, the response time is bad as well.
> swap is set to 4GB, there 3.2 free, I doubt swap would affect it since
> there is such huge free memory.
>
> I could try to with set Xms and Xmx to the same value, but I doubt how
> much would that change the response time.
>
>
> BRs
>
> //Aaron
>
> ________________________________
> From: Emir Arnautović <[hidden email]>
> Sent: Monday, March 18, 2019 2:19:19 PM
> To: [hidden email]
> Subject: Re: Solr index slow response
>
> Hi Aaron,
> Without looking too much into numbers, my bet would be that it is large
> heap that is causing issues. I would decrease is significantly (<30GB) and
> see if it is enough for your max load. Also, disable swap or reduce
> swappiness to min.
>
> In any case, you should install some monitoring tool that would help you
> do better analysis when you run into problems. One such tool is our
> monitoring solution: https://sematext.com/spm
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 18 Mar 2019, at 13:14, Aaron Yingcai Sun <[hidden email]> wrote:
> >
> > Hello, Emir,
> >
> > Thanks for the reply, this is the solr version and heap info, standalone
> single solr server. I don't have monitor tool connected. only look at
> 'top', has not seen cpu spike so far, when the slow response happens, cpu
> usage is not high at all, around 30%.
> >
> >
> > # curl 'http://.../solr/admin/info/system?wt=json&indent=true'
> > {
> >  "responseHeader":{
> >    "status":0,
> >    "QTime":27},
> >  "mode":"std",
> >  "solr_home":"/ardome/solr",
> >  "lucene":{
> >    "solr-spec-version":"6.5.1",
> >    "solr-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 -
> jimczi - 2017-04-21 12:23:42",
> >    "lucene-spec-version":"6.5.1",
> >    "lucene-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75
> - jimczi - 2017-04-21 12:17:15"},
> >  "jvm":{
> >    "version":"1.8.0_144 25.144-b01",
> >    "name":"Oracle Corporation Java HotSpot(TM) 64-Bit Server VM",
> >    "spec":{
> >      "vendor":"Oracle Corporation",
> >      "name":"Java Platform API Specification",
> >      "version":"1.8"},
> >    "jre":{
> >      "vendor":"Oracle Corporation",
> >      "version":"1.8.0_144"},
> >    "vm":{
> >      "vendor":"Oracle Corporation",
> >      "name":"Java HotSpot(TM) 64-Bit Server VM",
> >      "version":"25.144-b01"},
> >    "processors":32,
> >    "memory":{
> >      "free":"69.1 GB",
> >      "total":"180.2 GB",
> >      "max":"266.7 GB",
> >      "used":"111 GB (%41.6)",
> >      "raw":{
> >        "free":74238728336,
> >        "total":193470136320,
> >        "max":286331502592,
> >        "used":119231407984,
> >        "used%":41.64103736566334}},
> >    "jmx":{
> >
> "bootclasspath":"/usr/java/jdk1.8.0_144/jre/lib/resources.jar:/usr/java/jdk1.8.0_144/jre/lib/rt.jar:/usr/java/jdk1.8.0_144/jre/lib/sunrsasign.jar:/usr/java/jdk1.8.0_144/jre/lib/jsse.jar:/usr/java/jdk1.8.0_144/jre/lib/jce.jar:/usr/java/jdk1.8.0_144/jre/lib/charsets.jar:/usr/java/jdk1.8.0_144/jre/lib/jfr.jar:/usr/java/jdk1.8.0_144/jre/classes",
> >      "classpath":"...",
> >      "commandLineArgs":["-Xms100G",
> >        "-Xmx300G",
> >        "-DSTOP.PORT=8079",
> >        "-DSTOP.KEY=..",
> >        "-Dsolr.solr.home=..",
> >        "-Djetty.port=8983"],
> >      "startTime":"2019-03-18T09:35:27.892Z",
> >      "upTimeMS":9258422}},
> >  "system":{
> >    "name":"Linux",
> >    "arch":"amd64",
> >    "availableProcessors":32,
> >    "systemLoadAverage":14.72,
> >    "version":"3.0.101-311.g08a8a9d-default",
> >    "committedVirtualMemorySize":2547960700928,
> >    "freePhysicalMemorySize":4530696192,
> >    "freeSwapSpaceSize":3486846976,
> >    "processCpuLoad":0.3257436126790475,
> >    "processCpuTime":93869450000000,
> >    "systemCpuLoad":0.3279781055816521,
> >    "totalPhysicalMemorySize":406480175104,
> >    "totalSwapSpaceSize":4302303232 <(430)%20230-3232>,
> >    "maxFileDescriptorCount":32768,
> >    "openFileDescriptorCount":385,
> >    "uname":"Linux ... 3.0.101-311.g08a8a9d-default #1 SMP Wed Dec 14
> 10:15:37 UTC 2016 (08a8a9d) x86_64 x86_64 x86_64 GNU/Linux\n",
> >    "uptime":" 13:09pm  up 5 days 21:23,  7 users,  load average: 14.72,
> 12.28, 11.48\n"}}
> >
> >
> >
> >
> > ________________________________
> > From: Emir Arnautović <[hidden email]>
> > Sent: Monday, March 18, 2019 12:10:30 PM
> > To: [hidden email]
> > Subject: Re: Solr index slow response
> >
> > Hi Aaron,
> > Which version of Solr? How did you configure your heap? Is it standalone
> Solr or SolrCloud? A single server? Do you use some monitoring tool? Do you
> see some spikes, pauses or CPU usage is constant?
> >
> > Thanks,
> > Emir
> > --
> > Monitoring - Log Management - Alerting - Anomaly Detection
> > Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >
> >
> >
> >> On 18 Mar 2019, at 11:47, Aaron Yingcai Sun <[hidden email]> wrote:
> >>
> >> Hello, Solr!
> >>
> >>
> >> We are having some performance issue when try to send documents for
> solr to index. The repose time is very slow and unpredictable some time.
> >>
> >>
> >> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM,
> while 300 GB is reserved for solr, while this happening, cpu usage is
> around 30%, mem usage is 34%.  io also look ok according to iotop. SSD disk.
> >>
> >>
> >> Our application send 100 documents to solr per request, json encoded.
> the size is around 5M each time. some times the response time is under 1
> seconds, some times could be 300 seconds, the slow response happens very
> often.
> >>
> >>
> >> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for
> 3600000ms; if 1000000 uncommited docs"
> >>
> >>
> >> There are around 100 clients sending those documents at the same time,
> but each for the client is blocking call which wait the http response then
> send the next one.
> >>
> >>
> >> I tried to make the number of documents smaller in one request, such as
> 20, but  still I see slow response time to time, like 80 seconds.
> >>
> >>
> >> Would you help to give some hint how improve the response time?  solr
> does not seems very loaded, there must be a way to make the response faster.
> >>
> >>
> >> BRs
> >>
> >> //Aaron
> >>
> >>
> >>
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr index slow response

Aaron Yingcai Sun
Hello, Chris


Thanks for the tips.  So I tried to set it as you suggested, not see too much improvement.  Since I don't need it visible immediately, softCommit is disabled totally.

The slow response is happening every few seconds,  if it happens hourly I would suspect the hourly auto-commit.  But it happen much more frequently.  I don't see any CPU/RAM/NETWORK IO/DISK IO bottleneck on OS level.  It just looks like solr server is blocking internally itself.


<       <maxTime>${solr.autoCommit.maxTime:3600000}</maxTime>
---
>       <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
16c16
<       <openSearcher>true</openSearcher>
---
>       <openSearcher>false</openSearcher>



190318-162811.610-189982 DBG1:doc_count: 10 , doc_size: 539  KB, Res code: 200, QTime: 1405 ms, Request time: 1407 ms.
190318-162811.636-189968 DBG1:doc_count: 10 , doc_size: 465  KB, Res code: 200, QTime: 1357 ms, Request time: 1360 ms.
190318-162811.732-189968 DBG1:doc_count: 10 , doc_size: 473  KB, Res code: 200, QTime: 90 ms, Request time: 92 ms.
190318-162811.995-189981 DBG1:doc_count: 10 , doc_size: 610  KB, Res code: 200, QTime: 5306 ms, Request time: 5308 ms.
190318-162814.873-190003 DBG1:doc_count: 10 , doc_size: 508  KB, Res code: 200, QTime: 4775 ms, Request time: 4777 ms.
190318-162814.889-189972 DBG1:doc_count: 10 , doc_size: 563  KB, Res code: 200, QTime: 20222 ms, Request time: 20224 ms.
190318-162814.975-191817 DBG1:doc_count: 10 , doc_size: 539  KB, Res code: 200, QTime: 27732 ms, Request time: 27735 ms.
190318-162814.975-189958 DBG1:doc_count: 10 , doc_size: 616  KB, Res code: 200, QTime: 28106 ms, Request time: 28109 ms.
190318-162814.975-190004 DBG1:doc_count: 10 , doc_size: 473  KB, Res code: 200, QTime: 16703 ms, Request time: 16706 ms.
190318-162814.982-189963 DBG1:doc_count: 10 , doc_size: 540  KB, Res code: 200, QTime: 28216 ms, Request time: 28218 ms.
190318-162814.988-190007 DBG1:doc_count: 10 , doc_size: 673  KB, Res code: 200, QTime: 28133 ms, Request time: 28136 ms.
190318-162814.993-189962 DBG1:doc_count: 10 , doc_size: 631  KB, Res code: 200, QTime: 27909 ms, Request time: 27912 ms.
190318-162814.996-191818 DBG1:doc_count: 10 , doc_size: 529  KB, Res code: 200, QTime: 28172 ms, Request time: 28174 ms.
190318-162815.056-189986 DBG1:doc_count: 10 , doc_size: 602  KB, Res code: 200, QTime: 11375 ms, Request time: 11378 ms.
190318-162815.100-189988 DBG1:doc_count: 10 , doc_size: 530  KB, Res code: 200, QTime: 8663 ms, Request time: 8666 ms.
190318-162815.275-189980 DBG1:doc_count: 10 , doc_size: 553  KB, Res code: 200, QTime: 27526 ms, Request time: 27528 ms.
190318-162815.283-189997 DBG1:doc_count: 10 , doc_size: 600  KB, Res code: 200, QTime: 27529 ms, Request time: 27535 ms.
190318-162815.318-189961 DBG1:doc_count: 10 , doc_size: 661  KB, Res code: 200, QTime: 16617 ms, Request time: 16621 ms.
190318-162815.484-189952 DBG1:doc_count: 10 , doc_size: 549  KB, Res code: 200, QTime: 11707 ms, Request time: 11711 ms.
190318-162815.485-189993 DBG1:doc_count: 10 , doc_size: 618  KB, Res code: 200, QTime: 28315 ms, Request time: 28321 ms.
190318-162816.216-189972 DBG1:doc_count: 10 , doc_size: 493  KB, Res code: 200, QTime: 1320 ms, Request time: 1322 ms.
190318-162816.354-189972 DBG1:doc_count: 10 , doc_size: 631  KB, Res code: 200, QTime: 130 ms, Request time: 132 ms.
190318-162816.471-189972 DBG1:doc_count: 10 , doc_size: 563  KB, Res code: 200, QTime: 111 ms, Request time: 113 ms.
190318-162816.586-189972 DBG1:doc_count: 10 , doc_size: 554  KB, Res code: 200, QTime: 110 ms, Request time: 111 ms.
190318-162816.716-189972 DBG1:doc_count: 10 , doc_size: 590  KB, Res code: 200, QTime: 124 ms, Request time: 125 ms.
190318-162820.494-189972 DBG1:doc_count: 10 , doc_size: 583  KB, Res code: 200, QTime: 3772 ms, Request time: 3774 ms.
190318-162820.574-189953 DBG1:doc_count: 10 , doc_size: 550  KB, Res code: 200, QTime: 32802 ms, Request time: 32804 ms.
190318-162820.609-189991 DBG1:doc_count: 10 , doc_size: 586  KB, Res code: 200, QTime: 33787 ms, Request time: 33790 ms.
190318-162820.621-189976 DBG1:doc_count: 10 , doc_size: 572  KB, Res code: 200, QTime: 33397 ms, Request time: 33400 ms.
190318-162820.622-189985 DBG1:doc_count: 10 , doc_size: 498  KB, Res code: 200, QTime: 32917 ms, Request time: 32919 ms.
190318-162820.987-190005 DBG1:doc_count: 10 , doc_size: 629  KB, Res code: 200, QTime: 22207 ms, Request time: 22209 ms.
190318-162821.028-189979 DBG1:doc_count: 10 , doc_size: 584  KB, Res code: 200, QTime: 22800 ms, Request time: 22802 ms.
190318-162821.056-189948 DBG1:doc_count: 10 , doc_size: 670  KB, Res code: 200, QTime: 34193 ms, Request time: 34195 ms.
190318-162821.062-189983 DBG1:doc_count: 10 , doc_size: 675  KB, Res code: 200, QTime: 22250 ms, Request time: 22252 ms.




BRs

//Aaron

________________________________
From: Chris Ulicny <[hidden email]>
Sent: Monday, March 18, 2019 2:54:25 PM
To: [hidden email]
Subject: Re: Solr index slow response

One other thing to look at besides the heap is your commit settings. We've
experienced something similar, and changing commit settings alleviated the
issue.

Are you opening a search on every hardcommit? If so, you might want to
reconsider and use the softcommit for the hourly creation of a new searcher.

The hardcommit interval should be much lower. Probably something on the
order of seconds (15000) instead of hours (currently 3600000). When the
hard commit fires, numerous merges might be firing off in the background
due to the volume of documents you are indexing, which might explain the
periodic bad response times shown in your logs.

It would depend on your specific scenario, but here's our setup. During
long periods of constant indexing of documents to a staging collection (~2
billion documents), we have following commit settings

softcommit: 3600000ms (for periodic validation of data, since it's not in
production)
hardcommit: openSearcher -> false, 15000ms (no document limit)

This makes the documents available for searching every hour, but doesn't
result in the large bursts of IO due to the infrequent hard commits.

For more info, Erick Erickson has a great write up:
https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Best,
Chris

On Mon, Mar 18, 2019 at 9:36 AM Aaron Yingcai Sun <[hidden email]> wrote:

> Hi, Emir,
>
> My system used to run with max 32GB, the response time is bad as well.
> swap is set to 4GB, there 3.2 free, I doubt swap would affect it since
> there is such huge free memory.
>
> I could try to with set Xms and Xmx to the same value, but I doubt how
> much would that change the response time.
>
>
> BRs
>
> //Aaron
>
> ________________________________
> From: Emir Arnautović <[hidden email]>
> Sent: Monday, March 18, 2019 2:19:19 PM
> To: [hidden email]
> Subject: Re: Solr index slow response
>
> Hi Aaron,
> Without looking too much into numbers, my bet would be that it is large
> heap that is causing issues. I would decrease is significantly (<30GB) and
> see if it is enough for your max load. Also, disable swap or reduce
> swappiness to min.
>
> In any case, you should install some monitoring tool that would help you
> do better analysis when you run into problems. One such tool is our
> monitoring solution: https://sematext.com/spm
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 18 Mar 2019, at 13:14, Aaron Yingcai Sun <[hidden email]> wrote:
> >
> > Hello, Emir,
> >
> > Thanks for the reply, this is the solr version and heap info, standalone
> single solr server. I don't have monitor tool connected. only look at
> 'top', has not seen cpu spike so far, when the slow response happens, cpu
> usage is not high at all, around 30%.
> >
> >
> > # curl 'http://.../solr/admin/info/system?wt=json&indent=true'
> > {
> >  "responseHeader":{
> >    "status":0,
> >    "QTime":27},
> >  "mode":"std",
> >  "solr_home":"/ardome/solr",
> >  "lucene":{
> >    "solr-spec-version":"6.5.1",
> >    "solr-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 -
> jimczi - 2017-04-21 12:23:42",
> >    "lucene-spec-version":"6.5.1",
> >    "lucene-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75
> - jimczi - 2017-04-21 12:17:15"},
> >  "jvm":{
> >    "version":"1.8.0_144 25.144-b01",
> >    "name":"Oracle Corporation Java HotSpot(TM) 64-Bit Server VM",
> >    "spec":{
> >      "vendor":"Oracle Corporation",
> >      "name":"Java Platform API Specification",
> >      "version":"1.8"},
> >    "jre":{
> >      "vendor":"Oracle Corporation",
> >      "version":"1.8.0_144"},
> >    "vm":{
> >      "vendor":"Oracle Corporation",
> >      "name":"Java HotSpot(TM) 64-Bit Server VM",
> >      "version":"25.144-b01"},
> >    "processors":32,
> >    "memory":{
> >      "free":"69.1 GB",
> >      "total":"180.2 GB",
> >      "max":"266.7 GB",
> >      "used":"111 GB (%41.6)",
> >      "raw":{
> >        "free":74238728336,
> >        "total":193470136320,
> >        "max":286331502592,
> >        "used":119231407984,
> >        "used%":41.64103736566334}},
> >    "jmx":{
> >
> "bootclasspath":"/usr/java/jdk1.8.0_144/jre/lib/resources.jar:/usr/java/jdk1.8.0_144/jre/lib/rt.jar:/usr/java/jdk1.8.0_144/jre/lib/sunrsasign.jar:/usr/java/jdk1.8.0_144/jre/lib/jsse.jar:/usr/java/jdk1.8.0_144/jre/lib/jce.jar:/usr/java/jdk1.8.0_144/jre/lib/charsets.jar:/usr/java/jdk1.8.0_144/jre/lib/jfr.jar:/usr/java/jdk1.8.0_144/jre/classes",
> >      "classpath":"...",
> >      "commandLineArgs":["-Xms100G",
> >        "-Xmx300G",
> >        "-DSTOP.PORT=8079",
> >        "-DSTOP.KEY=..",
> >        "-Dsolr.solr.home=..",
> >        "-Djetty.port=8983"],
> >      "startTime":"2019-03-18T09:35:27.892Z",
> >      "upTimeMS":9258422}},
> >  "system":{
> >    "name":"Linux",
> >    "arch":"amd64",
> >    "availableProcessors":32,
> >    "systemLoadAverage":14.72,
> >    "version":"3.0.101-311.g08a8a9d-default",
> >    "committedVirtualMemorySize":2547960700928,
> >    "freePhysicalMemorySize":4530696192,
> >    "freeSwapSpaceSize":3486846976,
> >    "processCpuLoad":0.3257436126790475,
> >    "processCpuTime":93869450000000,
> >    "systemCpuLoad":0.3279781055816521,
> >    "totalPhysicalMemorySize":406480175104,
> >    "totalSwapSpaceSize":4302303232 <(430)%20230-3232>,
> >    "maxFileDescriptorCount":32768,
> >    "openFileDescriptorCount":385,
> >    "uname":"Linux ... 3.0.101-311.g08a8a9d-default #1 SMP Wed Dec 14
> 10:15:37 UTC 2016 (08a8a9d) x86_64 x86_64 x86_64 GNU/Linux\n",
> >    "uptime":" 13:09pm  up 5 days 21:23,  7 users,  load average: 14.72,
> 12.28, 11.48\n"}}
> >
> >
> >
> >
> > ________________________________
> > From: Emir Arnautović <[hidden email]>
> > Sent: Monday, March 18, 2019 12:10:30 PM
> > To: [hidden email]
> > Subject: Re: Solr index slow response
> >
> > Hi Aaron,
> > Which version of Solr? How did you configure your heap? Is it standalone
> Solr or SolrCloud? A single server? Do you use some monitoring tool? Do you
> see some spikes, pauses or CPU usage is constant?
> >
> > Thanks,
> > Emir
> > --
> > Monitoring - Log Management - Alerting - Anomaly Detection
> > Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >
> >
> >
> >> On 18 Mar 2019, at 11:47, Aaron Yingcai Sun <[hidden email]> wrote:
> >>
> >> Hello, Solr!
> >>
> >>
> >> We are having some performance issue when try to send documents for
> solr to index. The repose time is very slow and unpredictable some time.
> >>
> >>
> >> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM,
> while 300 GB is reserved for solr, while this happening, cpu usage is
> around 30%, mem usage is 34%.  io also look ok according to iotop. SSD disk.
> >>
> >>
> >> Our application send 100 documents to solr per request, json encoded.
> the size is around 5M each time. some times the response time is under 1
> seconds, some times could be 300 seconds, the slow response happens very
> often.
> >>
> >>
> >> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for
> 3600000ms; if 1000000 uncommited docs"
> >>
> >>
> >> There are around 100 clients sending those documents at the same time,
> but each for the client is blocking call which wait the http response then
> send the next one.
> >>
> >>
> >> I tried to make the number of documents smaller in one request, such as
> 20, but  still I see slow response time to time, like 80 seconds.
> >>
> >>
> >> Would you help to give some hint how improve the response time?  solr
> does not seems very loaded, there must be a way to make the response faster.
> >>
> >>
> >> BRs
> >>
> >> //Aaron
> >>
> >>
> >>
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr index slow response

Walter Underwood
Solr is not designed to have consistent response times for updates. You are expecting Solr to do something that it does not do.

About Xms and Xmx, the JVM will continue to allocate memory until it hits the max. After it hits the max, it will start to collect garbage. A smaller Xms just wastes time doing allocations after the JVM is running. Avoid that by making Xms and Xms the same.

We run all of our JVMs with 8 GB of heap and the G1 collector. You probably do not need more than 8 GB unless you are doing high-cardinality facets or some other memory-hungry querying.

The first step would be to use a good configuration. We start our Java 8 JVMs with these parameters:

SOLR_HEAP=8g
# Use G1 GC  -- wunder 2017-01-23
# Settings from https://wiki.apache.org/solr/ShawnHeisey
GC_TUNE=" \
-XX:+UseG1GC \
-XX:+ParallelRefProcEnabled \
-XX:G1HeapRegionSize=8m \
-XX:MaxGCPauseMillis=200 \
-XX:+UseLargePages \
-XX:+AggressiveOpts \


Use SSD for disks, with total space about 3X as big as the expected index size.

Have RAM not used by Solr or the OS that is equal to the expected index size.

After that, let’s figure out what the real requirement is. If you must have consistent response times for update requests, you’ll need to do that outside of Solr. But if you need high data import rates, we can probably help.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On Mar 18, 2019, at 8:31 AM, Aaron Yingcai Sun <[hidden email]> wrote:
>
> Hello, Chris
>
>
> Thanks for the tips.  So I tried to set it as you suggested, not see too much improvement.  Since I don't need it visible immediately, softCommit is disabled totally.
>
> The slow response is happening every few seconds,  if it happens hourly I would suspect the hourly auto-commit.  But it happen much more frequently.  I don't see any CPU/RAM/NETWORK IO/DISK IO bottleneck on OS level.  It just looks like solr server is blocking internally itself.
>
>
> <       <maxTime>${solr.autoCommit.maxTime:3600000}</maxTime>
> ---
>>      <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
> 16c16
> <       <openSearcher>true</openSearcher>
> ---
>>      <openSearcher>false</openSearcher>
>
>
>
> 190318-162811.610-189982 DBG1:doc_count: 10 , doc_size: 539  KB, Res code: 200, QTime: 1405 ms, Request time: 1407 ms.
> 190318-162811.636-189968 DBG1:doc_count: 10 , doc_size: 465  KB, Res code: 200, QTime: 1357 ms, Request time: 1360 ms.
> 190318-162811.732-189968 DBG1:doc_count: 10 , doc_size: 473  KB, Res code: 200, QTime: 90 ms, Request time: 92 ms.
> 190318-162811.995-189981 DBG1:doc_count: 10 , doc_size: 610  KB, Res code: 200, QTime: 5306 ms, Request time: 5308 ms.
> 190318-162814.873-190003 DBG1:doc_count: 10 , doc_size: 508  KB, Res code: 200, QTime: 4775 ms, Request time: 4777 ms.
> 190318-162814.889-189972 DBG1:doc_count: 10 , doc_size: 563  KB, Res code: 200, QTime: 20222 ms, Request time: 20224 ms.
> 190318-162814.975-191817 DBG1:doc_count: 10 , doc_size: 539  KB, Res code: 200, QTime: 27732 ms, Request time: 27735 ms.
> 190318-162814.975-189958 DBG1:doc_count: 10 , doc_size: 616  KB, Res code: 200, QTime: 28106 ms, Request time: 28109 ms.
> 190318-162814.975-190004 DBG1:doc_count: 10 , doc_size: 473  KB, Res code: 200, QTime: 16703 ms, Request time: 16706 ms.
> 190318-162814.982-189963 DBG1:doc_count: 10 , doc_size: 540  KB, Res code: 200, QTime: 28216 ms, Request time: 28218 ms.
> 190318-162814.988-190007 DBG1:doc_count: 10 , doc_size: 673  KB, Res code: 200, QTime: 28133 ms, Request time: 28136 ms.
> 190318-162814.993-189962 DBG1:doc_count: 10 , doc_size: 631  KB, Res code: 200, QTime: 27909 ms, Request time: 27912 ms.
> 190318-162814.996-191818 DBG1:doc_count: 10 , doc_size: 529  KB, Res code: 200, QTime: 28172 ms, Request time: 28174 ms.
> 190318-162815.056-189986 DBG1:doc_count: 10 , doc_size: 602  KB, Res code: 200, QTime: 11375 ms, Request time: 11378 ms.
> 190318-162815.100-189988 DBG1:doc_count: 10 , doc_size: 530  KB, Res code: 200, QTime: 8663 ms, Request time: 8666 ms.
> 190318-162815.275-189980 DBG1:doc_count: 10 , doc_size: 553  KB, Res code: 200, QTime: 27526 ms, Request time: 27528 ms.
> 190318-162815.283-189997 DBG1:doc_count: 10 , doc_size: 600  KB, Res code: 200, QTime: 27529 ms, Request time: 27535 ms.
> 190318-162815.318-189961 DBG1:doc_count: 10 , doc_size: 661  KB, Res code: 200, QTime: 16617 ms, Request time: 16621 ms.
> 190318-162815.484-189952 DBG1:doc_count: 10 , doc_size: 549  KB, Res code: 200, QTime: 11707 ms, Request time: 11711 ms.
> 190318-162815.485-189993 DBG1:doc_count: 10 , doc_size: 618  KB, Res code: 200, QTime: 28315 ms, Request time: 28321 ms.
> 190318-162816.216-189972 DBG1:doc_count: 10 , doc_size: 493  KB, Res code: 200, QTime: 1320 ms, Request time: 1322 ms.
> 190318-162816.354-189972 DBG1:doc_count: 10 , doc_size: 631  KB, Res code: 200, QTime: 130 ms, Request time: 132 ms.
> 190318-162816.471-189972 DBG1:doc_count: 10 , doc_size: 563  KB, Res code: 200, QTime: 111 ms, Request time: 113 ms.
> 190318-162816.586-189972 DBG1:doc_count: 10 , doc_size: 554  KB, Res code: 200, QTime: 110 ms, Request time: 111 ms.
> 190318-162816.716-189972 DBG1:doc_count: 10 , doc_size: 590  KB, Res code: 200, QTime: 124 ms, Request time: 125 ms.
> 190318-162820.494-189972 DBG1:doc_count: 10 , doc_size: 583  KB, Res code: 200, QTime: 3772 ms, Request time: 3774 ms.
> 190318-162820.574-189953 DBG1:doc_count: 10 , doc_size: 550  KB, Res code: 200, QTime: 32802 ms, Request time: 32804 ms.
> 190318-162820.609-189991 DBG1:doc_count: 10 , doc_size: 586  KB, Res code: 200, QTime: 33787 ms, Request time: 33790 ms.
> 190318-162820.621-189976 DBG1:doc_count: 10 , doc_size: 572  KB, Res code: 200, QTime: 33397 ms, Request time: 33400 ms.
> 190318-162820.622-189985 DBG1:doc_count: 10 , doc_size: 498  KB, Res code: 200, QTime: 32917 ms, Request time: 32919 ms.
> 190318-162820.987-190005 DBG1:doc_count: 10 , doc_size: 629  KB, Res code: 200, QTime: 22207 ms, Request time: 22209 ms.
> 190318-162821.028-189979 DBG1:doc_count: 10 , doc_size: 584  KB, Res code: 200, QTime: 22800 ms, Request time: 22802 ms.
> 190318-162821.056-189948 DBG1:doc_count: 10 , doc_size: 670  KB, Res code: 200, QTime: 34193 ms, Request time: 34195 ms.
> 190318-162821.062-189983 DBG1:doc_count: 10 , doc_size: 675  KB, Res code: 200, QTime: 22250 ms, Request time: 22252 ms.
>
>
>
>
> BRs
>
> //Aaron
>
> ________________________________
> From: Chris Ulicny <[hidden email]>
> Sent: Monday, March 18, 2019 2:54:25 PM
> To: [hidden email]
> Subject: Re: Solr index slow response
>
> One other thing to look at besides the heap is your commit settings. We've
> experienced something similar, and changing commit settings alleviated the
> issue.
>
> Are you opening a search on every hardcommit? If so, you might want to
> reconsider and use the softcommit for the hourly creation of a new searcher.
>
> The hardcommit interval should be much lower. Probably something on the
> order of seconds (15000) instead of hours (currently 3600000). When the
> hard commit fires, numerous merges might be firing off in the background
> due to the volume of documents you are indexing, which might explain the
> periodic bad response times shown in your logs.
>
> It would depend on your specific scenario, but here's our setup. During
> long periods of constant indexing of documents to a staging collection (~2
> billion documents), we have following commit settings
>
> softcommit: 3600000ms (for periodic validation of data, since it's not in
> production)
> hardcommit: openSearcher -> false, 15000ms (no document limit)
>
> This makes the documents available for searching every hour, but doesn't
> result in the large bursts of IO due to the infrequent hard commits.
>
> For more info, Erick Erickson has a great write up:
> https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> Best,
> Chris
>
> On Mon, Mar 18, 2019 at 9:36 AM Aaron Yingcai Sun <[hidden email]> wrote:
>
>> Hi, Emir,
>>
>> My system used to run with max 32GB, the response time is bad as well.
>> swap is set to 4GB, there 3.2 free, I doubt swap would affect it since
>> there is such huge free memory.
>>
>> I could try to with set Xms and Xmx to the same value, but I doubt how
>> much would that change the response time.
>>
>>
>> BRs
>>
>> //Aaron
>>
>> ________________________________
>> From: Emir Arnautović <[hidden email]>
>> Sent: Monday, March 18, 2019 2:19:19 PM
>> To: [hidden email]
>> Subject: Re: Solr index slow response
>>
>> Hi Aaron,
>> Without looking too much into numbers, my bet would be that it is large
>> heap that is causing issues. I would decrease is significantly (<30GB) and
>> see if it is enough for your max load. Also, disable swap or reduce
>> swappiness to min.
>>
>> In any case, you should install some monitoring tool that would help you
>> do better analysis when you run into problems. One such tool is our
>> monitoring solution: https://sematext.com/spm
>>
>> HTH,
>> Emir
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>
>>
>>
>>> On 18 Mar 2019, at 13:14, Aaron Yingcai Sun <[hidden email]> wrote:
>>>
>>> Hello, Emir,
>>>
>>> Thanks for the reply, this is the solr version and heap info, standalone
>> single solr server. I don't have monitor tool connected. only look at
>> 'top', has not seen cpu spike so far, when the slow response happens, cpu
>> usage is not high at all, around 30%.
>>>
>>>
>>> # curl 'http://.../solr/admin/info/system?wt=json&indent=true'
>>> {
>>> "responseHeader":{
>>>   "status":0,
>>>   "QTime":27},
>>> "mode":"std",
>>> "solr_home":"/ardome/solr",
>>> "lucene":{
>>>   "solr-spec-version":"6.5.1",
>>>   "solr-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 -
>> jimczi - 2017-04-21 12:23:42",
>>>   "lucene-spec-version":"6.5.1",
>>>   "lucene-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75
>> - jimczi - 2017-04-21 12:17:15"},
>>> "jvm":{
>>>   "version":"1.8.0_144 25.144-b01",
>>>   "name":"Oracle Corporation Java HotSpot(TM) 64-Bit Server VM",
>>>   "spec":{
>>>     "vendor":"Oracle Corporation",
>>>     "name":"Java Platform API Specification",
>>>     "version":"1.8"},
>>>   "jre":{
>>>     "vendor":"Oracle Corporation",
>>>     "version":"1.8.0_144"},
>>>   "vm":{
>>>     "vendor":"Oracle Corporation",
>>>     "name":"Java HotSpot(TM) 64-Bit Server VM",
>>>     "version":"25.144-b01"},
>>>   "processors":32,
>>>   "memory":{
>>>     "free":"69.1 GB",
>>>     "total":"180.2 GB",
>>>     "max":"266.7 GB",
>>>     "used":"111 GB (%41.6)",
>>>     "raw":{
>>>       "free":74238728336,
>>>       "total":193470136320,
>>>       "max":286331502592,
>>>       "used":119231407984,
>>>       "used%":41.64103736566334}},
>>>   "jmx":{
>>>
>> "bootclasspath":"/usr/java/jdk1.8.0_144/jre/lib/resources.jar:/usr/java/jdk1.8.0_144/jre/lib/rt.jar:/usr/java/jdk1.8.0_144/jre/lib/sunrsasign.jar:/usr/java/jdk1.8.0_144/jre/lib/jsse.jar:/usr/java/jdk1.8.0_144/jre/lib/jce.jar:/usr/java/jdk1.8.0_144/jre/lib/charsets.jar:/usr/java/jdk1.8.0_144/jre/lib/jfr.jar:/usr/java/jdk1.8.0_144/jre/classes",
>>>     "classpath":"...",
>>>     "commandLineArgs":["-Xms100G",
>>>       "-Xmx300G",
>>>       "-DSTOP.PORT=8079",
>>>       "-DSTOP.KEY=..",
>>>       "-Dsolr.solr.home=..",
>>>       "-Djetty.port=8983"],
>>>     "startTime":"2019-03-18T09:35:27.892Z",
>>>     "upTimeMS":9258422}},
>>> "system":{
>>>   "name":"Linux",
>>>   "arch":"amd64",
>>>   "availableProcessors":32,
>>>   "systemLoadAverage":14.72,
>>>   "version":"3.0.101-311.g08a8a9d-default",
>>>   "committedVirtualMemorySize":2547960700928,
>>>   "freePhysicalMemorySize":4530696192,
>>>   "freeSwapSpaceSize":3486846976,
>>>   "processCpuLoad":0.3257436126790475,
>>>   "processCpuTime":93869450000000,
>>>   "systemCpuLoad":0.3279781055816521,
>>>   "totalPhysicalMemorySize":406480175104,
>>>   "totalSwapSpaceSize":4302303232 <(430)%20230-3232>,
>>>   "maxFileDescriptorCount":32768,
>>>   "openFileDescriptorCount":385,
>>>   "uname":"Linux ... 3.0.101-311.g08a8a9d-default #1 SMP Wed Dec 14
>> 10:15:37 UTC 2016 (08a8a9d) x86_64 x86_64 x86_64 GNU/Linux\n",
>>>   "uptime":" 13:09pm  up 5 days 21:23,  7 users,  load average: 14.72,
>> 12.28, 11.48\n"}}
>>>
>>>
>>>
>>>
>>> ________________________________
>>> From: Emir Arnautović <[hidden email]>
>>> Sent: Monday, March 18, 2019 12:10:30 PM
>>> To: [hidden email]
>>> Subject: Re: Solr index slow response
>>>
>>> Hi Aaron,
>>> Which version of Solr? How did you configure your heap? Is it standalone
>> Solr or SolrCloud? A single server? Do you use some monitoring tool? Do you
>> see some spikes, pauses or CPU usage is constant?
>>>
>>> Thanks,
>>> Emir
>>> --
>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>>
>>>
>>>
>>>> On 18 Mar 2019, at 11:47, Aaron Yingcai Sun <[hidden email]> wrote:
>>>>
>>>> Hello, Solr!
>>>>
>>>>
>>>> We are having some performance issue when try to send documents for
>> solr to index. The repose time is very slow and unpredictable some time.
>>>>
>>>>
>>>> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM,
>> while 300 GB is reserved for solr, while this happening, cpu usage is
>> around 30%, mem usage is 34%.  io also look ok according to iotop. SSD disk.
>>>>
>>>>
>>>> Our application send 100 documents to solr per request, json encoded.
>> the size is around 5M each time. some times the response time is under 1
>> seconds, some times could be 300 seconds, the slow response happens very
>> often.
>>>>
>>>>
>>>> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for
>> 3600000ms; if 1000000 uncommited docs"
>>>>
>>>>
>>>> There are around 100 clients sending those documents at the same time,
>> but each for the client is blocking call which wait the http response then
>> send the next one.
>>>>
>>>>
>>>> I tried to make the number of documents smaller in one request, such as
>> 20, but  still I see slow response time to time, like 80 seconds.
>>>>
>>>>
>>>> Would you help to give some hint how improve the response time?  solr
>> does not seems very loaded, there must be a way to make the response faster.
>>>>
>>>>
>>>> BRs
>>>>
>>>> //Aaron
>>>>
>>>>
>>>>
>>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: Solr index slow response

Aaron Yingcai Sun
We have around 80 million documents to index, total index size around 3TB,  I guess I'm not the first one to work with this big amount of data. with such slow response time, the index process would take around 2 weeks. While the system resource is not very loaded, there must be a way to speed it up.


To Walter, I don't see why G1GC would improve this, we only do index, no query in the background. There is no memory constraint.  it's more feel like some internal thread are blocking each other.


I used to run with more documents in one request, that give much worse response time, 300 documents in one request could end up 20 minutes response time, now I changed to max 10 documents in one request, still many response time around 30 seconds, while some of them are very fast( ~100 ms).  How come there are such big difference? the documents size does not have such big difference.


I just want to speed it up since nothing seems to be overloaded.  Are there any other faster way to index such big amount of data?


BRs

//Aaron

________________________________
From: Walter Underwood <[hidden email]>
Sent: Monday, March 18, 2019 4:59:20 PM
To: [hidden email]
Subject: Re: Solr index slow response

Solr is not designed to have consistent response times for updates. You are expecting Solr to do something that it does not do.

About Xms and Xmx, the JVM will continue to allocate memory until it hits the max. After it hits the max, it will start to collect garbage. A smaller Xms just wastes time doing allocations after the JVM is running. Avoid that by making Xms and Xms the same.

We run all of our JVMs with 8 GB of heap and the G1 collector. You probably do not need more than 8 GB unless you are doing high-cardinality facets or some other memory-hungry querying.

The first step would be to use a good configuration. We start our Java 8 JVMs with these parameters:

SOLR_HEAP=8g
# Use G1 GC  -- wunder 2017-01-23
# Settings from https://wiki.apache.org/solr/ShawnHeisey
GC_TUNE=" \
-XX:+UseG1GC \
-XX:+ParallelRefProcEnabled \
-XX:G1HeapRegionSize=8m \
-XX:MaxGCPauseMillis=200 \
-XX:+UseLargePages \
-XX:+AggressiveOpts \


Use SSD for disks, with total space about 3X as big as the expected index size.

Have RAM not used by Solr or the OS that is equal to the expected index size.

After that, let’s figure out what the real requirement is. If you must have consistent response times for update requests, you’ll need to do that outside of Solr. But if you need high data import rates, we can probably help.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On Mar 18, 2019, at 8:31 AM, Aaron Yingcai Sun <[hidden email]> wrote:
>
> Hello, Chris
>
>
> Thanks for the tips.  So I tried to set it as you suggested, not see too much improvement.  Since I don't need it visible immediately, softCommit is disabled totally.
>
> The slow response is happening every few seconds,  if it happens hourly I would suspect the hourly auto-commit.  But it happen much more frequently.  I don't see any CPU/RAM/NETWORK IO/DISK IO bottleneck on OS level.  It just looks like solr server is blocking internally itself.
>
>
> <       <maxTime>${solr.autoCommit.maxTime:3600000}</maxTime>
> ---
>>      <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
> 16c16
> <       <openSearcher>true</openSearcher>
> ---
>>      <openSearcher>false</openSearcher>
>
>
>
> 190318-162811.610-189982 DBG1:doc_count: 10 , doc_size: 539  KB, Res code: 200, QTime: 1405 ms, Request time: 1407 ms.
> 190318-162811.636-189968 DBG1:doc_count: 10 , doc_size: 465  KB, Res code: 200, QTime: 1357 ms, Request time: 1360 ms.
> 190318-162811.732-189968 DBG1:doc_count: 10 , doc_size: 473  KB, Res code: 200, QTime: 90 ms, Request time: 92 ms.
> 190318-162811.995-189981 DBG1:doc_count: 10 , doc_size: 610  KB, Res code: 200, QTime: 5306 ms, Request time: 5308 ms.
> 190318-162814.873-190003 DBG1:doc_count: 10 , doc_size: 508  KB, Res code: 200, QTime: 4775 ms, Request time: 4777 ms.
> 190318-162814.889-189972 DBG1:doc_count: 10 , doc_size: 563  KB, Res code: 200, QTime: 20222 ms, Request time: 20224 ms.
> 190318-162814.975-191817 DBG1:doc_count: 10 , doc_size: 539  KB, Res code: 200, QTime: 27732 ms, Request time: 27735 ms.
> 190318-162814.975-189958 DBG1:doc_count: 10 , doc_size: 616  KB, Res code: 200, QTime: 28106 ms, Request time: 28109 ms.
> 190318-162814.975-190004 DBG1:doc_count: 10 , doc_size: 473  KB, Res code: 200, QTime: 16703 ms, Request time: 16706 ms.
> 190318-162814.982-189963 DBG1:doc_count: 10 , doc_size: 540  KB, Res code: 200, QTime: 28216 ms, Request time: 28218 ms.
> 190318-162814.988-190007 DBG1:doc_count: 10 , doc_size: 673  KB, Res code: 200, QTime: 28133 ms, Request time: 28136 ms.
> 190318-162814.993-189962 DBG1:doc_count: 10 , doc_size: 631  KB, Res code: 200, QTime: 27909 ms, Request time: 27912 ms.
> 190318-162814.996-191818 DBG1:doc_count: 10 , doc_size: 529  KB, Res code: 200, QTime: 28172 ms, Request time: 28174 ms.
> 190318-162815.056-189986 DBG1:doc_count: 10 , doc_size: 602  KB, Res code: 200, QTime: 11375 ms, Request time: 11378 ms.
> 190318-162815.100-189988 DBG1:doc_count: 10 , doc_size: 530  KB, Res code: 200, QTime: 8663 ms, Request time: 8666 ms.
> 190318-162815.275-189980 DBG1:doc_count: 10 , doc_size: 553  KB, Res code: 200, QTime: 27526 ms, Request time: 27528 ms.
> 190318-162815.283-189997 DBG1:doc_count: 10 , doc_size: 600  KB, Res code: 200, QTime: 27529 ms, Request time: 27535 ms.
> 190318-162815.318-189961 DBG1:doc_count: 10 , doc_size: 661  KB, Res code: 200, QTime: 16617 ms, Request time: 16621 ms.
> 190318-162815.484-189952 DBG1:doc_count: 10 , doc_size: 549  KB, Res code: 200, QTime: 11707 ms, Request time: 11711 ms.
> 190318-162815.485-189993 DBG1:doc_count: 10 , doc_size: 618  KB, Res code: 200, QTime: 28315 ms, Request time: 28321 ms.
> 190318-162816.216-189972 DBG1:doc_count: 10 , doc_size: 493  KB, Res code: 200, QTime: 1320 ms, Request time: 1322 ms.
> 190318-162816.354-189972 DBG1:doc_count: 10 , doc_size: 631  KB, Res code: 200, QTime: 130 ms, Request time: 132 ms.
> 190318-162816.471-189972 DBG1:doc_count: 10 , doc_size: 563  KB, Res code: 200, QTime: 111 ms, Request time: 113 ms.
> 190318-162816.586-189972 DBG1:doc_count: 10 , doc_size: 554  KB, Res code: 200, QTime: 110 ms, Request time: 111 ms.
> 190318-162816.716-189972 DBG1:doc_count: 10 , doc_size: 590  KB, Res code: 200, QTime: 124 ms, Request time: 125 ms.
> 190318-162820.494-189972 DBG1:doc_count: 10 , doc_size: 583  KB, Res code: 200, QTime: 3772 ms, Request time: 3774 ms.
> 190318-162820.574-189953 DBG1:doc_count: 10 , doc_size: 550  KB, Res code: 200, QTime: 32802 ms, Request time: 32804 ms.
> 190318-162820.609-189991 DBG1:doc_count: 10 , doc_size: 586  KB, Res code: 200, QTime: 33787 ms, Request time: 33790 ms.
> 190318-162820.621-189976 DBG1:doc_count: 10 , doc_size: 572  KB, Res code: 200, QTime: 33397 ms, Request time: 33400 ms.
> 190318-162820.622-189985 DBG1:doc_count: 10 , doc_size: 498  KB, Res code: 200, QTime: 32917 ms, Request time: 32919 ms.
> 190318-162820.987-190005 DBG1:doc_count: 10 , doc_size: 629  KB, Res code: 200, QTime: 22207 ms, Request time: 22209 ms.
> 190318-162821.028-189979 DBG1:doc_count: 10 , doc_size: 584  KB, Res code: 200, QTime: 22800 ms, Request time: 22802 ms.
> 190318-162821.056-189948 DBG1:doc_count: 10 , doc_size: 670  KB, Res code: 200, QTime: 34193 ms, Request time: 34195 ms.
> 190318-162821.062-189983 DBG1:doc_count: 10 , doc_size: 675  KB, Res code: 200, QTime: 22250 ms, Request time: 22252 ms.
>
>
>
>
> BRs
>
> //Aaron
>
> ________________________________
> From: Chris Ulicny <[hidden email]>
> Sent: Monday, March 18, 2019 2:54:25 PM
> To: [hidden email]
> Subject: Re: Solr index slow response
>
> One other thing to look at besides the heap is your commit settings. We've
> experienced something similar, and changing commit settings alleviated the
> issue.
>
> Are you opening a search on every hardcommit? If so, you might want to
> reconsider and use the softcommit for the hourly creation of a new searcher.
>
> The hardcommit interval should be much lower. Probably something on the
> order of seconds (15000) instead of hours (currently 3600000). When the
> hard commit fires, numerous merges might be firing off in the background
> due to the volume of documents you are indexing, which might explain the
> periodic bad response times shown in your logs.
>
> It would depend on your specific scenario, but here's our setup. During
> long periods of constant indexing of documents to a staging collection (~2
> billion documents), we have following commit settings
>
> softcommit: 3600000ms (for periodic validation of data, since it's not in
> production)
> hardcommit: openSearcher -> false, 15000ms (no document limit)
>
> This makes the documents available for searching every hour, but doesn't
> result in the large bursts of IO due to the infrequent hard commits.
>
> For more info, Erick Erickson has a great write up:
> https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> Best,
> Chris
>
> On Mon, Mar 18, 2019 at 9:36 AM Aaron Yingcai Sun <[hidden email]> wrote:
>
>> Hi, Emir,
>>
>> My system used to run with max 32GB, the response time is bad as well.
>> swap is set to 4GB, there 3.2 free, I doubt swap would affect it since
>> there is such huge free memory.
>>
>> I could try to with set Xms and Xmx to the same value, but I doubt how
>> much would that change the response time.
>>
>>
>> BRs
>>
>> //Aaron
>>
>> ________________________________
>> From: Emir Arnautović <[hidden email]>
>> Sent: Monday, March 18, 2019 2:19:19 PM
>> To: [hidden email]
>> Subject: Re: Solr index slow response
>>
>> Hi Aaron,
>> Without looking too much into numbers, my bet would be that it is large
>> heap that is causing issues. I would decrease is significantly (<30GB) and
>> see if it is enough for your max load. Also, disable swap or reduce
>> swappiness to min.
>>
>> In any case, you should install some monitoring tool that would help you
>> do better analysis when you run into problems. One such tool is our
>> monitoring solution: https://sematext.com/spm
>>
>> HTH,
>> Emir
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>
>>
>>
>>> On 18 Mar 2019, at 13:14, Aaron Yingcai Sun <[hidden email]> wrote:
>>>
>>> Hello, Emir,
>>>
>>> Thanks for the reply, this is the solr version and heap info, standalone
>> single solr server. I don't have monitor tool connected. only look at
>> 'top', has not seen cpu spike so far, when the slow response happens, cpu
>> usage is not high at all, around 30%.
>>>
>>>
>>> # curl 'http://.../solr/admin/info/system?wt=json&indent=true'
>>> {
>>> "responseHeader":{
>>>   "status":0,
>>>   "QTime":27},
>>> "mode":"std",
>>> "solr_home":"/ardome/solr",
>>> "lucene":{
>>>   "solr-spec-version":"6.5.1",
>>>   "solr-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 -
>> jimczi - 2017-04-21 12:23:42",
>>>   "lucene-spec-version":"6.5.1",
>>>   "lucene-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75
>> - jimczi - 2017-04-21 12:17:15"},
>>> "jvm":{
>>>   "version":"1.8.0_144 25.144-b01",
>>>   "name":"Oracle Corporation Java HotSpot(TM) 64-Bit Server VM",
>>>   "spec":{
>>>     "vendor":"Oracle Corporation",
>>>     "name":"Java Platform API Specification",
>>>     "version":"1.8"},
>>>   "jre":{
>>>     "vendor":"Oracle Corporation",
>>>     "version":"1.8.0_144"},
>>>   "vm":{
>>>     "vendor":"Oracle Corporation",
>>>     "name":"Java HotSpot(TM) 64-Bit Server VM",
>>>     "version":"25.144-b01"},
>>>   "processors":32,
>>>   "memory":{
>>>     "free":"69.1 GB",
>>>     "total":"180.2 GB",
>>>     "max":"266.7 GB",
>>>     "used":"111 GB (%41.6)",
>>>     "raw":{
>>>       "free":74238728336,
>>>       "total":193470136320,
>>>       "max":286331502592,
>>>       "used":119231407984,
>>>       "used%":41.64103736566334}},
>>>   "jmx":{
>>>
>> "bootclasspath":"/usr/java/jdk1.8.0_144/jre/lib/resources.jar:/usr/java/jdk1.8.0_144/jre/lib/rt.jar:/usr/java/jdk1.8.0_144/jre/lib/sunrsasign.jar:/usr/java/jdk1.8.0_144/jre/lib/jsse.jar:/usr/java/jdk1.8.0_144/jre/lib/jce.jar:/usr/java/jdk1.8.0_144/jre/lib/charsets.jar:/usr/java/jdk1.8.0_144/jre/lib/jfr.jar:/usr/java/jdk1.8.0_144/jre/classes",
>>>     "classpath":"...",
>>>     "commandLineArgs":["-Xms100G",
>>>       "-Xmx300G",
>>>       "-DSTOP.PORT=8079",
>>>       "-DSTOP.KEY=..",
>>>       "-Dsolr.solr.home=..",
>>>       "-Djetty.port=8983"],
>>>     "startTime":"2019-03-18T09:35:27.892Z",
>>>     "upTimeMS":9258422}},
>>> "system":{
>>>   "name":"Linux",
>>>   "arch":"amd64",
>>>   "availableProcessors":32,
>>>   "systemLoadAverage":14.72,
>>>   "version":"3.0.101-311.g08a8a9d-default",
>>>   "committedVirtualMemorySize":2547960700928,
>>>   "freePhysicalMemorySize":4530696192,
>>>   "freeSwapSpaceSize":3486846976,
>>>   "processCpuLoad":0.3257436126790475,
>>>   "processCpuTime":93869450000000,
>>>   "systemCpuLoad":0.3279781055816521,
>>>   "totalPhysicalMemorySize":406480175104,
>>>   "totalSwapSpaceSize":4302303232 <(430)%20230-3232>,
>>>   "maxFileDescriptorCount":32768,
>>>   "openFileDescriptorCount":385,
>>>   "uname":"Linux ... 3.0.101-311.g08a8a9d-default #1 SMP Wed Dec 14
>> 10:15:37 UTC 2016 (08a8a9d) x86_64 x86_64 x86_64 GNU/Linux\n",
>>>   "uptime":" 13:09pm  up 5 days 21:23,  7 users,  load average: 14.72,
>> 12.28, 11.48\n"}}
>>>
>>>
>>>
>>>
>>> ________________________________
>>> From: Emir Arnautović <[hidden email]>
>>> Sent: Monday, March 18, 2019 12:10:30 PM
>>> To: [hidden email]
>>> Subject: Re: Solr index slow response
>>>
>>> Hi Aaron,
>>> Which version of Solr? How did you configure your heap? Is it standalone
>> Solr or SolrCloud? A single server? Do you use some monitoring tool? Do you
>> see some spikes, pauses or CPU usage is constant?
>>>
>>> Thanks,
>>> Emir
>>> --
>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>>
>>>
>>>
>>>> On 18 Mar 2019, at 11:47, Aaron Yingcai Sun <[hidden email]> wrote:
>>>>
>>>> Hello, Solr!
>>>>
>>>>
>>>> We are having some performance issue when try to send documents for
>> solr to index. The repose time is very slow and unpredictable some time.
>>>>
>>>>
>>>> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM,
>> while 300 GB is reserved for solr, while this happening, cpu usage is
>> around 30%, mem usage is 34%.  io also look ok according to iotop. SSD disk.
>>>>
>>>>
>>>> Our application send 100 documents to solr per request, json encoded.
>> the size is around 5M each time. some times the response time is under 1
>> seconds, some times could be 300 seconds, the slow response happens very
>> often.
>>>>
>>>>
>>>> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for
>> 3600000ms; if 1000000 uncommited docs"
>>>>
>>>>
>>>> There are around 100 clients sending those documents at the same time,
>> but each for the client is blocking call which wait the http response then
>> send the next one.
>>>>
>>>>
>>>> I tried to make the number of documents smaller in one request, such as
>> 20, but  still I see slow response time to time, like 80 seconds.
>>>>
>>>>
>>>> Would you help to give some hint how improve the response time?  solr
>> does not seems very loaded, there must be a way to make the response faster.
>>>>
>>>>
>>>> BRs
>>>>
>>>> //Aaron
>>>>
>>>>
>>>>
>>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: Solr index slow response

Emir Arnautović
Just to add different perspective here: how do you send documents to Solr? Are those log lines from your client? Maybe it is not Solr that is slow. Could it be network or client itself. If you have some dry run on client, maybe try running it without Solr to eliminate client from the suspects.

Do you observe similar spikes when you run indexing with less concurrent clients?

It is really hard to pinpoint the issue without looking at some monitoring tool.

Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 19 Mar 2019, at 09:17, Aaron Yingcai Sun <[hidden email]> wrote:
>
> We have around 80 million documents to index, total index size around 3TB,  I guess I'm not the first one to work with this big amount of data. with such slow response time, the index process would take around 2 weeks. While the system resource is not very loaded, there must be a way to speed it up.
>
>
> To Walter, I don't see why G1GC would improve this, we only do index, no query in the background. There is no memory constraint.  it's more feel like some internal thread are blocking each other.
>
>
> I used to run with more documents in one request, that give much worse response time, 300 documents in one request could end up 20 minutes response time, now I changed to max 10 documents in one request, still many response time around 30 seconds, while some of them are very fast( ~100 ms).  How come there are such big difference? the documents size does not have such big difference.
>
>
> I just want to speed it up since nothing seems to be overloaded.  Are there any other faster way to index such big amount of data?
>
>
> BRs
>
> //Aaron
>
> ________________________________
> From: Walter Underwood <[hidden email]>
> Sent: Monday, March 18, 2019 4:59:20 PM
> To: [hidden email]
> Subject: Re: Solr index slow response
>
> Solr is not designed to have consistent response times for updates. You are expecting Solr to do something that it does not do.
>
> About Xms and Xmx, the JVM will continue to allocate memory until it hits the max. After it hits the max, it will start to collect garbage. A smaller Xms just wastes time doing allocations after the JVM is running. Avoid that by making Xms and Xms the same.
>
> We run all of our JVMs with 8 GB of heap and the G1 collector. You probably do not need more than 8 GB unless you are doing high-cardinality facets or some other memory-hungry querying.
>
> The first step would be to use a good configuration. We start our Java 8 JVMs with these parameters:
>
> SOLR_HEAP=8g
> # Use G1 GC  -- wunder 2017-01-23
> # Settings from https://wiki.apache.org/solr/ShawnHeisey
> GC_TUNE=" \
> -XX:+UseG1GC \
> -XX:+ParallelRefProcEnabled \
> -XX:G1HeapRegionSize=8m \
> -XX:MaxGCPauseMillis=200 \
> -XX:+UseLargePages \
> -XX:+AggressiveOpts \
> “
>
> Use SSD for disks, with total space about 3X as big as the expected index size.
>
> Have RAM not used by Solr or the OS that is equal to the expected index size.
>
> After that, let’s figure out what the real requirement is. If you must have consistent response times for update requests, you’ll need to do that outside of Solr. But if you need high data import rates, we can probably help.
>
> wunder
> Walter Underwood
> [hidden email]
> http://observer.wunderwood.org/  (my blog)
>
>> On Mar 18, 2019, at 8:31 AM, Aaron Yingcai Sun <[hidden email]> wrote:
>>
>> Hello, Chris
>>
>>
>> Thanks for the tips.  So I tried to set it as you suggested, not see too much improvement.  Since I don't need it visible immediately, softCommit is disabled totally.
>>
>> The slow response is happening every few seconds,  if it happens hourly I would suspect the hourly auto-commit.  But it happen much more frequently.  I don't see any CPU/RAM/NETWORK IO/DISK IO bottleneck on OS level.  It just looks like solr server is blocking internally itself.
>>
>>
>> <       <maxTime>${solr.autoCommit.maxTime:3600000}</maxTime>
>> ---
>>>     <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
>> 16c16
>> <       <openSearcher>true</openSearcher>
>> ---
>>>     <openSearcher>false</openSearcher>
>>
>>
>>
>> 190318-162811.610-189982 DBG1:doc_count: 10 , doc_size: 539  KB, Res code: 200, QTime: 1405 ms, Request time: 1407 ms.
>> 190318-162811.636-189968 DBG1:doc_count: 10 , doc_size: 465  KB, Res code: 200, QTime: 1357 ms, Request time: 1360 ms.
>> 190318-162811.732-189968 DBG1:doc_count: 10 , doc_size: 473  KB, Res code: 200, QTime: 90 ms, Request time: 92 ms.
>> 190318-162811.995-189981 DBG1:doc_count: 10 , doc_size: 610  KB, Res code: 200, QTime: 5306 ms, Request time: 5308 ms.
>> 190318-162814.873-190003 DBG1:doc_count: 10 , doc_size: 508  KB, Res code: 200, QTime: 4775 ms, Request time: 4777 ms.
>> 190318-162814.889-189972 DBG1:doc_count: 10 , doc_size: 563  KB, Res code: 200, QTime: 20222 ms, Request time: 20224 ms.
>> 190318-162814.975-191817 DBG1:doc_count: 10 , doc_size: 539  KB, Res code: 200, QTime: 27732 ms, Request time: 27735 ms.
>> 190318-162814.975-189958 DBG1:doc_count: 10 , doc_size: 616  KB, Res code: 200, QTime: 28106 ms, Request time: 28109 ms.
>> 190318-162814.975-190004 DBG1:doc_count: 10 , doc_size: 473  KB, Res code: 200, QTime: 16703 ms, Request time: 16706 ms.
>> 190318-162814.982-189963 DBG1:doc_count: 10 , doc_size: 540  KB, Res code: 200, QTime: 28216 ms, Request time: 28218 ms.
>> 190318-162814.988-190007 DBG1:doc_count: 10 , doc_size: 673  KB, Res code: 200, QTime: 28133 ms, Request time: 28136 ms.
>> 190318-162814.993-189962 DBG1:doc_count: 10 , doc_size: 631  KB, Res code: 200, QTime: 27909 ms, Request time: 27912 ms.
>> 190318-162814.996-191818 DBG1:doc_count: 10 , doc_size: 529  KB, Res code: 200, QTime: 28172 ms, Request time: 28174 ms.
>> 190318-162815.056-189986 DBG1:doc_count: 10 , doc_size: 602  KB, Res code: 200, QTime: 11375 ms, Request time: 11378 ms.
>> 190318-162815.100-189988 DBG1:doc_count: 10 , doc_size: 530  KB, Res code: 200, QTime: 8663 ms, Request time: 8666 ms.
>> 190318-162815.275-189980 DBG1:doc_count: 10 , doc_size: 553  KB, Res code: 200, QTime: 27526 ms, Request time: 27528 ms.
>> 190318-162815.283-189997 DBG1:doc_count: 10 , doc_size: 600  KB, Res code: 200, QTime: 27529 ms, Request time: 27535 ms.
>> 190318-162815.318-189961 DBG1:doc_count: 10 , doc_size: 661  KB, Res code: 200, QTime: 16617 ms, Request time: 16621 ms.
>> 190318-162815.484-189952 DBG1:doc_count: 10 , doc_size: 549  KB, Res code: 200, QTime: 11707 ms, Request time: 11711 ms.
>> 190318-162815.485-189993 DBG1:doc_count: 10 , doc_size: 618  KB, Res code: 200, QTime: 28315 ms, Request time: 28321 ms.
>> 190318-162816.216-189972 DBG1:doc_count: 10 , doc_size: 493  KB, Res code: 200, QTime: 1320 ms, Request time: 1322 ms.
>> 190318-162816.354-189972 DBG1:doc_count: 10 , doc_size: 631  KB, Res code: 200, QTime: 130 ms, Request time: 132 ms.
>> 190318-162816.471-189972 DBG1:doc_count: 10 , doc_size: 563  KB, Res code: 200, QTime: 111 ms, Request time: 113 ms.
>> 190318-162816.586-189972 DBG1:doc_count: 10 , doc_size: 554  KB, Res code: 200, QTime: 110 ms, Request time: 111 ms.
>> 190318-162816.716-189972 DBG1:doc_count: 10 , doc_size: 590  KB, Res code: 200, QTime: 124 ms, Request time: 125 ms.
>> 190318-162820.494-189972 DBG1:doc_count: 10 , doc_size: 583  KB, Res code: 200, QTime: 3772 ms, Request time: 3774 ms.
>> 190318-162820.574-189953 DBG1:doc_count: 10 , doc_size: 550  KB, Res code: 200, QTime: 32802 ms, Request time: 32804 ms.
>> 190318-162820.609-189991 DBG1:doc_count: 10 , doc_size: 586  KB, Res code: 200, QTime: 33787 ms, Request time: 33790 ms.
>> 190318-162820.621-189976 DBG1:doc_count: 10 , doc_size: 572  KB, Res code: 200, QTime: 33397 ms, Request time: 33400 ms.
>> 190318-162820.622-189985 DBG1:doc_count: 10 , doc_size: 498  KB, Res code: 200, QTime: 32917 ms, Request time: 32919 ms.
>> 190318-162820.987-190005 DBG1:doc_count: 10 , doc_size: 629  KB, Res code: 200, QTime: 22207 ms, Request time: 22209 ms.
>> 190318-162821.028-189979 DBG1:doc_count: 10 , doc_size: 584  KB, Res code: 200, QTime: 22800 ms, Request time: 22802 ms.
>> 190318-162821.056-189948 DBG1:doc_count: 10 , doc_size: 670  KB, Res code: 200, QTime: 34193 ms, Request time: 34195 ms.
>> 190318-162821.062-189983 DBG1:doc_count: 10 , doc_size: 675  KB, Res code: 200, QTime: 22250 ms, Request time: 22252 ms.
>>
>>
>>
>>
>> BRs
>>
>> //Aaron
>>
>> ________________________________
>> From: Chris Ulicny <[hidden email]>
>> Sent: Monday, March 18, 2019 2:54:25 PM
>> To: [hidden email]
>> Subject: Re: Solr index slow response
>>
>> One other thing to look at besides the heap is your commit settings. We've
>> experienced something similar, and changing commit settings alleviated the
>> issue.
>>
>> Are you opening a search on every hardcommit? If so, you might want to
>> reconsider and use the softcommit for the hourly creation of a new searcher.
>>
>> The hardcommit interval should be much lower. Probably something on the
>> order of seconds (15000) instead of hours (currently 3600000). When the
>> hard commit fires, numerous merges might be firing off in the background
>> due to the volume of documents you are indexing, which might explain the
>> periodic bad response times shown in your logs.
>>
>> It would depend on your specific scenario, but here's our setup. During
>> long periods of constant indexing of documents to a staging collection (~2
>> billion documents), we have following commit settings
>>
>> softcommit: 3600000ms (for periodic validation of data, since it's not in
>> production)
>> hardcommit: openSearcher -> false, 15000ms (no document limit)
>>
>> This makes the documents available for searching every hour, but doesn't
>> result in the large bursts of IO due to the infrequent hard commits.
>>
>> For more info, Erick Erickson has a great write up:
>> https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>>
>> Best,
>> Chris
>>
>> On Mon, Mar 18, 2019 at 9:36 AM Aaron Yingcai Sun <[hidden email]> wrote:
>>
>>> Hi, Emir,
>>>
>>> My system used to run with max 32GB, the response time is bad as well.
>>> swap is set to 4GB, there 3.2 free, I doubt swap would affect it since
>>> there is such huge free memory.
>>>
>>> I could try to with set Xms and Xmx to the same value, but I doubt how
>>> much would that change the response time.
>>>
>>>
>>> BRs
>>>
>>> //Aaron
>>>
>>> ________________________________
>>> From: Emir Arnautović <[hidden email]>
>>> Sent: Monday, March 18, 2019 2:19:19 PM
>>> To: [hidden email]
>>> Subject: Re: Solr index slow response
>>>
>>> Hi Aaron,
>>> Without looking too much into numbers, my bet would be that it is large
>>> heap that is causing issues. I would decrease is significantly (<30GB) and
>>> see if it is enough for your max load. Also, disable swap or reduce
>>> swappiness to min.
>>>
>>> In any case, you should install some monitoring tool that would help you
>>> do better analysis when you run into problems. One such tool is our
>>> monitoring solution: https://sematext.com/spm
>>>
>>> HTH,
>>> Emir
>>> --
>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>>
>>>
>>>
>>>> On 18 Mar 2019, at 13:14, Aaron Yingcai Sun <[hidden email]> wrote:
>>>>
>>>> Hello, Emir,
>>>>
>>>> Thanks for the reply, this is the solr version and heap info, standalone
>>> single solr server. I don't have monitor tool connected. only look at
>>> 'top', has not seen cpu spike so far, when the slow response happens, cpu
>>> usage is not high at all, around 30%.
>>>>
>>>>
>>>> # curl 'http://.../solr/admin/info/system?wt=json&indent=true'
>>>> {
>>>> "responseHeader":{
>>>>  "status":0,
>>>>  "QTime":27},
>>>> "mode":"std",
>>>> "solr_home":"/ardome/solr",
>>>> "lucene":{
>>>>  "solr-spec-version":"6.5.1",
>>>>  "solr-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 -
>>> jimczi - 2017-04-21 12:23:42",
>>>>  "lucene-spec-version":"6.5.1",
>>>>  "lucene-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75
>>> - jimczi - 2017-04-21 12:17:15"},
>>>> "jvm":{
>>>>  "version":"1.8.0_144 25.144-b01",
>>>>  "name":"Oracle Corporation Java HotSpot(TM) 64-Bit Server VM",
>>>>  "spec":{
>>>>    "vendor":"Oracle Corporation",
>>>>    "name":"Java Platform API Specification",
>>>>    "version":"1.8"},
>>>>  "jre":{
>>>>    "vendor":"Oracle Corporation",
>>>>    "version":"1.8.0_144"},
>>>>  "vm":{
>>>>    "vendor":"Oracle Corporation",
>>>>    "name":"Java HotSpot(TM) 64-Bit Server VM",
>>>>    "version":"25.144-b01"},
>>>>  "processors":32,
>>>>  "memory":{
>>>>    "free":"69.1 GB",
>>>>    "total":"180.2 GB",
>>>>    "max":"266.7 GB",
>>>>    "used":"111 GB (%41.6)",
>>>>    "raw":{
>>>>      "free":74238728336,
>>>>      "total":193470136320,
>>>>      "max":286331502592,
>>>>      "used":119231407984,
>>>>      "used%":41.64103736566334}},
>>>>  "jmx":{
>>>>
>>> "bootclasspath":"/usr/java/jdk1.8.0_144/jre/lib/resources.jar:/usr/java/jdk1.8.0_144/jre/lib/rt.jar:/usr/java/jdk1.8.0_144/jre/lib/sunrsasign.jar:/usr/java/jdk1.8.0_144/jre/lib/jsse.jar:/usr/java/jdk1.8.0_144/jre/lib/jce.jar:/usr/java/jdk1.8.0_144/jre/lib/charsets.jar:/usr/java/jdk1.8.0_144/jre/lib/jfr.jar:/usr/java/jdk1.8.0_144/jre/classes",
>>>>    "classpath":"...",
>>>>    "commandLineArgs":["-Xms100G",
>>>>      "-Xmx300G",
>>>>      "-DSTOP.PORT=8079",
>>>>      "-DSTOP.KEY=..",
>>>>      "-Dsolr.solr.home=..",
>>>>      "-Djetty.port=8983"],
>>>>    "startTime":"2019-03-18T09:35:27.892Z",
>>>>    "upTimeMS":9258422}},
>>>> "system":{
>>>>  "name":"Linux",
>>>>  "arch":"amd64",
>>>>  "availableProcessors":32,
>>>>  "systemLoadAverage":14.72,
>>>>  "version":"3.0.101-311.g08a8a9d-default",
>>>>  "committedVirtualMemorySize":2547960700928,
>>>>  "freePhysicalMemorySize":4530696192,
>>>>  "freeSwapSpaceSize":3486846976,
>>>>  "processCpuLoad":0.3257436126790475,
>>>>  "processCpuTime":93869450000000,
>>>>  "systemCpuLoad":0.3279781055816521,
>>>>  "totalPhysicalMemorySize":406480175104,
>>>>  "totalSwapSpaceSize":4302303232 <(430)%20230-3232>,
>>>>  "maxFileDescriptorCount":32768,
>>>>  "openFileDescriptorCount":385,
>>>>  "uname":"Linux ... 3.0.101-311.g08a8a9d-default #1 SMP Wed Dec 14
>>> 10:15:37 UTC 2016 (08a8a9d) x86_64 x86_64 x86_64 GNU/Linux\n",
>>>>  "uptime":" 13:09pm  up 5 days 21:23,  7 users,  load average: 14.72,
>>> 12.28, 11.48\n"}}
>>>>
>>>>
>>>>
>>>>
>>>> ________________________________
>>>> From: Emir Arnautović <[hidden email]>
>>>> Sent: Monday, March 18, 2019 12:10:30 PM
>>>> To: [hidden email]
>>>> Subject: Re: Solr index slow response
>>>>
>>>> Hi Aaron,
>>>> Which version of Solr? How did you configure your heap? Is it standalone
>>> Solr or SolrCloud? A single server? Do you use some monitoring tool? Do you
>>> see some spikes, pauses or CPU usage is constant?
>>>>
>>>> Thanks,
>>>> Emir
>>>> --
>>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>>>
>>>>
>>>>
>>>>> On 18 Mar 2019, at 11:47, Aaron Yingcai Sun <[hidden email]> wrote:
>>>>>
>>>>> Hello, Solr!
>>>>>
>>>>>
>>>>> We are having some performance issue when try to send documents for
>>> solr to index. The repose time is very slow and unpredictable some time.
>>>>>
>>>>>
>>>>> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM,
>>> while 300 GB is reserved for solr, while this happening, cpu usage is
>>> around 30%, mem usage is 34%.  io also look ok according to iotop. SSD disk.
>>>>>
>>>>>
>>>>> Our application send 100 documents to solr per request, json encoded.
>>> the size is around 5M each time. some times the response time is under 1
>>> seconds, some times could be 300 seconds, the slow response happens very
>>> often.
>>>>>
>>>>>
>>>>> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for
>>> 3600000ms; if 1000000 uncommited docs"
>>>>>
>>>>>
>>>>> There are around 100 clients sending those documents at the same time,
>>> but each for the client is blocking call which wait the http response then
>>> send the next one.
>>>>>
>>>>>
>>>>> I tried to make the number of documents smaller in one request, such as
>>> 20, but  still I see slow response time to time, like 80 seconds.
>>>>>
>>>>>
>>>>> Would you help to give some hint how improve the response time?  solr
>>> does not seems very loaded, there must be a way to make the response faster.
>>>>>
>>>>>
>>>>> BRs
>>>>>
>>>>> //Aaron
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Solr index slow response

Aaron Yingcai Sun
"QTime" value is from the solr rest api response, extracted from the http/json payload.  The "Request time" is what I measured from client side, it's almost the same value as QTime, just some milliseconds difference.  I could provide tcpdump to prove that it is really solr slow response.

Those long response time is not really spikes, it's constantly happening, almost half of the request has such long delay.  The more document added in one request the more delay it has.

________________________________
From: Emir Arnautović <[hidden email]>
Sent: Tuesday, March 19, 2019 12:30:33 PM
To: [hidden email]
Subject: Re: Solr index slow response

Just to add different perspective here: how do you send documents to Solr? Are those log lines from your client? Maybe it is not Solr that is slow. Could it be network or client itself. If you have some dry run on client, maybe try running it without Solr to eliminate client from the suspects.

Do you observe similar spikes when you run indexing with less concurrent clients?

It is really hard to pinpoint the issue without looking at some monitoring tool.

Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 19 Mar 2019, at 09:17, Aaron Yingcai Sun <[hidden email]> wrote:
>
> We have around 80 million documents to index, total index size around 3TB,  I guess I'm not the first one to work with this big amount of data. with such slow response time, the index process would take around 2 weeks. While the system resource is not very loaded, there must be a way to speed it up.
>
>
> To Walter, I don't see why G1GC would improve this, we only do index, no query in the background. There is no memory constraint.  it's more feel like some internal thread are blocking each other.
>
>
> I used to run with more documents in one request, that give much worse response time, 300 documents in one request could end up 20 minutes response time, now I changed to max 10 documents in one request, still many response time around 30 seconds, while some of them are very fast( ~100 ms).  How come there are such big difference? the documents size does not have such big difference.
>
>
> I just want to speed it up since nothing seems to be overloaded.  Are there any other faster way to index such big amount of data?
>
>
> BRs
>
> //Aaron
>
> ________________________________
> From: Walter Underwood <[hidden email]>
> Sent: Monday, March 18, 2019 4:59:20 PM
> To: [hidden email]
> Subject: Re: Solr index slow response
>
> Solr is not designed to have consistent response times for updates. You are expecting Solr to do something that it does not do.
>
> About Xms and Xmx, the JVM will continue to allocate memory until it hits the max. After it hits the max, it will start to collect garbage. A smaller Xms just wastes time doing allocations after the JVM is running. Avoid that by making Xms and Xms the same.
>
> We run all of our JVMs with 8 GB of heap and the G1 collector. You probably do not need more than 8 GB unless you are doing high-cardinality facets or some other memory-hungry querying.
>
> The first step would be to use a good configuration. We start our Java 8 JVMs with these parameters:
>
> SOLR_HEAP=8g
> # Use G1 GC  -- wunder 2017-01-23
> # Settings from https://wiki.apache.org/solr/ShawnHeisey
> GC_TUNE=" \
> -XX:+UseG1GC \
> -XX:+ParallelRefProcEnabled \
> -XX:G1HeapRegionSize=8m \
> -XX:MaxGCPauseMillis=200 \
> -XX:+UseLargePages \
> -XX:+AggressiveOpts \
> “
>
> Use SSD for disks, with total space about 3X as big as the expected index size.
>
> Have RAM not used by Solr or the OS that is equal to the expected index size.
>
> After that, let’s figure out what the real requirement is. If you must have consistent response times for update requests, you’ll need to do that outside of Solr. But if you need high data import rates, we can probably help.
>
> wunder
> Walter Underwood
> [hidden email]
> http://observer.wunderwood.org/  (my blog)
>
>> On Mar 18, 2019, at 8:31 AM, Aaron Yingcai Sun <[hidden email]> wrote:
>>
>> Hello, Chris
>>
>>
>> Thanks for the tips.  So I tried to set it as you suggested, not see too much improvement.  Since I don't need it visible immediately, softCommit is disabled totally.
>>
>> The slow response is happening every few seconds,  if it happens hourly I would suspect the hourly auto-commit.  But it happen much more frequently.  I don't see any CPU/RAM/NETWORK IO/DISK IO bottleneck on OS level.  It just looks like solr server is blocking internally itself.
>>
>>
>> <       <maxTime>${solr.autoCommit.maxTime:3600000}</maxTime>
>> ---
>>>     <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
>> 16c16
>> <       <openSearcher>true</openSearcher>
>> ---
>>>     <openSearcher>false</openSearcher>
>>
>>
>>
>> 190318-162811.610-189982 DBG1:doc_count: 10 , doc_size: 539  KB, Res code: 200, QTime: 1405 ms, Request time: 1407 ms.
>> 190318-162811.636-189968 DBG1:doc_count: 10 , doc_size: 465  KB, Res code: 200, QTime: 1357 ms, Request time: 1360 ms.
>> 190318-162811.732-189968 DBG1:doc_count: 10 , doc_size: 473  KB, Res code: 200, QTime: 90 ms, Request time: 92 ms.
>> 190318-162811.995-189981 DBG1:doc_count: 10 , doc_size: 610  KB, Res code: 200, QTime: 5306 ms, Request time: 5308 ms.
>> 190318-162814.873-190003 DBG1:doc_count: 10 , doc_size: 508  KB, Res code: 200, QTime: 4775 ms, Request time: 4777 ms.
>> 190318-162814.889-189972 DBG1:doc_count: 10 , doc_size: 563  KB, Res code: 200, QTime: 20222 ms, Request time: 20224 ms.
>> 190318-162814.975-191817 DBG1:doc_count: 10 , doc_size: 539  KB, Res code: 200, QTime: 27732 ms, Request time: 27735 ms.
>> 190318-162814.975-189958 DBG1:doc_count: 10 , doc_size: 616  KB, Res code: 200, QTime: 28106 ms, Request time: 28109 ms.
>> 190318-162814.975-190004 DBG1:doc_count: 10 , doc_size: 473  KB, Res code: 200, QTime: 16703 ms, Request time: 16706 ms.
>> 190318-162814.982-189963 DBG1:doc_count: 10 , doc_size: 540  KB, Res code: 200, QTime: 28216 ms, Request time: 28218 ms.
>> 190318-162814.988-190007 DBG1:doc_count: 10 , doc_size: 673  KB, Res code: 200, QTime: 28133 ms, Request time: 28136 ms.
>> 190318-162814.993-189962 DBG1:doc_count: 10 , doc_size: 631  KB, Res code: 200, QTime: 27909 ms, Request time: 27912 ms.
>> 190318-162814.996-191818 DBG1:doc_count: 10 , doc_size: 529  KB, Res code: 200, QTime: 28172 ms, Request time: 28174 ms.
>> 190318-162815.056-189986 DBG1:doc_count: 10 , doc_size: 602  KB, Res code: 200, QTime: 11375 ms, Request time: 11378 ms.
>> 190318-162815.100-189988 DBG1:doc_count: 10 , doc_size: 530  KB, Res code: 200, QTime: 8663 ms, Request time: 8666 ms.
>> 190318-162815.275-189980 DBG1:doc_count: 10 , doc_size: 553  KB, Res code: 200, QTime: 27526 ms, Request time: 27528 ms.
>> 190318-162815.283-189997 DBG1:doc_count: 10 , doc_size: 600  KB, Res code: 200, QTime: 27529 ms, Request time: 27535 ms.
>> 190318-162815.318-189961 DBG1:doc_count: 10 , doc_size: 661  KB, Res code: 200, QTime: 16617 ms, Request time: 16621 ms.
>> 190318-162815.484-189952 DBG1:doc_count: 10 , doc_size: 549  KB, Res code: 200, QTime: 11707 ms, Request time: 11711 ms.
>> 190318-162815.485-189993 DBG1:doc_count: 10 , doc_size: 618  KB, Res code: 200, QTime: 28315 ms, Request time: 28321 ms.
>> 190318-162816.216-189972 DBG1:doc_count: 10 , doc_size: 493  KB, Res code: 200, QTime: 1320 ms, Request time: 1322 ms.
>> 190318-162816.354-189972 DBG1:doc_count: 10 , doc_size: 631  KB, Res code: 200, QTime: 130 ms, Request time: 132 ms.
>> 190318-162816.471-189972 DBG1:doc_count: 10 , doc_size: 563  KB, Res code: 200, QTime: 111 ms, Request time: 113 ms.
>> 190318-162816.586-189972 DBG1:doc_count: 10 , doc_size: 554  KB, Res code: 200, QTime: 110 ms, Request time: 111 ms.
>> 190318-162816.716-189972 DBG1:doc_count: 10 , doc_size: 590  KB, Res code: 200, QTime: 124 ms, Request time: 125 ms.
>> 190318-162820.494-189972 DBG1:doc_count: 10 , doc_size: 583  KB, Res code: 200, QTime: 3772 ms, Request time: 3774 ms.
>> 190318-162820.574-189953 DBG1:doc_count: 10 , doc_size: 550  KB, Res code: 200, QTime: 32802 ms, Request time: 32804 ms.
>> 190318-162820.609-189991 DBG1:doc_count: 10 , doc_size: 586  KB, Res code: 200, QTime: 33787 ms, Request time: 33790 ms.
>> 190318-162820.621-189976 DBG1:doc_count: 10 , doc_size: 572  KB, Res code: 200, QTime: 33397 ms, Request time: 33400 ms.
>> 190318-162820.622-189985 DBG1:doc_count: 10 , doc_size: 498  KB, Res code: 200, QTime: 32917 ms, Request time: 32919 ms.
>> 190318-162820.987-190005 DBG1:doc_count: 10 , doc_size: 629  KB, Res code: 200, QTime: 22207 ms, Request time: 22209 ms.
>> 190318-162821.028-189979 DBG1:doc_count: 10 , doc_size: 584  KB, Res code: 200, QTime: 22800 ms, Request time: 22802 ms.
>> 190318-162821.056-189948 DBG1:doc_count: 10 , doc_size: 670  KB, Res code: 200, QTime: 34193 ms, Request time: 34195 ms.
>> 190318-162821.062-189983 DBG1:doc_count: 10 , doc_size: 675  KB, Res code: 200, QTime: 22250 ms, Request time: 22252 ms.
>>
>>
>>
>>
>> BRs
>>
>> //Aaron
>>
>> ________________________________
>> From: Chris Ulicny <[hidden email]>
>> Sent: Monday, March 18, 2019 2:54:25 PM
>> To: [hidden email]
>> Subject: Re: Solr index slow response
>>
>> One other thing to look at besides the heap is your commit settings. We've
>> experienced something similar, and changing commit settings alleviated the
>> issue.
>>
>> Are you opening a search on every hardcommit? If so, you might want to
>> reconsider and use the softcommit for the hourly creation of a new searcher.
>>
>> The hardcommit interval should be much lower. Probably something on the
>> order of seconds (15000) instead of hours (currently 3600000). When the
>> hard commit fires, numerous merges might be firing off in the background
>> due to the volume of documents you are indexing, which might explain the
>> periodic bad response times shown in your logs.
>>
>> It would depend on your specific scenario, but here's our setup. During
>> long periods of constant indexing of documents to a staging collection (~2
>> billion documents), we have following commit settings
>>
>> softcommit: 3600000ms (for periodic validation of data, since it's not in
>> production)
>> hardcommit: openSearcher -> false, 15000ms (no document limit)
>>
>> This makes the documents available for searching every hour, but doesn't
>> result in the large bursts of IO due to the infrequent hard commits.
>>
>> For more info, Erick Erickson has a great write up:
>> https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>>
>> Best,
>> Chris
>>
>> On Mon, Mar 18, 2019 at 9:36 AM Aaron Yingcai Sun <[hidden email]> wrote:
>>
>>> Hi, Emir,
>>>
>>> My system used to run with max 32GB, the response time is bad as well.
>>> swap is set to 4GB, there 3.2 free, I doubt swap would affect it since
>>> there is such huge free memory.
>>>
>>> I could try to with set Xms and Xmx to the same value, but I doubt how
>>> much would that change the response time.
>>>
>>>
>>> BRs
>>>
>>> //Aaron
>>>
>>> ________________________________
>>> From: Emir Arnautović <[hidden email]>
>>> Sent: Monday, March 18, 2019 2:19:19 PM
>>> To: [hidden email]
>>> Subject: Re: Solr index slow response
>>>
>>> Hi Aaron,
>>> Without looking too much into numbers, my bet would be that it is large
>>> heap that is causing issues. I would decrease is significantly (<30GB) and
>>> see if it is enough for your max load. Also, disable swap or reduce
>>> swappiness to min.
>>>
>>> In any case, you should install some monitoring tool that would help you
>>> do better analysis when you run into problems. One such tool is our
>>> monitoring solution: https://sematext.com/spm
>>>
>>> HTH,
>>> Emir
>>> --
>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>>
>>>
>>>
>>>> On 18 Mar 2019, at 13:14, Aaron Yingcai Sun <[hidden email]> wrote:
>>>>
>>>> Hello, Emir,
>>>>
>>>> Thanks for the reply, this is the solr version and heap info, standalone
>>> single solr server. I don't have monitor tool connected. only look at
>>> 'top', has not seen cpu spike so far, when the slow response happens, cpu
>>> usage is not high at all, around 30%.
>>>>
>>>>
>>>> # curl 'http://.../solr/admin/info/system?wt=json&indent=true'
>>>> {
>>>> "responseHeader":{
>>>>  "status":0,
>>>>  "QTime":27},
>>>> "mode":"std",
>>>> "solr_home":"/ardome/solr",
>>>> "lucene":{
>>>>  "solr-spec-version":"6.5.1",
>>>>  "solr-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 -
>>> jimczi - 2017-04-21 12:23:42",
>>>>  "lucene-spec-version":"6.5.1",
>>>>  "lucene-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75
>>> - jimczi - 2017-04-21 12:17:15"},
>>>> "jvm":{
>>>>  "version":"1.8.0_144 25.144-b01",
>>>>  "name":"Oracle Corporation Java HotSpot(TM) 64-Bit Server VM",
>>>>  "spec":{
>>>>    "vendor":"Oracle Corporation",
>>>>    "name":"Java Platform API Specification",
>>>>    "version":"1.8"},
>>>>  "jre":{
>>>>    "vendor":"Oracle Corporation",
>>>>    "version":"1.8.0_144"},
>>>>  "vm":{
>>>>    "vendor":"Oracle Corporation",
>>>>    "name":"Java HotSpot(TM) 64-Bit Server VM",
>>>>    "version":"25.144-b01"},
>>>>  "processors":32,
>>>>  "memory":{
>>>>    "free":"69.1 GB",
>>>>    "total":"180.2 GB",
>>>>    "max":"266.7 GB",
>>>>    "used":"111 GB (%41.6)",
>>>>    "raw":{
>>>>      "free":74238728336,
>>>>      "total":193470136320,
>>>>      "max":286331502592,
>>>>      "used":119231407984,
>>>>      "used%":41.64103736566334}},
>>>>  "jmx":{
>>>>
>>> "bootclasspath":"/usr/java/jdk1.8.0_144/jre/lib/resources.jar:/usr/java/jdk1.8.0_144/jre/lib/rt.jar:/usr/java/jdk1.8.0_144/jre/lib/sunrsasign.jar:/usr/java/jdk1.8.0_144/jre/lib/jsse.jar:/usr/java/jdk1.8.0_144/jre/lib/jce.jar:/usr/java/jdk1.8.0_144/jre/lib/charsets.jar:/usr/java/jdk1.8.0_144/jre/lib/jfr.jar:/usr/java/jdk1.8.0_144/jre/classes",
>>>>    "classpath":"...",
>>>>    "commandLineArgs":["-Xms100G",
>>>>      "-Xmx300G",
>>>>      "-DSTOP.PORT=8079",
>>>>      "-DSTOP.KEY=..",
>>>>      "-Dsolr.solr.home=..",
>>>>      "-Djetty.port=8983"],
>>>>    "startTime":"2019-03-18T09:35:27.892Z",
>>>>    "upTimeMS":9258422}},
>>>> "system":{
>>>>  "name":"Linux",
>>>>  "arch":"amd64",
>>>>  "availableProcessors":32,
>>>>  "systemLoadAverage":14.72,
>>>>  "version":"3.0.101-311.g08a8a9d-default",
>>>>  "committedVirtualMemorySize":2547960700928,
>>>>  "freePhysicalMemorySize":4530696192,
>>>>  "freeSwapSpaceSize":3486846976,
>>>>  "processCpuLoad":0.3257436126790475,
>>>>  "processCpuTime":93869450000000,
>>>>  "systemCpuLoad":0.3279781055816521,
>>>>  "totalPhysicalMemorySize":406480175104,
>>>>  "totalSwapSpaceSize":4302303232 <(430)%20230-3232>,
>>>>  "maxFileDescriptorCount":32768,
>>>>  "openFileDescriptorCount":385,
>>>>  "uname":"Linux ... 3.0.101-311.g08a8a9d-default #1 SMP Wed Dec 14
>>> 10:15:37 UTC 2016 (08a8a9d) x86_64 x86_64 x86_64 GNU/Linux\n",
>>>>  "uptime":" 13:09pm  up 5 days 21:23,  7 users,  load average: 14.72,
>>> 12.28, 11.48\n"}}
>>>>
>>>>
>>>>
>>>>
>>>> ________________________________
>>>> From: Emir Arnautović <[hidden email]>
>>>> Sent: Monday, March 18, 2019 12:10:30 PM
>>>> To: [hidden email]
>>>> Subject: Re: Solr index slow response
>>>>
>>>> Hi Aaron,
>>>> Which version of Solr? How did you configure your heap? Is it standalone
>>> Solr or SolrCloud? A single server? Do you use some monitoring tool? Do you
>>> see some spikes, pauses or CPU usage is constant?
>>>>
>>>> Thanks,
>>>> Emir
>>>> --
>>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>>>
>>>>
>>>>
>>>>> On 18 Mar 2019, at 11:47, Aaron Yingcai Sun <[hidden email]> wrote:
>>>>>
>>>>> Hello, Solr!
>>>>>
>>>>>
>>>>> We are having some performance issue when try to send documents for
>>> solr to index. The repose time is very slow and unpredictable some time.
>>>>>
>>>>>
>>>>> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM,
>>> while 300 GB is reserved for solr, while this happening, cpu usage is
>>> around 30%, mem usage is 34%.  io also look ok according to iotop. SSD disk.
>>>>>
>>>>>
>>>>> Our application send 100 documents to solr per request, json encoded.
>>> the size is around 5M each time. some times the response time is under 1
>>> seconds, some times could be 300 seconds, the slow response happens very
>>> often.
>>>>>
>>>>>
>>>>> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for
>>> 3600000ms; if 1000000 uncommited docs"
>>>>>
>>>>>
>>>>> There are around 100 clients sending those documents at the same time,
>>> but each for the client is blocking call which wait the http response then
>>> send the next one.
>>>>>
>>>>>
>>>>> I tried to make the number of documents smaller in one request, such as
>>> 20, but  still I see slow response time to time, like 80 seconds.
>>>>>
>>>>>
>>>>> Would you help to give some hint how improve the response time?  solr
>>> does not seems very loaded, there must be a way to make the response faster.
>>>>>
>>>>>
>>>>> BRs
>>>>>
>>>>> //Aaron
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Solr index slow response

Emir Arnautović
If you start indexing with just a single thread/client, do you still see slow bulks?

Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 19 Mar 2019, at 12:54, Aaron Yingcai Sun <[hidden email]> wrote:
>
> "QTime" value is from the solr rest api response, extracted from the http/json payload.  The "Request time" is what I measured from client side, it's almost the same value as QTime, just some milliseconds difference.  I could provide tcpdump to prove that it is really solr slow response.
>
> Those long response time is not really spikes, it's constantly happening, almost half of the request has such long delay.  The more document added in one request the more delay it has.
>
> ________________________________
> From: Emir Arnautović <[hidden email]>
> Sent: Tuesday, March 19, 2019 12:30:33 PM
> To: [hidden email]
> Subject: Re: Solr index slow response
>
> Just to add different perspective here: how do you send documents to Solr? Are those log lines from your client? Maybe it is not Solr that is slow. Could it be network or client itself. If you have some dry run on client, maybe try running it without Solr to eliminate client from the suspects.
>
> Do you observe similar spikes when you run indexing with less concurrent clients?
>
> It is really hard to pinpoint the issue without looking at some monitoring tool.
>
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
>> On 19 Mar 2019, at 09:17, Aaron Yingcai Sun <[hidden email]> wrote:
>>
>> We have around 80 million documents to index, total index size around 3TB,  I guess I'm not the first one to work with this big amount of data. with such slow response time, the index process would take around 2 weeks. While the system resource is not very loaded, there must be a way to speed it up.
>>
>>
>> To Walter, I don't see why G1GC would improve this, we only do index, no query in the background. There is no memory constraint.  it's more feel like some internal thread are blocking each other.
>>
>>
>> I used to run with more documents in one request, that give much worse response time, 300 documents in one request could end up 20 minutes response time, now I changed to max 10 documents in one request, still many response time around 30 seconds, while some of them are very fast( ~100 ms).  How come there are such big difference? the documents size does not have such big difference.
>>
>>
>> I just want to speed it up since nothing seems to be overloaded.  Are there any other faster way to index such big amount of data?
>>
>>
>> BRs
>>
>> //Aaron
>>
>> ________________________________
>> From: Walter Underwood <[hidden email]>
>> Sent: Monday, March 18, 2019 4:59:20 PM
>> To: [hidden email]
>> Subject: Re: Solr index slow response
>>
>> Solr is not designed to have consistent response times for updates. You are expecting Solr to do something that it does not do.
>>
>> About Xms and Xmx, the JVM will continue to allocate memory until it hits the max. After it hits the max, it will start to collect garbage. A smaller Xms just wastes time doing allocations after the JVM is running. Avoid that by making Xms and Xms the same.
>>
>> We run all of our JVMs with 8 GB of heap and the G1 collector. You probably do not need more than 8 GB unless you are doing high-cardinality facets or some other memory-hungry querying.
>>
>> The first step would be to use a good configuration. We start our Java 8 JVMs with these parameters:
>>
>> SOLR_HEAP=8g
>> # Use G1 GC  -- wunder 2017-01-23
>> # Settings from https://wiki.apache.org/solr/ShawnHeisey
>> GC_TUNE=" \
>> -XX:+UseG1GC \
>> -XX:+ParallelRefProcEnabled \
>> -XX:G1HeapRegionSize=8m \
>> -XX:MaxGCPauseMillis=200 \
>> -XX:+UseLargePages \
>> -XX:+AggressiveOpts \
>> “
>>
>> Use SSD for disks, with total space about 3X as big as the expected index size.
>>
>> Have RAM not used by Solr or the OS that is equal to the expected index size.
>>
>> After that, let’s figure out what the real requirement is. If you must have consistent response times for update requests, you’ll need to do that outside of Solr. But if you need high data import rates, we can probably help.
>>
>> wunder
>> Walter Underwood
>> [hidden email]
>> http://observer.wunderwood.org/  (my blog)
>>
>>> On Mar 18, 2019, at 8:31 AM, Aaron Yingcai Sun <[hidden email]> wrote:
>>>
>>> Hello, Chris
>>>
>>>
>>> Thanks for the tips.  So I tried to set it as you suggested, not see too much improvement.  Since I don't need it visible immediately, softCommit is disabled totally.
>>>
>>> The slow response is happening every few seconds,  if it happens hourly I would suspect the hourly auto-commit.  But it happen much more frequently.  I don't see any CPU/RAM/NETWORK IO/DISK IO bottleneck on OS level.  It just looks like solr server is blocking internally itself.
>>>
>>>
>>> <       <maxTime>${solr.autoCommit.maxTime:3600000}</maxTime>
>>> ---
>>>>    <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
>>> 16c16
>>> <       <openSearcher>true</openSearcher>
>>> ---
>>>>    <openSearcher>false</openSearcher>
>>>
>>>
>>>
>>> 190318-162811.610-189982 DBG1:doc_count: 10 , doc_size: 539  KB, Res code: 200, QTime: 1405 ms, Request time: 1407 ms.
>>> 190318-162811.636-189968 DBG1:doc_count: 10 , doc_size: 465  KB, Res code: 200, QTime: 1357 ms, Request time: 1360 ms.
>>> 190318-162811.732-189968 DBG1:doc_count: 10 , doc_size: 473  KB, Res code: 200, QTime: 90 ms, Request time: 92 ms.
>>> 190318-162811.995-189981 DBG1:doc_count: 10 , doc_size: 610  KB, Res code: 200, QTime: 5306 ms, Request time: 5308 ms.
>>> 190318-162814.873-190003 DBG1:doc_count: 10 , doc_size: 508  KB, Res code: 200, QTime: 4775 ms, Request time: 4777 ms.
>>> 190318-162814.889-189972 DBG1:doc_count: 10 , doc_size: 563  KB, Res code: 200, QTime: 20222 ms, Request time: 20224 ms.
>>> 190318-162814.975-191817 DBG1:doc_count: 10 , doc_size: 539  KB, Res code: 200, QTime: 27732 ms, Request time: 27735 ms.
>>> 190318-162814.975-189958 DBG1:doc_count: 10 , doc_size: 616  KB, Res code: 200, QTime: 28106 ms, Request time: 28109 ms.
>>> 190318-162814.975-190004 DBG1:doc_count: 10 , doc_size: 473  KB, Res code: 200, QTime: 16703 ms, Request time: 16706 ms.
>>> 190318-162814.982-189963 DBG1:doc_count: 10 , doc_size: 540  KB, Res code: 200, QTime: 28216 ms, Request time: 28218 ms.
>>> 190318-162814.988-190007 DBG1:doc_count: 10 , doc_size: 673  KB, Res code: 200, QTime: 28133 ms, Request time: 28136 ms.
>>> 190318-162814.993-189962 DBG1:doc_count: 10 , doc_size: 631  KB, Res code: 200, QTime: 27909 ms, Request time: 27912 ms.
>>> 190318-162814.996-191818 DBG1:doc_count: 10 , doc_size: 529  KB, Res code: 200, QTime: 28172 ms, Request time: 28174 ms.
>>> 190318-162815.056-189986 DBG1:doc_count: 10 , doc_size: 602  KB, Res code: 200, QTime: 11375 ms, Request time: 11378 ms.
>>> 190318-162815.100-189988 DBG1:doc_count: 10 , doc_size: 530  KB, Res code: 200, QTime: 8663 ms, Request time: 8666 ms.
>>> 190318-162815.275-189980 DBG1:doc_count: 10 , doc_size: 553  KB, Res code: 200, QTime: 27526 ms, Request time: 27528 ms.
>>> 190318-162815.283-189997 DBG1:doc_count: 10 , doc_size: 600  KB, Res code: 200, QTime: 27529 ms, Request time: 27535 ms.
>>> 190318-162815.318-189961 DBG1:doc_count: 10 , doc_size: 661  KB, Res code: 200, QTime: 16617 ms, Request time: 16621 ms.
>>> 190318-162815.484-189952 DBG1:doc_count: 10 , doc_size: 549  KB, Res code: 200, QTime: 11707 ms, Request time: 11711 ms.
>>> 190318-162815.485-189993 DBG1:doc_count: 10 , doc_size: 618  KB, Res code: 200, QTime: 28315 ms, Request time: 28321 ms.
>>> 190318-162816.216-189972 DBG1:doc_count: 10 , doc_size: 493  KB, Res code: 200, QTime: 1320 ms, Request time: 1322 ms.
>>> 190318-162816.354-189972 DBG1:doc_count: 10 , doc_size: 631  KB, Res code: 200, QTime: 130 ms, Request time: 132 ms.
>>> 190318-162816.471-189972 DBG1:doc_count: 10 , doc_size: 563  KB, Res code: 200, QTime: 111 ms, Request time: 113 ms.
>>> 190318-162816.586-189972 DBG1:doc_count: 10 , doc_size: 554  KB, Res code: 200, QTime: 110 ms, Request time: 111 ms.
>>> 190318-162816.716-189972 DBG1:doc_count: 10 , doc_size: 590  KB, Res code: 200, QTime: 124 ms, Request time: 125 ms.
>>> 190318-162820.494-189972 DBG1:doc_count: 10 , doc_size: 583  KB, Res code: 200, QTime: 3772 ms, Request time: 3774 ms.
>>> 190318-162820.574-189953 DBG1:doc_count: 10 , doc_size: 550  KB, Res code: 200, QTime: 32802 ms, Request time: 32804 ms.
>>> 190318-162820.609-189991 DBG1:doc_count: 10 , doc_size: 586  KB, Res code: 200, QTime: 33787 ms, Request time: 33790 ms.
>>> 190318-162820.621-189976 DBG1:doc_count: 10 , doc_size: 572  KB, Res code: 200, QTime: 33397 ms, Request time: 33400 ms.
>>> 190318-162820.622-189985 DBG1:doc_count: 10 , doc_size: 498  KB, Res code: 200, QTime: 32917 ms, Request time: 32919 ms.
>>> 190318-162820.987-190005 DBG1:doc_count: 10 , doc_size: 629  KB, Res code: 200, QTime: 22207 ms, Request time: 22209 ms.
>>> 190318-162821.028-189979 DBG1:doc_count: 10 , doc_size: 584  KB, Res code: 200, QTime: 22800 ms, Request time: 22802 ms.
>>> 190318-162821.056-189948 DBG1:doc_count: 10 , doc_size: 670  KB, Res code: 200, QTime: 34193 ms, Request time: 34195 ms.
>>> 190318-162821.062-189983 DBG1:doc_count: 10 , doc_size: 675  KB, Res code: 200, QTime: 22250 ms, Request time: 22252 ms.
>>>
>>>
>>>
>>>
>>> BRs
>>>
>>> //Aaron
>>>
>>> ________________________________
>>> From: Chris Ulicny <[hidden email]>
>>> Sent: Monday, March 18, 2019 2:54:25 PM
>>> To: [hidden email]
>>> Subject: Re: Solr index slow response
>>>
>>> One other thing to look at besides the heap is your commit settings. We've
>>> experienced something similar, and changing commit settings alleviated the
>>> issue.
>>>
>>> Are you opening a search on every hardcommit? If so, you might want to
>>> reconsider and use the softcommit for the hourly creation of a new searcher.
>>>
>>> The hardcommit interval should be much lower. Probably something on the
>>> order of seconds (15000) instead of hours (currently 3600000). When the
>>> hard commit fires, numerous merges might be firing off in the background
>>> due to the volume of documents you are indexing, which might explain the
>>> periodic bad response times shown in your logs.
>>>
>>> It would depend on your specific scenario, but here's our setup. During
>>> long periods of constant indexing of documents to a staging collection (~2
>>> billion documents), we have following commit settings
>>>
>>> softcommit: 3600000ms (for periodic validation of data, since it's not in
>>> production)
>>> hardcommit: openSearcher -> false, 15000ms (no document limit)
>>>
>>> This makes the documents available for searching every hour, but doesn't
>>> result in the large bursts of IO due to the infrequent hard commits.
>>>
>>> For more info, Erick Erickson has a great write up:
>>> https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>>>
>>> Best,
>>> Chris
>>>
>>> On Mon, Mar 18, 2019 at 9:36 AM Aaron Yingcai Sun <[hidden email]> wrote:
>>>
>>>> Hi, Emir,
>>>>
>>>> My system used to run with max 32GB, the response time is bad as well.
>>>> swap is set to 4GB, there 3.2 free, I doubt swap would affect it since
>>>> there is such huge free memory.
>>>>
>>>> I could try to with set Xms and Xmx to the same value, but I doubt how
>>>> much would that change the response time.
>>>>
>>>>
>>>> BRs
>>>>
>>>> //Aaron
>>>>
>>>> ________________________________
>>>> From: Emir Arnautović <[hidden email]>
>>>> Sent: Monday, March 18, 2019 2:19:19 PM
>>>> To: [hidden email]
>>>> Subject: Re: Solr index slow response
>>>>
>>>> Hi Aaron,
>>>> Without looking too much into numbers, my bet would be that it is large
>>>> heap that is causing issues. I would decrease is significantly (<30GB) and
>>>> see if it is enough for your max load. Also, disable swap or reduce
>>>> swappiness to min.
>>>>
>>>> In any case, you should install some monitoring tool that would help you
>>>> do better analysis when you run into problems. One such tool is our
>>>> monitoring solution: https://sematext.com/spm
>>>>
>>>> HTH,
>>>> Emir
>>>> --
>>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>>>
>>>>
>>>>
>>>>> On 18 Mar 2019, at 13:14, Aaron Yingcai Sun <[hidden email]> wrote:
>>>>>
>>>>> Hello, Emir,
>>>>>
>>>>> Thanks for the reply, this is the solr version and heap info, standalone
>>>> single solr server. I don't have monitor tool connected. only look at
>>>> 'top', has not seen cpu spike so far, when the slow response happens, cpu
>>>> usage is not high at all, around 30%.
>>>>>
>>>>>
>>>>> # curl 'http://.../solr/admin/info/system?wt=json&indent=true'
>>>>> {
>>>>> "responseHeader":{
>>>>> "status":0,
>>>>> "QTime":27},
>>>>> "mode":"std",
>>>>> "solr_home":"/ardome/solr",
>>>>> "lucene":{
>>>>> "solr-spec-version":"6.5.1",
>>>>> "solr-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 -
>>>> jimczi - 2017-04-21 12:23:42",
>>>>> "lucene-spec-version":"6.5.1",
>>>>> "lucene-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75
>>>> - jimczi - 2017-04-21 12:17:15"},
>>>>> "jvm":{
>>>>> "version":"1.8.0_144 25.144-b01",
>>>>> "name":"Oracle Corporation Java HotSpot(TM) 64-Bit Server VM",
>>>>> "spec":{
>>>>>   "vendor":"Oracle Corporation",
>>>>>   "name":"Java Platform API Specification",
>>>>>   "version":"1.8"},
>>>>> "jre":{
>>>>>   "vendor":"Oracle Corporation",
>>>>>   "version":"1.8.0_144"},
>>>>> "vm":{
>>>>>   "vendor":"Oracle Corporation",
>>>>>   "name":"Java HotSpot(TM) 64-Bit Server VM",
>>>>>   "version":"25.144-b01"},
>>>>> "processors":32,
>>>>> "memory":{
>>>>>   "free":"69.1 GB",
>>>>>   "total":"180.2 GB",
>>>>>   "max":"266.7 GB",
>>>>>   "used":"111 GB (%41.6)",
>>>>>   "raw":{
>>>>>     "free":74238728336,
>>>>>     "total":193470136320,
>>>>>     "max":286331502592,
>>>>>     "used":119231407984,
>>>>>     "used%":41.64103736566334}},
>>>>> "jmx":{
>>>>>
>>>> "bootclasspath":"/usr/java/jdk1.8.0_144/jre/lib/resources.jar:/usr/java/jdk1.8.0_144/jre/lib/rt.jar:/usr/java/jdk1.8.0_144/jre/lib/sunrsasign.jar:/usr/java/jdk1.8.0_144/jre/lib/jsse.jar:/usr/java/jdk1.8.0_144/jre/lib/jce.jar:/usr/java/jdk1.8.0_144/jre/lib/charsets.jar:/usr/java/jdk1.8.0_144/jre/lib/jfr.jar:/usr/java/jdk1.8.0_144/jre/classes",
>>>>>   "classpath":"...",
>>>>>   "commandLineArgs":["-Xms100G",
>>>>>     "-Xmx300G",
>>>>>     "-DSTOP.PORT=8079",
>>>>>     "-DSTOP.KEY=..",
>>>>>     "-Dsolr.solr.home=..",
>>>>>     "-Djetty.port=8983"],
>>>>>   "startTime":"2019-03-18T09:35:27.892Z",
>>>>>   "upTimeMS":9258422}},
>>>>> "system":{
>>>>> "name":"Linux",
>>>>> "arch":"amd64",
>>>>> "availableProcessors":32,
>>>>> "systemLoadAverage":14.72,
>>>>> "version":"3.0.101-311.g08a8a9d-default",
>>>>> "committedVirtualMemorySize":2547960700928,
>>>>> "freePhysicalMemorySize":4530696192,
>>>>> "freeSwapSpaceSize":3486846976,
>>>>> "processCpuLoad":0.3257436126790475,
>>>>> "processCpuTime":93869450000000,
>>>>> "systemCpuLoad":0.3279781055816521,
>>>>> "totalPhysicalMemorySize":406480175104,
>>>>> "totalSwapSpaceSize":4302303232 <(430)%20230-3232>,
>>>>> "maxFileDescriptorCount":32768,
>>>>> "openFileDescriptorCount":385,
>>>>> "uname":"Linux ... 3.0.101-311.g08a8a9d-default #1 SMP Wed Dec 14
>>>> 10:15:37 UTC 2016 (08a8a9d) x86_64 x86_64 x86_64 GNU/Linux\n",
>>>>> "uptime":" 13:09pm  up 5 days 21:23,  7 users,  load average: 14.72,
>>>> 12.28, 11.48\n"}}
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ________________________________
>>>>> From: Emir Arnautović <[hidden email]>
>>>>> Sent: Monday, March 18, 2019 12:10:30 PM
>>>>> To: [hidden email]
>>>>> Subject: Re: Solr index slow response
>>>>>
>>>>> Hi Aaron,
>>>>> Which version of Solr? How did you configure your heap? Is it standalone
>>>> Solr or SolrCloud? A single server? Do you use some monitoring tool? Do you
>>>> see some spikes, pauses or CPU usage is constant?
>>>>>
>>>>> Thanks,
>>>>> Emir
>>>>> --
>>>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>>>>
>>>>>
>>>>>
>>>>>> On 18 Mar 2019, at 11:47, Aaron Yingcai Sun <[hidden email]> wrote:
>>>>>>
>>>>>> Hello, Solr!
>>>>>>
>>>>>>
>>>>>> We are having some performance issue when try to send documents for
>>>> solr to index. The repose time is very slow and unpredictable some time.
>>>>>>
>>>>>>
>>>>>> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM,
>>>> while 300 GB is reserved for solr, while this happening, cpu usage is
>>>> around 30%, mem usage is 34%.  io also look ok according to iotop. SSD disk.
>>>>>>
>>>>>>
>>>>>> Our application send 100 documents to solr per request, json encoded.
>>>> the size is around 5M each time. some times the response time is under 1
>>>> seconds, some times could be 300 seconds, the slow response happens very
>>>> often.
>>>>>>
>>>>>>
>>>>>> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for
>>>> 3600000ms; if 1000000 uncommited docs"
>>>>>>
>>>>>>
>>>>>> There are around 100 clients sending those documents at the same time,
>>>> but each for the client is blocking call which wait the http response then
>>>> send the next one.
>>>>>>
>>>>>>
>>>>>> I tried to make the number of documents smaller in one request, such as
>>>> 20, but  still I see slow response time to time, like 80 seconds.
>>>>>>
>>>>>>
>>>>>> Would you help to give some hint how improve the response time?  solr
>>>> does not seems very loaded, there must be a way to make the response faster.
>>>>>>
>>>>>>
>>>>>> BRs
>>>>>>
>>>>>> //Aaron
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Solr index slow response

Aaron Yingcai Sun
Yes, the same behavior even with a single thread client. The following page says "In general, adding many documents per update request is faster than one per update request."  but in reality, add many documents per request result in much longer response time, it's not liner, response time of 100 docs per request  is bigger than (the response time of 10 docs per request) * 10.


https://wiki.apache.org/solr/SolrPerformanceFactors#mergeFactor

SolrPerformanceFactors - Solr Wiki<https://wiki.apache.org/solr/SolrPerformanceFactors#mergeFactor>
wiki.apache.org
Schema Design Considerations. indexed fields. The number of indexed fields greatly increases the following: Memory usage during indexing ; Segment merge time




________________________________
From: Emir Arnautović <[hidden email]>
Sent: Tuesday, March 19, 2019 1:00:19 PM
To: [hidden email]
Subject: Re: Solr index slow response

If you start indexing with just a single thread/client, do you still see slow bulks?

Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 19 Mar 2019, at 12:54, Aaron Yingcai Sun <[hidden email]> wrote:
>
> "QTime" value is from the solr rest api response, extracted from the http/json payload.  The "Request time" is what I measured from client side, it's almost the same value as QTime, just some milliseconds difference.  I could provide tcpdump to prove that it is really solr slow response.
>
> Those long response time is not really spikes, it's constantly happening, almost half of the request has such long delay.  The more document added in one request the more delay it has.
>
> ________________________________
> From: Emir Arnautović <[hidden email]>
> Sent: Tuesday, March 19, 2019 12:30:33 PM
> To: [hidden email]
> Subject: Re: Solr index slow response
>
> Just to add different perspective here: how do you send documents to Solr? Are those log lines from your client? Maybe it is not Solr that is slow. Could it be network or client itself. If you have some dry run on client, maybe try running it without Solr to eliminate client from the suspects.
>
> Do you observe similar spikes when you run indexing with less concurrent clients?
>
> It is really hard to pinpoint the issue without looking at some monitoring tool.
>
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
>> On 19 Mar 2019, at 09:17, Aaron Yingcai Sun <[hidden email]> wrote:
>>
>> We have around 80 million documents to index, total index size around 3TB,  I guess I'm not the first one to work with this big amount of data. with such slow response time, the index process would take around 2 weeks. While the system resource is not very loaded, there must be a way to speed it up.
>>
>>
>> To Walter, I don't see why G1GC would improve this, we only do index, no query in the background. There is no memory constraint.  it's more feel like some internal thread are blocking each other.
>>
>>
>> I used to run with more documents in one request, that give much worse response time, 300 documents in one request could end up 20 minutes response time, now I changed to max 10 documents in one request, still many response time around 30 seconds, while some of them are very fast( ~100 ms).  How come there are such big difference? the documents size does not have such big difference.
>>
>>
>> I just want to speed it up since nothing seems to be overloaded.  Are there any other faster way to index such big amount of data?
>>
>>
>> BRs
>>
>> //Aaron
>>
>> ________________________________
>> From: Walter Underwood <[hidden email]>
>> Sent: Monday, March 18, 2019 4:59:20 PM
>> To: [hidden email]
>> Subject: Re: Solr index slow response
>>
>> Solr is not designed to have consistent response times for updates. You are expecting Solr to do something that it does not do.
>>
>> About Xms and Xmx, the JVM will continue to allocate memory until it hits the max. After it hits the max, it will start to collect garbage. A smaller Xms just wastes time doing allocations after the JVM is running. Avoid that by making Xms and Xms the same.
>>
>> We run all of our JVMs with 8 GB of heap and the G1 collector. You probably do not need more than 8 GB unless you are doing high-cardinality facets or some other memory-hungry querying.
>>
>> The first step would be to use a good configuration. We start our Java 8 JVMs with these parameters:
>>
>> SOLR_HEAP=8g
>> # Use G1 GC  -- wunder 2017-01-23
>> # Settings from https://wiki.apache.org/solr/ShawnHeisey
>> GC_TUNE=" \
>> -XX:+UseG1GC \
>> -XX:+ParallelRefProcEnabled \
>> -XX:G1HeapRegionSize=8m \
>> -XX:MaxGCPauseMillis=200 \
>> -XX:+UseLargePages \
>> -XX:+AggressiveOpts \
>> “
>>
>> Use SSD for disks, with total space about 3X as big as the expected index size.
>>
>> Have RAM not used by Solr or the OS that is equal to the expected index size.
>>
>> After that, let’s figure out what the real requirement is. If you must have consistent response times for update requests, you’ll need to do that outside of Solr. But if you need high data import rates, we can probably help.
>>
>> wunder
>> Walter Underwood
>> [hidden email]
>> http://observer.wunderwood.org/  (my blog)
>>
>>> On Mar 18, 2019, at 8:31 AM, Aaron Yingcai Sun <[hidden email]> wrote:
>>>
>>> Hello, Chris
>>>
>>>
>>> Thanks for the tips.  So I tried to set it as you suggested, not see too much improvement.  Since I don't need it visible immediately, softCommit is disabled totally.
>>>
>>> The slow response is happening every few seconds,  if it happens hourly I would suspect the hourly auto-commit.  But it happen much more frequently.  I don't see any CPU/RAM/NETWORK IO/DISK IO bottleneck on OS level.  It just looks like solr server is blocking internally itself.
>>>
>>>
>>> <       <maxTime>${solr.autoCommit.maxTime:3600000}</maxTime>
>>> ---
>>>>    <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
>>> 16c16
>>> <       <openSearcher>true</openSearcher>
>>> ---
>>>>    <openSearcher>false</openSearcher>
>>>
>>>
>>>
>>> 190318-162811.610-189982 DBG1:doc_count: 10 , doc_size: 539  KB, Res code: 200, QTime: 1405 ms, Request time: 1407 ms.
>>> 190318-162811.636-189968 DBG1:doc_count: 10 , doc_size: 465  KB, Res code: 200, QTime: 1357 ms, Request time: 1360 ms.
>>> 190318-162811.732-189968 DBG1:doc_count: 10 , doc_size: 473  KB, Res code: 200, QTime: 90 ms, Request time: 92 ms.
>>> 190318-162811.995-189981 DBG1:doc_count: 10 , doc_size: 610  KB, Res code: 200, QTime: 5306 ms, Request time: 5308 ms.
>>> 190318-162814.873-190003 DBG1:doc_count: 10 , doc_size: 508  KB, Res code: 200, QTime: 4775 ms, Request time: 4777 ms.
>>> 190318-162814.889-189972 DBG1:doc_count: 10 , doc_size: 563  KB, Res code: 200, QTime: 20222 ms, Request time: 20224 ms.
>>> 190318-162814.975-191817 DBG1:doc_count: 10 , doc_size: 539  KB, Res code: 200, QTime: 27732 ms, Request time: 27735 ms.
>>> 190318-162814.975-189958 DBG1:doc_count: 10 , doc_size: 616  KB, Res code: 200, QTime: 28106 ms, Request time: 28109 ms.
>>> 190318-162814.975-190004 DBG1:doc_count: 10 , doc_size: 473  KB, Res code: 200, QTime: 16703 ms, Request time: 16706 ms.
>>> 190318-162814.982-189963 DBG1:doc_count: 10 , doc_size: 540  KB, Res code: 200, QTime: 28216 ms, Request time: 28218 ms.
>>> 190318-162814.988-190007 DBG1:doc_count: 10 , doc_size: 673  KB, Res code: 200, QTime: 28133 ms, Request time: 28136 ms.
>>> 190318-162814.993-189962 DBG1:doc_count: 10 , doc_size: 631  KB, Res code: 200, QTime: 27909 ms, Request time: 27912 ms.
>>> 190318-162814.996-191818 DBG1:doc_count: 10 , doc_size: 529  KB, Res code: 200, QTime: 28172 ms, Request time: 28174 ms.
>>> 190318-162815.056-189986 DBG1:doc_count: 10 , doc_size: 602  KB, Res code: 200, QTime: 11375 ms, Request time: 11378 ms.
>>> 190318-162815.100-189988 DBG1:doc_count: 10 , doc_size: 530  KB, Res code: 200, QTime: 8663 ms, Request time: 8666 ms.
>>> 190318-162815.275-189980 DBG1:doc_count: 10 , doc_size: 553  KB, Res code: 200, QTime: 27526 ms, Request time: 27528 ms.
>>> 190318-162815.283-189997 DBG1:doc_count: 10 , doc_size: 600  KB, Res code: 200, QTime: 27529 ms, Request time: 27535 ms.
>>> 190318-162815.318-189961 DBG1:doc_count: 10 , doc_size: 661  KB, Res code: 200, QTime: 16617 ms, Request time: 16621 ms.
>>> 190318-162815.484-189952 DBG1:doc_count: 10 , doc_size: 549  KB, Res code: 200, QTime: 11707 ms, Request time: 11711 ms.
>>> 190318-162815.485-189993 DBG1:doc_count: 10 , doc_size: 618  KB, Res code: 200, QTime: 28315 ms, Request time: 28321 ms.
>>> 190318-162816.216-189972 DBG1:doc_count: 10 , doc_size: 493  KB, Res code: 200, QTime: 1320 ms, Request time: 1322 ms.
>>> 190318-162816.354-189972 DBG1:doc_count: 10 , doc_size: 631  KB, Res code: 200, QTime: 130 ms, Request time: 132 ms.
>>> 190318-162816.471-189972 DBG1:doc_count: 10 , doc_size: 563  KB, Res code: 200, QTime: 111 ms, Request time: 113 ms.
>>> 190318-162816.586-189972 DBG1:doc_count: 10 , doc_size: 554  KB, Res code: 200, QTime: 110 ms, Request time: 111 ms.
>>> 190318-162816.716-189972 DBG1:doc_count: 10 , doc_size: 590  KB, Res code: 200, QTime: 124 ms, Request time: 125 ms.
>>> 190318-162820.494-189972 DBG1:doc_count: 10 , doc_size: 583  KB, Res code: 200, QTime: 3772 ms, Request time: 3774 ms.
>>> 190318-162820.574-189953 DBG1:doc_count: 10 , doc_size: 550  KB, Res code: 200, QTime: 32802 ms, Request time: 32804 ms.
>>> 190318-162820.609-189991 DBG1:doc_count: 10 , doc_size: 586  KB, Res code: 200, QTime: 33787 ms, Request time: 33790 ms.
>>> 190318-162820.621-189976 DBG1:doc_count: 10 , doc_size: 572  KB, Res code: 200, QTime: 33397 ms, Request time: 33400 ms.
>>> 190318-162820.622-189985 DBG1:doc_count: 10 , doc_size: 498  KB, Res code: 200, QTime: 32917 ms, Request time: 32919 ms.
>>> 190318-162820.987-190005 DBG1:doc_count: 10 , doc_size: 629  KB, Res code: 200, QTime: 22207 ms, Request time: 22209 ms.
>>> 190318-162821.028-189979 DBG1:doc_count: 10 , doc_size: 584  KB, Res code: 200, QTime: 22800 ms, Request time: 22802 ms.
>>> 190318-162821.056-189948 DBG1:doc_count: 10 , doc_size: 670  KB, Res code: 200, QTime: 34193 ms, Request time: 34195 ms.
>>> 190318-162821.062-189983 DBG1:doc_count: 10 , doc_size: 675  KB, Res code: 200, QTime: 22250 ms, Request time: 22252 ms.
>>>
>>>
>>>
>>>
>>> BRs
>>>
>>> //Aaron
>>>
>>> ________________________________
>>> From: Chris Ulicny <[hidden email]>
>>> Sent: Monday, March 18, 2019 2:54:25 PM
>>> To: [hidden email]
>>> Subject: Re: Solr index slow response
>>>
>>> One other thing to look at besides the heap is your commit settings. We've
>>> experienced something similar, and changing commit settings alleviated the
>>> issue.
>>>
>>> Are you opening a search on every hardcommit? If so, you might want to
>>> reconsider and use the softcommit for the hourly creation of a new searcher.
>>>
>>> The hardcommit interval should be much lower. Probably something on the
>>> order of seconds (15000) instead of hours (currently 3600000). When the
>>> hard commit fires, numerous merges might be firing off in the background
>>> due to the volume of documents you are indexing, which might explain the
>>> periodic bad response times shown in your logs.
>>>
>>> It would depend on your specific scenario, but here's our setup. During
>>> long periods of constant indexing of documents to a staging collection (~2
>>> billion documents), we have following commit settings
>>>
>>> softcommit: 3600000ms (for periodic validation of data, since it's not in
>>> production)
>>> hardcommit: openSearcher -> false, 15000ms (no document limit)
>>>
>>> This makes the documents available for searching every hour, but doesn't
>>> result in the large bursts of IO due to the infrequent hard commits.
>>>
>>> For more info, Erick Erickson has a great write up:
>>> https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>>>
>>> Best,
>>> Chris
>>>
>>> On Mon, Mar 18, 2019 at 9:36 AM Aaron Yingcai Sun <[hidden email]> wrote:
>>>
>>>> Hi, Emir,
>>>>
>>>> My system used to run with max 32GB, the response time is bad as well.
>>>> swap is set to 4GB, there 3.2 free, I doubt swap would affect it since
>>>> there is such huge free memory.
>>>>
>>>> I could try to with set Xms and Xmx to the same value, but I doubt how
>>>> much would that change the response time.
>>>>
>>>>
>>>> BRs
>>>>
>>>> //Aaron
>>>>
>>>> ________________________________
>>>> From: Emir Arnautović <[hidden email]>
>>>> Sent: Monday, March 18, 2019 2:19:19 PM
>>>> To: [hidden email]
>>>> Subject: Re: Solr index slow response
>>>>
>>>> Hi Aaron,
>>>> Without looking too much into numbers, my bet would be that it is large
>>>> heap that is causing issues. I would decrease is significantly (<30GB) and
>>>> see if it is enough for your max load. Also, disable swap or reduce
>>>> swappiness to min.
>>>>
>>>> In any case, you should install some monitoring tool that would help you
>>>> do better analysis when you run into problems. One such tool is our
>>>> monitoring solution: https://sematext.com/spm
>>>>
>>>> HTH,
>>>> Emir
>>>> --
>>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>>>
>>>>
>>>>
>>>>> On 18 Mar 2019, at 13:14, Aaron Yingcai Sun <[hidden email]> wrote:
>>>>>
>>>>> Hello, Emir,
>>>>>
>>>>> Thanks for the reply, this is the solr version and heap info, standalone
>>>> single solr server. I don't have monitor tool connected. only look at
>>>> 'top', has not seen cpu spike so far, when the slow response happens, cpu
>>>> usage is not high at all, around 30%.
>>>>>
>>>>>
>>>>> # curl 'http://.../solr/admin/info/system?wt=json&indent=true'
>>>>> {
>>>>> "responseHeader":{
>>>>> "status":0,
>>>>> "QTime":27},
>>>>> "mode":"std",
>>>>> "solr_home":"/ardome/solr",
>>>>> "lucene":{
>>>>> "solr-spec-version":"6.5.1",
>>>>> "solr-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 -
>>>> jimczi - 2017-04-21 12:23:42",
>>>>> "lucene-spec-version":"6.5.1",
>>>>> "lucene-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75
>>>> - jimczi - 2017-04-21 12:17:15"},
>>>>> "jvm":{
>>>>> "version":"1.8.0_144 25.144-b01",
>>>>> "name":"Oracle Corporation Java HotSpot(TM) 64-Bit Server VM",
>>>>> "spec":{
>>>>>   "vendor":"Oracle Corporation",
>>>>>   "name":"Java Platform API Specification",
>>>>>   "version":"1.8"},
>>>>> "jre":{
>>>>>   "vendor":"Oracle Corporation",
>>>>>   "version":"1.8.0_144"},
>>>>> "vm":{
>>>>>   "vendor":"Oracle Corporation",
>>>>>   "name":"Java HotSpot(TM) 64-Bit Server VM",
>>>>>   "version":"25.144-b01"},
>>>>> "processors":32,
>>>>> "memory":{
>>>>>   "free":"69.1 GB",
>>>>>   "total":"180.2 GB",
>>>>>   "max":"266.7 GB",
>>>>>   "used":"111 GB (%41.6)",
>>>>>   "raw":{
>>>>>     "free":74238728336,
>>>>>     "total":193470136320,
>>>>>     "max":286331502592,
>>>>>     "used":119231407984,
>>>>>     "used%":41.64103736566334}},
>>>>> "jmx":{
>>>>>
>>>> "bootclasspath":"/usr/java/jdk1.8.0_144/jre/lib/resources.jar:/usr/java/jdk1.8.0_144/jre/lib/rt.jar:/usr/java/jdk1.8.0_144/jre/lib/sunrsasign.jar:/usr/java/jdk1.8.0_144/jre/lib/jsse.jar:/usr/java/jdk1.8.0_144/jre/lib/jce.jar:/usr/java/jdk1.8.0_144/jre/lib/charsets.jar:/usr/java/jdk1.8.0_144/jre/lib/jfr.jar:/usr/java/jdk1.8.0_144/jre/classes",
>>>>>   "classpath":"...",
>>>>>   "commandLineArgs":["-Xms100G",
>>>>>     "-Xmx300G",
>>>>>     "-DSTOP.PORT=8079",
>>>>>     "-DSTOP.KEY=..",
>>>>>     "-Dsolr.solr.home=..",
>>>>>     "-Djetty.port=8983"],
>>>>>   "startTime":"2019-03-18T09:35:27.892Z",
>>>>>   "upTimeMS":9258422}},
>>>>> "system":{
>>>>> "name":"Linux",
>>>>> "arch":"amd64",
>>>>> "availableProcessors":32,
>>>>> "systemLoadAverage":14.72,
>>>>> "version":"3.0.101-311.g08a8a9d-default",
>>>>> "committedVirtualMemorySize":2547960700928,
>>>>> "freePhysicalMemorySize":4530696192,
>>>>> "freeSwapSpaceSize":3486846976,
>>>>> "processCpuLoad":0.3257436126790475,
>>>>> "processCpuTime":93869450000000,
>>>>> "systemCpuLoad":0.3279781055816521,
>>>>> "totalPhysicalMemorySize":406480175104,
>>>>> "totalSwapSpaceSize":4302303232 <(430)%20230-3232>,
>>>>> "maxFileDescriptorCount":32768,
>>>>> "openFileDescriptorCount":385,
>>>>> "uname":"Linux ... 3.0.101-311.g08a8a9d-default #1 SMP Wed Dec 14
>>>> 10:15:37 UTC 2016 (08a8a9d) x86_64 x86_64 x86_64 GNU/Linux\n",
>>>>> "uptime":" 13:09pm  up 5 days 21:23,  7 users,  load average: 14.72,
>>>> 12.28, 11.48\n"}}
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ________________________________
>>>>> From: Emir Arnautović <[hidden email]>
>>>>> Sent: Monday, March 18, 2019 12:10:30 PM
>>>>> To: [hidden email]
>>>>> Subject: Re: Solr index slow response
>>>>>
>>>>> Hi Aaron,
>>>>> Which version of Solr? How did you configure your heap? Is it standalone
>>>> Solr or SolrCloud? A single server? Do you use some monitoring tool? Do you
>>>> see some spikes, pauses or CPU usage is constant?
>>>>>
>>>>> Thanks,
>>>>> Emir
>>>>> --
>>>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>>>>
>>>>>
>>>>>
>>>>>> On 18 Mar 2019, at 11:47, Aaron Yingcai Sun <[hidden email]> wrote:
>>>>>>
>>>>>> Hello, Solr!
>>>>>>
>>>>>>
>>>>>> We are having some performance issue when try to send documents for
>>>> solr to index. The repose time is very slow and unpredictable some time.
>>>>>>
>>>>>>
>>>>>> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM,
>>>> while 300 GB is reserved for solr, while this happening, cpu usage is
>>>> around 30%, mem usage is 34%.  io also look ok according to iotop. SSD disk.
>>>>>>
>>>>>>
>>>>>> Our application send 100 documents to solr per request, json encoded.
>>>> the size is around 5M each time. some times the response time is under 1
>>>> seconds, some times could be 300 seconds, the slow response happens very
>>>> often.
>>>>>>
>>>>>>
>>>>>> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for
>>>> 3600000ms; if 1000000 uncommited docs"
>>>>>>
>>>>>>
>>>>>> There are around 100 clients sending those documents at the same time,
>>>> but each for the client is blocking call which wait the http response then
>>>> send the next one.
>>>>>>
>>>>>>
>>>>>> I tried to make the number of documents smaller in one request, such as
>>>> 20, but  still I see slow response time to time, like 80 seconds.
>>>>>>
>>>>>>
>>>>>> Would you help to give some hint how improve the response time?  solr
>>>> does not seems very loaded, there must be a way to make the response faster.
>>>>>>
>>>>>>
>>>>>> BRs
>>>>>>
>>>>>> //Aaron
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Solr index slow response

Chris Ulicny
Do you know what is causing the 30% CPU usage? The seems awfully high for
solr when only indexing (unless it is iowait), but could be expected based
on custom update processors and analyzers you may have.

Are you putting all of the documents into a single core or multiple? Also
are you using SATA or PCIe solid state drives?

We have a similar index situation. About 65million documents that take up
about 4.5TB for a single copy of the index. We've never tried to put all of
the documents into a single solr core before, so I don't know how well that
would work.

On Tue, Mar 19, 2019 at 8:28 AM Aaron Yingcai Sun <[hidden email]> wrote:

> Yes, the same behavior even with a single thread client. The following
> page says "In general, adding many documents per update request is faster
> than one per update request."  but in reality, add many documents per
> request result in much longer response time, it's not liner, response time
> of 100 docs per request  is bigger than (the response time of 10 docs per
> request) * 10.
>
>
> https://wiki.apache.org/solr/SolrPerformanceFactors#mergeFactor
>
> SolrPerformanceFactors - Solr Wiki<
> https://wiki.apache.org/solr/SolrPerformanceFactors#mergeFactor>
> wiki.apache.org
> Schema Design Considerations. indexed fields. The number of indexed fields
> greatly increases the following: Memory usage during indexing ; Segment
> merge time
>
>
>
>
> ________________________________
> From: Emir Arnautović <[hidden email]>
> Sent: Tuesday, March 19, 2019 1:00:19 PM
> To: [hidden email]
> Subject: Re: Solr index slow response
>
> If you start indexing with just a single thread/client, do you still see
> slow bulks?
>
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 19 Mar 2019, at 12:54, Aaron Yingcai Sun <[hidden email]> wrote:
> >
> > "QTime" value is from the solr rest api response, extracted from the
> http/json payload.  The "Request time" is what I measured from client side,
> it's almost the same value as QTime, just some milliseconds difference.  I
> could provide tcpdump to prove that it is really solr slow response.
> >
> > Those long response time is not really spikes, it's constantly
> happening, almost half of the request has such long delay.  The more
> document added in one request the more delay it has.
> >
> > ________________________________
> > From: Emir Arnautović <[hidden email]>
> > Sent: Tuesday, March 19, 2019 12:30:33 PM
> > To: [hidden email]
> > Subject: Re: Solr index slow response
> >
> > Just to add different perspective here: how do you send documents to
> Solr? Are those log lines from your client? Maybe it is not Solr that is
> slow. Could it be network or client itself. If you have some dry run on
> client, maybe try running it without Solr to eliminate client from the
> suspects.
> >
> > Do you observe similar spikes when you run indexing with less concurrent
> clients?
> >
> > It is really hard to pinpoint the issue without looking at some
> monitoring tool.
> >
> > Emir
> > --
> > Monitoring - Log Management - Alerting - Anomaly Detection
> > Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >
> >
> >
> >> On 19 Mar 2019, at 09:17, Aaron Yingcai Sun <[hidden email]> wrote:
> >>
> >> We have around 80 million documents to index, total index size around
> 3TB,  I guess I'm not the first one to work with this big amount of data.
> with such slow response time, the index process would take around 2 weeks.
> While the system resource is not very loaded, there must be a way to speed
> it up.
> >>
> >>
> >> To Walter, I don't see why G1GC would improve this, we only do index,
> no query in the background. There is no memory constraint.  it's more feel
> like some internal thread are blocking each other.
> >>
> >>
> >> I used to run with more documents in one request, that give much worse
> response time, 300 documents in one request could end up 20 minutes
> response time, now I changed to max 10 documents in one request, still many
> response time around 30 seconds, while some of them are very fast( ~100
> ms).  How come there are such big difference? the documents size does not
> have such big difference.
> >>
> >>
> >> I just want to speed it up since nothing seems to be overloaded.  Are
> there any other faster way to index such big amount of data?
> >>
> >>
> >> BRs
> >>
> >> //Aaron
> >>
> >> ________________________________
> >> From: Walter Underwood <[hidden email]>
> >> Sent: Monday, March 18, 2019 4:59:20 PM
> >> To: [hidden email]
> >> Subject: Re: Solr index slow response
> >>
> >> Solr is not designed to have consistent response times for updates. You
> are expecting Solr to do something that it does not do.
> >>
> >> About Xms and Xmx, the JVM will continue to allocate memory until it
> hits the max. After it hits the max, it will start to collect garbage. A
> smaller Xms just wastes time doing allocations after the JVM is running.
> Avoid that by making Xms and Xms the same.
> >>
> >> We run all of our JVMs with 8 GB of heap and the G1 collector. You
> probably do not need more than 8 GB unless you are doing high-cardinality
> facets or some other memory-hungry querying.
> >>
> >> The first step would be to use a good configuration. We start our Java
> 8 JVMs with these parameters:
> >>
> >> SOLR_HEAP=8g
> >> # Use G1 GC  -- wunder 2017-01-23
> >> # Settings from https://wiki.apache.org/solr/ShawnHeisey
> >> GC_TUNE=" \
> >> -XX:+UseG1GC \
> >> -XX:+ParallelRefProcEnabled \
> >> -XX:G1HeapRegionSize=8m \
> >> -XX:MaxGCPauseMillis=200 \
> >> -XX:+UseLargePages \
> >> -XX:+AggressiveOpts \
> >> “
> >>
> >> Use SSD for disks, with total space about 3X as big as the expected
> index size.
> >>
> >> Have RAM not used by Solr or the OS that is equal to the expected index
> size.
> >>
> >> After that, let’s figure out what the real requirement is. If you must
> have consistent response times for update requests, you’ll need to do that
> outside of Solr. But if you need high data import rates, we can probably
> help.
> >>
> >> wunder
> >> Walter Underwood
> >> [hidden email]
> >> http://observer.wunderwood.org/  (my blog)
> >>
> >>> On Mar 18, 2019, at 8:31 AM, Aaron Yingcai Sun <[hidden email]> wrote:
> >>>
> >>> Hello, Chris
> >>>
> >>>
> >>> Thanks for the tips.  So I tried to set it as you suggested, not see
> too much improvement.  Since I don't need it visible immediately,
> softCommit is disabled totally.
> >>>
> >>> The slow response is happening every few seconds,  if it happens
> hourly I would suspect the hourly auto-commit.  But it happen much more
> frequently.  I don't see any CPU/RAM/NETWORK IO/DISK IO bottleneck on OS
> level.  It just looks like solr server is blocking internally itself.
> >>>
> >>>
> >>> <       <maxTime>${solr.autoCommit.maxTime:3600000}</maxTime>
> >>> ---
> >>>>    <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
> >>> 16c16
> >>> <       <openSearcher>true</openSearcher>
> >>> ---
> >>>>    <openSearcher>false</openSearcher>
> >>>
> >>>
> >>>
> >>> 190318-162811.610-189982 DBG1:doc_count: 10 , doc_size: 539  KB, Res
> code: 200, QTime: 1405 ms, Request time: 1407 ms.
> >>> 190318-162811.636-189968 DBG1:doc_count: 10 , doc_size: 465  KB, Res
> code: 200, QTime: 1357 ms, Request time: 1360 ms.
> >>> 190318-162811.732-189968 DBG1:doc_count: 10 , doc_size: 473  KB, Res
> code: 200, QTime: 90 ms, Request time: 92 ms.
> >>> 190318-162811.995-189981 DBG1:doc_count: 10 , doc_size: 610  KB, Res
> code: 200, QTime: 5306 ms, Request time: 5308 ms.
> >>> 190318-162814.873-190003 DBG1:doc_count: 10 , doc_size: 508  KB, Res
> code: 200, QTime: 4775 ms, Request time: 4777 ms.
> >>> 190318-162814.889-189972 DBG1:doc_count: 10 , doc_size: 563  KB, Res
> code: 200, QTime: 20222 ms, Request time: 20224 ms.
> >>> 190318-162814.975-191817 DBG1:doc_count: 10 , doc_size: 539  KB, Res
> code: 200, QTime: 27732 ms, Request time: 27735 ms.
> >>> 190318-162814.975-189958 DBG1:doc_count: 10 , doc_size: 616  KB, Res
> code: 200, QTime: 28106 ms, Request time: 28109 ms.
> >>> 190318-162814.975-190004 DBG1:doc_count: 10 , doc_size: 473  KB, Res
> code: 200, QTime: 16703 ms, Request time: 16706 ms.
> >>> 190318-162814.982-189963 DBG1:doc_count: 10 , doc_size: 540  KB, Res
> code: 200, QTime: 28216 ms, Request time: 28218 ms.
> >>> 190318-162814.988-190007 DBG1:doc_count: 10 , doc_size: 673  KB, Res
> code: 200, QTime: 28133 ms, Request time: 28136 ms.
> >>> 190318-162814.993-189962 DBG1:doc_count: 10 , doc_size: 631  KB, Res
> code: 200, QTime: 27909 ms, Request time: 27912 ms.
> >>> 190318-162814.996-191818 DBG1:doc_count: 10 , doc_size: 529  KB, Res
> code: 200, QTime: 28172 ms, Request time: 28174 ms.
> >>> 190318-162815.056-189986 DBG1:doc_count: 10 , doc_size: 602  KB, Res
> code: 200, QTime: 11375 ms, Request time: 11378 ms.
> >>> 190318-162815.100-189988 DBG1:doc_count: 10 , doc_size: 530  KB, Res
> code: 200, QTime: 8663 ms, Request time: 8666 ms.
> >>> 190318-162815.275-189980 DBG1:doc_count: 10 , doc_size: 553  KB, Res
> code: 200, QTime: 27526 ms, Request time: 27528 ms.
> >>> 190318-162815.283-189997 DBG1:doc_count: 10 , doc_size: 600  KB, Res
> code: 200, QTime: 27529 ms, Request time: 27535 ms.
> >>> 190318-162815.318-189961 DBG1:doc_count: 10 , doc_size: 661  KB, Res
> code: 200, QTime: 16617 ms, Request time: 16621 ms.
> >>> 190318-162815.484-189952 DBG1:doc_count: 10 , doc_size: 549  KB, Res
> code: 200, QTime: 11707 ms, Request time: 11711 ms.
> >>> 190318-162815.485-189993 DBG1:doc_count: 10 , doc_size: 618  KB, Res
> code: 200, QTime: 28315 ms, Request time: 28321 ms.
> >>> 190318-162816.216-189972 DBG1:doc_count: 10 , doc_size: 493  KB, Res
> code: 200, QTime: 1320 ms, Request time: 1322 ms.
> >>> 190318-162816.354-189972 DBG1:doc_count: 10 , doc_size: 631  KB, Res
> code: 200, QTime: 130 ms, Request time: 132 ms.
> >>> 190318-162816.471-189972 DBG1:doc_count: 10 , doc_size: 563  KB, Res
> code: 200, QTime: 111 ms, Request time: 113 ms.
> >>> 190318-162816.586-189972 DBG1:doc_count: 10 , doc_size: 554  KB, Res
> code: 200, QTime: 110 ms, Request time: 111 ms.
> >>> 190318-162816.716-189972 DBG1:doc_count: 10 , doc_size: 590  KB, Res
> code: 200, QTime: 124 ms, Request time: 125 ms.
> >>> 190318-162820.494-189972 DBG1:doc_count: 10 , doc_size: 583  KB, Res
> code: 200, QTime: 3772 ms, Request time: 3774 ms.
> >>> 190318-162820.574-189953 DBG1:doc_count: 10 , doc_size: 550  KB, Res
> code: 200, QTime: 32802 ms, Request time: 32804 ms.
> >>> 190318-162820.609-189991 DBG1:doc_count: 10 , doc_size: 586  KB, Res
> code: 200, QTime: 33787 ms, Request time: 33790 ms.
> >>> 190318-162820.621-189976 DBG1:doc_count: 10 , doc_size: 572  KB, Res
> code: 200, QTime: 33397 ms, Request time: 33400 ms.
> >>> 190318-162820.622-189985 DBG1:doc_count: 10 , doc_size: 498  KB, Res
> code: 200, QTime: 32917 ms, Request time: 32919 ms.
> >>> 190318-162820.987-190005 DBG1:doc_count: 10 , doc_size: 629  KB, Res
> code: 200, QTime: 22207 ms, Request time: 22209 ms.
> >>> 190318-162821.028-189979 DBG1:doc_count: 10 , doc_size: 584  KB, Res
> code: 200, QTime: 22800 ms, Request time: 22802 ms.
> >>> 190318-162821.056-189948 DBG1:doc_count: 10 , doc_size: 670  KB, Res
> code: 200, QTime: 34193 ms, Request time: 34195 ms.
> >>> 190318-162821.062-189983 DBG1:doc_count: 10 , doc_size: 675  KB, Res
> code: 200, QTime: 22250 ms, Request time: 22252 ms.
> >>>
> >>>
> >>>
> >>>
> >>> BRs
> >>>
> >>> //Aaron
> >>>
> >>> ________________________________
> >>> From: Chris Ulicny <[hidden email]>
> >>> Sent: Monday, March 18, 2019 2:54:25 PM
> >>> To: [hidden email]
> >>> Subject: Re: Solr index slow response
> >>>
> >>> One other thing to look at besides the heap is your commit settings.
> We've
> >>> experienced something similar, and changing commit settings alleviated
> the
> >>> issue.
> >>>
> >>> Are you opening a search on every hardcommit? If so, you might want to
> >>> reconsider and use the softcommit for the hourly creation of a new
> searcher.
> >>>
> >>> The hardcommit interval should be much lower. Probably something on the
> >>> order of seconds (15000) instead of hours (currently 3600000). When the
> >>> hard commit fires, numerous merges might be firing off in the
> background
> >>> due to the volume of documents you are indexing, which might explain
> the
> >>> periodic bad response times shown in your logs.
> >>>
> >>> It would depend on your specific scenario, but here's our setup. During
> >>> long periods of constant indexing of documents to a staging collection
> (~2
> >>> billion documents), we have following commit settings
> >>>
> >>> softcommit: 3600000ms (for periodic validation of data, since it's not
> in
> >>> production)
> >>> hardcommit: openSearcher -> false, 15000ms (no document limit)
> >>>
> >>> This makes the documents available for searching every hour, but
> doesn't
> >>> result in the large bursts of IO due to the infrequent hard commits.
> >>>
> >>> For more info, Erick Erickson has a great write up:
> >>>
> https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
> >>>
> >>> Best,
> >>> Chris
> >>>
> >>> On Mon, Mar 18, 2019 at 9:36 AM Aaron Yingcai Sun <[hidden email]>
> wrote:
> >>>
> >>>> Hi, Emir,
> >>>>
> >>>> My system used to run with max 32GB, the response time is bad as well.
> >>>> swap is set to 4GB, there 3.2 free, I doubt swap would affect it since
> >>>> there is such huge free memory.
> >>>>
> >>>> I could try to with set Xms and Xmx to the same value, but I doubt how
> >>>> much would that change the response time.
> >>>>
> >>>>
> >>>> BRs
> >>>>
> >>>> //Aaron
> >>>>
> >>>> ________________________________
> >>>> From: Emir Arnautović <[hidden email]>
> >>>> Sent: Monday, March 18, 2019 2:19:19 PM
> >>>> To: [hidden email]
> >>>> Subject: Re: Solr index slow response
> >>>>
> >>>> Hi Aaron,
> >>>> Without looking too much into numbers, my bet would be that it is
> large
> >>>> heap that is causing issues. I would decrease is significantly
> (<30GB) and
> >>>> see if it is enough for your max load. Also, disable swap or reduce
> >>>> swappiness to min.
> >>>>
> >>>> In any case, you should install some monitoring tool that would help
> you
> >>>> do better analysis when you run into problems. One such tool is our
> >>>> monitoring solution: https://sematext.com/spm
> >>>>
> >>>> HTH,
> >>>> Emir
> >>>> --
> >>>> Monitoring - Log Management - Alerting - Anomaly Detection
> >>>> Solr & Elasticsearch Consulting Support Training -
> http://sematext.com/
> >>>>
> >>>>
> >>>>
> >>>>> On 18 Mar 2019, at 13:14, Aaron Yingcai Sun <[hidden email]> wrote:
> >>>>>
> >>>>> Hello, Emir,
> >>>>>
> >>>>> Thanks for the reply, this is the solr version and heap info,
> standalone
> >>>> single solr server. I don't have monitor tool connected. only look at
> >>>> 'top', has not seen cpu spike so far, when the slow response happens,
> cpu
> >>>> usage is not high at all, around 30%.
> >>>>>
> >>>>>
> >>>>> # curl 'http://.../solr/admin/info/system?wt=json&indent=true'
> >>>>> {
> >>>>> "responseHeader":{
> >>>>> "status":0,
> >>>>> "QTime":27},
> >>>>> "mode":"std",
> >>>>> "solr_home":"/ardome/solr",
> >>>>> "lucene":{
> >>>>> "solr-spec-version":"6.5.1",
> >>>>> "solr-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 -
> >>>> jimczi - 2017-04-21 12:23:42",
> >>>>> "lucene-spec-version":"6.5.1",
> >>>>> "lucene-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75
> >>>> - jimczi - 2017-04-21 12:17:15"},
> >>>>> "jvm":{
> >>>>> "version":"1.8.0_144 25.144-b01",
> >>>>> "name":"Oracle Corporation Java HotSpot(TM) 64-Bit Server VM",
> >>>>> "spec":{
> >>>>>   "vendor":"Oracle Corporation",
> >>>>>   "name":"Java Platform API Specification",
> >>>>>   "version":"1.8"},
> >>>>> "jre":{
> >>>>>   "vendor":"Oracle Corporation",
> >>>>>   "version":"1.8.0_144"},
> >>>>> "vm":{
> >>>>>   "vendor":"Oracle Corporation",
> >>>>>   "name":"Java HotSpot(TM) 64-Bit Server VM",
> >>>>>   "version":"25.144-b01"},
> >>>>> "processors":32,
> >>>>> "memory":{
> >>>>>   "free":"69.1 GB",
> >>>>>   "total":"180.2 GB",
> >>>>>   "max":"266.7 GB",
> >>>>>   "used":"111 GB (%41.6)",
> >>>>>   "raw":{
> >>>>>     "free":74238728336,
> >>>>>     "total":193470136320,
> >>>>>     "max":286331502592,
> >>>>>     "used":119231407984,
> >>>>>     "used%":41.64103736566334}},
> >>>>> "jmx":{
> >>>>>
> >>>>
> "bootclasspath":"/usr/java/jdk1.8.0_144/jre/lib/resources.jar:/usr/java/jdk1.8.0_144/jre/lib/rt.jar:/usr/java/jdk1.8.0_144/jre/lib/sunrsasign.jar:/usr/java/jdk1.8.0_144/jre/lib/jsse.jar:/usr/java/jdk1.8.0_144/jre/lib/jce.jar:/usr/java/jdk1.8.0_144/jre/lib/charsets.jar:/usr/java/jdk1.8.0_144/jre/lib/jfr.jar:/usr/java/jdk1.8.0_144/jre/classes",
> >>>>>   "classpath":"...",
> >>>>>   "commandLineArgs":["-Xms100G",
> >>>>>     "-Xmx300G",
> >>>>>     "-DSTOP.PORT=8079",
> >>>>>     "-DSTOP.KEY=..",
> >>>>>     "-Dsolr.solr.home=..",
> >>>>>     "-Djetty.port=8983"],
> >>>>>   "startTime":"2019-03-18T09:35:27.892Z",
> >>>>>   "upTimeMS":9258422}},
> >>>>> "system":{
> >>>>> "name":"Linux",
> >>>>> "arch":"amd64",
> >>>>> "availableProcessors":32,
> >>>>> "systemLoadAverage":14.72,
> >>>>> "version":"3.0.101-311.g08a8a9d-default",
> >>>>> "committedVirtualMemorySize":2547960700928,
> >>>>> "freePhysicalMemorySize":4530696192,
> >>>>> "freeSwapSpaceSize":3486846976,
> >>>>> "processCpuLoad":0.3257436126790475,
> >>>>> "processCpuTime":93869450000000,
> >>>>> "systemCpuLoad":0.3279781055816521,
> >>>>> "totalPhysicalMemorySize":406480175104,
> >>>>> "totalSwapSpaceSize":4302303232 <(430)%20230-3232>
> <(430)%20230-3232>,
> >>>>> "maxFileDescriptorCount":32768,
> >>>>> "openFileDescriptorCount":385,
> >>>>> "uname":"Linux ... 3.0.101-311.g08a8a9d-default #1 SMP Wed Dec 14
> >>>> 10:15:37 UTC 2016 (08a8a9d) x86_64 x86_64 x86_64 GNU/Linux\n",
> >>>>> "uptime":" 13:09pm  up 5 days 21:23,  7 users,  load average: 14.72,
> >>>> 12.28, 11.48\n"}}
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> ________________________________
> >>>>> From: Emir Arnautović <[hidden email]>
> >>>>> Sent: Monday, March 18, 2019 12:10:30 PM
> >>>>> To: [hidden email]
> >>>>> Subject: Re: Solr index slow response
> >>>>>
> >>>>> Hi Aaron,
> >>>>> Which version of Solr? How did you configure your heap? Is it
> standalone
> >>>> Solr or SolrCloud? A single server? Do you use some monitoring tool?
> Do you
> >>>> see some spikes, pauses or CPU usage is constant?
> >>>>>
> >>>>> Thanks,
> >>>>> Emir
> >>>>> --
> >>>>> Monitoring - Log Management - Alerting - Anomaly Detection
> >>>>> Solr & Elasticsearch Consulting Support Training -
> http://sematext.com/
> >>>>>
> >>>>>
> >>>>>
> >>>>>> On 18 Mar 2019, at 11:47, Aaron Yingcai Sun <[hidden email]> wrote:
> >>>>>>
> >>>>>> Hello, Solr!
> >>>>>>
> >>>>>>
> >>>>>> We are having some performance issue when try to send documents for
> >>>> solr to index. The repose time is very slow and unpredictable some
> time.
> >>>>>>
> >>>>>>
> >>>>>> Solr server is running on a quit powerful server, 32 cpus, 400GB
> RAM,
> >>>> while 300 GB is reserved for solr, while this happening, cpu usage is
> >>>> around 30%, mem usage is 34%.  io also look ok according to iotop.
> SSD disk.
> >>>>>>
> >>>>>>
> >>>>>> Our application send 100 documents to solr per request, json
> encoded.
> >>>> the size is around 5M each time. some times the response time is
> under 1
> >>>> seconds, some times could be 300 seconds, the slow response happens
> very
> >>>> often.
> >>>>>>
> >>>>>>
> >>>>>> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for
> >>>> 3600000ms; if 1000000 uncommited docs"
> >>>>>>
> >>>>>>
> >>>>>> There are around 100 clients sending those documents at the same
> time,
> >>>> but each for the client is blocking call which wait the http response
> then
> >>>> send the next one.
> >>>>>>
> >>>>>>
> >>>>>> I tried to make the number of documents smaller in one request,
> such as
> >>>> 20, but  still I see slow response time to time, like 80 seconds.
> >>>>>>
> >>>>>>
> >>>>>> Would you help to give some hint how improve the response time?
> solr
> >>>> does not seems very loaded, there must be a way to make the response
> faster.
> >>>>>>
> >>>>>>
> >>>>>> BRs
> >>>>>>
> >>>>>> //Aaron
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr index slow response

Jörn Franke
In reply to this post by Aaron Yingcai Sun
It could be an issue with jdk 8 that may not be suitable for such large heaps. Have more nodes with smaller heaps (eg 31 gb)

> Am 18.03.2019 um 11:47 schrieb Aaron Yingcai Sun <[hidden email]>:
>
> Hello, Solr!
>
>
> We are having some performance issue when try to send documents for solr to index. The repose time is very slow and unpredictable some time.
>
>
> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM, while 300 GB is reserved for solr, while this happening, cpu usage is around 30%, mem usage is 34%.  io also look ok according to iotop. SSD disk.
>
>
> Our application send 100 documents to solr per request, json encoded. the size is around 5M each time. some times the response time is under 1 seconds, some times could be 300 seconds, the slow response happens very often.
>
>
> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for 3600000ms; if 1000000 uncommited docs"
>
>
> There are around 100 clients sending those documents at the same time, but each for the client is blocking call which wait the http response then send the next one.
>
>
> I tried to make the number of documents smaller in one request, such as 20, but  still I see slow response time to time, like 80 seconds.
>
>
> Would you help to give some hint how improve the response time?  solr does not seems very loaded, there must be a way to make the response faster.
>
>
> BRs
>
> //Aaron
>
>
>
12