SolrCloud: Configured socket timeouts not reflecting

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

SolrCloud: Configured socket timeouts not reflecting

Rahul Goswami
Hello,

I am running Solr 7.2.1 in cloud mode. To overcome a setup hardware
bottleneck, I tried to configure distribUpdateSoTimeout and socketTimeout
to a value greater than the default 10 mins. I did this by passing these as
system properties at Solr start up time (-DdistribUpdateSoTimeout and
-DsocketTimeout  ). The Solr admin UI shows these values in the Dashboard
args section. As a test, I tried setting each of them to one hour
(3600000). However I start seeing socket read timeouts within a few mins.
Looks like the values are not taking effect. What am I missing? If this is
a known issue, is there a JIRA for it ?

Thanks,
Rahul
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud: Configured socket timeouts not reflecting

Rahul Goswami
Hello,

I was looking into the code to try to get to the root of this issue. Looks
like this is an issue after all (as of 7.2.1 which is the version we are
using), but wanted to confirm on the user list before creating a JIRA. I
found that the soTimeout property of ConcurrentUpdateSolrClient class (in
the code referenced below) remains null and hence the default of 600000 ms
is set as the timeout in HttpPost class instance variable "method".
https://github.com/apache/lucene-solr/blob/e6f6f352cfc30517235822b3deed83df1ee144c6/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrClient.java#L334


When the call is finally made in the below line, the Httpclient does
contain the configured timeout (as in solr.xml or -DdistribUpdateSoTimeout)
but gets overriden by the hard default of 600000 in the "method" parameter
of the execute call.

https://github.com/apache/lucene-solr/blob/e6f6f352cfc30517235822b3deed83df1ee144c6/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrClient.java#L348


The hard default of 600000 is set here:
https://github.com/apache/lucene-solr/blob/e6f6f352cfc30517235822b3deed83df1ee144c6/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrClient.java#L333


I tried to create a local patch with the below fix which works fine:
https://github.com/apache/lucene-solr/blob/86fe24cbef238d2042d68494bd94e2362a2d996e/solr/core/src/java/org/apache/solr/update/StreamingSolrClients.java#L69



client = new ErrorReportingConcurrentUpdateSolrClient.Builder(url, req,
errors)
          .withHttpClient(httpClient)
          .withQueueSize(100)
          .withSocketTimeout(getSocketTimeout(req))
          .withThreadCount(runnerCount)
          .withExecutorService(updateExecutor)
          .alwaysStreamDeletes()
          .build();

private int getSocketTimeout(SolrCmdDistributor.Req req) {
    if(req==null) {
      return UpdateShardHandlerConfig.DEFAULT_DISTRIBUPDATESOTIMEOUT;
    }

    return
req.cmd.req.getCore().getCoreContainer().getConfig().getUpdateShardHandlerConfig().getDistributedSocketTimeout();
  }

I found this open JIRA on this issue:

https://issues.apache.org/jira/browse/SOLR-12550?jql=text%20~%20%22distribUpdateSoTimeout%22


Should I update the JIRA with this ?

Thanks,
Rahul




On Thu, Jun 13, 2019 at 12:00 AM Rahul Goswami <[hidden email]>
wrote:

> Hello,
>
> I am running Solr 7.2.1 in cloud mode. To overcome a setup hardware
> bottleneck, I tried to configure distribUpdateSoTimeout and socketTimeout
> to a value greater than the default 10 mins. I did this by passing these as
> system properties at Solr start up time (-DdistribUpdateSoTimeout and
> -DsocketTimeout  ). The Solr admin UI shows these values in the Dashboard
> args section. As a test, I tried setting each of them to one hour
> (3600000). However I start seeing socket read timeouts within a few mins.
> Looks like the values are not taking effect. What am I missing? If this is
> a known issue, is there a JIRA for it ?
>
> Thanks,
> Rahul
>
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud: Configured socket timeouts not reflecting

Gus Heck
Hi Rahul,

Did you try the patch int that issue? Also food for thought:
https://issues.apache.org/jira/browse/SOLR-13457

-Gus

On Tue, Jun 18, 2019 at 5:52 PM Rahul Goswami <[hidden email]> wrote:

> Hello,
>
> I was looking into the code to try to get to the root of this issue. Looks
> like this is an issue after all (as of 7.2.1 which is the version we are
> using), but wanted to confirm on the user list before creating a JIRA. I
> found that the soTimeout property of ConcurrentUpdateSolrClient class (in
> the code referenced below) remains null and hence the default of 600000 ms
> is set as the timeout in HttpPost class instance variable "method".
>
> https://github.com/apache/lucene-solr/blob/e6f6f352cfc30517235822b3deed83df1ee144c6/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrClient.java#L334
>
>
> When the call is finally made in the below line, the Httpclient does
> contain the configured timeout (as in solr.xml or -DdistribUpdateSoTimeout)
> but gets overriden by the hard default of 600000 in the "method" parameter
> of the execute call.
>
>
> https://github.com/apache/lucene-solr/blob/e6f6f352cfc30517235822b3deed83df1ee144c6/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrClient.java#L348
>
>
> The hard default of 600000 is set here:
>
> https://github.com/apache/lucene-solr/blob/e6f6f352cfc30517235822b3deed83df1ee144c6/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrClient.java#L333
>
>
> I tried to create a local patch with the below fix which works fine:
>
> https://github.com/apache/lucene-solr/blob/86fe24cbef238d2042d68494bd94e2362a2d996e/solr/core/src/java/org/apache/solr/update/StreamingSolrClients.java#L69
>
>
>
> client = new ErrorReportingConcurrentUpdateSolrClient.Builder(url, req,
> errors)
>           .withHttpClient(httpClient)
>           .withQueueSize(100)
>           .withSocketTimeout(getSocketTimeout(req))
>           .withThreadCount(runnerCount)
>           .withExecutorService(updateExecutor)
>           .alwaysStreamDeletes()
>           .build();
>
> private int getSocketTimeout(SolrCmdDistributor.Req req) {
>     if(req==null) {
>       return UpdateShardHandlerConfig.DEFAULT_DISTRIBUPDATESOTIMEOUT;
>     }
>
>     return
>
> req.cmd.req.getCore().getCoreContainer().getConfig().getUpdateShardHandlerConfig().getDistributedSocketTimeout();
>   }
>
> I found this open JIRA on this issue:
>
>
> https://issues.apache.org/jira/browse/SOLR-12550?jql=text%20~%20%22distribUpdateSoTimeout%22
>
>
> Should I update the JIRA with this ?
>
> Thanks,
> Rahul
>
>
>
>
> On Thu, Jun 13, 2019 at 12:00 AM Rahul Goswami <[hidden email]>
> wrote:
>
> > Hello,
> >
> > I am running Solr 7.2.1 in cloud mode. To overcome a setup hardware
> > bottleneck, I tried to configure distribUpdateSoTimeout and socketTimeout
> > to a value greater than the default 10 mins. I did this by passing these
> as
> > system properties at Solr start up time (-DdistribUpdateSoTimeout and
> > -DsocketTimeout  ). The Solr admin UI shows these values in the Dashboard
> > args section. As a test, I tried setting each of them to one hour
> > (3600000). However I start seeing socket read timeouts within a few mins.
> > Looks like the values are not taking effect. What am I missing? If this
> is
> > a known issue, is there a JIRA for it ?
> >
> > Thanks,
> > Rahul
> >
>


--
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud: Configured socket timeouts not reflecting

Rahul Goswami
Hi Gus,
Thanks for the response and referencing the umbrella JIRA for these kind of
issues. I see that it won't solve the problem since the builder object
which is used to instantiate a ConcurrentUpdateSolrClient itself doesn't
contain the timeout values. I did create a local solr-core binary to try
the patch nevertheless, but it didn't help as I anticipated. I'll update
the JIRA and submit a patch.

Thank you,
Rahul

On Thu, Jun 20, 2019 at 11:35 AM Gus Heck <[hidden email]> wrote:

> Hi Rahul,
>
> Did you try the patch int that issue? Also food for thought:
> https://issues.apache.org/jira/browse/SOLR-13457
>
> -Gus
>
> On Tue, Jun 18, 2019 at 5:52 PM Rahul Goswami <[hidden email]>
> wrote:
>
> > Hello,
> >
> > I was looking into the code to try to get to the root of this issue.
> Looks
> > like this is an issue after all (as of 7.2.1 which is the version we are
> > using), but wanted to confirm on the user list before creating a JIRA. I
> > found that the soTimeout property of ConcurrentUpdateSolrClient class (in
> > the code referenced below) remains null and hence the default of 600000
> ms
> > is set as the timeout in HttpPost class instance variable "method".
> >
> >
> https://github.com/apache/lucene-solr/blob/e6f6f352cfc30517235822b3deed83df1ee144c6/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrClient.java#L334
> >
> >
> > When the call is finally made in the below line, the Httpclient does
> > contain the configured timeout (as in solr.xml or
> -DdistribUpdateSoTimeout)
> > but gets overriden by the hard default of 600000 in the "method"
> parameter
> > of the execute call.
> >
> >
> >
> https://github.com/apache/lucene-solr/blob/e6f6f352cfc30517235822b3deed83df1ee144c6/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrClient.java#L348
> >
> >
> > The hard default of 600000 is set here:
> >
> >
> https://github.com/apache/lucene-solr/blob/e6f6f352cfc30517235822b3deed83df1ee144c6/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrClient.java#L333
> >
> >
> > I tried to create a local patch with the below fix which works fine:
> >
> >
> https://github.com/apache/lucene-solr/blob/86fe24cbef238d2042d68494bd94e2362a2d996e/solr/core/src/java/org/apache/solr/update/StreamingSolrClients.java#L69
> >
> >
> >
> > client = new ErrorReportingConcurrentUpdateSolrClient.Builder(url, req,
> > errors)
> >           .withHttpClient(httpClient)
> >           .withQueueSize(100)
> >           .withSocketTimeout(getSocketTimeout(req))
> >           .withThreadCount(runnerCount)
> >           .withExecutorService(updateExecutor)
> >           .alwaysStreamDeletes()
> >           .build();
> >
> > private int getSocketTimeout(SolrCmdDistributor.Req req) {
> >     if(req==null) {
> >       return UpdateShardHandlerConfig.DEFAULT_DISTRIBUPDATESOTIMEOUT;
> >     }
> >
> >     return
> >
> >
> req.cmd.req.getCore().getCoreContainer().getConfig().getUpdateShardHandlerConfig().getDistributedSocketTimeout();
> >   }
> >
> > I found this open JIRA on this issue:
> >
> >
> >
> https://issues.apache.org/jira/browse/SOLR-12550?jql=text%20~%20%22distribUpdateSoTimeout%22
> >
> >
> > Should I update the JIRA with this ?
> >
> > Thanks,
> > Rahul
> >
> >
> >
> >
> > On Thu, Jun 13, 2019 at 12:00 AM Rahul Goswami <[hidden email]>
> > wrote:
> >
> > > Hello,
> > >
> > > I am running Solr 7.2.1 in cloud mode. To overcome a setup hardware
> > > bottleneck, I tried to configure distribUpdateSoTimeout and
> socketTimeout
> > > to a value greater than the default 10 mins. I did this by passing
> these
> > as
> > > system properties at Solr start up time (-DdistribUpdateSoTimeout and
> > > -DsocketTimeout  ). The Solr admin UI shows these values in the
> Dashboard
> > > args section. As a test, I tried setting each of them to one hour
> > > (3600000). However I start seeing socket read timeouts within a few
> mins.
> > > Looks like the values are not taking effect. What am I missing? If this
> > is
> > > a known issue, is there a JIRA for it ?
> > >
> > > Thanks,
> > > Rahul
> > >
> >
>
>
> --
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)
>
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud: Configured socket timeouts not reflecting

Rahul Goswami
Hi Gus,

Have created a pull request for JIRA 12550
<https://issues.apache.org/jira/browse/SOLR-12550> and updated the affected
Solr version (7.2.1) in the comments. The provided fix is on branch_7_2. I
haven't tried reproducing the issue on the latest version, but see that the
code for this part is different on the master.

Regards,
Rahul

On Thu, Jun 20, 2019 at 8:22 PM Rahul Goswami <[hidden email]> wrote:

> Hi Gus,
> Thanks for the response and referencing the umbrella JIRA for these kind
> of issues. I see that it won't solve the problem since the builder object
> which is used to instantiate a ConcurrentUpdateSolrClient itself doesn't
> contain the timeout values. I did create a local solr-core binary to try
> the patch nevertheless, but it didn't help as I anticipated. I'll update
> the JIRA and submit a patch.
>
> Thank you,
> Rahul
>
> On Thu, Jun 20, 2019 at 11:35 AM Gus Heck <[hidden email]> wrote:
>
>> Hi Rahul,
>>
>> Did you try the patch int that issue? Also food for thought:
>> https://issues.apache.org/jira/browse/SOLR-13457
>>
>> -Gus
>>
>> On Tue, Jun 18, 2019 at 5:52 PM Rahul Goswami <[hidden email]>
>> wrote:
>>
>> > Hello,
>> >
>> > I was looking into the code to try to get to the root of this issue.
>> Looks
>> > like this is an issue after all (as of 7.2.1 which is the version we are
>> > using), but wanted to confirm on the user list before creating a JIRA. I
>> > found that the soTimeout property of ConcurrentUpdateSolrClient class
>> (in
>> > the code referenced below) remains null and hence the default of 600000
>> ms
>> > is set as the timeout in HttpPost class instance variable "method".
>> >
>> >
>> https://github.com/apache/lucene-solr/blob/e6f6f352cfc30517235822b3deed83df1ee144c6/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrClient.java#L334
>> >
>> >
>> > When the call is finally made in the below line, the Httpclient does
>> > contain the configured timeout (as in solr.xml or
>> -DdistribUpdateSoTimeout)
>> > but gets overriden by the hard default of 600000 in the "method"
>> parameter
>> > of the execute call.
>> >
>> >
>> >
>> https://github.com/apache/lucene-solr/blob/e6f6f352cfc30517235822b3deed83df1ee144c6/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrClient.java#L348
>> >
>> >
>> > The hard default of 600000 is set here:
>> >
>> >
>> https://github.com/apache/lucene-solr/blob/e6f6f352cfc30517235822b3deed83df1ee144c6/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrClient.java#L333
>> >
>> >
>> > I tried to create a local patch with the below fix which works fine:
>> >
>> >
>> https://github.com/apache/lucene-solr/blob/86fe24cbef238d2042d68494bd94e2362a2d996e/solr/core/src/java/org/apache/solr/update/StreamingSolrClients.java#L69
>> >
>> >
>> >
>> > client = new ErrorReportingConcurrentUpdateSolrClient.Builder(url, req,
>> > errors)
>> >           .withHttpClient(httpClient)
>> >           .withQueueSize(100)
>> >           .withSocketTimeout(getSocketTimeout(req))
>> >           .withThreadCount(runnerCount)
>> >           .withExecutorService(updateExecutor)
>> >           .alwaysStreamDeletes()
>> >           .build();
>> >
>> > private int getSocketTimeout(SolrCmdDistributor.Req req) {
>> >     if(req==null) {
>> >       return UpdateShardHandlerConfig.DEFAULT_DISTRIBUPDATESOTIMEOUT;
>> >     }
>> >
>> >     return
>> >
>> >
>> req.cmd.req.getCore().getCoreContainer().getConfig().getUpdateShardHandlerConfig().getDistributedSocketTimeout();
>> >   }
>> >
>> > I found this open JIRA on this issue:
>> >
>> >
>> >
>> https://issues.apache.org/jira/browse/SOLR-12550?jql=text%20~%20%22distribUpdateSoTimeout%22
>> >
>> >
>> > Should I update the JIRA with this ?
>> >
>> > Thanks,
>> > Rahul
>> >
>> >
>> >
>> >
>> > On Thu, Jun 13, 2019 at 12:00 AM Rahul Goswami <[hidden email]>
>> > wrote:
>> >
>> > > Hello,
>> > >
>> > > I am running Solr 7.2.1 in cloud mode. To overcome a setup hardware
>> > > bottleneck, I tried to configure distribUpdateSoTimeout and
>> socketTimeout
>> > > to a value greater than the default 10 mins. I did this by passing
>> these
>> > as
>> > > system properties at Solr start up time (-DdistribUpdateSoTimeout and
>> > > -DsocketTimeout  ). The Solr admin UI shows these values in the
>> Dashboard
>> > > args section. As a test, I tried setting each of them to one hour
>> > > (3600000). However I start seeing socket read timeouts within a few
>> mins.
>> > > Looks like the values are not taking effect. What am I missing? If
>> this
>> > is
>> > > a known issue, is there a JIRA for it ?
>> > >
>> > > Thanks,
>> > > Rahul
>> > >
>> >
>>
>>
>> --
>> http://www.needhamsoftware.com (work)
>> http://www.the111shift.com (play)
>>
>