HttpShardHandlerFactory

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

HttpShardHandlerFactory

Mark Robinson
Hello,

I am trying to understand the socket time out and connection time out in
the HttpShardHandlerFactory:-

       <shardHandler class="HttpShardHandlerFactory">
              <int name="socketTimeOut">10</int>
              <int name="connTimeOut">20</int>
       </shardHandler>

1.Could some one please help me understand the effect of using such low
values of 10 ms
    and 20ms as given above inside my /select handler?

2. What is the guidelines for setting these parameters? Should they be low
or high

3. How can I test the effect of this chunk of code after adding it to my
/select handler ie I want to
     make sure the above code snippet is working. That is why I gave such
low values and
     thought when I fire a query I would get both time out errors in the
logs. But did not!
     Or is it that within the above time frame (10 ms, 20ms) if no request
comes the socket will
     time out and the connection will be lost. So to test this should I
give a say 100 TPS load with
     these low values and then increase the values to maybe 1000 ms and
1500 ms respectively
     and see lesser time out error messages?

I am trying to understand how these parameters can be put to good use.

Thanks!
Mark
Reply | Threaded
Open this post in threaded view
|

Re: HttpShardHandlerFactory

Shawn Heisey-2
On 8/16/2019 3:51 AM, Mark Robinson wrote:
> I am trying to understand the socket time out and connection time out in
> the HttpShardHandlerFactory:-
>
>         <shardHandler class="HttpShardHandlerFactory">
>                <int name="socketTimeOut">10</int>
>                <int name="connTimeOut">20</int>
>         </shardHandler>

The shard handler is used when that Solr instance needs to make
connections to another Solr instance (which could be itself, as odd as
that might sound).  It does not apply to the requests that you make from
outside Solr.

> 1.Could some one please help me understand the effect of using such low
> values of 10 ms
>      and 20ms as given above inside my /select handler?

A connection timeout of 10 milliseconds *might* result in connections
not establishing at all.  This is translated down to the TCP socket as
the TCP connection timeout -- the time limit imposed on making the TCP
connection itself.  Which as I understand it, is the completion of the
"SYN", "SYN/ACK", and "ACK" sequence.  If the two endpoints of the
connection are on a LAN, you might never see a problem from this -- LAN
connections are very low latency.  But if they are across the Internet,
they might never work.

The socket timeout of 20 milliseconds means that if the connection goes
idle for 20 milliseconds, it will be forcibly closed.  So if it took 25
milliseconds for the remote Solr instance to respond, this Solr instance
would have given up and closed the connection.  It is extremely common
for requests to take 100, 500, 2000, or more milliseconds to respond.

> 2. What is the guidelines for setting these parameters? Should they be low
> or high

I would probably use a value of about 5000 (five seconds) for the
connection timeout if everything's on a local LAN.  I might go as high
as 15 seconds if there's a high latency network between them, but five
seconds is probably long enough too.

For the socket timeout, you want a value that's considerably longer than
you expect requests to ever take.  Probably somewhere between two and
five minutes.

> 3. How can I test the effect of this chunk of code after adding it to my
> /select handler ie I want to
>       make sure the above code snippet is working. That is why I gave such
> low values and
>       thought when I fire a query I would get both time out errors in the
> logs. But did not!
>       Or is it that within the above time frame (10 ms, 20ms) if no request
> comes the socket will
>       time out and the connection will be lost. So to test this should I
> give a say 100 TPS load with
>       these low values and then increase the values to maybe 1000 ms and
> 1500 ms respectively
>       and see lesser time out error messages?

If you were running a multi-server SolrCloud setup (or a single-server
setup with multiple shards and/or replicas), you probably would see
problems from values that low.  But if Solr never has any need to make
connections to satisfy a request, then the values will never take effect.

If you want to control these values for requests made from outside Solr,
you will need to do it in your client software that is making the request.

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

Re: HttpShardHandlerFactory

Michael Gibney
Mark,

Another thing to check is that I believe the configuration you posted may
not actually be taking effect. Unless I'm mistaken, I think the correct
element name to configure the shardHandler is "shardHandler*Factory*", not
"shardHandler" ... as in, '<shardHandlerFactory
class="HttpShardHandlerFactory">...'

The element name is documented correctly in the refGuide page for "Format
of solr.xml":
https://lucene.apache.org/solr/guide/8_1/format-of-solr-xml.html#the-shardhandlerfactory-element

... but the incorrect (?) element name is included in the refGuide page for
"Distributed Requests":
https://lucene.apache.org/solr/guide/8_1/distributed-requests.html#configuring-the-shardhandlerfactory

Michael

On Fri, Aug 16, 2019 at 9:40 AM Shawn Heisey <[hidden email]> wrote:

> On 8/16/2019 3:51 AM, Mark Robinson wrote:
> > I am trying to understand the socket time out and connection time out in
> > the HttpShardHandlerFactory:-
> >
> >         <shardHandler class="HttpShardHandlerFactory">
> >                <int name="socketTimeOut">10</int>
> >                <int name="connTimeOut">20</int>
> >         </shardHandler>
>
> The shard handler is used when that Solr instance needs to make
> connections to another Solr instance (which could be itself, as odd as
> that might sound).  It does not apply to the requests that you make from
> outside Solr.
>
> > 1.Could some one please help me understand the effect of using such low
> > values of 10 ms
> >      and 20ms as given above inside my /select handler?
>
> A connection timeout of 10 milliseconds *might* result in connections
> not establishing at all.  This is translated down to the TCP socket as
> the TCP connection timeout -- the time limit imposed on making the TCP
> connection itself.  Which as I understand it, is the completion of the
> "SYN", "SYN/ACK", and "ACK" sequence.  If the two endpoints of the
> connection are on a LAN, you might never see a problem from this -- LAN
> connections are very low latency.  But if they are across the Internet,
> they might never work.
>
> The socket timeout of 20 milliseconds means that if the connection goes
> idle for 20 milliseconds, it will be forcibly closed.  So if it took 25
> milliseconds for the remote Solr instance to respond, this Solr instance
> would have given up and closed the connection.  It is extremely common
> for requests to take 100, 500, 2000, or more milliseconds to respond.
>
> > 2. What is the guidelines for setting these parameters? Should they be
> low
> > or high
>
> I would probably use a value of about 5000 (five seconds) for the
> connection timeout if everything's on a local LAN.  I might go as high
> as 15 seconds if there's a high latency network between them, but five
> seconds is probably long enough too.
>
> For the socket timeout, you want a value that's considerably longer than
> you expect requests to ever take.  Probably somewhere between two and
> five minutes.
>
> > 3. How can I test the effect of this chunk of code after adding it to my
> > /select handler ie I want to
> >       make sure the above code snippet is working. That is why I gave
> such
> > low values and
> >       thought when I fire a query I would get both time out errors in the
> > logs. But did not!
> >       Or is it that within the above time frame (10 ms, 20ms) if no
> request
> > comes the socket will
> >       time out and the connection will be lost. So to test this should I
> > give a say 100 TPS load with
> >       these low values and then increase the values to maybe 1000 ms and
> > 1500 ms respectively
> >       and see lesser time out error messages?
>
> If you were running a multi-server SolrCloud setup (or a single-server
> setup with multiple shards and/or replicas), you probably would see
> problems from values that low.  But if Solr never has any need to make
> connections to satisfy a request, then the values will never take effect.
>
> If you want to control these values for requests made from outside Solr,
> you will need to do it in your client software that is making the request.
>
> Thanks,
> Shawn
>
Reply | Threaded
Open this post in threaded view
|

Re: HttpShardHandlerFactory

Mark Robinson
In reply to this post by Shawn Heisey-2
Hello Shawn,

Thank you so much for the detailed response.
It was so helpful!

Thanks!
Mark.

On Fri, Aug 16, 2019 at 9:40 AM Shawn Heisey <[hidden email]> wrote:

> On 8/16/2019 3:51 AM, Mark Robinson wrote:
> > I am trying to understand the socket time out and connection time out in
> > the HttpShardHandlerFactory:-
> >
> >         <shardHandler class="HttpShardHandlerFactory">
> >                <int name="socketTimeOut">10</int>
> >                <int name="connTimeOut">20</int>
> >         </shardHandler>
>
> The shard handler is used when that Solr instance needs to make
> connections to another Solr instance (which could be itself, as odd as
> that might sound).  It does not apply to the requests that you make from
> outside Solr.
>
> > 1.Could some one please help me understand the effect of using such low
> > values of 10 ms
> >      and 20ms as given above inside my /select handler?
>
> A connection timeout of 10 milliseconds *might* result in connections
> not establishing at all.  This is translated down to the TCP socket as
> the TCP connection timeout -- the time limit imposed on making the TCP
> connection itself.  Which as I understand it, is the completion of the
> "SYN", "SYN/ACK", and "ACK" sequence.  If the two endpoints of the
> connection are on a LAN, you might never see a problem from this -- LAN
> connections are very low latency.  But if they are across the Internet,
> they might never work.
>
> The socket timeout of 20 milliseconds means that if the connection goes
> idle for 20 milliseconds, it will be forcibly closed.  So if it took 25
> milliseconds for the remote Solr instance to respond, this Solr instance
> would have given up and closed the connection.  It is extremely common
> for requests to take 100, 500, 2000, or more milliseconds to respond.
>
> > 2. What is the guidelines for setting these parameters? Should they be
> low
> > or high
>
> I would probably use a value of about 5000 (five seconds) for the
> connection timeout if everything's on a local LAN.  I might go as high
> as 15 seconds if there's a high latency network between them, but five
> seconds is probably long enough too.
>
> For the socket timeout, you want a value that's considerably longer than
> you expect requests to ever take.  Probably somewhere between two and
> five minutes.
>
> > 3. How can I test the effect of this chunk of code after adding it to my
> > /select handler ie I want to
> >       make sure the above code snippet is working. That is why I gave
> such
> > low values and
> >       thought when I fire a query I would get both time out errors in the
> > logs. But did not!
> >       Or is it that within the above time frame (10 ms, 20ms) if no
> request
> > comes the socket will
> >       time out and the connection will be lost. So to test this should I
> > give a say 100 TPS load with
> >       these low values and then increase the values to maybe 1000 ms and
> > 1500 ms respectively
> >       and see lesser time out error messages?
>
> If you were running a multi-server SolrCloud setup (or a single-server
> setup with multiple shards and/or replicas), you probably would see
> problems from values that low.  But if Solr never has any need to make
> connections to satisfy a request, then the values will never take effect.
>
> If you want to control these values for requests made from outside Solr,
> you will need to do it in your client software that is making the request.
>
> Thanks,
> Shawn
>
Reply | Threaded
Open this post in threaded view
|

Re: HttpShardHandlerFactory

Mark Robinson
In reply to this post by Michael Gibney
Hello Michael,

Thank you for pointing that out.
Today I am planning to try this out along with the insights Shawn had
shared.

Thanks!
Mark.

On Mon, Aug 19, 2019 at 9:21 AM Michael Gibney <[hidden email]>
wrote:

> Mark,
>
> Another thing to check is that I believe the configuration you posted may
> not actually be taking effect. Unless I'm mistaken, I think the correct
> element name to configure the shardHandler is "shardHandler*Factory*", not
> "shardHandler" ... as in, '<shardHandlerFactory
> class="HttpShardHandlerFactory">...'
>
> The element name is documented correctly in the refGuide page for "Format
> of solr.xml":
>
> https://lucene.apache.org/solr/guide/8_1/format-of-solr-xml.html#the-shardhandlerfactory-element
>
> ... but the incorrect (?) element name is included in the refGuide page for
> "Distributed Requests":
>
> https://lucene.apache.org/solr/guide/8_1/distributed-requests.html#configuring-the-shardhandlerfactory
>
> Michael
>
> On Fri, Aug 16, 2019 at 9:40 AM Shawn Heisey <[hidden email]> wrote:
>
> > On 8/16/2019 3:51 AM, Mark Robinson wrote:
> > > I am trying to understand the socket time out and connection time out
> in
> > > the HttpShardHandlerFactory:-
> > >
> > >         <shardHandler class="HttpShardHandlerFactory">
> > >                <int name="socketTimeOut">10</int>
> > >                <int name="connTimeOut">20</int>
> > >         </shardHandler>
> >
> > The shard handler is used when that Solr instance needs to make
> > connections to another Solr instance (which could be itself, as odd as
> > that might sound).  It does not apply to the requests that you make from
> > outside Solr.
> >
> > > 1.Could some one please help me understand the effect of using such low
> > > values of 10 ms
> > >      and 20ms as given above inside my /select handler?
> >
> > A connection timeout of 10 milliseconds *might* result in connections
> > not establishing at all.  This is translated down to the TCP socket as
> > the TCP connection timeout -- the time limit imposed on making the TCP
> > connection itself.  Which as I understand it, is the completion of the
> > "SYN", "SYN/ACK", and "ACK" sequence.  If the two endpoints of the
> > connection are on a LAN, you might never see a problem from this -- LAN
> > connections are very low latency.  But if they are across the Internet,
> > they might never work.
> >
> > The socket timeout of 20 milliseconds means that if the connection goes
> > idle for 20 milliseconds, it will be forcibly closed.  So if it took 25
> > milliseconds for the remote Solr instance to respond, this Solr instance
> > would have given up and closed the connection.  It is extremely common
> > for requests to take 100, 500, 2000, or more milliseconds to respond.
> >
> > > 2. What is the guidelines for setting these parameters? Should they be
> > low
> > > or high
> >
> > I would probably use a value of about 5000 (five seconds) for the
> > connection timeout if everything's on a local LAN.  I might go as high
> > as 15 seconds if there's a high latency network between them, but five
> > seconds is probably long enough too.
> >
> > For the socket timeout, you want a value that's considerably longer than
> > you expect requests to ever take.  Probably somewhere between two and
> > five minutes.
> >
> > > 3. How can I test the effect of this chunk of code after adding it to
> my
> > > /select handler ie I want to
> > >       make sure the above code snippet is working. That is why I gave
> > such
> > > low values and
> > >       thought when I fire a query I would get both time out errors in
> the
> > > logs. But did not!
> > >       Or is it that within the above time frame (10 ms, 20ms) if no
> > request
> > > comes the socket will
> > >       time out and the connection will be lost. So to test this should
> I
> > > give a say 100 TPS load with
> > >       these low values and then increase the values to maybe 1000 ms
> and
> > > 1500 ms respectively
> > >       and see lesser time out error messages?
> >
> > If you were running a multi-server SolrCloud setup (or a single-server
> > setup with multiple shards and/or replicas), you probably would see
> > problems from values that low.  But if Solr never has any need to make
> > connections to satisfy a request, then the values will never take effect.
> >
> > If you want to control these values for requests made from outside Solr,
> > you will need to do it in your client software that is making the
> request.
> >
> > Thanks,
> > Shawn
> >
>