solr 4.4 splitshard query

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

solr 4.4 splitshard query

ashoknix
Hi,

  I have a legacy app which runs on solr 4.4 - I have 4 nodes solr cloud
with 3 zookeepers.

curl -v
'http://localhost:8980/solr/admin/collections?action=SPLITSHARD&collection=billdocs&shard=shard1&async=2000'

<lst name="responseHeader"><int name="status">500</int><int
name="QTime">300009</int></lst><lst name="error"><str name="msg">splitshard
the collection time out:300s</str><str
name="trace">org.apache.solr.common.SolrException: splitshard the collection
time out:300s
        at
org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:175)
        at
org.apache.solr.handler.admin.CollectionsHandler.handleSplitShardAction(CollectionsHandler.java:322)
        at
org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:136)
        at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)

I have few questions:

1.  Currently index size is around 40GB.
2.  Right now it has single shard - we observe query times high.
3.  Does SPLITSHARD helps here with query times?  Since docs gets
distributed

Please advise..

Thanks,
Ash



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: solr 4.4 splitshard query

Kelly, Frank
Whenever I hit a problem with SPLITSHARDS it's usually because I run out of disk as effectively your doubling the disk space used by the shard.

However for large indexes (and 40GB is pretty large) take a look at https://issues.apache.org/jira/browse/SOLR-5324
If that's the problem one possible workaround is to reduce the number of replicas before splitting the shard - although that will likely increase your query times even more.

-Frank

On 12/5/18, 7:14 AM, "ashoknix" <[hidden email]> wrote:

    Hi,
   
      I have a legacy app which runs on solr 4.4 - I have 4 nodes solr cloud
    with 3 zookeepers.
   
    curl -v
    'http://localhost:8980/solr/admin/collections?action=SPLITSHARD&collection=billdocs&shard=shard1&async=2000'
   
    <lst name="responseHeader"><int name="status">500</int><int
    name="QTime">300009</int></lst><lst name="error"><str name="msg">splitshard
    the collection time out:300s</str><str
    name="trace">org.apache.solr.common.SolrException: splitshard the collection
    time out:300s
            at
    org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:175)
            at
    org.apache.solr.handler.admin.CollectionsHandler.handleSplitShardAction(CollectionsHandler.java:322)
            at
    org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:136)
            at
    org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
   
    I have few questions:
   
    1.  Currently index size is around 40GB.
    2.  Right now it has single shard - we observe query times high.
    3.  Does SPLITSHARD helps here with query times?  Since docs gets
    distributed
   
    Please advise..
   
    Thanks,
    Ash
   
   
   
    --
    Sent from: https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flucene.472066.n3.nabble.com%2FSolr-User-f472068.html&amp;data=01%7C01%7C%7Cba17b59994454e140f7408d65aab347a%7C6d4034cd72254f72b85391feaea64919%7C1&amp;sdata=bTL8LMG7F57nD6scyFzlyDfiaBQO%2FVwj6E1pysyZ3vk%3D&amp;reserved=0
   

Reply | Threaded
Open this post in threaded view
|

Re: solr 4.4 splitshard query

Shawn Heisey-2
In reply to this post by ashoknix
On 12/5/2018 5:14 AM, ashoknix wrote:
> curl -v
> 'http://localhost:8980/solr/admin/collections?action=SPLITSHARD&collection=billdocs&shard=shard1&async=2000'
>
> <lst name="responseHeader"><int name="status">500</int><int
> name="QTime">300009</int></lst><lst name="error"><str name="msg">splitshard
> the collection time out:300s</str><str
> name="trace">org.apache.solr.common.SolrException: splitshard the collection
> time out:300s
<snip>
> 1.  Currently index size is around 40GB.
> 2.  Right now it has single shard - we observe query times high.
> 3.  Does SPLITSHARD helps here with query times?  Since docs gets
> distributed

You're trying to make the call async.  This is a good idea... but async
capability for the collections API was added in Solr 4.8.

https://issues.apache.org/jira/browse/SOLR-5477

Which means that in version 4.4, any collections API action that takes
longer than your collections API timeout is going to return this error. 
Your timeout appears to be 300 seconds.  I do not know whether the
splitshard will continue to operate on the server in this situation or not.

Once you have successfully split your index, the following will apply: 
Increasing the shard count will increase the amount of work that Solr
must do to execute a query.  If your query rate is very low and your
system has idle CPUs, then the query might complete faster.  If your
query rate is high or you do not have idle CPUs, then splitting shards
will make your queries take longer.

Because the latest version of Solr is 7.5.0, I would not recommend
running any 4.x version.  There is zero possibility of bugs in 4.x
getting developer attention.  Bugs in 6.6.x MIGHT get attention, but
mostly only bugs in the current major release will be addressed.

Thanks,
Shawn