solr 6 at scale

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

solr 6 at scale

Nawab Zada Asad Iqbal
Hi all,

I am planning to upgrade my solr.4.x installation to a recent stable
version. Should I get the latest 6.5.1 bits or will a little older release
be better in terms of stability?
I am curious if there is way to see solr.6.x adoption in large companies. I
have talked to few people and they are also stuck at older major versions.

Anyone using solr.6.x for multi-terabytes index size: how did you decide
which version to upgrade to?


Regards
Nawab
Reply | Threaded
Open this post in threaded view
|

Re: solr 6 at scale

Walter Underwood
We are running 6.5.1 in a 16 node cluster, four shards and four replicas. It is performing brilliantly.

Our index is 18 million documents, but we have very heavy queries. Students are searching for homework help, so they paste in the entire problem. We truncate queries at 40 terms to limit the load, but we have a LOT of long queries. Our average query time is nicely under 500 milliseconds.

I strongly recommend that you benchmark your data with your prod queries. JMeter can replay access logs.

Versions after 6.4.0 and before 6.5.1 have performance issues because of the metrics reporting. Use 6.5.1.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)


> On May 23, 2017, at 5:27 PM, Nawab Zada Asad Iqbal <[hidden email]> wrote:
>
> Hi all,
>
> I am planning to upgrade my solr.4.x installation to a recent stable
> version. Should I get the latest 6.5.1 bits or will a little older release
> be better in terms of stability?
> I am curious if there is way to see solr.6.x adoption in large companies. I
> have talked to few people and they are also stuck at older major versions.
>
> Anyone using solr.6.x for multi-terabytes index size: how did you decide
> which version to upgrade to?
>
>
> Regards
> Nawab

Reply | Threaded
Open this post in threaded view
|

Re: solr 6 at scale

Erick Erickson
I'll quibble a little with Walter and say that 6.4.2 fixes the perf
problem in 6.4.0 and 6.4.1. Which doesn't change his recommendation at
all, I'd go with 6.5.1.

Best,
Erick

On Tue, May 23, 2017 at 5:49 PM, Walter Underwood <[hidden email]> wrote:

> We are running 6.5.1 in a 16 node cluster, four shards and four replicas. It is performing brilliantly.
>
> Our index is 18 million documents, but we have very heavy queries. Students are searching for homework help, so they paste in the entire problem. We truncate queries at 40 terms to limit the load, but we have a LOT of long queries. Our average query time is nicely under 500 milliseconds.
>
> I strongly recommend that you benchmark your data with your prod queries. JMeter can replay access logs.
>
> Versions after 6.4.0 and before 6.5.1 have performance issues because of the metrics reporting. Use 6.5.1.
>
> wunder
> Walter Underwood
> [hidden email]
> http://observer.wunderwood.org/  (my blog)
>
>
>> On May 23, 2017, at 5:27 PM, Nawab Zada Asad Iqbal <[hidden email]> wrote:
>>
>> Hi all,
>>
>> I am planning to upgrade my solr.4.x installation to a recent stable
>> version. Should I get the latest 6.5.1 bits or will a little older release
>> be better in terms of stability?
>> I am curious if there is way to see solr.6.x adoption in large companies. I
>> have talked to few people and they are also stuck at older major versions.
>>
>> Anyone using solr.6.x for multi-terabytes index size: how did you decide
>> which version to upgrade to?
>>
>>
>> Regards
>> Nawab
>
Reply | Threaded
Open this post in threaded view
|

Re: solr 6 at scale

Toke Eskildsen-2
In reply to this post by Nawab Zada Asad Iqbal
On Tue, 2017-05-23 at 17:27 -0700, Nawab Zada Asad Iqbal wrote:
> Anyone using solr.6.x for multi-terabytes index size: how did you
> decide which version to upgrade to?

We are still stuck with 4.10 for our 70TB+ (split in 83 shards) index,
due to some custom hacks that has not yet been ported. If not for the
hacks, we would probably have switched to Solr 6.x by now, as we would
very much like some of the newer features.

We do have a 2.8TB (split in 6 shards, 2 replicas) index running on
Solr 6.3, which was the newest version at installation time. As long as
there are known stable and well-functioning releases within the same
major version, we are fine with picking the latest release and see how
it goes: It is relatively easy to downgrade to an earlier release
within the same major version. We have not switched to 6.5.1 simply
because we have no pressing need for it - Solr 6.3 works well for us.

I guess it depends quite a bit on your need for stability. We are a
library and uptime is only "best effort".
-- 
Toke Eskildsen, Royal Danish Library
Reply | Threaded
Open this post in threaded view
|

Re: solr 6 at scale

Shawn Heisey-2
On 5/24/2017 3:44 AM, Toke Eskildsen wrote:
> It is relatively easy to downgrade to an earlier release within the
> same major version. We have not switched to 6.5.1 simply because we
> have no pressing need for it - Solr 6.3 works well for us.

That strikes me as a little bit dangerous, unless your indexes are very
static.  The Lucene index format does occasionally change in minor
versions.  I do not know whether the index format changed from 6.3 to
6.5, but if it did, then 6.3 would not be able to read index segments
built by 6.5, which might mean that it would refuse to read the entire
index.

Thanks,
Shawn

Reply | Threaded
Open this post in threaded view
|

Re: solr 6 at scale

Toke Eskildsen-2
Shawn Heisey <[hidden email]> wrote:
> On 5/24/2017 3:44 AM, Toke Eskildsen wrote:
>> It is relatively easy to downgrade to an earlier release within the
>> same major version. We have not switched to 6.5.1 simply because we
>> have no pressing need for it - Solr 6.3 works well for us.

> That strikes me as a little bit dangerous, unless your indexes are very
> static.  The Lucene index format does occasionally change in minor
> versions.

Err.. Okay? Thank you for that. I was under the impression that the index format was fixed (modulo critical bugs) for major versions. This will change our approach to updating.

Apologies for the confusion,
Toke
Reply | Threaded
Open this post in threaded view
|

Re: solr 6 at scale

Nawab Zada Asad Iqbal
Thanks everyone for the responses, I will go with the latest bits for now;
and will share how it goes.

@Toke, I stumbled upon your page last week but it seems that your huge
index doesn't receive a lot of query traffic. Mine is around 60TB and
receives around 120 queries per second; ~90 shards on 30 machines.


I look forward to hear more scale stories.
Nawab

On Wed, May 24, 2017 at 7:58 AM, Toke Eskildsen <[hidden email]> wrote:

> Shawn Heisey <[hidden email]> wrote:
> > On 5/24/2017 3:44 AM, Toke Eskildsen wrote:
> >> It is relatively easy to downgrade to an earlier release within the
> >> same major version. We have not switched to 6.5.1 simply because we
> >> have no pressing need for it - Solr 6.3 works well for us.
>
> > That strikes me as a little bit dangerous, unless your indexes are very
> > static.  The Lucene index format does occasionally change in minor
> > versions.
>
> Err.. Okay? Thank you for that. I was under the impression that the index
> format was fixed (modulo critical bugs) for major versions. This will
> change our approach to updating.
>
> Apologies for the confusion,
> Toke
>
Reply | Threaded
Open this post in threaded view
|

Re: solr 6 at scale

Walter Underwood
I remembered why we waited for 6.5.1. It is the object leak in the Zookeeper client code. A very slow leak, but worth getting a fix.

I tested our cluster at 6000 requests/minute. It is 18 million documents, four shards by four replicas on big AWS instances (c4.8xlarge). We have very long free text queries. Students enter queries with hundreds of words (copy/paste), but we truncate at 40 terms.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)


> On May 24, 2017, at 12:33 PM, Nawab Zada Asad Iqbal <[hidden email]> wrote:
>
> Thanks everyone for the responses, I will go with the latest bits for now;
> and will share how it goes.
>
> @Toke, I stumbled upon your page last week but it seems that your huge
> index doesn't receive a lot of query traffic. Mine is around 60TB and
> receives around 120 queries per second; ~90 shards on 30 machines.
>
>
> I look forward to hear more scale stories.
> Nawab
>
> On Wed, May 24, 2017 at 7:58 AM, Toke Eskildsen <[hidden email]> wrote:
>
>> Shawn Heisey <[hidden email]> wrote:
>>> On 5/24/2017 3:44 AM, Toke Eskildsen wrote:
>>>> It is relatively easy to downgrade to an earlier release within the
>>>> same major version. We have not switched to 6.5.1 simply because we
>>>> have no pressing need for it - Solr 6.3 works well for us.
>>
>>> That strikes me as a little bit dangerous, unless your indexes are very
>>> static.  The Lucene index format does occasionally change in minor
>>> versions.
>>
>> Err.. Okay? Thank you for that. I was under the impression that the index
>> format was fixed (modulo critical bugs) for major versions. This will
>> change our approach to updating.
>>
>> Apologies for the confusion,
>> Toke
>>

Reply | Threaded
Open this post in threaded view
|

Re: solr 6 at scale

Toke Eskildsen-2
In reply to this post by Nawab Zada Asad Iqbal
Nawab Zada Asad Iqbal <[hidden email]> wrote:
> @Toke, I stumbled upon your page last week but it seems that your huge
> index doesn't receive a lot of query traffic.

It switches between two kinds of usage:

Everyday use is very low traffic by researchers using it interactively: 1-2 simultaneous queries, with faceting ranging from somewhat heavy to very heavy. Our setup is optimized towards this scenario and latency starts to go up pretty quickly if the number of simultaneous request rises.

Now and then some cultural probes are being performed, where the index is being hammered continuously by multiple threads. Here it is our experience that max throughput for extremely simple queries (existence checks for social security numbers) is around 50 queries/second.

> Mine is around 60TB and receives around 120 queries per second; ~90 shards on 30 machines.

Sounds interesting. Do you have a more detailed write-up somewhere?

- Toke
Reply | Threaded
Open this post in threaded view
|

Re: solr 6 at scale

Bram Van Dam
In reply to this post by Toke Eskildsen-2
>>> It is relatively easy to downgrade to an earlier release within the
>>> same major version. We have not switched to 6.5.1 simply because we
>>> have no pressing need for it - Solr 6.3 works well for us.
>
>> That strikes me as a little bit dangerous, unless your indexes are very
>> static.  The Lucene index format does occasionally change in minor
>> versions.
>
> Err.. Okay? Thank you for that. I was under the impression that the index format was fixed (modulo critical bugs) for major versions. This will change our approach to updating.

*Upgrading* (say 6.3 to 6.5.1) should be fine, because -- as I
understand it -- newer Lucene/Solr versions support reading older index
formats (up to the previous major version). Older versions reading newer
index would be ... difficult.

 - Bram
Reply | Threaded
Open this post in threaded view
|

Re: solr 6 at scale

Nawab Zada Asad Iqbal
In reply to this post by Toke Eskildsen-2
Hi Toke,

I don't have any blog, but here is a high level idea:

I have 31 machine cluster with 3 shards on each (93 shards). Each machine
has 250~GB ram and 3TB SSD for search index (there is another drive for OS
and stuff). One solr process runs for each shard with 48G heap. So we have
3 large files on the SSD.

That is just one cluster, we have 5 such clusters which we can bring live
or offline (for testing or maintenance etc.) Usually 3 are active at any
time, taking 1/3 of user traffic each.
We don't rely on replication between these clusters. Our out-of-solr
processes send writes to all the replicas in parallel. We don't use
solrCloud although it was available in solr.4.5 (which we are using).


Thanks
Nawab


On Wed, May 24, 2017 at 3:01 PM, Toke Eskildsen <[hidden email]> wrote:

> Nawab Zada Asad Iqbal <[hidden email]> wrote:
> > @Toke, I stumbled upon your page last week but it seems that your huge
> > index doesn't receive a lot of query traffic.
>
> It switches between two kinds of usage:
>
> Everyday use is very low traffic by researchers using it interactively:
> 1-2 simultaneous queries, with faceting ranging from somewhat heavy to very
> heavy. Our setup is optimized towards this scenario and latency starts to
> go up pretty quickly if the number of simultaneous request rises.
>
> Now and then some cultural probes are being performed, where the index is
> being hammered continuously by multiple threads. Here it is our experience
> that max throughput for extremely simple queries (existence checks for
> social security numbers) is around 50 queries/second.
>
> > Mine is around 60TB and receives around 120 queries per second; ~90
> shards on 30 machines.
>
> Sounds interesting. Do you have a more detailed write-up somewhere?
>
> - Toke
>
Reply | Threaded
Open this post in threaded view
|

Re: solr 6 at scale

Toke Eskildsen-2
On Thu, 2017-05-25 at 15:56 -0700, Nawab Zada Asad Iqbal wrote:
> I have 31 machine cluster with 3 shards on each (93 shards). Each
> machine has 250~GB ram and 3TB SSD for search index (there is another
> drive for OS and stuff). One solr process runs for each shard with
> 48G heap. So we have 3 large files on the SSD.

So each shards is ~650GB, right? Which means 2TB of index and 1TB of
free space on the SSDs. In principle that is dangerous as it can run
out of space during index update, but unless you are using huge
segments, I guess the chances of that are low (I am not an expert in
segment merge mechanics).

We're also using a 1 Solr/shard setup, but with SolrCloud. Our initial
rationale for 1 Solr/shard was to avoid long GC-pauses due to large
heaps, but that does not seem to be a problem here. Now we stick to it
as it works fine and makes for simple logistics.
-- 
Toke Eskildsen, Royal Danish Library