question about setup for maximizing solr performance

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

question about setup for maximizing solr performance

Odysci
Hi,
I'm looking for some advice on improving performance of our solr setup. In
particular, about the trade-offs between applying larger machines, vs more
smaller machines. Our full index has just over 100 million docs, and we do
almost all searches using fq's (with q=*:*) and facets. We are using solr
8.3.

Currently, I have a solrcloud setup with 2 physical machines (let's call
them A and B), and my index is divided into 2 shards, and 2 replicas, such
that each machine has a full copy of the index.
The nodes and replicas are as follows:
Machine A:
      core_node3 / shard1_replica_n1
      core_node7 / shard2_replica_n4
Machine B:
      core_node5 / shard1_replica_n2
      core_node8 / shard2_replica_n6

My Zookeeper setup uses 3 instances. It's also the case that most of the
searches we do, we have results returning from both shards (from the same
search).

My experiments indicate that our setup is cpu-bound.
Due to cost constraints, I could, either, double the cpu in each of the 2
machines, or make it a 4-machine setup (using current size machines) and 2
shards and 4 replicas (or 4 shards w/ 4 replicas). I assume that keeping
the full index on all machines will allow all searches to be evenly
distributed.

Does anyone have any insights on what would be better for maximizing
throughput on multiple searches being done at the same time?
thanks!

Reinaldo
Reply | Threaded
Open this post in threaded view
|

Re: question about setup for maximizing solr performance

Shawn Heisey-2
On 6/1/2020 9:29 AM, Odysci wrote:
> Hi,
> I'm looking for some advice on improving performance of our solr setup.
<snip>

> Does anyone have any insights on what would be better for maximizing
> throughput on multiple searches being done at the same time?
> thanks!

In almost all cases, adding memory will provide the best performance
boost.  This is because memory is faster than disks, even SSD.  I have
put relevant information on a wiki page so that it is easy for people to
find and digest:

https://cwiki.apache.org/confluence/display/solr/SolrPerformanceProblems

Thanks,
Shawn