Solr staying constant on popularity indexes

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr staying constant on popularity indexes

Tech Id

Hi,

So I was a bit frustrated the other day when all of a sudden my Solr nodes started going into recovery.
Everything became normal after a rolling restart, but when I looked at the logs, I was surprised to see --- nothing !
Solr UI gave me no information during recovery.
Solr logs gave me no information as to what really happened.

And though I have not had the time to use Elastic-Search yet, a couple of friends have recommended it highly.

Here is a graph that shows 30% gain of ES over Solr in less than 2 years:
Being a long term Solr user, I tried to do a little comparison myself and actually found some interesting features in ES.

1. No zookeeper  - I have burnt my hands with some zookeeper issues in the past and it is no fun to deal with. Kafka and Storm are also trying to burden zookeeper less and less because ZK cannot handle heavy traffic.
2. REST APIs - this is a big wow over the complicated syntax Solr uses. I think V2 APIs are coming to address this, but they did come a bit late in the game.
3. Client nodes - No such equivalent in Solr. All nodes do scatter-gather in Solr which adds scalability problems.
4. Much better logs in ES
5. Cluster level stats and hot-threads etc APIs make monitoring easy.

So I just wanted to discuss some of these important points about ES vs Solr.

At the very least, we should try to improve our logs.
When a node is behaving badly, Solr gives absolutely no information why its is behaving the way it is.
In the same debugging spirit, the Solr-UI can also be improved to show number-of-cores per node, total number of down/recovering etc nodes, memory/CPU/disk used by each node etc which make the engineer's jobs a bit more easy.


Cheers,
TI
Reply | Threaded
Open this post in threaded view
|

Re: Solr staying constant on popularity indexes

Toke Eskildsen-2
On Mon, 2017-10-09 at 20:50 -0700, Tech Id wrote:
> Being a long term Solr user, I tried to do a little comparison myself
> and actually found some interesting features in ES.
>
> 1. No zookeeper  - I have burnt my hands with some zookeeper issues
> in the past and it is no fun to deal with. Kafka and Storm are also
> trying to burden zookeeper less and less because ZK cannot handle
> heavy traffic.

ZooKeeper is not the easiest beast to tame, but it does have its
plusses. The greatest being that it is pretty good at what it does:
https://aphyr.com/posts/291-call-me-maybe-zookeeper

Home-cooked distribution systems might be a lot easier to use,
primarily because they tend to be a perfect fit for the technology they
support, but they are hard to get right:
https://aphyr.com/posts/323-call-me-maybe-elasticsearch-1-5-0

> 2. REST APIs - this is a big wow over the complicated syntax Solr
> uses. I think V2 APIs are coming to address this, but they did come a
> bit late in the game.

I guess you mean JSON APIs? Anyway, I fully agree that the old Solr
syntax is extremely clunky as soon as we move beyond the simple "just
supply a few search terms"-scenario.

- Toke Eskildsen, Royal Danish Library

Reply | Threaded
Open this post in threaded view
|

Re: Solr staying constant on popularity indexes

alessandro.benedetti
In reply to this post by Tech Id
In line :

/"1. No zookeeper  - I have burnt my hands with some zookeeper issues in the
past and it is no fun to deal with. Kafka and Storm are also trying to
burden zookeeper less and less because ZK cannot handle heavy traffic."/

Where did you get this information ? is based on some publicly
report/analysis/stress test or based on experience ?
Anyway


/"3. Client nodes - No such equivalent in Solr. All nodes do scatter-gather
in Solr which adds scalability problems."/

Solr has not such thing, but I would say it is moving in that direction [1]
adding different types of replicas.

Anyway I agree with you , it is always useful to look for the weak points (
and having another great product for comparation is very useful).

[1]
https://lucene.apache.org/solr/guide/7_0/shards-and-indexing-data-in-solrcloud.html#types-of-replicas



-----
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
Reply | Threaded
Open this post in threaded view
|

Re: Solr staying constant on popularity indexes

Bernd Fehling
In reply to this post by Tech Id
Questions coming to my mind:

Is there a "Resiliency Status" page for SolrCloud somewhere?

How would SolrCloud behave in a Jepsen test?

Regards
Bernd

Am 10.10.2017 um 09:22 schrieb Toke Eskildsen:

> On Mon, 2017-10-09 at 20:50 -0700, Tech Id wrote:
>> Being a long term Solr user, I tried to do a little comparison myself
>> and actually found some interesting features in ES.
>>
>> 1. No zookeeper  - I have burnt my hands with some zookeeper issues
>> in the past and it is no fun to deal with. Kafka and Storm are also
>> trying to burden zookeeper less and less because ZK cannot handle
>> heavy traffic.
>
> ZooKeeper is not the easiest beast to tame, but it does have its
> plusses. The greatest being that it is pretty good at what it does:
> https://aphyr.com/posts/291-call-me-maybe-zookeeper
>
> Home-cooked distribution systems might be a lot easier to use,
> primarily because they tend to be a perfect fit for the technology they
> support, but they are hard to get right:
> https://aphyr.com/posts/323-call-me-maybe-elasticsearch-1-5-0
>
>> 2. REST APIs - this is a big wow over the complicated syntax Solr
>> uses. I think V2 APIs are coming to address this, but they did come a
>> bit late in the game.
>
> I guess you mean JSON APIs? Anyway, I fully agree that the old Solr
> syntax is extremely clunky as soon as we move beyond the simple "just
> supply a few search terms"-scenario.
>
> - Toke Eskildsen, Royal Danish Library
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr staying constant on popularity indexes

Charlie Hull-3
On 10/10/2017 11:02, Bernd Fehling wrote:
> Questions coming to my mind:
>
> Is there a "Resiliency Status" page for SolrCloud somewhere?
>
> How would SolrCloud behave in a Jepsen test?

This has been done in 2014 - see
https://lucidworks.com/2014/12/10/call-maybe-solrcloud-jepsen-flaky-networks/

Charlie

>
> Regards
> Bernd
>
> Am 10.10.2017 um 09:22 schrieb Toke Eskildsen:
>> On Mon, 2017-10-09 at 20:50 -0700, Tech Id wrote:
>>> Being a long term Solr user, I tried to do a little comparison myself
>>> and actually found some interesting features in ES.
>>>
>>> 1. No zookeeper  - I have burnt my hands with some zookeeper issues
>>> in the past and it is no fun to deal with. Kafka and Storm are also
>>> trying to burden zookeeper less and less because ZK cannot handle
>>> heavy traffic.
>>
>> ZooKeeper is not the easiest beast to tame, but it does have its
>> plusses. The greatest being that it is pretty good at what it does:
>> https://aphyr.com/posts/291-call-me-maybe-zookeeper
>>
>> Home-cooked distribution systems might be a lot easier to use,
>> primarily because they tend to be a perfect fit for the technology they
>> support, but they are hard to get right:
>> https://aphyr.com/posts/323-call-me-maybe-elasticsearch-1-5-0
>>
>>> 2. REST APIs - this is a big wow over the complicated syntax Solr
>>> uses. I think V2 APIs are coming to address this, but they did come a
>>> bit late in the game.
>>
>> I guess you mean JSON APIs? Anyway, I fully agree that the old Solr
>> syntax is extremely clunky as soon as we move beyond the simple "just
>> supply a few search terms"-scenario.
>>
>> - Toke Eskildsen, Royal Danish Library
>>
>
> ---
> This email has been checked for viruses by AVG.
> http://www.avg.com
>


--
Charlie Hull
Flax - Open Source Enterprise Search

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.flax.co.uk
S G
Reply | Threaded
Open this post in threaded view
|

Re: Solr staying constant on popularity indexes

S G
I find myself in the same boat as TI when a Solr node goes into recovery.
Solr UI and the logs are really of no help at that time.
It would be really nice to enhance the Solr UI with the features mentioned
in the original post.


On Tue, Oct 10, 2017 at 4:14 AM, Charlie Hull <[hidden email]> wrote:

> On 10/10/2017 11:02, Bernd Fehling wrote:
>
>> Questions coming to my mind:
>>
>> Is there a "Resiliency Status" page for SolrCloud somewhere?
>>
>> How would SolrCloud behave in a Jepsen test?
>>
>
> This has been done in 2014 - see https://lucidworks.com/2014/12
> /10/call-maybe-solrcloud-jepsen-flaky-networks/
>
> Charlie
>
>>
>> Regards
>> Bernd
>>
>> Am 10.10.2017 um 09:22 schrieb Toke Eskildsen:
>>
>>> On Mon, 2017-10-09 at 20:50 -0700, Tech Id wrote:
>>>
>>>> Being a long term Solr user, I tried to do a little comparison myself
>>>> and actually found some interesting features in ES.
>>>>
>>>> 1. No zookeeper  - I have burnt my hands with some zookeeper issues
>>>> in the past and it is no fun to deal with. Kafka and Storm are also
>>>> trying to burden zookeeper less and less because ZK cannot handle
>>>> heavy traffic.
>>>>
>>>
>>> ZooKeeper is not the easiest beast to tame, but it does have its
>>> plusses. The greatest being that it is pretty good at what it does:
>>> https://aphyr.com/posts/291-call-me-maybe-zookeeper
>>>
>>> Home-cooked distribution systems might be a lot easier to use,
>>> primarily because they tend to be a perfect fit for the technology they
>>> support, but they are hard to get right:
>>> https://aphyr.com/posts/323-call-me-maybe-elasticsearch-1-5-0
>>>
>>> 2. REST APIs - this is a big wow over the complicated syntax Solr
>>>> uses. I think V2 APIs are coming to address this, but they did come a
>>>> bit late in the game.
>>>>
>>>
>>> I guess you mean JSON APIs? Anyway, I fully agree that the old Solr
>>> syntax is extremely clunky as soon as we move beyond the simple "just
>>> supply a few search terms"-scenario.
>>>
>>> - Toke Eskildsen, Royal Danish Library
>>>
>>>
>> ---
>> This email has been checked for viruses by AVG.
>> http://www.avg.com
>>
>>
>
> --
> Charlie Hull
> Flax - Open Source Enterprise Search
>
> tel/fax: +44 (0)8700 118334
> mobile:  +44 (0)7767 825828
> web: www.flax.co.uk
>