Solr cloud in kubernetes

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr cloud in kubernetes

Lars Karlsson
Hi, I wanted to hear if anyone successfully got solr cloud running on
kubernetes and can share challenges and limitations.

Can't find much uptodate github projects, would be great if you can point
out blogposts or other useful links.

Thanks in advance.
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud in kubernetes

Björn Häuser
Hi Lars,

we are running Solr in kubernetes and after some initial problems we are running quite stable now.

Here is the setup we choose for solr:

- separate service for external traffic to solr (called “solr”)
- statefulset for solr with 3 replicas with another service (called “solr-discovery”)

We set the SOLR_HOST (which is used for intra cluster communication) to the pod inside the statefulset (solr-0.solr-discovery.default.svc.cluster.local. This ensures that on solr pod restart the intra cluster communication still continues to work. In the beginning we used the IP address of the pod, this caused problems when restarting pods, they tried to talk with the old ip addresses.

Zookeeper inside kubernetes is a different story. Use the latest version of kubernetes, because old versions never reresolved dns names. For connecting to zookeeper we use the same approach, one service-ip for all pods. The statefulset works again with a different service name.

The problems we are currently facing:

- Client timeouts whenever a solr pod stops and starts again, we currently try to solve this with better readiness probes, no success yet
- Sometimes solr collections do not recover completely after a pod restart and we manually have to force recovery, still not investigated fully

Hope this helps you!

Thanks
Björn

> On 26. Aug 2017, at 12:08, Lars Karlsson <[hidden email]> wrote:
>
> Hi, I wanted to hear if anyone successfully got solr cloud running on
> kubernetes and can share challenges and limitations.
>
> Can't find much uptodate github projects, would be great if you can point
> out blogposts or other useful links.
>
> Thanks in advance.

Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud in kubernetes

Lars Karlsson
Thanks Björn for the detailed information, just wanted to understand:

When you say separate service for external traffic, does this mean a home
brewed one that proxy solr queries?

And what is the difference between the above and "solr discovery"?

Do you specify pod anti affinity for solr hosts?

Regards
Lars

On Sat, 26 Aug 2017 at 13:19, Björn Häuser <[hidden email]> wrote:

> Hi Lars,
>
> we are running Solr in kubernetes and after some initial problems we are
> running quite stable now.
>
> Here is the setup we choose for solr:
>
> - separate service for external traffic to solr (called “solr”)
> - statefulset for solr with 3 replicas with another service (called
> “solr-discovery”)
>
> We set the SOLR_HOST (which is used for intra cluster communication) to
> the pod inside the statefulset
> (solr-0.solr-discovery.default.svc.cluster.local. This ensures that on solr
> pod restart the intra cluster communication still continues to work. In the
> beginning we used the IP address of the pod, this caused problems when
> restarting pods, they tried to talk with the old ip addresses.
>
> Zookeeper inside kubernetes is a different story. Use the latest version
> of kubernetes, because old versions never reresolved dns names. For
> connecting to zookeeper we use the same approach, one service-ip for all
> pods. The statefulset works again with a different service name.
>
> The problems we are currently facing:
>
> - Client timeouts whenever a solr pod stops and starts again, we currently
> try to solve this with better readiness probes, no success yet
> - Sometimes solr collections do not recover completely after a pod restart
> and we manually have to force recovery, still not investigated fully
>
> Hope this helps you!
>
> Thanks
> Björn
>
> > On 26. Aug 2017, at 12:08, Lars Karlsson <[hidden email]>
> wrote:
> >
> > Hi, I wanted to hear if anyone successfully got solr cloud running on
> > kubernetes and can share challenges and limitations.
> >
> > Can't find much uptodate github projects, would be great if you can point
> > out blogposts or other useful links.
> >
> > Thanks in advance.
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud in kubernetes

Björn Häuser
Hi Lars,

sorry, external traffic is a wrong name.

Basically all traffic to Solr goes through a k8s service which uses all solr pods as endpoints.
Additionally we use another service for intra cluster communication.

We do not use pod affinity.

Feel free to ask more question if something is unclear.

Regards
Björn

> On 28. Aug 2017, at 22:28, Lars Karlsson <[hidden email]> wrote:
>
> Thanks Björn for the detailed information, just wanted to understand:
>
> When you say separate service for external traffic, does this mean a home
> brewed one that proxy solr queries?
>
> And what is the difference between the above and "solr discovery"?
>
> Do you specify pod anti affinity for solr hosts?
>
> Regards
> Lars
>
> On Sat, 26 Aug 2017 at 13:19, Björn Häuser <[hidden email]> wrote:
>
>> Hi Lars,
>>
>> we are running Solr in kubernetes and after some initial problems we are
>> running quite stable now.
>>
>> Here is the setup we choose for solr:
>>
>> - separate service for external traffic to solr (called “solr”)
>> - statefulset for solr with 3 replicas with another service (called
>> “solr-discovery”)
>>
>> We set the SOLR_HOST (which is used for intra cluster communication) to
>> the pod inside the statefulset
>> (solr-0.solr-discovery.default.svc.cluster.local. This ensures that on solr
>> pod restart the intra cluster communication still continues to work. In the
>> beginning we used the IP address of the pod, this caused problems when
>> restarting pods, they tried to talk with the old ip addresses.
>>
>> Zookeeper inside kubernetes is a different story. Use the latest version
>> of kubernetes, because old versions never reresolved dns names. For
>> connecting to zookeeper we use the same approach, one service-ip for all
>> pods. The statefulset works again with a different service name.
>>
>> The problems we are currently facing:
>>
>> - Client timeouts whenever a solr pod stops and starts again, we currently
>> try to solve this with better readiness probes, no success yet
>> - Sometimes solr collections do not recover completely after a pod restart
>> and we manually have to force recovery, still not investigated fully
>>
>> Hope this helps you!
>>
>> Thanks
>> Björn
>>
>>> On 26. Aug 2017, at 12:08, Lars Karlsson <[hidden email]>
>> wrote:
>>>
>>> Hi, I wanted to hear if anyone successfully got solr cloud running on
>>> kubernetes and can share challenges and limitations.
>>>
>>> Can't find much uptodate github projects, would be great if you can point
>>> out blogposts or other useful links.
>>>
>>> Thanks in advance.
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud in kubernetes

rajasaur
Hi Bjorn,

Im trying a similar approach now (to get solrcloud working on kubernetes). I
have run Zookeeper as a statefulset, but not running SolrCloud, which is
causing an issue when my pods get destroyed and restarted.
I will try running the -h option so that the SOLR_HOST is used when
connecting to itself (and to zookeeper).

On another note, how do you store the indexes ? I had an issue with my GCE
node (Node NotReady), which had its kubelet to be restarted, but with that,
since solrcloud pods were restarted, all the data got wiped out. Just
wondering how you have setup your indexes with the solrcloud kubernetes
setup.

Thanks
Raja
 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud in kubernetes

Björn Häuser
Hi Raja,

we are using solrcloud as a statefulset and every pod has its own storage attached to it.

Thanks
Björn

> On 20. Nov 2017, at 05:59, rajasaur <[hidden email]> wrote:
>
> Hi Bjorn,
>
> Im trying a similar approach now (to get solrcloud working on kubernetes). I
> have run Zookeeper as a statefulset, but not running SolrCloud, which is
> causing an issue when my pods get destroyed and restarted.
> I will try running the -h option so that the SOLR_HOST is used when
> connecting to itself (and to zookeeper).
>
> On another note, how do you store the indexes ? I had an issue with my GCE
> node (Node NotReady), which had its kubelet to be restarted, but with that,
> since solrcloud pods were restarted, all the data got wiped out. Just
> wondering how you have setup your indexes with the solrcloud kubernetes
> setup.
>
> Thanks
> Raja
>
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud in kubernetes

Upayavira
We hopefully will switch to Kubernetes/Rancher 2.0 from Rancher
1.x/Docker, soon.

Here are some utilities that we've used as run-once containers to start
everything up:

https://github.com/odoko-devops/solr-utils

Using a single image, run with many different configurations, we have
been able to stand up an entire Solr stack, from scratch, including
ZooKeeper, Solr, solr.xml, config upload, collection creation, replica
creation, content indexing, etc. It is a delight to see when it works.

Upayavira

On Mon, 20 Nov 2017, at 09:30 AM, Björn Häuser wrote:

> Hi Raja,
>
> we are using solrcloud as a statefulset and every pod has its own storage
> attached to it.
>
> Thanks
> Björn
>
> > On 20. Nov 2017, at 05:59, rajasaur <[hidden email]> wrote:
> >
> > Hi Bjorn,
> >
> > Im trying a similar approach now (to get solrcloud working on kubernetes). I
> > have run Zookeeper as a statefulset, but not running SolrCloud, which is
> > causing an issue when my pods get destroyed and restarted.
> > I will try running the -h option so that the SOLR_HOST is used when
> > connecting to itself (and to zookeeper).
> >
> > On another note, how do you store the indexes ? I had an issue with my GCE
> > node (Node NotReady), which had its kubelet to be restarted, but with that,
> > since solrcloud pods were restarted, all the data got wiped out. Just
> > wondering how you have setup your indexes with the solrcloud kubernetes
> > setup.
> >
> > Thanks
> > Raja
> >
> >
> >
> >
> > --
> > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud in kubernetes

Paweł Ruciński
Hi,
I am trying to achieve same, to host Solr on k8s.
For now, I successfully created ZK as a statefulset (3 instances) with a
headless service. Apart of that created deployment objects for storing Solr
pods (again 3 instances). For each solr pod I have manually created
persistent volume.

Now I am wondering if there is a way to move into the statefulset for a solr
instances. I see a constraint, when solr pod dies, it loose core.properties
file, as it is inside solr home directory. Solr data directory is a mounted
persistent volume.

My question is, can I made Solr to create core.properties files in a
different place that solr home directory (ie. solr data)?

PS.
Your discussion was very informative for me. As I just started, both with
Solr and k8s.


Best regards,
Paweł Ruciński



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud in kubernetes

ssivashunm
This post was updated on .
https://github.com/freedev/solrcloud-zookeeper-kubernetes 
 provides more detail about persistent disck usage for solr data and home.

Th issue I face is, since all three statefulset will use the same solr port
(as they are replicas) instance 2 and instance 3 are failing to start.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Solr cloud in kubernetes

jonasdkhansen
Hi, i also got this running
https://github.com/freedev/solrcloud-zookeeper-kubernetes 

My problem is also that instance 2 and 3 will not start. If i exec into
them, and run bin/solr start -cloud, then i can start them on another port
than 32080, but thats not what we want.

Is anyone having a solution to this yet ?



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html