I have a quick question about using solrj to connect to multiple slaves.
My application is deployed on multiple boxes that have to talk to
multiple solr slaves. In order to take advantage of the queryResult
cache, each request from one of my app boxes should be redirected to the
same solr slave.
I'm using an apache to load balance between the slaves using sticky
sessions with jk2 (jsessionId cookie). Is this the right way to go
about loadbalancing multiple solr slaves when using solrj? If so, should
I look into making a patch for solrj so that each query can optionally
take a cookie parameter so the underling HttpClient knows what
jsessionId to attach to the request?
On the other hand.. I could be going about this all wrong.
: I have a quick question about using solrj to connect to multiple slaves.
: My application is deployed on multiple boxes that have to talk to
: multiple solr slaves. In order to take advantage of the queryResult
: cache, each request from one of my app boxes should be redirected to the
: same solr slave.
i've never once worried about "session affinity" when dealing with Solr
... if a query is common/important enough that it's going to be a cache
hit, it will probably be a cache hit on all the servers. besides which:
just because two queries come from the same client doesn't mean they have
anything to do with eachother - i'm typically just as likely to get the
same query from two differnet clients as i am twice fro mthe same client.
if you want to worry about smart load balancing, try to load balance based
on the nature of the URL query string ... make you loard balancer pick
a slave by hashing on the "q" param for example.
the one situation where i worry about sending certain traffic to some Solr
boxes and other traffic to other Solr boxes is when i know that the client
apps have very differnet query usage patterns ... then i have two seperate
tiers of Slaves -- identical indexes, but different solrconfigs. the
clients that hit my custon faceting plugin use one tier with a big custom
cache and filterCache. the clients that do more traditional searching
using dismax hit a second tier which has no custom cache, a smaller
filterCache and a bigger queryResultCache ... but even then i don't worry
about session IDs ... i just configure the two client applications with
different DNS aliases.