embedding solr in a webapp?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

embedding solr in a webapp?

Joachim Martin
Hi,

We are looking at running read-only solr nodes embedded in our webapp
nodes.  This would give us the
additional features of solr over lucene, but would keep it in memory and
reduce the overhead of http/xml
transport of results.

Looks like we would just create a request handler and call
handleRequest(req,rsp), and deal with the
search results DocList ourselves.

Would there be any reason why this sort of setup would prohibit the use
of index replication in a master/slave
setup?

Does this make sense?  As you might guess, speed is more important that
flexibility.  We are using solr for
a content search, returning ids, and doing a secondary db lookup for
extended entity information.

Thanks --Joachim
Reply | Threaded
Open this post in threaded view
|

Re: embedding solr in a webapp?

Yonik Seeley
On 6/7/06, Joachim Martin <[hidden email]> wrote:
> We are looking at running read-only solr nodes embedded in our webapp
> nodes.  This would give us the
> additional features of solr over lucene, but would keep it in memory and
> reduce the overhead of http/xml
> transport of results.
>
> Looks like we would just create a request handler and call
> handleRequest(req,rsp), and deal with the
> search results DocList ourselves.

Yes, that should work fine.

> Would there be any reason why this sort of setup would prohibit the use
> of index replication in a master/slave
> setup?

No, that should still work fine.

> Does this make sense?  As you might guess, speed is more important that
> flexibility.

It can make sense in certain cases... but it does cut down on your
flexibility to size the search tier independently of the appserver
tier.

Eliminating the IPC might get you 5% more performance, but at what
development & flexibility cost?  It's easier to buy a slightly faster
box, or simply add another server if you are running behind a
load-balancer.  You know your situation best of course :-)

>  We are using solr for
> a content search, returning ids, and doing a secondary db lookup for
> extended entity information.

You go through the trouble of avoiding one IPC call, but you add it
back in with the DB lookup... are the fields too large to store in
Lucene?

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: embedding solr in a webapp?

Joachim Martin
Certainly running a load balanced solr cluster will be our first
approach, I was just wondering if there were
any glaring problems with running solr embedded in each webapp node.  
Sounds like there are not.

As for the secondary db lookup, those will be cached, and are necessary
to filter results further based on
time (schedule) restrictions.

We will probably also implement a custom ResponseWriter that just
returns a comma separated list of ids-
the IPC time is just one component of the overhead, xml parsing is another.

Thanks  --Joachim

Yonik Seeley wrote:

> On 6/7/06, Joachim Martin <[hidden email]> wrote:
>> We are looking at running read-only solr nodes embedded in our webapp
>> nodes.  This would give us the
>> additional features of solr over lucene, but would keep it in memory and
>> reduce the overhead of http/xml
>> transport of results.
>>
>> Looks like we would just create a request handler and call
>> handleRequest(req,rsp), and deal with the
>> search results DocList ourselves.
>
> Yes, that should work fine.
>
>> Would there be any reason why this sort of setup would prohibit the use
>> of index replication in a master/slave
>> setup?
>
> No, that should still work fine.
>
>> Does this make sense?  As you might guess, speed is more important that
>> flexibility.
>
> It can make sense in certain cases... but it does cut down on your
> flexibility to size the search tier independently of the appserver
> tier.
>
> Eliminating the IPC might get you 5% more performance, but at what
> development & flexibility cost?  It's easier to buy a slightly faster
> box, or simply add another server if you are running behind a
> load-balancer.  You know your situation best of course :-)
>
>>  We are using solr for
>> a content search, returning ids, and doing a secondary db lookup for
>> extended entity information.
>
> You go through the trouble of avoiding one IPC call, but you add it
> back in with the DB lookup... are the fields too large to store in
> Lucene?
>
> -Yonik