embedding solr

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

embedding solr

Erik Hatcher
I have a client need to embed Solr behind an already built custom TCP/
IP interface (currently for Lucene, but want to swap in Solr to  
benefit from its additional goodness of course).   Have folks already  
done this?   Experiences?   Or perhaps there are some thoughts on why  
this may or may not be a good idea and any technical hurdles that  
might be encountered.

My hunch is this should be possible, and fairly cleanly so.  But I  
hesitate to exaggerate the ease of such a thing without asking first.

Many thanks,
        Erik

Reply | Threaded
Open this post in threaded view
|

Re: embedding solr

Yonik Seeley-2
On 4/2/07, Erik Hatcher <[hidden email]> wrote:
> I have a client need to embed Solr behind an already built custom TCP/
> IP interface (currently for Lucene, but want to swap in Solr to
> benefit from its additional goodness of course).   Have folks already
> done this?   Experiences?   Or perhaps there are some thoughts on why
> this may or may not be a good idea and any technical hurdles that
> might be encountered.
>
> My hunch is this should be possible, and fairly cleanly so.  But I
> hesitate to exaggerate the ease of such a thing without asking first.

It should be relatively easy, esp since SolrIndexSearcher is an
IndexSearcher, and one can get an IndexReader from there.  So whatever
compex stuff may have been done on the query side, it should be
relatively easy to move it to a custom request handler that opens a
socket to handle raw TCP requests.

Trickiness by custom code on the update side might be harder to port
to Solr though.

Of course, one should also evaluate the cost of migrating the clients
compared to developing a compatibility handler in Solr...

-Yonik
Reply | Threaded
Open this post in threaded view
|

Re: embedding solr

Ryan McKinley
In reply to this post by Erik Hatcher
I have embedded solr skipping HTTP transport altogether.  It was
remarkably easy to link directly to request handlers skipping the
dispatch filter and using the DocList and associated data in the
SolrQueryResponse directly.

Assuming the existing TCP/IP interface is sending strings around,
filling up the appropriate queries should be no problem.

The only issue i have had is trying to to deal with Objects for as
much as possible and get the FieldType to marshal the fields into and
out of lucene.  (SOLR-193)  One goal with SOLR-20 is to get an
interface that works the same with or without HTTP -- I know thats not
what it was intended for, but solr makes lucene so much more
manageable even without a server!



On 4/2/07, Erik Hatcher <[hidden email]> wrote:

> I have a client need to embed Solr behind an already built custom TCP/
> IP interface (currently for Lucene, but want to swap in Solr to
> benefit from its additional goodness of course).   Have folks already
> done this?   Experiences?   Or perhaps there are some thoughts on why
> this may or may not be a good idea and any technical hurdles that
> might be encountered.
>
> My hunch is this should be possible, and fairly cleanly so.  But I
> hesitate to exaggerate the ease of such a thing without asking first.
>
> Many thanks,
>         Erik
>
>
Reply | Threaded
Open this post in threaded view
|

Re: embedding solr

Erik Hatcher
Yonik and Ryan,

Thank you for the quick and helpful responses.  I'll have to do some  
hacking from here on out and see where I get to, but I'm happy to  
know I'm in good company and that what I'm attempting is on a path  
already slightly worn.  :)

        Erik

On Apr 2, 2007, at 9:55 PM, Ryan McKinley wrote:

> I have embedded solr skipping HTTP transport altogether.  It was
> remarkably easy to link directly to request handlers skipping the
> dispatch filter and using the DocList and associated data in the
> SolrQueryResponse directly.
>
> Assuming the existing TCP/IP interface is sending strings around,
> filling up the appropriate queries should be no problem.
>
> The only issue i have had is trying to to deal with Objects for as
> much as possible and get the FieldType to marshal the fields into and
> out of lucene.  (SOLR-193)  One goal with SOLR-20 is to get an
> interface that works the same with or without HTTP -- I know thats not
> what it was intended for, but solr makes lucene so much more
> manageable even without a server!
>
>
>
> On 4/2/07, Erik Hatcher <[hidden email]> wrote:
>> I have a client need to embed Solr behind an already built custom  
>> TCP/
>> IP interface (currently for Lucene, but want to swap in Solr to
>> benefit from its additional goodness of course).   Have folks already
>> done this?   Experiences?   Or perhaps there are some thoughts on why
>> this may or may not be a good idea and any technical hurdles that
>> might be encountered.
>>
>> My hunch is this should be possible, and fairly cleanly so.  But I
>> hesitate to exaggerate the ease of such a thing without asking first.
>>
>> Many thanks,
>>         Erik
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: embedding solr

Daniel Einspanjer
In reply to this post by Ryan McKinley
Ryan,

Do you have any of this code you could share?  I am currently using
Solr to perform thousands of queries in a batch, and eliminating the
HTTP overhead is something I'd love to do if it isn't complicated.  We
need several of the extra features Solr provides, which is why we are
trying to use it instead of Lucene directly.

On 4/2/07, Ryan McKinley <[hidden email]> wrote:
> I have embedded solr skipping HTTP transport altogether.  It was
> remarkably easy to link directly to request handlers skipping the
> dispatch filter and using the DocList and associated data in the
> SolrQueryResponse directly.
Reply | Threaded
Open this post in threaded view
|

Re: embedding solr

Ryan McKinley
There is nothing particularly magic to it.  It is just fills up
SolrParams directly (see any of the tests) calling the requestHandler,
then walking through the Documents.  Something like:


  SolrRequestHandler handler = core.getRequestHandler( "" ); // gets
the standard one

  SolrQueryResponse rsp = new SolrQueryResponse();
  core.execute( handler, sreq, rsp );

  IndexReader reader = sreq.getSearcher().getReader();
  DocListAndSet response = (DocListAndSet)rsp.getValues().get( "response" );
  DocIterator iter = response.docList.iterator();
  while( iter.hasNext() ) {
    Document doc = reader.document( iter.next() );
    // ...
  }

ryan


On 4/10/07, Daniel Einspanjer <[hidden email]> wrote:

> Ryan,
>
> Do you have any of this code you could share?  I am currently using
> Solr to perform thousands of queries in a batch, and eliminating the
> HTTP overhead is something I'd love to do if it isn't complicated.  We
> need several of the extra features Solr provides, which is why we are
> trying to use it instead of Lucene directly.
>
> On 4/2/07, Ryan McKinley <[hidden email]> wrote:
> > I have embedded solr skipping HTTP transport altogether.  It was
> > remarkably easy to link directly to request handlers skipping the
> > dispatch filter and using the DocList and associated data in the
> > SolrQueryResponse directly.
>
Reply | Threaded
Open this post in threaded view
|

Re: embedding solr

Daniel Einspanjer
That is good to hear. I guess I was overly worried when I saw your
mention of having trouble getting the field values with the correct
types.  I will be taking a look at this later this week.

Thank you very much for your prompt response.

On 4/10/07, Ryan McKinley <[hidden email]> wrote:
> There is nothing particularly magic to it.
Reply | Threaded
Open this post in threaded view
|

Re: embedding solr

Chris Hostetter-3
In reply to this post by Ryan McKinley
:   core.execute( handler, sreq, rsp );
:
:   IndexReader reader = sreq.getSearcher().getReader();
:   DocListAndSet response = (DocListAndSet)rsp.getValues().get( "response" );
:   DocIterator iter = response.docList.iterator();
:   while( iter.hasNext() ) {
:     Document doc = reader.document( iter.next() );
:     // ...
:   }

I can imagine this might be a little easier to deal with if there was a
"no-op" QueryResponseWriter that just did this for you ... adding the
documents to a new Map in the SolrQueryRequest.getContext() so you can get
to them perhaps?




-Hoss

Reply | Threaded
Open this post in threaded view
|

Re: embedding solr

Devangini
In reply to this post by Ryan McKinley
How does the SolrParams fill up directly? Shouldn't it be SolrQueryRequest and not SolrParams, if I am not mistaken?