Sending Tika parse result to Solr

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Sending Tika parse result to Solr

Daniel Knapp
Hello,


i want to send the Tika parse results of my data to my Solr-Server.
My File-Server is not my Solr-Server, so Solr Cell is no option for me.

In Lucene i can pass my Reader Object (as an result of the parsing) to a Lucene Document for indexing.

Is this also possible with Solr? Or is there an other or better way to do this?
I'm using SolrJ for the connection.


Regards,
Daniel

smime.p7s (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Sending Tika parse result to Solr

Grant Ingersoll-2

On Nov 25, 2009, at 5:32 AM, Daniel Knapp wrote:

> Hello,
>
>
> i want to send the Tika parse results of my data to my Solr-Server.
> My File-Server is not my Solr-Server, so Solr Cell is no option for me.
>
> In Lucene i can pass my Reader Object (as an result of the parsing) to a Lucene Document for indexing.
>
> Is this also possible with Solr? Or is there an other or better way to do this?
> I'm using SolrJ for the connection.

You can't pass your reader object, but I have opened https://issues.apache.org/jira/browse/SOLR-1526 to provide a SolrJ client side equivalent of Solr Cell.  If you'd like to contribute a patch that would be great.   Basically, you just need to have your Handler override create a SolrInputDocument (batches, that is) and then send them to Solr.  Using the Streaming server may also fit well with this model.



--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene:
http://www.lucidimagination.com/search

Reply | Threaded
Open this post in threaded view
|

Re: Re: Sending Tika parse result to Solr

Daniel Knapp
In reply to this post by Daniel Knapp
Hello,


i want to send the Tika parse results of my data to my Solr-Server.
My File-Server is not my Solr-Server, so Solr Cell is no option for me.

In Lucene i can pass my Reader Object (as an result of the parsing) to a Lucene Document for indexing.

Is this also possible with Solr? Or is there an other or better way to do this?
I'm using SolrJ for the connection.

You can't pass your reader object, but I have opened https://issues.apache.org/jira/browse/SOLR-1526 to provide a SolrJ client side equivalent of Solr Cell.  If you'd like to contribute a patch that would be great.   Basically, you just need to have your Handler override create a SolrInputDocument (batches, that is) and then send them to Solr.  

Is there any documentation how to do that? I'm new to Solr and don't exactly understand what you mean with that. Some detailed informations would be great.

Thank you in anticipation!

Using the Streaming server may also fit well with this model.



--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene:
http://www.lucidimagination.com/search


smime.p7s (6K) Download Attachment