continuously creating index packages for katta with solr

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

continuously creating index packages for katta with solr

Thomas Koch
Hi,

I'd like to use SOLR to create indices for deployment with katta. I'd like to
install a SOLR server on each crawler. The crawling script then sends the
content directly to the local SOLR server. Every 5-10 minutes I'd like to take
the current SOLR index, add it to katta and let SOLR start with an empty index
again.
Does anybody has an idea, how this could be achieved?

Thanks a lot,

Thomas Koch, http://www.koch.ro
Reply | Threaded
Open this post in threaded view
|

"Overwriting" cores with the same core name

Thomas Koch
Hi,

I'm currently evaluating the following solution: My crawler sends all docs to
a SOLR core named "WHATEVER". Every 5 minutes a new SOLR core with the same
name WHATEVER is created, but with a new datadir. The datadir contains a
timestamp in it's name.
Now I can check for datadirs that are older then the newest one and all these
can be picked up for submission to katta.

Now there remain two questions:

- When the old core is closed, will there be an implicit commit?
- How to be sure, that no more work is in progress on an old core datadir?

Thanks,

Thomas Koch, http://www.koch.ro