Splitting indexes

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Splitting indexes

Upayavira
I've seen a few situations recently where the ability to 'reshard'
distributed solr setups would be very useful.

Much of what is needed is already there (or relatively easy to
implement). The crucial bit is effective splitting of a Lucene index.

I've seen (and used) the MultiPassIndexSplitter in contrib, but have
seen suggestions here [1] that there might be a more efficient, lower
level way to achieve it - more akin to the process of merging two
indexes.

So, my question is, what would be involved in coding a single pass
IndexSplitter?

I don't expect it to be easy, but I'd really like to know (a) whether it
is possible and (b) approximately what you'd need to do.


Thanks in advance!

Upayavira

[1] http://www.slideshare.net/abial/eurocon2010


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Splitting indexes

Andrzej Białecki-2
On 2010-09-30 17:03, Upayavira wrote:

> I've seen a few situations recently where the ability to 'reshard'
> distributed solr setups would be very useful.
>
> Much of what is needed is already there (or relatively easy to
> implement). The crucial bit is effective splitting of a Lucene index.
>
> I've seen (and used) the MultiPassIndexSplitter in contrib, but have
> seen suggestions here [1] that there might be a more efficient, lower
> level way to achieve it - more akin to the process of merging two
> indexes.
>
> So, my question is, what would be involved in coding a single pass
> IndexSplitter?

Some free time of an interested and capable developer. :) Unfortunately
this is a scarce resource...

>
> I don't expect it to be easy, but I'd really like to know (a) whether it
> is possible and (b) approximately what you'd need to do.

a) yes
b) implement a class that follows the logic for writing out field and
postings data, similar to the one that is already present in
o.a.l.index.SegmentMerger.

--
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]