[jira] [Commented] (LUCENE-4752) Merge segments to sort them

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Commented] (LUCENE-4752) Merge segments to sort them

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13604716#comment-13604716 ]

Shai Erera commented on LUCENE-4752:

Patch looks awesome! Few comments:

* LuceneTestCase: looks like the changes are that you ported the assertXYZ methods from TestDuelingCodecs, so I didn't review them. But, since LTC is quite big, perhaps we can move these methods to a util, e.g. CompareIndexes? I remember that when I wrote the sorting tests, I was looking for such methods!

* Can we make OneMerge.readers private and add OneMerge.add(AtomicReader) for IW to use? It looks odd that IW manipulates OneMerge.readers directly, but then calls OneMerge.getMergeReaders(). If you do that, the put a comment on readers why it's private, so that we don't forget :).

* Can we remove SegmentMerger.add() methods in favor of a single merge(List<AtomicReader>)? Don't go overboard with it, only if it's trivial (as it's not directly related to this issue).

> Merge segments to sort them
> ---------------------------
>                 Key: LUCENE-4752
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4752
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: core/index
>            Reporter: David Smiley
>            Assignee: Adrien Grand
>         Attachments: LUCENE-4752.patch, LUCENE-4752.patch, LUCENE-4752.patch
> It would be awesome if Lucene could write the documents out in a segment based on a configurable order.  This of course applies to merging segments to. The benefit is increased locality on disk of documents that are likely to be accessed together.  This often applies to documents near each other in time, but also spatially.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]