[jira] [Commented] (LUCENE-7792) Add optional concurrency to OfflineSorter

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Commented] (LUCENE-7792) Add optional concurrency to OfflineSorter

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-7792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978479#comment-15978479 ]

Michael McCandless commented on LUCENE-7792:

Thanks [~dawid.weiss]; I'll try your impl. above.

> Add optional concurrency to OfflineSorter
> -----------------------------------------
>                 Key: LUCENE-7792
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7792
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: master (7.0), 6.6
>         Attachments: LUCENE-7792.patch
> OfflineSorter is a heavy operation and is really an embarrassingly concurrent problem at heart, and if you have enough hardware concurrency (e.g. fast SSDs, multiple CPU cores) it can be a big speedup.
> E.g., after reading a partition from the input, one thread can sort and write it, while another thread reads the next partition, etc.  Merging partitions can also be done in the background.  Some things still cannot be concurrent, e.g. the initial read from the input must be a single thread, as well as the final merge and writing to the final output.
> I think I found a fairly non-invasive way to add optional concurrency to this class, by adding an optional ExecutorService to OfflineSorter's ctor (similar to IndexSearcher) and using futures to represent each partition as we sort, and creating Callable classes for sorting and merging partitions.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]