[jira] [Commented] (LUCENE-8950) FieldComparators Should Not Maintain Implicit PQs

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Commented] (LUCENE-8950) FieldComparators Should Not Maintain Implicit PQs

Shalin Shekhar Mangar (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906978#comment-16906978 ]

Atri Sharma commented on LUCENE-8950:

I confess I do not have a very clean idea as to how this can be implemented: the typical usages of FieldComparator mandate that the user maintain a list of slots into the FieldComparator, which can implicitly be as bad in terms of size as the queue itself. FieldComparator provides a convenient API to allow comparisons between two values of the type maintained in the queue, which can form the basis of this observation.


Here is the first cut of proposal that I have in mind:

1) Deprecate compare(slot, slot) so that new implementations do not depend on this method, but rather use compare(T val, T val).

2) Start with some comparators (Numeric comparators?), get rid of the implicit priority queue and make the user maintain those values.

3) Make Numeric comparators track only the top and bottom values, as needed.


Note that I am treating NumericComparators as the starting point/example, but the approach should extend for other comparators as well.


With [https://github.com/apache/lucene-solr/pull/831,] getting values out of leaf comparators should be easy, so the logical step after this PR is to depend on compare (val, val) more than we rely on compare (slot, slot).


Happy to receive feedback and alternate proposals

> FieldComparators Should Not Maintain Implicit PQs
> -------------------------------------------------
>                 Key: LUCENE-8950
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8950
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Atri Sharma
>            Priority: Major
> While doing some perf tests, I realised that FieldComparators inherently maintain implicit priority queues for maintaining the sorted order of documents for the given sort order. This is wasteful especially in the case of a multi feature sort order and a large number of hits requested.
> We should change this to have FieldComparators maintain only the top and bottom values, and use them as barriers to compare

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]