[jira] [Commented] (LUCENE-7844) UnifiedHighlighter: simplify "maxPassages" input API

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[jira] [Commented] (LUCENE-7844) UnifiedHighlighter: simplify "maxPassages" input API

Hudson (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024934#comment-16024934 ]

David Smiley commented on LUCENE-7844:

bq.  For example, a user may want to highlight a title fully (one passage) ...

For that case, the user _should_ be using WholeBreakIterator for that field, and thus they already need to subclass.
Does that make you feel any better?  If not, I'm not sure where this all leaves us right now.

I do like a FieldOptions (per-field object options) design over subclassing; again -- longer term.  I could imagine something like this:
unifiedHighlighter.highlight(query, topDocs,
 unifiedHighlighter.fieldOptions("body", 3)
Indeed, WholeBreakIterator almost suggest a different FieldHighlighter that is simpler (no BI, Scorer)... yet the outcome will be a bunch more code for likely immeasurable performance win and it's all internal code so the user's perceived complexity doesn't change.

> UnifiedHighlighter: simplify "maxPassages" input API
> ----------------------------------------------------
>                 Key: LUCENE-7844
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7844
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/highlighter
>            Reporter: David Smiley
>            Priority: Minor
>             Fix For: master (7.0)
>         Attachments: LUCENE_7844__UH_maxPassages_simplification.patch
> The "maxPassages" input to the UnifiedHighlighter can be provided as an array to some of the public methods on UnifiedHighlighter.  When it's provided as an array, the index in the array is for the field in a parallel array. I think this is awkward and furthermore it's inconsistent with the way this highlighter customizes things on a by field basis.  Instead, the parameter can be a simple int default (not an array), and then there can be a protected method like {{getMaxPassageCount(String field}} that returns an Integer which, when non-null, replaces the default value for this field.
> Aside from API simplicity and consistency, this will also remove some annoying parallel array sorting going on.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]