Highlighting and passage sizing backwards-compatibility

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Highlighting and passage sizing backwards-compatibility

@lucene.experimentalI want to draw some attention to a change coming in LUCENE-9093 relating to the UnifiedHighlighter and how it sizes Passages.  I'll link to the pertinent summary comment:

The contributor and I are very happy with the improvements and we think they are good for basically everyone.  Despite a new configuration option that can be set in a way that is close to the previous behavior, it's not identical.  Consequently, if someone wrote highlighting tests in their app that assert final passages, and lets say configured the sizing alignment to be closer to the current behavior, it's going to be different some of the time.  Perhaps 5% of the time as a very rough guess?  If the new "0.5" default is chosen then probably much higher at ~30% (another rough guess).  Nonetheless we made these changes because we think the results are better.  So if it breaks someone's tests, well they can and should update them because the fragments will be sized better.  For users that demand the utmost control, it remains possible for them to supply a BreakIterator impl of their choosing and avoid LengthGoalBreakIterator.

Note:  The UnifiedHighlighter is labelled @lucene.experimental

Are others cool with this?  If we *had* to retain the old behavior, we could in 8.x choose the pivot point based on the left edge of the first match, as it did before.  That would still leave most of the good change here but some of the finess would require users to wait to 9.

~ David Smiley
Apache Lucene/Solr Search Developer