hl.preserveMulti in Unified highlighter?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

hl.preserveMulti in Unified highlighter?

Walter Underwood
It looks like hl.preserveMulti is only implemented in the Original highlighter. Has anyone looked at doing this for the Unified highlighter?

We need to preserve order in the highlights for a multi-valued field.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

Reply | Threaded
Open this post in threaded view
|

Re: hl.preserveMulti in Unified highlighter?

Walter Underwood
In testing, hl.preserveMulti=true works with the unified highlighter. But the documentation says that the parameter is only implemented in the original highlighter.

Is the documentation wrong? Can we trust this to keep working with unified?

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On Mar 26, 2019, at 12:08 PM, Walter Underwood <[hidden email]> wrote:
>
> It looks like hl.preserveMulti is only implemented in the Original highlighter. Has anyone looked at doing this for the Unified highlighter?
>
> We need to preserve order in the highlights for a multi-valued field.
>
> wunder
> Walter Underwood
> [hidden email] <mailto:[hidden email]>
> http://observer.wunderwood.org/  (my blog)
>

Reply | Threaded
Open this post in threaded view
|

Re: hl.preserveMulti in Unified highlighter?

Walter Underwood
We are testing 6.6.1.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On Mar 29, 2019, at 11:02 AM, Walter Underwood <[hidden email]> wrote:
>
> In testing, hl.preserveMulti=true works with the unified highlighter. But the documentation says that the parameter is only implemented in the original highlighter.
>
> Is the documentation wrong? Can we trust this to keep working with unified?
>
> wunder
> Walter Underwood
> [hidden email]
> http://observer.wunderwood.org/  (my blog)
>
>> On Mar 26, 2019, at 12:08 PM, Walter Underwood <[hidden email]> wrote:
>>
>> It looks like hl.preserveMulti is only implemented in the Original highlighter. Has anyone looked at doing this for the Unified highlighter?
>>
>> We need to preserve order in the highlights for a multi-valued field.
>>
>> wunder
>> Walter Underwood
>> [hidden email] <mailto:[hidden email]>
>> http://observer.wunderwood.org/  (my blog)
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: hl.preserveMulti in Unified highlighter?

david.w.smiley@gmail.com
Hi Walter,

No, the UnifiedHighlighter does not behave as if this setting were true.

The docs say:

`hl.preserveMulti`::
If `true`, multi-valued fields will return all values in the order they
were saved in the index. If `false`, the default, only values that match
the highlight request will be returned.


The first sentence there is the essence of it.  Notice it's not conditional
on wether there are highlights or not.  The UH won't return values lacking
a highlight. Even hl.defaultSummary isn't triggered because *some* of the
values have a highlight.

As I look at the pertinent code right now, I imagine a solution would be to
provide a custom PassageFormatter.  If we can assume for this use-case that
you can use hl.bs.type=WHOLE as well, then a a simpler PassageFormatter
could basically ignore the passage starts & ends and merely mark up the
original content in entirety, which is a null concatenated sequence of all
the values for this field for a document.

~ David


On Fri, Mar 29, 2019 at 2:02 PM Walter Underwood <[hidden email]>
wrote:

> We are testing 6.6.1.
>
> wunder
> Walter Underwood
> [hidden email]
> http://observer.wunderwood.org/  (my blog)
>
> > On Mar 29, 2019, at 11:02 AM, Walter Underwood <[hidden email]>
> wrote:
> >
> > In testing, hl.preserveMulti=true works with the unified highlighter.
> But the documentation says that the parameter is only implemented in the
> original highlighter.
> >
> > Is the documentation wrong? Can we trust this to keep working with
> unified?
> >
> > wunder
> > Walter Underwood
> > [hidden email]
> > http://observer.wunderwood.org/  (my blog)
> >
> >> On Mar 26, 2019, at 12:08 PM, Walter Underwood <[hidden email]>
> wrote:
> >>
> >> It looks like hl.preserveMulti is only implemented in the Original
> highlighter. Has anyone looked at doing this for the Unified highlighter?
> >>
> >> We need to preserve order in the highlights for a multi-valued field.
> >>
> >> wunder
> >> Walter Underwood
> >> [hidden email] <mailto:[hidden email]>
> >> http://observer.wunderwood.org/  (my blog)
> >>
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: hl.preserveMulti in Unified highlighter?

Anthony Groves
Hi Walter,

I did something very similar to what David is suggesting when switching
from the PostingsHighlighter to the UnifiedHighlighter in Solr 7.

In order to include non-highlighted items (exact ordering) when using
preserveMulti, we used a custom PassageFormatter that ignored the start and
end offsets:
https://github.com/oreillymedia/ifpress-solr-plugin/blob/bf3b07c5be32fbcfa7b6fdfd439d511ef60dab68/src/main/java/com/ifactory/press/db/solr/highlight/HighlightFormatter.java#L35

I was actually surprised to see not much of a performance hit from
essentially removing the offset usage, but our highlighted fields aren't
extremely large :-)

Hope that helps!
Anthony

*Anthony Groves*  | Technical Lead, Search

O'Reilly Media, Inc.  | https://www.linkedin.com/in/anthonygroves/


On Fri, May 22, 2020 at 4:59 PM David Smiley <[hidden email]>
wrote:

> Hi Walter,
>
> No, the UnifiedHighlighter does not behave as if this setting were true.
>
> The docs say:
>
> `hl.preserveMulti`::
> If `true`, multi-valued fields will return all values in the order they
> were saved in the index. If `false`, the default, only values that match
> the highlight request will be returned.
>
>
> The first sentence there is the essence of it.  Notice it's not conditional
> on wether there are highlights or not.  The UH won't return values lacking
> a highlight. Even hl.defaultSummary isn't triggered because *some* of the
> values have a highlight.
>
> As I look at the pertinent code right now, I imagine a solution would be to
> provide a custom PassageFormatter.  If we can assume for this use-case that
> you can use hl.bs.type=WHOLE as well, then a a simpler PassageFormatter
> could basically ignore the passage starts & ends and merely mark up the
> original content in entirety, which is a null concatenated sequence of all
> the values for this field for a document.
>
> ~ David
>
>
> On Fri, Mar 29, 2019 at 2:02 PM Walter Underwood <[hidden email]>
> wrote:
>
> > We are testing 6.6.1.
> >
> > wunder
> > Walter Underwood
> > [hidden email]
> > http://observer.wunderwood.org/  (my blog)
> >
> > > On Mar 29, 2019, at 11:02 AM, Walter Underwood <[hidden email]>
> > wrote:
> > >
> > > In testing, hl.preserveMulti=true works with the unified highlighter.
> > But the documentation says that the parameter is only implemented in the
> > original highlighter.
> > >
> > > Is the documentation wrong? Can we trust this to keep working with
> > unified?
> > >
> > > wunder
> > > Walter Underwood
> > > [hidden email]
> > > http://observer.wunderwood.org/  (my blog)
> > >
> > >> On Mar 26, 2019, at 12:08 PM, Walter Underwood <[hidden email]
> >
> > wrote:
> > >>
> > >> It looks like hl.preserveMulti is only implemented in the Original
> > highlighter. Has anyone looked at doing this for the Unified highlighter?
> > >>
> > >> We need to preserve order in the highlights for a multi-valued field.
> > >>
> > >> wunder
> > >> Walter Underwood
> > >> [hidden email] <mailto:[hidden email]>
> > >> http://observer.wunderwood.org/  (my blog)
> > >>
> > >
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: hl.preserveMulti in Unified highlighter?

Walter Underwood
I’m a little amused that this thread has become active after almost two months of silence.

I think we just used the old highlighter. I don’t even remember now.

wunder
Walter Underwood
[hidden email]
http://observer.wunderwood.org/  (my blog)

> On May 23, 2020, at 9:14 AM, Anthony Groves <[hidden email]> wrote:
>
> Hi Walter,
>
> I did something very similar to what David is suggesting when switching
> from the PostingsHighlighter to the UnifiedHighlighter in Solr 7.
>
> In order to include non-highlighted items (exact ordering) when using
> preserveMulti, we used a custom PassageFormatter that ignored the start and
> end offsets:
> https://github.com/oreillymedia/ifpress-solr-plugin/blob/bf3b07c5be32fbcfa7b6fdfd439d511ef60dab68/src/main/java/com/ifactory/press/db/solr/highlight/HighlightFormatter.java#L35
>
> I was actually surprised to see not much of a performance hit from
> essentially removing the offset usage, but our highlighted fields aren't
> extremely large :-)
>
> Hope that helps!
> Anthony
>
> *Anthony Groves*  | Technical Lead, Search
>
> O'Reilly Media, Inc.  | https://www.linkedin.com/in/anthonygroves/
>
>
> On Fri, May 22, 2020 at 4:59 PM David Smiley <[hidden email]>
> wrote:
>
>> Hi Walter,
>>
>> No, the UnifiedHighlighter does not behave as if this setting were true.
>>
>> The docs say:
>>
>> `hl.preserveMulti`::
>> If `true`, multi-valued fields will return all values in the order they
>> were saved in the index. If `false`, the default, only values that match
>> the highlight request will be returned.
>>
>>
>> The first sentence there is the essence of it.  Notice it's not conditional
>> on wether there are highlights or not.  The UH won't return values lacking
>> a highlight. Even hl.defaultSummary isn't triggered because *some* of the
>> values have a highlight.
>>
>> As I look at the pertinent code right now, I imagine a solution would be to
>> provide a custom PassageFormatter.  If we can assume for this use-case that
>> you can use hl.bs.type=WHOLE as well, then a a simpler PassageFormatter
>> could basically ignore the passage starts & ends and merely mark up the
>> original content in entirety, which is a null concatenated sequence of all
>> the values for this field for a document.
>>
>> ~ David
>>
>>
>> On Fri, Mar 29, 2019 at 2:02 PM Walter Underwood <[hidden email]>
>> wrote:
>>
>>> We are testing 6.6.1.
>>>
>>> wunder
>>> Walter Underwood
>>> [hidden email]
>>> http://observer.wunderwood.org/  (my blog)
>>>
>>>> On Mar 29, 2019, at 11:02 AM, Walter Underwood <[hidden email]>
>>> wrote:
>>>>
>>>> In testing, hl.preserveMulti=true works with the unified highlighter.
>>> But the documentation says that the parameter is only implemented in the
>>> original highlighter.
>>>>
>>>> Is the documentation wrong? Can we trust this to keep working with
>>> unified?
>>>>
>>>> wunder
>>>> Walter Underwood
>>>> [hidden email]
>>>> http://observer.wunderwood.org/  (my blog)
>>>>
>>>>> On Mar 26, 2019, at 12:08 PM, Walter Underwood <[hidden email]
>>>
>>> wrote:
>>>>>
>>>>> It looks like hl.preserveMulti is only implemented in the Original
>>> highlighter. Has anyone looked at doing this for the Unified highlighter?
>>>>>
>>>>> We need to preserve order in the highlights for a multi-valued field.
>>>>>
>>>>> wunder
>>>>> Walter Underwood
>>>>> [hidden email] <mailto:[hidden email]>
>>>>> http://observer.wunderwood.org/  (my blog)
>>>>>
>>>>
>>>
>>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: hl.preserveMulti in Unified highlighter?

david.w.smiley@gmail.com
Better late than never?  I added some new mail filters to bring topics of
interest to my attention.

Any way; this seems like an important use-case.

Anthony:  You'd probably benefit from also setting hl.bs.type=WHOLE since
clearly you want whole values (no snippets/fragments of values).  If I get
around to implementing hl.preserveMulti for the UH, i'll have it make this
assumption likewise.

~ David


On Sat, May 23, 2020 at 1:48 PM Walter Underwood <[hidden email]>
wrote:

> I’m a little amused that this thread has become active after almost two
> months of silence.
>
> I think we just used the old highlighter. I don’t even remember now.
>
> wunder
> Walter Underwood
> [hidden email]
> http://observer.wunderwood.org/  (my blog)
>
> > On May 23, 2020, at 9:14 AM, Anthony Groves <[hidden email]> wrote:
> >
> > Hi Walter,
> >
> > I did something very similar to what David is suggesting when switching
> > from the PostingsHighlighter to the UnifiedHighlighter in Solr 7.
> >
> > In order to include non-highlighted items (exact ordering) when using
> > preserveMulti, we used a custom PassageFormatter that ignored the start
> and
> > end offsets:
> >
> https://github.com/oreillymedia/ifpress-solr-plugin/blob/bf3b07c5be32fbcfa7b6fdfd439d511ef60dab68/src/main/java/com/ifactory/press/db/solr/highlight/HighlightFormatter.java#L35
> >
> > I was actually surprised to see not much of a performance hit from
> > essentially removing the offset usage, but our highlighted fields aren't
> > extremely large :-)
> >
> > Hope that helps!
> > Anthony
> >
> > *Anthony Groves*  | Technical Lead, Search
> >
> > O'Reilly Media, Inc.  | https://www.linkedin.com/in/anthonygroves/
> >
> >
> > On Fri, May 22, 2020 at 4:59 PM David Smiley <[hidden email]>
> > wrote:
> >
> >> Hi Walter,
> >>
> >> No, the UnifiedHighlighter does not behave as if this setting were true.
> >>
> >> The docs say:
> >>
> >> `hl.preserveMulti`::
> >> If `true`, multi-valued fields will return all values in the order they
> >> were saved in the index. If `false`, the default, only values that match
> >> the highlight request will be returned.
> >>
> >>
> >> The first sentence there is the essence of it.  Notice it's not
> conditional
> >> on wether there are highlights or not.  The UH won't return values
> lacking
> >> a highlight. Even hl.defaultSummary isn't triggered because *some* of
> the
> >> values have a highlight.
> >>
> >> As I look at the pertinent code right now, I imagine a solution would
> be to
> >> provide a custom PassageFormatter.  If we can assume for this use-case
> that
> >> you can use hl.bs.type=WHOLE as well, then a a simpler PassageFormatter
> >> could basically ignore the passage starts & ends and merely mark up the
> >> original content in entirety, which is a null concatenated sequence of
> all
> >> the values for this field for a document.
> >>
> >> ~ David
> >>
> >>
> >> On Fri, Mar 29, 2019 at 2:02 PM Walter Underwood <[hidden email]
> >
> >> wrote:
> >>
> >>> We are testing 6.6.1.
> >>>
> >>> wunder
> >>> Walter Underwood
> >>> [hidden email]
> >>> http://observer.wunderwood.org/  (my blog)
> >>>
> >>>> On Mar 29, 2019, at 11:02 AM, Walter Underwood <[hidden email]
> >
> >>> wrote:
> >>>>
> >>>> In testing, hl.preserveMulti=true works with the unified highlighter.
> >>> But the documentation says that the parameter is only implemented in
> the
> >>> original highlighter.
> >>>>
> >>>> Is the documentation wrong? Can we trust this to keep working with
> >>> unified?
> >>>>
> >>>> wunder
> >>>> Walter Underwood
> >>>> [hidden email]
> >>>> http://observer.wunderwood.org/  (my blog)
> >>>>
> >>>>> On Mar 26, 2019, at 12:08 PM, Walter Underwood <
> [hidden email]
> >>>
> >>> wrote:
> >>>>>
> >>>>> It looks like hl.preserveMulti is only implemented in the Original
> >>> highlighter. Has anyone looked at doing this for the Unified
> highlighter?
> >>>>>
> >>>>> We need to preserve order in the highlights for a multi-valued field.
> >>>>>
> >>>>> wunder
> >>>>> Walter Underwood
> >>>>> [hidden email] <mailto:[hidden email]>
> >>>>> http://observer.wunderwood.org/  (my blog)
> >>>>>
> >>>>
> >>>
> >>>
> >>
>
>