Search highlighter for custom Query implementations - how to?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Search highlighter for custom Query implementations - how to?

Lukáš Vlček
Hi,

What is the recommended way to use custom Query implementations with Lucene
(3.3.0) Highlighter framework?

In short, what worries me a bit is the fact
that WeightedSpanTermExtractor#extract(Query, Map<String, WeightedSpanTerm>)
accepts a general Query parameter but inside it does a lot of tests against
particular implementations ( ... if query instanceof BooleanQuery then ... )
with no default option to fall back to. What if I have my own (or
third-party) implementation of Query? Or even with some Lucene built-in
Query types (like BoostingQuery) there is a chance that the extract() method
will silently fall through and does not let me know that this particular
Query implementation is unknown to it.

AFAIK typical use case of Highlighter API can go like this:

Query query = _some_query_instance_;
QueryScorer scorer = new QueryScorer(query, "field"); // query? or
query.rewrite() or ... ?
... etc ...
Highlighter highlighter = new Highlighter(scorer);
... etc ...

Saying this, I am looking for an advice about how to deal with this
Highlighter API correctly so that I do not have to check the source code
of WeightedSpanTermExtractor in advance to learn whether I need to call
rewrite (or other custom method) on the query object or not.

Regards,
Lukas
Reply | Threaded
Open this post in threaded view
|

Re: Search highlighter for custom Query implementations - how to?

Michael Sokolov
Lukas there really isn't any support for custom Query types in
Highlighter, as you've found.  If you inherit from one of the types it
does support, or rewrite your query to one of them, that should work,
but the Query class just doesn't provide enough support for Highlighter
to work with in the general case. There is work in LUCENE-3318 which
could eventually help, but it's a ways away from being committed I
think.  At least that's my read of the code.

-Mike

On 9/9/2011 7:53 AM, Lukáš Vlček wrote:

> Hi,
>
> What is the recommended way to use custom Query implementations with Lucene
> (3.3.0) Highlighter framework?
>
> In short, what worries me a bit is the fact
> that WeightedSpanTermExtractor#extract(Query, Map<String, WeightedSpanTerm>)
> accepts a general Query parameter but inside it does a lot of tests against
> particular implementations ( ... if query instanceof BooleanQuery then ... )
> with no default option to fall back to. What if I have my own (or
> third-party) implementation of Query? Or even with some Lucene built-in
> Query types (like BoostingQuery) there is a chance that the extract() method
> will silently fall through and does not let me know that this particular
> Query implementation is unknown to it.
>
> AFAIK typical use case of Highlighter API can go like this:
>
> Query query = _some_query_instance_;
> QueryScorer scorer = new QueryScorer(query, "field"); // query? or
> query.rewrite() or ... ?
> ... etc ...
> Highlighter highlighter = new Highlighter(scorer);
> ... etc ...
>
> Saying this, I am looking for an advice about how to deal with this
> Highlighter API correctly so that I do not have to check the source code
> of WeightedSpanTermExtractor in advance to learn whether I need to call
> rewrite (or other custom method) on the query object or not.
>
> Regards,
> Lukas
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Search highlighter for custom Query implementations - how to?

Lukáš Vlček
Thanks Michael,

at least I would say that it would be good to state this clearly in Lucene
API.

More specifically, both the WeightedSpanTermExtractor#getWeightedSpanTerms
and #getWeightedSpanTermsWithScores should reflect this in Javadoc. Because
both execute private method extract(Query query,
Map<String,WeightedSpanTerm> terms) which checks a specific Query subclasses
only. As I pointed out not only custom Query implementations will fall
through without any results but also some Lucene built-in Queries (like
BoostingQuery) and that is confusing IMHO.

Should a ticket be opened?

Regards,
Lukas

On Sat, Sep 10, 2011 at 12:08 AM, Michael Sokolov <[hidden email]>wrote:

> Lukas there really isn't any support for custom Query types in Highlighter,
> as you've found.  If you inherit from one of the types it does support, or
> rewrite your query to one of them, that should work, but the Query class
> just doesn't provide enough support for Highlighter to work with in the
> general case. There is work in LUCENE-3318 which could eventually help, but
> it's a ways away from being committed I think.  At least that's my read of
> the code.
>
> -Mike
>
>
> On 9/9/2011 7:53 AM, Lukáš Vlček wrote:
>
>> Hi,
>>
>> What is the recommended way to use custom Query implementations with
>> Lucene
>> (3.3.0) Highlighter framework?
>>
>> In short, what worries me a bit is the fact
>> that WeightedSpanTermExtractor#**extract(Query, Map<String,
>> WeightedSpanTerm>)
>> accepts a general Query parameter but inside it does a lot of tests
>> against
>> particular implementations ( ... if query instanceof BooleanQuery then ...
>> )
>> with no default option to fall back to. What if I have my own (or
>> third-party) implementation of Query? Or even with some Lucene built-in
>> Query types (like BoostingQuery) there is a chance that the extract()
>> method
>> will silently fall through and does not let me know that this particular
>> Query implementation is unknown to it.
>>
>> AFAIK typical use case of Highlighter API can go like this:
>>
>> Query query = _some_query_instance_;
>> QueryScorer scorer = new QueryScorer(query, "field"); // query? or
>> query.rewrite() or ... ?
>> ... etc ...
>> Highlighter highlighter = new Highlighter(scorer);
>> ... etc ...
>>
>> Saying this, I am looking for an advice about how to deal with this
>> Highlighter API correctly so that I do not have to check the source code
>> of WeightedSpanTermExtractor in advance to learn whether I need to call
>> rewrite (or other custom method) on the query object or not.
>>
>> Regards,
>> Lukas
>>
>>
>