How to limit rows to which highlighting applies

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

How to limit rows to which highlighting applies

Alex Baranau
Hello Solr users and devs!

Is there a way to limit number of rows to which highlighting applies? I
don't see any "hl.rows" or similar parameter description, so it looks like I
need to enhance HighlightComponent to enable that. If it is not possible
currently, do you think it's worth adding such possibility?

JFI my case, when I need this: I display on results page 20, 10 or 5 rows
only, but I need much more rows (100-500) to display additional data on the
same page. Queries could be very complex and their execution time
(QueryComponent) is quite big. So I do want to fetch things via single
request. However, I noticed that with increasing number of rows, time spent
in HighlightComponent increases dramatically. For those additional hundreds
of rows I don't need highlighting at all.

Actually, *ideally* it would be great to have the ability to specify fields
returned for those extra rows as well. So I tend to think that adding this
features should not be based on changing HighlightComponent behaviour, but
changing QueryComponent or even "bigger" part somehow so that Solr query
accepts specifying extra group(s) of rows for fetching along with params for
them (which not influence the searching process, like
formatting/highlighting, fields to return, etc.). Thus, we could execute
*one* search query and fetch different data for different purposes.

Does this all make sense to you guys?

Thank you,
Alex Baranau
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - HBase
Lucene ecosystem search :: http://search-lucene.com/<http://search-hadoop.com/>
Reply | Threaded
Open this post in threaded view
|

Re: How to limit rows to which highlighting applies

MitchK
Alex,

it sounds like it would make sense.
Use cases could be i.e. clustering or similar techniques.
However, in my opinion the point of view for such a modification is not the right.

I.e. one wants to have got several resultsets. I could imagine that one does a primary-query (the query for the displayed results) and a query to compute clustering-results.
Now, you want to do different things with the result-sets.
The primary-query needs faceting, highlighting, spellcheck and much more, wheareas the additional query only needs clustering or something like that. In your case, you do not want to apply highlighting for the whole set, since you do not need such information for every row.

This is a general problem and I think a solution that makes it possible to create more than one "resultset" for a single solr-request would be applicable for more general use cases.

What do you think?

Kind regards,
- Mitch

Alex Baranau wrote
Hello Solr users and devs!

Is there a way to limit number of rows to which highlighting applies? I
don't see any "hl.rows" or similar parameter description, so it looks like I
need to enhance HighlightComponent to enable that. If it is not possible
currently, do you think it's worth adding such possibility?

JFI my case, when I need this: I display on results page 20, 10 or 5 rows
only, but I need much more rows (100-500) to display additional data on the
same page. Queries could be very complex and their execution time
(QueryComponent) is quite big. So I do want to fetch things via single
request. However, I noticed that with increasing number of rows, time spent
in HighlightComponent increases dramatically. For those additional hundreds
of rows I don't need highlighting at all.

Actually, *ideally* it would be great to have the ability to specify fields
returned for those extra rows as well. So I tend to think that adding this
features should not be based on changing HighlightComponent behaviour, but
changing QueryComponent or even "bigger" part somehow so that Solr query
accepts specifying extra group(s) of rows for fetching along with params for
them (which not influence the searching process, like
formatting/highlighting, fields to return, etc.). Thus, we could execute
*one* search query and fetch different data for different purposes.

Does this all make sense to you guys?

Thank you,
Alex Baranau
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - HBase
Lucene ecosystem search :: http://search-lucene.com/<http://search-hadoop.com/>
Reply | Threaded
Open this post in threaded view
|

Re: How to limit rows to which highlighting applies

Alex Baranau
Hello Mitch,

Agree. Basically you described the same context/needs. Your suggestion about
adding possibility to create more than one resultset for a single
solr-request is exactly what I meant in last paragraph of my initial message
(I called it "specifying extra group(s) of rows" and "we could execute *one*
search query and fetch different data"). Sorry if that was not very clear
from my not accurate terminology ;)

Is there any JIRA issue related to that or something close it? If not,
perhaps we should create it to initiate discussion among developers as it
looks like the feature is on demand in the community.

Thank you,
Alex.

On Sun, Aug 22, 2010 at 5:55 PM, MitchK <[hidden email]> wrote:

>
> Alex,
>
> it sounds like it would make sense.
> Use cases could be i.e. clustering or similar techniques.
> However, in my opinion the point of view for such a modification is not the
> right.
>
> I.e. one wants to have got several resultsets. I could imagine that one
> does
> a primary-query (the query for the displayed results) and a query to
> compute
> clustering-results.
> Now, you want to do different things with the result-sets.
> The primary-query needs faceting, highlighting, spellcheck and much more,
> wheareas the additional query only needs clustering or something like that.
> In your case, you do not want to apply highlighting for the whole set,
> since
> you do not need such information for every row.
>
> This is a general problem and I think a solution that makes it possible to
> create more than one "resultset" for a single solr-request would be
> applicable for more general use cases.
>
> What do you think?
>
> Kind regards,
> - Mitch
>
>
> Alex Baranau wrote:
> >
> > Hello Solr users and devs!
> >
> > Is there a way to limit number of rows to which highlighting applies? I
> > don't see any "hl.rows" or similar parameter description, so it looks
> like
> > I
> > need to enhance HighlightComponent to enable that. If it is not possible
> > currently, do you think it's worth adding such possibility?
> >
> > JFI my case, when I need this: I display on results page 20, 10 or 5 rows
> > only, but I need much more rows (100-500) to display additional data on
> > the
> > same page. Queries could be very complex and their execution time
> > (QueryComponent) is quite big. So I do want to fetch things via single
> > request. However, I noticed that with increasing number of rows, time
> > spent
> > in HighlightComponent increases dramatically. For those additional
> > hundreds
> > of rows I don't need highlighting at all.
> >
> > Actually, *ideally* it would be great to have the ability to specify
> > fields
> > returned for those extra rows as well. So I tend to think that adding
> this
> > features should not be based on changing HighlightComponent behaviour,
> but
> > changing QueryComponent or even "bigger" part somehow so that Solr query
> > accepts specifying extra group(s) of rows for fetching along with params
> > for
> > them (which not influence the searching process, like
> > formatting/highlighting, fields to return, etc.). Thus, we could execute
> > *one* search query and fetch different data for different purposes.
> >
> > Does this all make sense to you guys?
> >
> > Thank you,
> > Alex Baranau
> > ----
> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop -
> HBase
> > Lucene ecosystem search ::
> > http://search-lucene.com/<http://search-hadoop.com/>
> >
> >
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/How-to-limit-rows-to-which-highlighting-applies-tp1274042p1275962.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: How to limit rows to which highlighting applies

Otis Gospodnetic-2
In reply to this post by Alex Baranau
I didn't look at them closely now, but look at:

https://issues.apache.org/jira/browse/SOLR-1093
https://issues.apache.org/jira/browse/SOLR-2026

Incidentally, I found them with:
http://search-lucene.com/?q=multiple+queries&fc_project=Solr&fc_type=jira

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----

> From: Alex Baranau <[hidden email]>
> To: [hidden email]
> Sent: Sun, August 22, 2010 7:51:49 AM
> Subject: How to limit rows to which highlighting applies
>
> Hello Solr users and devs!
>
> Is there a way to limit number of rows to  which highlighting applies? I
> don't see any "hl.rows" or similar parameter  description, so it looks like I
> need to enhance HighlightComponent to enable  that. If it is not possible
> currently, do you think it's worth adding such  possibility?
>
> JFI my case, when I need this: I display on results page 20,  10 or 5 rows
> only, but I need much more rows (100-500) to display additional  data on the
> same page. Queries could be very complex and their execution  time
> (QueryComponent) is quite big. So I do want to fetch things via  single
> request. However, I noticed that with increasing number of rows, time  spent
> in HighlightComponent increases dramatically. For those additional  hundreds
> of rows I don't need highlighting at all.
>
> Actually, *ideally*  it would be great to have the ability to specify fields
> returned for those  extra rows as well. So I tend to think that adding this
> features should not  be based on changing HighlightComponent behaviour, but
> changing  QueryComponent or even "bigger" part somehow so that Solr query
> accepts  specifying extra group(s) of rows for fetching along with params for
> them  (which not influence the searching process, like
> formatting/highlighting,  fields to return, etc.). Thus, we could execute
> *one* search query and fetch  different data for different purposes.
>
> Does this all make sense to you  guys?
>
> Thank you,
> Alex Baranau
> ----
> Sematext :: http://sematext.com/ :: Solr -  Lucene - Nutch - Hadoop - HBase
> Lucene ecosystem search ::
http://search-lucene.com/<http://search-hadoop.com/>
>