Best practices to return total records vs total filtered records in query?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Best practices to return total records vs total filtered records in query?

Todd Stevenson

I have a use case that I would think is a common one but I cannot find any help with this use case.

 

I am wanting to do a query that returns a list of records that I will display in an html table in an app.  This table only displays n records of the complete data set, but is able to page through the data set.   This use case is handled wonderfully by Solr, by specifying the offset and limit of the records in the data set and repeatedly rerunning the query. 

 

Another facet of this use case is to be able to filter the records returned, typically by typing filter text in a search box.  This should filter the records and only display those that match the filter string.  This works fine in Solr with one exception.  I would like to be able to display with the table the total number of record in the data set (which I can) and the total filtered records in the data set (which I cannot). 

 

Is there a way for Solr to return the total record count and the total filtered record count (possibly based on the q and fq queries) ? 

 

The only way I can see to do this is to run the query twice,  once with the filter string included and once without.  This seems to e terribly inefficient.  Is there a better way?

 

Todd Stevenson

Care Transformation Application Developer

Intermountain Healthcare

3930 Parkway Blvd |Salt Lake City, UT 84120

Office: 801-442-5112 | Cell: 801-589-1115

[hidden email]

 

NOTICE: This e-mail is for the sole use of the intended recipient and may contain confidential and privileged information. If you are not the intended recipient, you are prohibited from reviewing, using, disclosing or distributing this e-mail or its contents. If you have received this e-mail in error, please contact the sender by reply e-mail and destroy all copies of this e-mail and its contents.
Reply | Threaded
Open this post in threaded view
|

Re: Best practices to return total records vs total filtered records in query?

Erick Erickson
I’m not sure I get the problem.

How do you “filter the records and only display those that match the filter string”? Do you attach an fq clause to the original query? If so, the return set _is_ the number of docs that match the filter (and the original query), and the numFound from the original query could be preserved on the app side. Or you could send a bogus parameter with the original count with the query and it should be echoed back in the results.

OTOH, if what you’re after is getting the total number of docs matched by _just_ the filter query including docs not matched by the original query, then no, there’s nothing in Solr to do that OOB, you’d have to send the query again.

If you were willing to write some Solr code, fqs are just a bitSet that you could return in the result set. I’m not at all sure how difficult that would be.

But before going there, let’s assume 1> you haven’t specified fq={!cached=false}… and 2> your entry in filterCache won’t be aged out by the time you get to asking about it. There’s very little work done if all the query has to do is check the filterCache. You might want to just try it and see what the QTimes are. The sequence would be:

1> q=original query
2> user types some stuff in the text box
3> q=original query&fq=stuff in the text box
4> q=*:*&fq=stuff in the text box&rows=0

q=*:* is a special short-circuit query that does very little work and by specifying rows=0 you’re not returning any docs. The numFound returned from <4> is the number of docs matched _only_ by the fq clause. It shouldn’t be nearly as expensive as you fear, I’d measure first before doing any Solr coding.

Or all this is off base and I don’t understand the problem at all.

Best,
Erick


> On Mar 24, 2020, at 7:01 PM, Todd Stevenson <[hidden email]> wrote:
>
> I have a use case that I would think is a common one but I cannot find any help with this use case.
>  
> I am wanting to do a query that returns a list of records that I will display in an html table in an app.  This table only displays n records of the complete data set, but is able to page through the data set.   This use case is handled wonderfully by Solr, by specifying the offset and limit of the records in the data set and repeatedly rerunning the query.
>  
> Another facet of this use case is to be able to filter the records returned, typically by typing filter text in a search box.  This should filter the records and only display those that match the filter string.  This works fine in Solr with one exception.  I would like to be able to display with the table the total number of record in the data set (which I can) and the total filtered records in the data set (which I cannot).
>  
> Is there a way for Solr to return the total record count and the total filtered record count (possibly based on the q and fq queries) ?
>  
> The only way I can see to do this is to run the query twice,  once with the filter string included and once without.  This seems to e terribly inefficient.  Is there a better way?
>  
> Todd Stevenson
> Care Transformation Application Developer
> Intermountain Healthcare
> 3930 Parkway Blvd |Salt Lake City, UT 84120
> Office: 801-442-5112 | Cell: 801-589-1115
> [hidden email]
>
>  
> NOTICE: This e-mail is for the sole use of the intended recipient and may contain confidential and privileged information. If you are not the intended recipient, you are prohibited from reviewing, using, disclosing or distributing this e-mail or its contents. If you have received this e-mail in error, please contact the sender by reply e-mail and destroy all copies of this e-mail and its contents.