Understanding Performance of Function Query

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Understanding Performance of Function Query

sidharth228
Hi,

I'm working with "edismax" and "function-query" parsers in Solr and have
difficulty in understanding whether the query time taken by
"function-query" makes sense. The query I'm trying to optimize looks as
follows:

q={!func sum($q1,$q2,$q3)} where q1,q2,q3 are edismax queries.

The QTime returned by edismax queries takes well under 50ms but it seems
that function-query is the rate determining step since combined query above
takes around 200-300ms. I also analyzed the performance of function query
using only constants.

The QTime results for different q are as follows:

   -

   097ms for q={!func} sum(10,20)
   -

   109ms for q={!func} sum(10,20,30)
   -

   127ms for q={!func} sum(10,20,30,40)
   -

   145ms for q={!func} sum(10,20,30,40,50)

Does this trend make sense? Are function-queries expected to be this slow?

What makes edismax queries so much faster?

What can I do to optimize my original query (which has edismax subqueries
q1,q2,q3) to work under 100ms?

I originally posted this question
<https://stackoverflow.com/questions/55352565/understanding-solr-function-query-performance>
on
StackOverflow with no success, so any help here would be appreciated.
Reply | Threaded
Open this post in threaded view
|

Re: Understanding Performance of Function Query

Erik Hatcher-4
Function queries in ‘q’ score EVERY DOCUMENT.   Use ‘bf’ or ‘boost’ for the function part, so its only computed on main query matching docs.  

    Erik

> On Apr 9, 2019, at 03:29, Sidharth Negi <[hidden email]> wrote:
>
> Hi,
>
> I'm working with "edismax" and "function-query" parsers in Solr and have
> difficulty in understanding whether the query time taken by
> "function-query" makes sense. The query I'm trying to optimize looks as
> follows:
>
> q={!func sum($q1,$q2,$q3)} where q1,q2,q3 are edismax queries.
>
> The QTime returned by edismax queries takes well under 50ms but it seems
> that function-query is the rate determining step since combined query above
> takes around 200-300ms. I also analyzed the performance of function query
> using only constants.
>
> The QTime results for different q are as follows:
>
>   -
>
>   097ms for q={!func} sum(10,20)
>   -
>
>   109ms for q={!func} sum(10,20,30)
>   -
>
>   127ms for q={!func} sum(10,20,30,40)
>   -
>
>   145ms for q={!func} sum(10,20,30,40,50)
>
> Does this trend make sense? Are function-queries expected to be this slow?
>
> What makes edismax queries so much faster?
>
> What can I do to optimize my original query (which has edismax subqueries
> q1,q2,q3) to work under 100ms?
>
> I originally posted this question
> <https://stackoverflow.com/questions/55352565/understanding-solr-function-query-performance>
> on
> StackOverflow with no success, so any help here would be appreciated.
Reply | Threaded
Open this post in threaded view
|

Re: Understanding Performance of Function Query

sidharth228
I did infact use "bf" parameter for individual edismax queries.

However, the reason I can't condense these edismax queries into a single
edismax query is because each of them uses different fields in "qf".

Basically what I'm trying to do is this: each of these edismax queries (q1,
q2, q3) has a logic, and scores docs using it. I am then trying to combine
the scores (to get an overall score) from these scores later by summing
them.

What options do I have of implementing this?




--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Understanding Performance of Function Query

Erik Hatcher-4
maybe something like q=

    ({!edismax .... v=$q1} OR {!edismax .... v=$q2} OR {!edismax ... v=$q3})

 and setting q1, q2, q3 as needed (or all to the same maybe with different qf’s and such)

      Erik

> On Apr 9, 2019, at 09:12, sidharth228 <[hidden email]> wrote:
>
> I did infact use "bf" parameter for individual edismax queries.
>
> However, the reason I can't condense these edismax queries into a single
> edismax query is because each of them uses different fields in "qf".
>
> Basically what I'm trying to do is this: each of these edismax queries (q1,
> q2, q3) has a logic, and scores docs using it. I am then trying to combine
> the scores (to get an overall score) from these scores later by summing
> them.
>
> What options do I have of implementing this?
>
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Understanding Performance of Function Query

sidharth228
This does indeed reduce the time. but doesn't quite do what I wanted. This
approach penalizes the docs based on "coord" factor. In other words, for a
doc with scores=5 on just one query (and nothing on others), the resulting
score would now be 5/3 since only one clause matches.

1. I wonder why does the above query work at all? I can't find the above
query syntax anywhere in any docs or books on Solr, can you point me to
your source for this syntax?

2. Which parser is used to parse the larger query? No info about the parser
used for the larger query is given from parsedQuery field. (using
debug=true)

3. What if I did not want to sum (the scores of q1, q2, q3) but rather
wanted to use their values in some other way (eg. sqrt(q1) + sqrt(q2) +
0.6*q3). Is there no way of cleanly implementing a flow of computations to
be done on sub-query scores?

On Tue, Apr 9, 2019 at 7:40 PM Erik Hatcher <[hidden email]> wrote:

> maybe something like q=
>
>     ({!edismax .... v=$q1} OR {!edismax .... v=$q2} OR {!edismax ...
> v=$q3})
>
>  and setting q1, q2, q3 as needed (or all to the same maybe with different
> qf’s and such)
>
>       Erik
>
> > On Apr 9, 2019, at 09:12, sidharth228 <[hidden email]> wrote:
> >
> > I did infact use "bf" parameter for individual edismax queries.
> >
> > However, the reason I can't condense these edismax queries into a single
> > edismax query is because each of them uses different fields in "qf".
> >
> > Basically what I'm trying to do is this: each of these edismax queries
> (q1,
> > q2, q3) has a logic, and scores docs using it. I am then trying to
> combine
> > the scores (to get an overall score) from these scores later by summing
> > them.
> >
> > What options do I have of implementing this?
> >
> >
> >
> >
> > --
> > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>
Reply | Threaded
Open this post in threaded view
|

Re: Understanding Performance of Function Query

sidharth228
To those interested, I was able to disable coord factor by overriding it in
a new CustomSimilarity jar file. This can effectively sum the scores from
multiple edismax queries.
However, I'd be interested in any other methods which are able to do
not-just-direct-sums and can work on other logics for scores, eg. sqrt(q1)
+ sqrt(q2) + 0.6*q3.

On Wed, Apr 17, 2019 at 6:20 PM Sidharth Negi <[hidden email]>
wrote:

> This does indeed reduce the time. but doesn't quite do what I wanted. This
> approach penalizes the docs based on "coord" factor. In other words, for a
> doc with scores=5 on just one query (and nothing on others), the resulting
> score would now be 5/3 since only one clause matches.
>
> 1. I wonder why does the above query work at all? I can't find the above
> query syntax anywhere in any docs or books on Solr, can you point me to
> your source for this syntax?
>
> 2. Which parser is used to parse the larger query? No info about the
> parser used for the larger query is given from parsedQuery field. (using
> debug=true)
>
> 3. What if I did not want to sum (the scores of q1, q2, q3) but rather
> wanted to use their values in some other way (eg. sqrt(q1) + sqrt(q2) +
> 0.6*q3). Is there no way of cleanly implementing a flow of computations to
> be done on sub-query scores?
>
> On Tue, Apr 9, 2019 at 7:40 PM Erik Hatcher <[hidden email]>
> wrote:
>
>> maybe something like q=
>>
>>     ({!edismax .... v=$q1} OR {!edismax .... v=$q2} OR {!edismax ...
>> v=$q3})
>>
>>  and setting q1, q2, q3 as needed (or all to the same maybe with
>> different qf’s and such)
>>
>>       Erik
>>
>> > On Apr 9, 2019, at 09:12, sidharth228 <[hidden email]> wrote:
>> >
>> > I did infact use "bf" parameter for individual edismax queries.
>> >
>> > However, the reason I can't condense these edismax queries into a single
>> > edismax query is because each of them uses different fields in "qf".
>> >
>> > Basically what I'm trying to do is this: each of these edismax queries
>> (q1,
>> > q2, q3) has a logic, and scores docs using it. I am then trying to
>> combine
>> > the scores (to get an overall score) from these scores later by summing
>> > them.
>> >
>> > What options do I have of implementing this?
>> >
>> >
>> >
>> >
>> > --
>> > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>
>