Huge Query execution time for multiple ORs

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Huge Query execution time for multiple ORs

Faraz Fallahi
Hi

I have a question regarding solr queries.
My query basically contains thousand of OR conditions for authors
(author:name1 OR author:name2 OR author:name3 OR author:name4 ...)
The execution time on my index is huge (around 15 sec). When i tag all the
associated documents with a custom field and value like authorlist:1 and
then i change my query to just search for authorlist:1 it executes in 78
ms. How come there is such a big difference in exec-time?
Can somebody please explain why there is sucha difference (maybe the query
parser?) and if there is a way to speed this up?

Thx for the help
Reply | Threaded
Open this post in threaded view
|

Re: Huge Query execution time for multiple ORs

Mikhail Khludnev-2
Long queries hurt Solr on many layers. You can experiment with
https://lucene.apache.org/solr/guide/7_1/other-parsers.html#terms-query-parser


On Tue, Nov 28, 2017 at 1:07 PM, Faraz Fallahi <[hidden email]
> wrote:

> Hi
>
> I have a question regarding solr queries.
> My query basically contains thousand of OR conditions for authors
> (author:name1 OR author:name2 OR author:name3 OR author:name4 ...)
> The execution time on my index is huge (around 15 sec). When i tag all the
> associated documents with a custom field and value like authorlist:1 and
> then i change my query to just search for authorlist:1 it executes in 78
> ms. How come there is such a big difference in exec-time?
> Can somebody please explain why there is sucha difference (maybe the query
> parser?) and if there is a way to speed this up?
>
> Thx for the help
>



--
Sincerely yours
Mikhail Khludnev
Reply | Threaded
Open this post in threaded view
|

Re: Huge Query execution time for multiple ORs

Toke Eskildsen-2
In reply to this post by Faraz Fallahi
On Tue, 2017-11-28 at 11:07 +0100, Faraz Fallahi wrote:
> I have a question regarding solr queries.
> My query basically contains thousand of OR conditions for authors
> (author:name1 OR author:name2 OR author:name3 OR author:name4 ...)
> The execution time on my index is huge (around 15 sec). When i tag
> all the associated documents with a custom field and value like
> authorlist:1 and then i change my query to just search for
> authorlist:1 it executes in 78 ms. How come there is such a big
> difference in exec-time?

Due to the nature of inverted indexes (which lies at the heart of
Solr), your thousands of OR-queries means thousands of lookups, whereas
your authorlist means a single lookup. Adding to this the results for
each author needs to be merged with the other author-results - for
authorlist the results are there directly.

If your author lists are static, indexing them as you did in your test
is the best solution.

If they are not static, using a filter-query will ensure that they are
at least cached subsequently, so that only the first call will be
slow. 

If they are semi-static and there are not too many of them, you could
do warm-up filter-queries for all the different groups so that the
users does not pay the first-call penalty. This requires your filter-
cache to be large enough to hold all the author lists.

- Toke Eskildsen, Royal Danish Library

Reply | Threaded
Open this post in threaded view
|

Re: Huge Query execution time for multiple ORs

Faraz Fallahi
Hi

Thx for all the replies.
I think in any way tagging them is probably the best solution on any way.

Best regards

Am 28.11.2017 15:39 schrieb "Toke Eskildsen" <[hidden email]>:

> On Tue, 2017-11-28 at 11:07 +0100, Faraz Fallahi wrote:
> > I have a question regarding solr queries.
> > My query basically contains thousand of OR conditions for authors
> > (author:name1 OR author:name2 OR author:name3 OR author:name4 ...)
> > The execution time on my index is huge (around 15 sec). When i tag
> > all the associated documents with a custom field and value like
> > authorlist:1 and then i change my query to just search for
> > authorlist:1 it executes in 78 ms. How come there is such a big
> > difference in exec-time?
>
> Due to the nature of inverted indexes (which lies at the heart of
> Solr), your thousands of OR-queries means thousands of lookups, whereas
> your authorlist means a single lookup. Adding to this the results for
> each author needs to be merged with the other author-results - for
> authorlist the results are there directly.
>
> If your author lists are static, indexing them as you did in your test
> is the best solution.
>
> If they are not static, using a filter-query will ensure that they are
> at least cached subsequently, so that only the first call will be
> slow.
>
> If they are semi-static and there are not too many of them, you could
> do warm-up filter-queries for all the different groups so that the
> users does not pay the first-call penalty. This requires your filter-
> cache to be large enough to hold all the author lists.
>
> - Toke Eskildsen, Royal Danish Library
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Huge Query execution time for multiple ORs

Faraz Fallahi
In reply to this post by Toke Eskildsen-2
Hi Toke,

Just to be clear and to understand. Does this mean that a query of the form
author:name1 OR author:name2 OR author:name3

Is being processed like e.g.

1 query against the index with author:name1 getting 4 result
Then 1 query against the index with author:name2 getting 3 result
Then 1 query against the index with author:name3 getting 1 result

And in the end all results are merged and i get a result of 8 ?

So a query of thousand authors will be splitted into thousand single
queries against the index?

Do i understand this correctly?

Thx for the help
Faraz


Am 28.11.2017 15:39 schrieb "Toke Eskildsen" <[hidden email]>:

On Tue, 2017-11-28 at 11:07 +0100, Faraz Fallahi wrote:
> I have a question regarding solr queries.
> My query basically contains thousand of OR conditions for authors
> (author:name1 OR author:name2 OR author:name3 OR author:name4 ...)
> The execution time on my index is huge (around 15 sec). When i tag
> all the associated documents with a custom field and value like
> authorlist:1 and then i change my query to just search for
> authorlist:1 it executes in 78 ms. How come there is such a big
> difference in exec-time?

Due to the nature of inverted indexes (which lies at the heart of
Solr), your thousands of OR-queries means thousands of lookups, whereas
your authorlist means a single lookup. Adding to this the results for
each author needs to be merged with the other author-results - for
authorlist the results are there directly.

If your author lists are static, indexing them as you did in your test
is the best solution.

If they are not static, using a filter-query will ensure that they are
at least cached subsequently, so that only the first call will be
slow.

If they are semi-static and there are not too many of them, you could
do warm-up filter-queries for all the different groups so that the
users does not pay the first-call penalty. This requires your filter-
cache to be large enough to hold all the author lists.

- Toke Eskildsen, Royal Danish Library
Reply | Threaded
Open this post in threaded view
|

Re: Huge Query execution time for multiple ORs

Emir Arnautović
Hi Faraz,
It is a bit worse than that - it also needs to calculate score, so for each matching doc of one query part it has to check if it appears in results of other query parts. If you use term query parser, you avoid calculating score - all doc will have score 1.
Solr is based on lucene, which is mainly inverted index: https://en.wikipedia.org/wiki/Inverted_index <https://en.wikipedia.org/wiki/Inverted_index> so knowing that helps understand how expensive some queries are. It is relatively easy to figure out what steps are needed for different query types. Of course, Lucene includes a lot smartness, and it is probably not using the naive approach, but it cannot avoid limitations of inverted index.

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 30 Nov 2017, at 02:39, Faraz Fallahi <[hidden email]> wrote:
>
> Hi Toke,
>
> Just to be clear and to understand. Does this mean that a query of the form
> author:name1 OR author:name2 OR author:name3
>
> Is being processed like e.g.
>
> 1 query against the index with author:name1 getting 4 result
> Then 1 query against the index with author:name2 getting 3 result
> Then 1 query against the index with author:name3 getting 1 result
>
> And in the end all results are merged and i get a result of 8 ?
>
> So a query of thousand authors will be splitted into thousand single
> queries against the index?
>
> Do i understand this correctly?
>
> Thx for the help
> Faraz
>
>
> Am 28.11.2017 15:39 schrieb "Toke Eskildsen" <[hidden email]>:
>
> On Tue, 2017-11-28 at 11:07 +0100, Faraz Fallahi wrote:
>> I have a question regarding solr queries.
>> My query basically contains thousand of OR conditions for authors
>> (author:name1 OR author:name2 OR author:name3 OR author:name4 ...)
>> The execution time on my index is huge (around 15 sec). When i tag
>> all the associated documents with a custom field and value like
>> authorlist:1 and then i change my query to just search for
>> authorlist:1 it executes in 78 ms. How come there is such a big
>> difference in exec-time?
>
> Due to the nature of inverted indexes (which lies at the heart of
> Solr), your thousands of OR-queries means thousands of lookups, whereas
> your authorlist means a single lookup. Adding to this the results for
> each author needs to be merged with the other author-results - for
> authorlist the results are there directly.
>
> If your author lists are static, indexing them as you did in your test
> is the best solution.
>
> If they are not static, using a filter-query will ensure that they are
> at least cached subsequently, so that only the first call will be
> slow.
>
> If they are semi-static and there are not too many of them, you could
> do warm-up filter-queries for all the different groups so that the
> users does not pay the first-call penalty. This requires your filter-
> cache to be large enough to hold all the author lists.
>
> - Toke Eskildsen, Royal Danish Library

Reply | Threaded
Open this post in threaded view
|

Re: Huge Query execution time for multiple ORs

Faraz Fallahi
Uff... I See.. thx dir the explanation :)

Am 30.11.2017 3:13 nachm. schrieb "Emir Arnautović" <
[hidden email]>:

> Hi Faraz,
> It is a bit worse than that - it also needs to calculate score, so for
> each matching doc of one query part it has to check if it appears in
> results of other query parts. If you use term query parser, you avoid
> calculating score - all doc will have score 1.
> Solr is based on lucene, which is mainly inverted index:
> https://en.wikipedia.org/wiki/Inverted_index <https://en.wikipedia.org/
> wiki/Inverted_index> so knowing that helps understand how expensive some
> queries are. It is relatively easy to figure out what steps are needed for
> different query types. Of course, Lucene includes a lot smartness, and it
> is probably not using the naive approach, but it cannot avoid limitations
> of inverted index.
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 30 Nov 2017, at 02:39, Faraz Fallahi <[hidden email]>
> wrote:
> >
> > Hi Toke,
> >
> > Just to be clear and to understand. Does this mean that a query of the
> form
> > author:name1 OR author:name2 OR author:name3
> >
> > Is being processed like e.g.
> >
> > 1 query against the index with author:name1 getting 4 result
> > Then 1 query against the index with author:name2 getting 3 result
> > Then 1 query against the index with author:name3 getting 1 result
> >
> > And in the end all results are merged and i get a result of 8 ?
> >
> > So a query of thousand authors will be splitted into thousand single
> > queries against the index?
> >
> > Do i understand this correctly?
> >
> > Thx for the help
> > Faraz
> >
> >
> > Am 28.11.2017 15:39 schrieb "Toke Eskildsen" <[hidden email]>:
> >
> > On Tue, 2017-11-28 at 11:07 +0100, Faraz Fallahi wrote:
> >> I have a question regarding solr queries.
> >> My query basically contains thousand of OR conditions for authors
> >> (author:name1 OR author:name2 OR author:name3 OR author:name4 ...)
> >> The execution time on my index is huge (around 15 sec). When i tag
> >> all the associated documents with a custom field and value like
> >> authorlist:1 and then i change my query to just search for
> >> authorlist:1 it executes in 78 ms. How come there is such a big
> >> difference in exec-time?
> >
> > Due to the nature of inverted indexes (which lies at the heart of
> > Solr), your thousands of OR-queries means thousands of lookups, whereas
> > your authorlist means a single lookup. Adding to this the results for
> > each author needs to be merged with the other author-results - for
> > authorlist the results are there directly.
> >
> > If your author lists are static, indexing them as you did in your test
> > is the best solution.
> >
> > If they are not static, using a filter-query will ensure that they are
> > at least cached subsequently, so that only the first call will be
> > slow.
> >
> > If they are semi-static and there are not too many of them, you could
> > do warm-up filter-queries for all the different groups so that the
> > users does not pay the first-call penalty. This requires your filter-
> > cache to be large enough to hold all the author lists.
> >
> > - Toke Eskildsen, Royal Danish Library
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Huge Query execution time for multiple ORs

Faraz Fallahi
Hi guys,

Sorry to bother you again, but i am really confused:

Ive used solr admin website and created a query with lots of ORs using solr
4.7.

When i execute the query without a sort it executes in round about 3.5 - 4
seconds.
When i execute it with a sort on a field called pubdate it takes about
4-4.5 seconds.
When i execute it with a sort on the guid field it takes about 7 - 8
seconds !!!

After your explanations i was expecting the query without a sort to be the
slowest. What am i missing here?

Beat regards
Faraz

Am 30.11.2017 09:29 schrieb "Faraz Fallahi" <[hidden email]>:

> Uff... I See.. thx dir the explanation :)
>
> Am 30.11.2017 3:13 nachm. schrieb "Emir Arnautović" <
> [hidden email]>:
>
>> Hi Faraz,
>> It is a bit worse than that - it also needs to calculate score, so for
>> each matching doc of one query part it has to check if it appears in
>> results of other query parts. If you use term query parser, you avoid
>> calculating score - all doc will have score 1.
>> Solr is based on lucene, which is mainly inverted index:
>> https://en.wikipedia.org/wiki/Inverted_index <
>> https://en.wikipedia.org/wiki/Inverted_index> so knowing that helps
>> understand how expensive some queries are. It is relatively easy to figure
>> out what steps are needed for different query types. Of course, Lucene
>> includes a lot smartness, and it is probably not using the naive approach,
>> but it cannot avoid limitations of inverted index.
>>
>> HTH,
>> Emir
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>
>>
>>
>> > On 30 Nov 2017, at 02:39, Faraz Fallahi <[hidden email]>
>> wrote:
>> >
>> > Hi Toke,
>> >
>> > Just to be clear and to understand. Does this mean that a query of the
>> form
>> > author:name1 OR author:name2 OR author:name3
>> >
>> > Is being processed like e.g.
>> >
>> > 1 query against the index with author:name1 getting 4 result
>> > Then 1 query against the index with author:name2 getting 3 result
>> > Then 1 query against the index with author:name3 getting 1 result
>> >
>> > And in the end all results are merged and i get a result of 8 ?
>> >
>> > So a query of thousand authors will be splitted into thousand single
>> > queries against the index?
>> >
>> > Do i understand this correctly?
>> >
>> > Thx for the help
>> > Faraz
>> >
>> >
>> > Am 28.11.2017 15:39 schrieb "Toke Eskildsen" <[hidden email]>:
>> >
>> > On Tue, 2017-11-28 at 11:07 +0100, Faraz Fallahi wrote:
>> >> I have a question regarding solr queries.
>> >> My query basically contains thousand of OR conditions for authors
>> >> (author:name1 OR author:name2 OR author:name3 OR author:name4 ...)
>> >> The execution time on my index is huge (around 15 sec). When i tag
>> >> all the associated documents with a custom field and value like
>> >> authorlist:1 and then i change my query to just search for
>> >> authorlist:1 it executes in 78 ms. How come there is such a big
>> >> difference in exec-time?
>> >
>> > Due to the nature of inverted indexes (which lies at the heart of
>> > Solr), your thousands of OR-queries means thousands of lookups, whereas
>> > your authorlist means a single lookup. Adding to this the results for
>> > each author needs to be merged with the other author-results - for
>> > authorlist the results are there directly.
>> >
>> > If your author lists are static, indexing them as you did in your test
>> > is the best solution.
>> >
>> > If they are not static, using a filter-query will ensure that they are
>> > at least cached subsequently, so that only the first call will be
>> > slow.
>> >
>> > If they are semi-static and there are not too many of them, you could
>> > do warm-up filter-queries for all the different groups so that the
>> > users does not pay the first-call penalty. This requires your filter-
>> > cache to be large enough to hold all the author lists.
>> >
>> > - Toke Eskildsen, Royal Danish Library
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: Huge Query execution time for multiple ORs

Emir Arnautović
Hi Faraz,
When you say query without sort, I assume that you mean you omit sort so you expect it to be sorted by score. It is expected to be slower than equal query without calculating score - e.g. run same query as fq.
What you observe can be explained with:
* Solr is calculating score even not sorted by score and not returning it (do you return score? Plus I am not sure about this - did not check the code)
* Field that you are using for sorting do not have doc values so have to be uninverted
* Fileld that you are using for sorting are not in OS cache so are read from disk.

Try comparing same query running as q=..,. and fq=… Make sure that your filter cache is disabled if you are repeating the same queries and averaging.

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 4 Dec 2017, at 14:54, Faraz Fallahi <[hidden email]> wrote:
>
> Hi guys,
>
> Sorry to bother you again, but i am really confused:
>
> Ive used solr admin website and created a query with lots of ORs using solr
> 4.7.
>
> When i execute the query without a sort it executes in round about 3.5 - 4
> seconds.
> When i execute it with a sort on a field called pubdate it takes about
> 4-4.5 seconds.
> When i execute it with a sort on the guid field it takes about 7 - 8
> seconds !!!
>
> After your explanations i was expecting the query without a sort to be the
> slowest. What am i missing here?
>
> Beat regards
> Faraz
>
> Am 30.11.2017 09:29 schrieb "Faraz Fallahi" <[hidden email]>:
>
>> Uff... I See.. thx dir the explanation :)
>>
>> Am 30.11.2017 3:13 nachm. schrieb "Emir Arnautović" <
>> [hidden email]>:
>>
>>> Hi Faraz,
>>> It is a bit worse than that - it also needs to calculate score, so for
>>> each matching doc of one query part it has to check if it appears in
>>> results of other query parts. If you use term query parser, you avoid
>>> calculating score - all doc will have score 1.
>>> Solr is based on lucene, which is mainly inverted index:
>>> https://en.wikipedia.org/wiki/Inverted_index <
>>> https://en.wikipedia.org/wiki/Inverted_index> so knowing that helps
>>> understand how expensive some queries are. It is relatively easy to figure
>>> out what steps are needed for different query types. Of course, Lucene
>>> includes a lot smartness, and it is probably not using the naive approach,
>>> but it cannot avoid limitations of inverted index.
>>>
>>> HTH,
>>> Emir
>>> --
>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>>
>>>
>>>
>>>> On 30 Nov 2017, at 02:39, Faraz Fallahi <[hidden email]>
>>> wrote:
>>>>
>>>> Hi Toke,
>>>>
>>>> Just to be clear and to understand. Does this mean that a query of the
>>> form
>>>> author:name1 OR author:name2 OR author:name3
>>>>
>>>> Is being processed like e.g.
>>>>
>>>> 1 query against the index with author:name1 getting 4 result
>>>> Then 1 query against the index with author:name2 getting 3 result
>>>> Then 1 query against the index with author:name3 getting 1 result
>>>>
>>>> And in the end all results are merged and i get a result of 8 ?
>>>>
>>>> So a query of thousand authors will be splitted into thousand single
>>>> queries against the index?
>>>>
>>>> Do i understand this correctly?
>>>>
>>>> Thx for the help
>>>> Faraz
>>>>
>>>>
>>>> Am 28.11.2017 15:39 schrieb "Toke Eskildsen" <[hidden email]>:
>>>>
>>>> On Tue, 2017-11-28 at 11:07 +0100, Faraz Fallahi wrote:
>>>>> I have a question regarding solr queries.
>>>>> My query basically contains thousand of OR conditions for authors
>>>>> (author:name1 OR author:name2 OR author:name3 OR author:name4 ...)
>>>>> The execution time on my index is huge (around 15 sec). When i tag
>>>>> all the associated documents with a custom field and value like
>>>>> authorlist:1 and then i change my query to just search for
>>>>> authorlist:1 it executes in 78 ms. How come there is such a big
>>>>> difference in exec-time?
>>>>
>>>> Due to the nature of inverted indexes (which lies at the heart of
>>>> Solr), your thousands of OR-queries means thousands of lookups, whereas
>>>> your authorlist means a single lookup. Adding to this the results for
>>>> each author needs to be merged with the other author-results - for
>>>> authorlist the results are there directly.
>>>>
>>>> If your author lists are static, indexing them as you did in your test
>>>> is the best solution.
>>>>
>>>> If they are not static, using a filter-query will ensure that they are
>>>> at least cached subsequently, so that only the first call will be
>>>> slow.
>>>>
>>>> If they are semi-static and there are not too many of them, you could
>>>> do warm-up filter-queries for all the different groups so that the
>>>> users does not pay the first-call penalty. This requires your filter-
>>>> cache to be large enough to hold all the author lists.
>>>>
>>>> - Toke Eskildsen, Royal Danish Library
>>>
>>>

Reply | Threaded
Open this post in threaded view
|

Re: Huge Query execution time for multiple ORs

Faraz Fallahi
Will do thx

Am 04.12.2017 9:27 nachm. schrieb "Emir Arnautović" <
[hidden email]>:

> Hi Faraz,
> When you say query without sort, I assume that you mean you omit sort so
> you expect it to be sorted by score. It is expected to be slower than equal
> query without calculating score - e.g. run same query as fq.
> What you observe can be explained with:
> * Solr is calculating score even not sorted by score and not returning it
> (do you return score? Plus I am not sure about this - did not check the
> code)
> * Field that you are using for sorting do not have doc values so have to
> be uninverted
> * Fileld that you are using for sorting are not in OS cache so are read
> from disk.
>
> Try comparing same query running as q=..,. and fq=… Make sure that your
> filter cache is disabled if you are repeating the same queries and
> averaging.
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 4 Dec 2017, at 14:54, Faraz Fallahi <[hidden email]>
> wrote:
> >
> > Hi guys,
> >
> > Sorry to bother you again, but i am really confused:
> >
> > Ive used solr admin website and created a query with lots of ORs using
> solr
> > 4.7.
> >
> > When i execute the query without a sort it executes in round about 3.5 -
> 4
> > seconds.
> > When i execute it with a sort on a field called pubdate it takes about
> > 4-4.5 seconds.
> > When i execute it with a sort on the guid field it takes about 7 - 8
> > seconds !!!
> >
> > After your explanations i was expecting the query without a sort to be
> the
> > slowest. What am i missing here?
> >
> > Beat regards
> > Faraz
> >
> > Am 30.11.2017 09:29 schrieb "Faraz Fallahi" <
> [hidden email]>:
> >
> >> Uff... I See.. thx dir the explanation :)
> >>
> >> Am 30.11.2017 3:13 nachm. schrieb "Emir Arnautović" <
> >> [hidden email]>:
> >>
> >>> Hi Faraz,
> >>> It is a bit worse than that - it also needs to calculate score, so for
> >>> each matching doc of one query part it has to check if it appears in
> >>> results of other query parts. If you use term query parser, you avoid
> >>> calculating score - all doc will have score 1.
> >>> Solr is based on lucene, which is mainly inverted index:
> >>> https://en.wikipedia.org/wiki/Inverted_index <
> >>> https://en.wikipedia.org/wiki/Inverted_index> so knowing that helps
> >>> understand how expensive some queries are. It is relatively easy to
> figure
> >>> out what steps are needed for different query types. Of course, Lucene
> >>> includes a lot smartness, and it is probably not using the naive
> approach,
> >>> but it cannot avoid limitations of inverted index.
> >>>
> >>> HTH,
> >>> Emir
> >>> --
> >>> Monitoring - Log Management - Alerting - Anomaly Detection
> >>> Solr & Elasticsearch Consulting Support Training -
> http://sematext.com/
> >>>
> >>>
> >>>
> >>>> On 30 Nov 2017, at 02:39, Faraz Fallahi <[hidden email]
> >
> >>> wrote:
> >>>>
> >>>> Hi Toke,
> >>>>
> >>>> Just to be clear and to understand. Does this mean that a query of the
> >>> form
> >>>> author:name1 OR author:name2 OR author:name3
> >>>>
> >>>> Is being processed like e.g.
> >>>>
> >>>> 1 query against the index with author:name1 getting 4 result
> >>>> Then 1 query against the index with author:name2 getting 3 result
> >>>> Then 1 query against the index with author:name3 getting 1 result
> >>>>
> >>>> And in the end all results are merged and i get a result of 8 ?
> >>>>
> >>>> So a query of thousand authors will be splitted into thousand single
> >>>> queries against the index?
> >>>>
> >>>> Do i understand this correctly?
> >>>>
> >>>> Thx for the help
> >>>> Faraz
> >>>>
> >>>>
> >>>> Am 28.11.2017 15:39 schrieb "Toke Eskildsen" <[hidden email]>:
> >>>>
> >>>> On Tue, 2017-11-28 at 11:07 +0100, Faraz Fallahi wrote:
> >>>>> I have a question regarding solr queries.
> >>>>> My query basically contains thousand of OR conditions for authors
> >>>>> (author:name1 OR author:name2 OR author:name3 OR author:name4 ...)
> >>>>> The execution time on my index is huge (around 15 sec). When i tag
> >>>>> all the associated documents with a custom field and value like
> >>>>> authorlist:1 and then i change my query to just search for
> >>>>> authorlist:1 it executes in 78 ms. How come there is such a big
> >>>>> difference in exec-time?
> >>>>
> >>>> Due to the nature of inverted indexes (which lies at the heart of
> >>>> Solr), your thousands of OR-queries means thousands of lookups,
> whereas
> >>>> your authorlist means a single lookup. Adding to this the results for
> >>>> each author needs to be merged with the other author-results - for
> >>>> authorlist the results are there directly.
> >>>>
> >>>> If your author lists are static, indexing them as you did in your test
> >>>> is the best solution.
> >>>>
> >>>> If they are not static, using a filter-query will ensure that they are
> >>>> at least cached subsequently, so that only the first call will be
> >>>> slow.
> >>>>
> >>>> If they are semi-static and there are not too many of them, you could
> >>>> do warm-up filter-queries for all the different groups so that the
> >>>> users does not pay the first-call penalty. This requires your filter-
> >>>> cache to be large enough to hold all the author lists.
> >>>>
> >>>> - Toke Eskildsen, Royal Danish Library
> >>>
> >>>
>
>