Very low filter cache hit ratio

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Very low filter cache hit ratio

Saurabh Sharma
Hi All,

I am trying to run an index on solr cloud version 7.3.1 with 3 nodes.
Planning to index the records using full index once a day and delta index
every 30 minutes. Purpose to keep stale index was to utilize the cache of
solr. But to my surprise, when I put real traffic on this index . cache
usage was very less. It was varying between 0 to 10% irrespective of the
size of filter cache.

I tried varying the cache size but nothing happened and usage was very low.
Most of the fields in the index are stored/doc values.

I tried with cache sizes of 1024, 10024, 100024.

What can be the possible reasons for low cache usage?
How can I leverage cache feature for high traffic indexes?

Thanks
Saurabh Sharma
Reply | Threaded
Open this post in threaded view
|

Re: Very low filter cache hit ratio

Shawn Heisey-2
On 5/29/2019 6:57 AM, Saurabh Sharma wrote:
> What can be the possible reasons for low cache usage?
> How can I leverage cache feature for high traffic indexes?

Your usage apparently does not use the exact same query (or filter
query, in the case of filterCache) very often.

In order to achieve a high hit ratio on a cache, the same query will
need to be used by many users.  That's not happening here.  I'm betting
that each user is sending something unique to Solr - which means it will
be impossible to get a hit, unless that user sends the same query again.

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

Re: Very low filter cache hit ratio

Saurabh Sharma
Hi Shwan,

Many filters are common among the queries. AFAIK, filter cache are created
against filters and by that logic one should get good hit ratio for those
cached filter conditions.i tried to create a cache of 100K size and that
too was not producing good hit ratio. Any document/suggetion about
efficient usage of various caches  and their internal working.

Thanks
Saurabh

On Wed 29 May, 2019, 6:53 PM Shawn Heisey, <[hidden email]> wrote:

> On 5/29/2019 6:57 AM, Saurabh Sharma wrote:
> > What can be the possible reasons for low cache usage?
> > How can I leverage cache feature for high traffic indexes?
>
> Your usage apparently does not use the exact same query (or filter
> query, in the case of filterCache) very often.
>
> In order to achieve a high hit ratio on a cache, the same query will
> need to be used by many users.  That's not happening here.  I'm betting
> that each user is sending something unique to Solr - which means it will
> be impossible to get a hit, unless that user sends the same query again.
>
> Thanks,
> Shawn
>
Reply | Threaded
Open this post in threaded view
|

Re: Very low filter cache hit ratio

Atita Arora
You can refer to this one:
https://teaspoon-consulting.com/articles/solr-cache-tuning.html

HTH,
Atita

On Wed, May 29, 2019 at 3:33 PM Saurabh Sharma <[hidden email]>
wrote:

> Hi Shwan,
>
> Many filters are common among the queries. AFAIK, filter cache are created
> against filters and by that logic one should get good hit ratio for those
> cached filter conditions.i tried to create a cache of 100K size and that
> too was not producing good hit ratio. Any document/suggetion about
> efficient usage of various caches  and their internal working.
>
> Thanks
> Saurabh
>
> On Wed 29 May, 2019, 6:53 PM Shawn Heisey, <[hidden email]> wrote:
>
> > On 5/29/2019 6:57 AM, Saurabh Sharma wrote:
> > > What can be the possible reasons for low cache usage?
> > > How can I leverage cache feature for high traffic indexes?
> >
> > Your usage apparently does not use the exact same query (or filter
> > query, in the case of filterCache) very often.
> >
> > In order to achieve a high hit ratio on a cache, the same query will
> > need to be used by many users.  That's not happening here.  I'm betting
> > that each user is sending something unique to Solr - which means it will
> > be impossible to get a hit, unless that user sends the same query again.
> >
> > Thanks,
> > Shawn
> >
>
Reply | Threaded
Open this post in threaded view
|

RE: Very low filter cache hit ratio

Markus Jelsma-2
In reply to this post by Saurabh Sharma
Hello,

What is missing in that article is you must never use NOW without rounding it down in a filter query. If you have it, round it down to an hour, day or minute to prevent flooding the filter cache.

Regards,
Markus

-----Original message-----

> From:Atita Arora <[hidden email]>
> Sent: Wednesday 29th May 2019 15:43
> To: [hidden email]
> Subject: Re: Very low filter cache hit ratio
>
> You can refer to this one:
> https://teaspoon-consulting.com/articles/solr-cache-tuning.html
>
> HTH,
> Atita
>
> On Wed, May 29, 2019 at 3:33 PM Saurabh Sharma <[hidden email]>
> wrote:
>
> > Hi Shwan,
> >
> > Many filters are common among the queries. AFAIK, filter cache are created
> > against filters and by that logic one should get good hit ratio for those
> > cached filter conditions.i tried to create a cache of 100K size and that
> > too was not producing good hit ratio. Any document/suggetion about
> > efficient usage of various caches  and their internal working.
> >
> > Thanks
> > Saurabh
> >
> > On Wed 29 May, 2019, 6:53 PM Shawn Heisey, <[hidden email]> wrote:
> >
> > > On 5/29/2019 6:57 AM, Saurabh Sharma wrote:
> > > > What can be the possible reasons for low cache usage?
> > > > How can I leverage cache feature for high traffic indexes?
> > >
> > > Your usage apparently does not use the exact same query (or filter
> > > query, in the case of filterCache) very often.
> > >
> > > In order to achieve a high hit ratio on a cache, the same query will
> > > need to be used by many users.  That's not happening here.  I'm betting
> > > that each user is sending something unique to Solr - which means it will
> > > be impossible to get a hit, unless that user sends the same query again.
> > >
> > > Thanks,
> > > Shawn
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Very low filter cache hit ratio

Shawn Heisey-2
In reply to this post by Saurabh Sharma
On 5/29/2019 7:33 AM, Saurabh Sharma wrote:
> Many filters are common among the queries. AFAIK, filter cache are created
> against filters and by that logic one should get good hit ratio for those
> cached filter conditions.i tried to create a cache of 100K size and that
> too was not producing good hit ratio. Any document/suggetion about
> efficient usage of various caches  and their internal working.

In order to produce a cache hit, the query or filter must be identical
in every way.  Whitespace and all.  And it must be identical after parts
of it are substituted or expanded by Solr.

Take note of the reply you received from Markus Jelsma.  The "NOW"
keyword is replaced by a current timestamp with millisecond accuracy --
which effectively means that queries using NOW are always different and
cannot produce a cache hit.  Rounding the timestamp using NOW/HOUR or
NOW/DAY, if that fits user requirements, can be one solution to that
problem.

Be careful with defining a large filterCache.  The memory requirements
can become VERY extreme.

Thanks,
Shawn
Reply | Threaded
Open this post in threaded view
|

Re: Very low filter cache hit ratio

Erick Erickson
In reply to this post by Markus Jelsma-2
You must show us the _exact_ filter queries you’re using, or at least a representative sample.

Bumping the cache up very high is almost always the wrong thing to do. Each entry takes approximately maxDoc/8 bytes so unless your corpus is very small, you’ll eventually blow memory up.

To Markus’ point about NOW, a full treatment is here: https://dzone.com/articles/solr-date-math-now-and-filter

Best,
Erick

> On May 29, 2019, at 6:47 AM, Markus Jelsma <[hidden email]> wrote:
>
> Hello,
>
> What is missing in that article is you must never use NOW without rounding it down in a filter query. If you have it, round it down to an hour, day or minute to prevent flooding the filter cache.
>
> Regards,
> Markus
>
> -----Original message-----
>> From:Atita Arora <[hidden email]>
>> Sent: Wednesday 29th May 2019 15:43
>> To: [hidden email]
>> Subject: Re: Very low filter cache hit ratio
>>
>> You can refer to this one:
>> https://teaspoon-consulting.com/articles/solr-cache-tuning.html
>>
>> HTH,
>> Atita
>>
>> On Wed, May 29, 2019 at 3:33 PM Saurabh Sharma <[hidden email]>
>> wrote:
>>
>>> Hi Shwan,
>>>
>>> Many filters are common among the queries. AFAIK, filter cache are created
>>> against filters and by that logic one should get good hit ratio for those
>>> cached filter conditions.i tried to create a cache of 100K size and that
>>> too was not producing good hit ratio. Any document/suggetion about
>>> efficient usage of various caches  and their internal working.
>>>
>>> Thanks
>>> Saurabh
>>>
>>> On Wed 29 May, 2019, 6:53 PM Shawn Heisey, <[hidden email]> wrote:
>>>
>>>> On 5/29/2019 6:57 AM, Saurabh Sharma wrote:
>>>>> What can be the possible reasons for low cache usage?
>>>>> How can I leverage cache feature for high traffic indexes?
>>>>
>>>> Your usage apparently does not use the exact same query (or filter
>>>> query, in the case of filterCache) very often.
>>>>
>>>> In order to achieve a high hit ratio on a cache, the same query will
>>>> need to be used by many users.  That's not happening here.  I'm betting
>>>> that each user is sending something unique to Solr - which means it will
>>>> be impossible to get a hit, unless that user sends the same query again.
>>>>
>>>> Thanks,
>>>> Shawn
>>>>
>>>
>>