Page faults

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Page faults

Branham, Jeremy (Experis)

Does anyone know if it is typical behavior for a SOLR cluster to have lots of page faults (50-100 per second) under heavy load?

We are performing load testing on a cluster with 8 nodes, and my performance engineer has brought this information to attention.

I don’t know enough about memory management to say it is normal or not.

 

The performance doesn’t appear to be suffering, but I don’t want to overlook a potential hazard.

 

Thanks!

 

 

 

 

Jeremy Branham

[hidden email]

Allstate Insurance Company | UCV Technology Services | Information Services Group

 

Reply | Threaded
Open this post in threaded view
|

Re: Page faults

Erick Erickson
Images do not come through, so we don't see what you're seeing.

That said, I'd expect page faults to happen:

1> when indexing. Besides what you'd expect (new segments
     written to disk), there's segment merging going on in
     the background which has to read segments from disk
     in order to merge.

2> when querying, any fields returned as part of a doc
     that has stored=true docValues=false will require
     a disk access to get the stored data.

Best,
Erick


On Mon, Jan 7, 2019 at 8:35 AM Branham, Jeremy (Experis)
<[hidden email]> wrote:

>
> Does anyone know if it is typical behavior for a SOLR cluster to have lots of page faults (50-100 per second) under heavy load?
>
> We are performing load testing on a cluster with 8 nodes, and my performance engineer has brought this information to attention.
>
> I don’t know enough about memory management to say it is normal or not.
>
>
>
> The performance doesn’t appear to be suffering, but I don’t want to overlook a potential hazard.
>
>
>
> Thanks!
>
>
>
>
>
>
>
>
>
> Jeremy Branham
>
> [hidden email]
>
> Allstate Insurance Company | UCV Technology Services | Information Services Group
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Page faults

Christopher Schultz
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Erick,

On 1/7/19 11:52, Erick Erickson wrote:

> Images do not come through, so we don't see what you're seeing.
>
> That said, I'd expect page faults to happen:
>
> 1> when indexing. Besides what you'd expect (new segments written
> to disk), there's segment merging going on in the background which
> has to read segments from disk in order to merge.
>
> 2> when querying, any fields returned as part of a doc that has
> stored=true docValues=false will require a disk access to get the
> stored data.

A page fault is not necessarily a disk access. It almost always *is*,
but it's not because the application is calling fopen(). It's because
the OS is performing a memory operation which often results in a dip
into virtual memory.

Jeremy, are these page-faults occurring on all the machines in your
cluster, or only some? What is the hardware configuration of each
machine (specifically, memory)? What are your JVM settings for your
Solr instances? Is anything else running on these nodes?

It would help to understand what's happening on your servers. "I'm
seeing page faults" doesn't really help us help you.

Thanks,
- -chris

> On Mon, Jan 7, 2019 at 8:35 AM Branham, Jeremy (Experis)
> <[hidden email]> wrote:
>>
>> Does anyone know if it is typical behavior for a SOLR cluster to
>> have lots of page faults (50-100 per second) under heavy load?
>>
>> We are performing load testing on a cluster with 8 nodes, and my
>> performance engineer has brought this information to attention.
>>
>> I don’t know enough about memory management to say it is normal
>> or not.
>>
>>
>>
>> The performance doesn’t appear to be suffering, but I don’t want
>> to overlook a potential hazard.
>>
>>
>>
>> Thanks!
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Jeremy Branham
>>
>> [hidden email]
>>
>> Allstate Insurance Company | UCV Technology Services |
>> Information Services Group
>>
>>
>
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlwzpYsACgkQHPApP6U8
pFgSHxAAgaXV5wkwV7Ru2QyhnvxUnIWY4Iom0IdZYrDuZBDxmFx9wzE7P33zmR3E
nrgZCqBtAMdxRSwG9BfyKircChZBssqtQpskw6mgJyzRyGvKVJjJ68r0vEio3Kjo
HjaJczBFWvdOKm42W1Li4SeymGyYXu/jmdkWLcIbEM4BgDQLf1HhSEphDeZzP4ST
GNDBrIA6XkUJwE1r58FUuj9l0XSKUAPLOPNAx1qGiAn4fKdbysVHvLcvJvJzC0pC
1kx000r+Mqdd61EzhM20ZDIvg2F3vgFgGCUtB31hIi18bfD8whoAafL2FSMkIccD
H7X09PpUK8qPM/oQgqCKTtfmVR3M2pi3CSxLFSQ1/QucnF2wxWknOOWUH1TMU/L2
KUQHS6GwuTk+R/8PxdBRsZI8ON3MVb690ECV4QplYlkrtygXrLRg2YOgifgAXsKL
5Kg2mrpKoxfNnDWaRksy4GUDTsSxbkd1rpnHJEZ8le26HXvz9wrug/FtNPzqP8S9
dan2gkgiSqOM9GKlKkA72ROyQDhZa5YiXfGNdRrmfkiQzlDBEcGpD8pg1GwskRJl
yidTBfvRSyCHsI5NBGf65nTG+2WfUnr8wClHVK5QQGVilHBn6KzeHeDTL9ZpHvcn
GhkDMvc+9f8DR7Hr/mTiGjYIAvJZYiIJeYUoe0Bl2BHmGDv0tEk=
=OpZo
-----END PGP SIGNATURE-----
Reply | Threaded
Open this post in threaded view
|

Re: Re: Page faults

Branham, Jeremy (Experis)
Thanks Erick/Chris for the information.
The page faults are occurring on each node of the cluster.
These are VMs running SOLR v7.2.1 on RHEL 7. CPUx8, 64GB mem.

We’re collecting GC information and using a DynaTrace agent, so I’m not sure if / how much that contributes to the overhead.

This cluster is used strictly for type-ahead/auto-complete functionality.

I’ve also just noticed that the shards are imbalanced – 2 having about 90GB and 2 having about 18GB of data.
Having just joined this team, I’m not too familiar yet with the documents or queries/updates [and maybe not relevant to the page faults].
Although, I did check the schema, and most of the fields are stored=true, docValues=true

Solr v7.2.1
OS: RHEL 7

Collection Configuration -
Shard count: 4
configName: pdv201806
replicationFactor: 2
maxShardsPerNode: 1
router: compositeId
autoAddReplicas: false

Cache configuration –
filterCache class="solr.FastLRUCache"
                 size="20000"
                 initialSize="5000"
                 autowarmCount="10"
queryResultCache class="solr.LRUCache"
                      size="5000"
                      initialSize="1000"
                      autowarmCount="0"
documentCache class="solr.LRUCache"
                   size="15000"
                   initialSize="512"

enableLazyFieldLoading=true


JVM Information/Configuration –
java.runtime.version: 1.8.0_162-b12

-XX:+CMSParallelRemarkEnabled
-XX:+CMSScavengeBeforeRemark
-XX:+ParallelRefProcEnabled
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCDateStamps
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintHeapAtGC
-XX:+PrintTenuringDistribution
-XX:+ScavengeBeforeFullGC
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+UseConcMarkSweepGC
-XX:+UseGCLogFileRotation
-XX:+UseParNewGC
-XX:-OmitStackTraceInFastThrow
-XX:CMSInitiatingOccupancyFraction=70
-XX:CMSMaxAbortablePrecleanTime=6000
-XX:ConcGCThreads=4
-XX:GCLogFileSize=20M
-XX:MaxTenuringThreshold=8
-XX:NewRatio=3
-XX:ParallelGCThreads=8
-XX:PretenureSizeThreshold=64m
-XX:SurvivorRatio=4
-XX:TargetSurvivorRatio=90
-Xms16g
-Xmx32g
-Xss256k
-verbose:gc


 
Jeremy Branham
[hidden email]

On 1/7/19, 1:16 PM, "Christopher Schultz" <[hidden email]> wrote:

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA256
   
    Erick,
   
    On 1/7/19 11:52, Erick Erickson wrote:
    > Images do not come through, so we don't see what you're seeing.
    >
    > That said, I'd expect page faults to happen:
    >
    > 1> when indexing. Besides what you'd expect (new segments written
    > to disk), there's segment merging going on in the background which
    > has to read segments from disk in order to merge.
    >
    > 2> when querying, any fields returned as part of a doc that has
    > stored=true docValues=false will require a disk access to get the
    > stored data.
   
    A page fault is not necessarily a disk access. It almost always *is*,
    but it's not because the application is calling fopen(). It's because
    the OS is performing a memory operation which often results in a dip
    into virtual memory.
   
    Jeremy, are these page-faults occurring on all the machines in your
    cluster, or only some? What is the hardware configuration of each
    machine (specifically, memory)? What are your JVM settings for your
    Solr instances? Is anything else running on these nodes?
   
    It would help to understand what's happening on your servers. "I'm
    seeing page faults" doesn't really help us help you.
   
    Thanks,
    - -chris
   
    > On Mon, Jan 7, 2019 at 8:35 AM Branham, Jeremy (Experis)
    > <[hidden email]> wrote:
    >>
    >> Does anyone know if it is typical behavior for a SOLR cluster to
    >> have lots of page faults (50-100 per second) under heavy load?
    >>
    >> We are performing load testing on a cluster with 8 nodes, and my
    >> performance engineer has brought this information to attention.
    >>
    >> I don’t know enough about memory management to say it is normal
    >> or not.
    >>
    >>
    >>
    >> The performance doesn’t appear to be suffering, but I don’t want
    >> to overlook a potential hazard.
    >>
    >>
    >>
    >> Thanks!
    >>
    >>
    >>
    >>
    >>
    >>
    >>
    >>
    >>
    >> Jeremy Branham
    >>
    >> [hidden email]
    >>
    >> Allstate Insurance Company | UCV Technology Services |
    >> Information Services Group
    >>
    >>
    >
    -----BEGIN PGP SIGNATURE-----
    Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
   
    iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlwzpYsACgkQHPApP6U8
    pFgSHxAAgaXV5wkwV7Ru2QyhnvxUnIWY4Iom0IdZYrDuZBDxmFx9wzE7P33zmR3E
    nrgZCqBtAMdxRSwG9BfyKircChZBssqtQpskw6mgJyzRyGvKVJjJ68r0vEio3Kjo
    HjaJczBFWvdOKm42W1Li4SeymGyYXu/jmdkWLcIbEM4BgDQLf1HhSEphDeZzP4ST
    GNDBrIA6XkUJwE1r58FUuj9l0XSKUAPLOPNAx1qGiAn4fKdbysVHvLcvJvJzC0pC
    1kx000r+Mqdd61EzhM20ZDIvg2F3vgFgGCUtB31hIi18bfD8whoAafL2FSMkIccD
    H7X09PpUK8qPM/oQgqCKTtfmVR3M2pi3CSxLFSQ1/QucnF2wxWknOOWUH1TMU/L2
    KUQHS6GwuTk+R/8PxdBRsZI8ON3MVb690ECV4QplYlkrtygXrLRg2YOgifgAXsKL
    5Kg2mrpKoxfNnDWaRksy4GUDTsSxbkd1rpnHJEZ8le26HXvz9wrug/FtNPzqP8S9
    dan2gkgiSqOM9GKlKkA72ROyQDhZa5YiXfGNdRrmfkiQzlDBEcGpD8pg1GwskRJl
    yidTBfvRSyCHsI5NBGf65nTG+2WfUnr8wClHVK5QQGVilHBn6KzeHeDTL9ZpHvcn
    GhkDMvc+9f8DR7Hr/mTiGjYIAvJZYiIJeYUoe0Bl2BHmGDv0tEk=
    =OpZo
    -----END PGP SIGNATURE-----
   

Reply | Threaded
Open this post in threaded view
|

Re: Re: Page faults

Erick Erickson
having some replicas at 90G and some at 18G is totally unexpected with
compisiteID routing unless you're using "multi-level routing", see:
https://lucidworks.com/2014/01/06/multi-level-composite-id-routing-solrcloud/

But let's be clear what we're talking about here. I'm talking about
specifically the size of the index on disk for any particular
_replica_, meaning the size in places similar to:
pdv201806_shard1_replica1/data/index. I've never seen as much
disparity as you're talking about so we should get to the bottom of
that.

Do you have massive numbers of deleted docs in any of those shards?
The admin screen for any particular replica will show this number.


On another note: Your cache sizes are probably not part of the page
fault question, but on the surface they're badly misconfigured, at
least the filterCache and queryResultCache. Each entry in the
filterCache is a map entry, the key is roughly the query and the value
is bounded by maxDoc/8. So if you have, say, 8M documents, your
filterCache could theoretically be 1M each (give or take) and you
could have up to 20,000 of them. You're probably just being lucky and
either not having very many distinct fq clauses or are indexing often
enough that it isn't growing for very long before being flushed.

Your queryResultCache takes up a lot less space, but still it's quite
large. It has two primary purposes:
> paging. It generally stores a few integers (40 is common, maybe several hundred but who cares?) so hitting the next page won't have to search again. This isn't terribly important in modern installations.

> being used in autowarming to pre-load parts of the index into memory.

I'd consider knocking each of these back to the defaults (512), except
I'd put the autowarm count at, say, 16 or so.

The document cache is less clear, the recommendation is (number of
simultaneous queries you expect) X (your average row parameter)

Best,
Erick

On Mon, Jan 7, 2019 at 12:43 PM Branham, Jeremy (Experis)
<[hidden email]> wrote:

>
> Thanks Erick/Chris for the information.
> The page faults are occurring on each node of the cluster.
> These are VMs running SOLR v7.2.1 on RHEL 7. CPUx8, 64GB mem.
>
> We’re collecting GC information and using a DynaTrace agent, so I’m not sure if / how much that contributes to the overhead.
>
> This cluster is used strictly for type-ahead/auto-complete functionality.
>
> I’ve also just noticed that the shards are imbalanced – 2 having about 90GB and 2 having about 18GB of data.
> Having just joined this team, I’m not too familiar yet with the documents or queries/updates [and maybe not relevant to the page faults].
> Although, I did check the schema, and most of the fields are stored=true, docValues=true
>
> Solr v7.2.1
> OS: RHEL 7
>
> Collection Configuration -
> Shard count: 4
> configName: pdv201806
> replicationFactor: 2
> maxShardsPerNode: 1
> router: compositeId
> autoAddReplicas: false
>
> Cache configuration –
> filterCache class="solr.FastLRUCache"
>                  size="20000"
>                  initialSize="5000"
>                  autowarmCount="10"
> queryResultCache class="solr.LRUCache"
>                       size="5000"
>                       initialSize="1000"
>                       autowarmCount="0"
> documentCache class="solr.LRUCache"
>                    size="15000"
>                    initialSize="512"
>
> enableLazyFieldLoading=true
>
>
> JVM Information/Configuration –
> java.runtime.version: 1.8.0_162-b12
>
> -XX:+CMSParallelRemarkEnabled
> -XX:+CMSScavengeBeforeRemark
> -XX:+ParallelRefProcEnabled
> -XX:+PrintGCApplicationStoppedTime
> -XX:+PrintGCDateStamps
> -XX:+PrintGCDetails
> -XX:+PrintGCTimeStamps
> -XX:+PrintHeapAtGC
> -XX:+PrintTenuringDistribution
> -XX:+ScavengeBeforeFullGC
> -XX:+UseCMSInitiatingOccupancyOnly
> -XX:+UseConcMarkSweepGC
> -XX:+UseGCLogFileRotation
> -XX:+UseParNewGC
> -XX:-OmitStackTraceInFastThrow
> -XX:CMSInitiatingOccupancyFraction=70
> -XX:CMSMaxAbortablePrecleanTime=6000
> -XX:ConcGCThreads=4
> -XX:GCLogFileSize=20M
> -XX:MaxTenuringThreshold=8
> -XX:NewRatio=3
> -XX:ParallelGCThreads=8
> -XX:PretenureSizeThreshold=64m
> -XX:SurvivorRatio=4
> -XX:TargetSurvivorRatio=90
> -Xms16g
> -Xmx32g
> -Xss256k
> -verbose:gc
>
>
>
> Jeremy Branham
> [hidden email]
>
> On 1/7/19, 1:16 PM, "Christopher Schultz" <[hidden email]> wrote:
>
>     -----BEGIN PGP SIGNED MESSAGE-----
>     Hash: SHA256
>
>     Erick,
>
>     On 1/7/19 11:52, Erick Erickson wrote:
>     > Images do not come through, so we don't see what you're seeing.
>     >
>     > That said, I'd expect page faults to happen:
>     >
>     > 1> when indexing. Besides what you'd expect (new segments written
>     > to disk), there's segment merging going on in the background which
>     > has to read segments from disk in order to merge.
>     >
>     > 2> when querying, any fields returned as part of a doc that has
>     > stored=true docValues=false will require a disk access to get the
>     > stored data.
>
>     A page fault is not necessarily a disk access. It almost always *is*,
>     but it's not because the application is calling fopen(). It's because
>     the OS is performing a memory operation which often results in a dip
>     into virtual memory.
>
>     Jeremy, are these page-faults occurring on all the machines in your
>     cluster, or only some? What is the hardware configuration of each
>     machine (specifically, memory)? What are your JVM settings for your
>     Solr instances? Is anything else running on these nodes?
>
>     It would help to understand what's happening on your servers. "I'm
>     seeing page faults" doesn't really help us help you.
>
>     Thanks,
>     - -chris
>
>     > On Mon, Jan 7, 2019 at 8:35 AM Branham, Jeremy (Experis)
>     > <[hidden email]> wrote:
>     >>
>     >> Does anyone know if it is typical behavior for a SOLR cluster to
>     >> have lots of page faults (50-100 per second) under heavy load?
>     >>
>     >> We are performing load testing on a cluster with 8 nodes, and my
>     >> performance engineer has brought this information to attention.
>     >>
>     >> I don’t know enough about memory management to say it is normal
>     >> or not.
>     >>
>     >>
>     >>
>     >> The performance doesn’t appear to be suffering, but I don’t want
>     >> to overlook a potential hazard.
>     >>
>     >>
>     >>
>     >> Thanks!
>     >>
>     >>
>     >>
>     >>
>     >>
>     >>
>     >>
>     >>
>     >>
>     >> Jeremy Branham
>     >>
>     >> [hidden email]
>     >>
>     >> Allstate Insurance Company | UCV Technology Services |
>     >> Information Services Group
>     >>
>     >>
>     >
>     -----BEGIN PGP SIGNATURE-----
>     Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/
>
>     iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlwzpYsACgkQHPApP6U8
>     pFgSHxAAgaXV5wkwV7Ru2QyhnvxUnIWY4Iom0IdZYrDuZBDxmFx9wzE7P33zmR3E
>     nrgZCqBtAMdxRSwG9BfyKircChZBssqtQpskw6mgJyzRyGvKVJjJ68r0vEio3Kjo
>     HjaJczBFWvdOKm42W1Li4SeymGyYXu/jmdkWLcIbEM4BgDQLf1HhSEphDeZzP4ST
>     GNDBrIA6XkUJwE1r58FUuj9l0XSKUAPLOPNAx1qGiAn4fKdbysVHvLcvJvJzC0pC
>     1kx000r+Mqdd61EzhM20ZDIvg2F3vgFgGCUtB31hIi18bfD8whoAafL2FSMkIccD
>     H7X09PpUK8qPM/oQgqCKTtfmVR3M2pi3CSxLFSQ1/QucnF2wxWknOOWUH1TMU/L2
>     KUQHS6GwuTk+R/8PxdBRsZI8ON3MVb690ECV4QplYlkrtygXrLRg2YOgifgAXsKL
>     5Kg2mrpKoxfNnDWaRksy4GUDTsSxbkd1rpnHJEZ8le26HXvz9wrug/FtNPzqP8S9
>     dan2gkgiSqOM9GKlKkA72ROyQDhZa5YiXfGNdRrmfkiQzlDBEcGpD8pg1GwskRJl
>     yidTBfvRSyCHsI5NBGf65nTG+2WfUnr8wClHVK5QQGVilHBn6KzeHeDTL9ZpHvcn
>     GhkDMvc+9f8DR7Hr/mTiGjYIAvJZYiIJeYUoe0Bl2BHmGDv0tEk=
>     =OpZo
>     -----END PGP SIGNATURE-----
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Re: Re: Page faults

Branham, Jeremy (Experis)
Thanks for the information Erick –
I’ve learned there are 2 ‘classes’ of documents being stored in this collection.
There are about 4x as many documents in class A as class B.
When the documents are indexed, the document ID includes the key prefix like ‘A/1!’ or ‘B/1!’, which I understand spreads the documents over ½ of the available shards.

I don’t suppose there is a way to say “I want 75% of the shards to store class A, and 25% to store class B”.
If we dropped the ‘/1’ from the prefix, all the documents would be indexed on a single shard, correct?


Currently, half the servers are under heavy load, and the other half are under-utilized. [8 servers total, 4 shards with replication factor of 2]
I’ve considered a few remedies, but I’m not sure which would be best.

We could drop the document ID prefix and let SOLR distribute the documents evenly, then use a discriminator field to filter queries.
- Requires re-indexing
- Code changes in our APIs and indexing process
We could create 2 separate collections.
- Requires re-indexing
- Code changes in our APIs and indexing process
- Lost ability to query all the docs at once
We could split the shards.
- More than 1 shard would be on a node. What if we end up with 2 big replicas on a single node?

If we split the shards, I’m unsure how the prefix would work in this scenario.
Would ‘A/1!’ continue to use the original shard range?

Like if we split just the 2 big shards –
4 shards become 6
Does ‘A/1!’ spread the documents across 3 shards [half of the new total] or across the 4 new shards?

Or if we split all 4 shards, ‘A/1!’ should spread across 8 shards, which would be half of the new total.
Could it be difficult trying to balance 8 shards across 8 servers?
I’m concerned 2 big shards would end up on the same server, and we would have imbalance again.

I think dropping the prefix all-together would be the easiest to maintain and scale, but has a code-impact on our apps.
Or maybe I’m over-thinking the complexity of splitting the shards, and they will balance out naturally.

I’ll split the shards in our test environment to see what happens.

 
Jeremy Branham
[hidden email]

On 1/7/19, 6:13 PM, "Erick Erickson" <[hidden email]> wrote:

    having some replicas at 90G and some at 18G is totally unexpected with
    compisiteID routing unless you're using "multi-level routing", see:
    https://urldefense.proofpoint.com/v2/url?u=https-3A__lucidworks.com_2014_01_06_multi-2Dlevel-2Dcomposite-2Did-2Drouting-2Dsolrcloud_&d=DwIFaQ&c=gtIjdLs6LnStUpy9cTOW9w&r=0SwsmPELGv6GC1_5JSQ9T7ZPMLljrIkbF_2jBCrKXI0&m=3W1fPV3il56N1yZXMpkr8tctxVeKkZ9Bi5S74c2AmSo&s=h67H58KbeLZIoOUaly3kVCFHllH-0Mi2FiqRDckIlBo&e=
   
    But let's be clear what we're talking about here. I'm talking about
    specifically the size of the index on disk for any particular
    _replica_, meaning the size in places similar to:
    pdv201806_shard1_replica1/data/index. I've never seen as much
    disparity as you're talking about so we should get to the bottom of
    that.
   
    Do you have massive numbers of deleted docs in any of those shards?
    The admin screen for any particular replica will show this number.
   
   
    On another note: Your cache sizes are probably not part of the page
    fault question, but on the surface they're badly misconfigured, at
    least the filterCache and queryResultCache. Each entry in the
    filterCache is a map entry, the key is roughly the query and the value
    is bounded by maxDoc/8. So if you have, say, 8M documents, your
    filterCache could theoretically be 1M each (give or take) and you
    could have up to 20,000 of them. You're probably just being lucky and
    either not having very many distinct fq clauses or are indexing often
    enough that it isn't growing for very long before being flushed.
   
    Your queryResultCache takes up a lot less space, but still it's quite
    large. It has two primary purposes:
    > paging. It generally stores a few integers (40 is common, maybe several hundred but who cares?) so hitting the next page won't have to search again. This isn't terribly important in modern installations.
   
    > being used in autowarming to pre-load parts of the index into memory.
   
    I'd consider knocking each of these back to the defaults (512), except
    I'd put the autowarm count at, say, 16 or so.
   
    The document cache is less clear, the recommendation is (number of
    simultaneous queries you expect) X (your average row parameter)
   
    Best,
    Erick
   
    On Mon, Jan 7, 2019 at 12:43 PM Branham, Jeremy (Experis)
    <[hidden email]> wrote:
    >
    > Thanks Erick/Chris for the information.
    > The page faults are occurring on each node of the cluster.
    > These are VMs running SOLR v7.2.1 on RHEL 7. CPUx8, 64GB mem.
    >
    > We’re collecting GC information and using a DynaTrace agent, so I’m not sure if / how much that contributes to the overhead.
    >
    > This cluster is used strictly for type-ahead/auto-complete functionality.
    >
    > I’ve also just noticed that the shards are imbalanced – 2 having about 90GB and 2 having about 18GB of data.
    > Having just joined this team, I’m not too familiar yet with the documents or queries/updates [and maybe not relevant to the page faults].
    > Although, I did check the schema, and most of the fields are stored=true, docValues=true
    >
    > Solr v7.2.1
    > OS: RHEL 7
    >
    > Collection Configuration -
    > Shard count: 4
    > configName: pdv201806
    > replicationFactor: 2
    > maxShardsPerNode: 1
    > router: compositeId
    > autoAddReplicas: false
    >
    > Cache configuration –
    > filterCache class="solr.FastLRUCache"
    >                  size="20000"
    >                  initialSize="5000"
    >                  autowarmCount="10"
    > queryResultCache class="solr.LRUCache"
    >                       size="5000"
    >                       initialSize="1000"
    >                       autowarmCount="0"
    > documentCache class="solr.LRUCache"
    >                    size="15000"
    >                    initialSize="512"
    >
    > enableLazyFieldLoading=true
    >
    >
    > JVM Information/Configuration –
    > java.runtime.version: 1.8.0_162-b12
    >
    > -XX:+CMSParallelRemarkEnabled
    > -XX:+CMSScavengeBeforeRemark
    > -XX:+ParallelRefProcEnabled
    > -XX:+PrintGCApplicationStoppedTime
    > -XX:+PrintGCDateStamps
    > -XX:+PrintGCDetails
    > -XX:+PrintGCTimeStamps
    > -XX:+PrintHeapAtGC
    > -XX:+PrintTenuringDistribution
    > -XX:+ScavengeBeforeFullGC
    > -XX:+UseCMSInitiatingOccupancyOnly
    > -XX:+UseConcMarkSweepGC
    > -XX:+UseGCLogFileRotation
    > -XX:+UseParNewGC
    > -XX:-OmitStackTraceInFastThrow
    > -XX:CMSInitiatingOccupancyFraction=70
    > -XX:CMSMaxAbortablePrecleanTime=6000
    > -XX:ConcGCThreads=4
    > -XX:GCLogFileSize=20M
    > -XX:MaxTenuringThreshold=8
    > -XX:NewRatio=3
    > -XX:ParallelGCThreads=8
    > -XX:PretenureSizeThreshold=64m
    > -XX:SurvivorRatio=4
    > -XX:TargetSurvivorRatio=90
    > -Xms16g
    > -Xmx32g
    > -Xss256k
    > -verbose:gc
    >
    >
    >
    > Jeremy Branham
    > [hidden email]
    >
    > On 1/7/19, 1:16 PM, "Christopher Schultz" <[hidden email]> wrote:
    >
    >     -----BEGIN PGP SIGNED MESSAGE-----
    >     Hash: SHA256
    >
    >     Erick,
    >
    >     On 1/7/19 11:52, Erick Erickson wrote:
    >     > Images do not come through, so we don't see what you're seeing.
    >     >
    >     > That said, I'd expect page faults to happen:
    >     >
    >     > 1> when indexing. Besides what you'd expect (new segments written
    >     > to disk), there's segment merging going on in the background which
    >     > has to read segments from disk in order to merge.
    >     >
    >     > 2> when querying, any fields returned as part of a doc that has
    >     > stored=true docValues=false will require a disk access to get the
    >     > stored data.
    >
    >     A page fault is not necessarily a disk access. It almost always *is*,
    >     but it's not because the application is calling fopen(). It's because
    >     the OS is performing a memory operation which often results in a dip
    >     into virtual memory.
    >
    >     Jeremy, are these page-faults occurring on all the machines in your
    >     cluster, or only some? What is the hardware configuration of each
    >     machine (specifically, memory)? What are your JVM settings for your
    >     Solr instances? Is anything else running on these nodes?
    >
    >     It would help to understand what's happening on your servers. "I'm
    >     seeing page faults" doesn't really help us help you.
    >
    >     Thanks,
    >     - -chris
    >
    >     > On Mon, Jan 7, 2019 at 8:35 AM Branham, Jeremy (Experis)
    >     > <[hidden email]> wrote:
    >     >>
    >     >> Does anyone know if it is typical behavior for a SOLR cluster to
    >     >> have lots of page faults (50-100 per second) under heavy load?
    >     >>
    >     >> We are performing load testing on a cluster with 8 nodes, and my
    >     >> performance engineer has brought this information to attention.
    >     >>
    >     >> I don’t know enough about memory management to say it is normal
    >     >> or not.
    >     >>
    >     >>
    >     >>
    >     >> The performance doesn’t appear to be suffering, but I don’t want
    >     >> to overlook a potential hazard.
    >     >>
    >     >>
    >     >>
    >     >> Thanks!
    >     >>
    >     >>
    >     >>
    >     >>
    >     >>
    >     >>
    >     >>
    >     >>
    >     >>
    >     >> Jeremy Branham
    >     >>
    >     >> [hidden email]
    >     >>
    >     >> Allstate Insurance Company | UCV Technology Services |
    >     >> Information Services Group
    >     >>
    >     >>
    >     >
    >     -----BEGIN PGP SIGNATURE-----
    >     Comment: Using GnuPG with Thunderbird - https://urldefense.proofpoint.com/v2/url?u=https-3A__www.enigmail.net_&d=DwIFaQ&c=gtIjdLs6LnStUpy9cTOW9w&r=0SwsmPELGv6GC1_5JSQ9T7ZPMLljrIkbF_2jBCrKXI0&m=3W1fPV3il56N1yZXMpkr8tctxVeKkZ9Bi5S74c2AmSo&s=vNR-dUrni9lawd8NWq67r2bdsuz_UvpC-vtPr9dbYF4&e=
    >
    >     iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlwzpYsACgkQHPApP6U8
    >     pFgSHxAAgaXV5wkwV7Ru2QyhnvxUnIWY4Iom0IdZYrDuZBDxmFx9wzE7P33zmR3E
    >     nrgZCqBtAMdxRSwG9BfyKircChZBssqtQpskw6mgJyzRyGvKVJjJ68r0vEio3Kjo
    >     HjaJczBFWvdOKm42W1Li4SeymGyYXu/jmdkWLcIbEM4BgDQLf1HhSEphDeZzP4ST
    >     GNDBrIA6XkUJwE1r58FUuj9l0XSKUAPLOPNAx1qGiAn4fKdbysVHvLcvJvJzC0pC
    >     1kx000r+Mqdd61EzhM20ZDIvg2F3vgFgGCUtB31hIi18bfD8whoAafL2FSMkIccD
    >     H7X09PpUK8qPM/oQgqCKTtfmVR3M2pi3CSxLFSQ1/QucnF2wxWknOOWUH1TMU/L2
    >     KUQHS6GwuTk+R/8PxdBRsZI8ON3MVb690ECV4QplYlkrtygXrLRg2YOgifgAXsKL
    >     5Kg2mrpKoxfNnDWaRksy4GUDTsSxbkd1rpnHJEZ8le26HXvz9wrug/FtNPzqP8S9
    >     dan2gkgiSqOM9GKlKkA72ROyQDhZa5YiXfGNdRrmfkiQzlDBEcGpD8pg1GwskRJl
    >     yidTBfvRSyCHsI5NBGf65nTG+2WfUnr8wClHVK5QQGVilHBn6KzeHeDTL9ZpHvcn
    >     GhkDMvc+9f8DR7Hr/mTiGjYIAvJZYiIJeYUoe0Bl2BHmGDv0tEk=
    >     =OpZo
    >     -----END PGP SIGNATURE-----
    >
    >
   

Reply | Threaded
Open this post in threaded view
|

Re: Re: Re: Page faults

Erick Erickson
bq: We could create 2 separate collections.
- Requires re-indexing
- Code changes in our APIs and indexing process
- Lost ability to query all the docs at once ***

*** Not quite true. You can create an alias that points to multiple
collections. HOWEVER,
since the scores are computed using different stats (term frequencies,
field length,
even terms) the scores may not be comparable so your results (assuming you're
sorting by score) may be skewed.

That said, my first choice would be the first one you suggested: Drop the prefix
and let Solr distribute docs. Then use an "fq" clause as your discriminator....

What people often do with this is create a new collection and index to it in the
background, i.e. don't server any queries from it. When you're happy
it's in good
shape, create a collection alias to it to seamlessly switch your
queries over to it.
Of course you have to be indexing to _both_ collections if your
current collection
needs to be up to date.

There are various tricks you can play to minimize hardware requirements once
you decide on whether you want to do that or not.

Best,
Erick

On Wed, Jan 9, 2019 at 11:56 AM Branham, Jeremy (Experis)
<[hidden email]> wrote:

>
> Thanks for the information Erick –
> I’ve learned there are 2 ‘classes’ of documents being stored in this collection.
> There are about 4x as many documents in class A as class B.
> When the documents are indexed, the document ID includes the key prefix like ‘A/1!’ or ‘B/1!’, which I understand spreads the documents over ½ of the available shards.
>
> I don’t suppose there is a way to say “I want 75% of the shards to store class A, and 25% to store class B”.
> If we dropped the ‘/1’ from the prefix, all the documents would be indexed on a single shard, correct?
>
>
> Currently, half the servers are under heavy load, and the other half are under-utilized. [8 servers total, 4 shards with replication factor of 2]
> I’ve considered a few remedies, but I’m not sure which would be best.
>
> We could drop the document ID prefix and let SOLR distribute the documents evenly, then use a discriminator field to filter queries.
> - Requires re-indexing
> - Code changes in our APIs and indexing process
> We could create 2 separate collections.
> - Requires re-indexing
> - Code changes in our APIs and indexing process
> - Lost ability to query all the docs at once
> We could split the shards.
> - More than 1 shard would be on a node. What if we end up with 2 big replicas on a single node?
>
> If we split the shards, I’m unsure how the prefix would work in this scenario.
> Would ‘A/1!’ continue to use the original shard range?
>
> Like if we split just the 2 big shards –
> 4 shards become 6
> Does ‘A/1!’ spread the documents across 3 shards [half of the new total] or across the 4 new shards?
>
> Or if we split all 4 shards, ‘A/1!’ should spread across 8 shards, which would be half of the new total.
> Could it be difficult trying to balance 8 shards across 8 servers?
> I’m concerned 2 big shards would end up on the same server, and we would have imbalance again.
>
> I think dropping the prefix all-together would be the easiest to maintain and scale, but has a code-impact on our apps.
> Or maybe I’m over-thinking the complexity of splitting the shards, and they will balance out naturally.
>
> I’ll split the shards in our test environment to see what happens.
>
>
> Jeremy Branham
> [hidden email]
>
> On 1/7/19, 6:13 PM, "Erick Erickson" <[hidden email]> wrote:
>
>     having some replicas at 90G and some at 18G is totally unexpected with
>     compisiteID routing unless you're using "multi-level routing", see:
>     https://urldefense.proofpoint.com/v2/url?u=https-3A__lucidworks.com_2014_01_06_multi-2Dlevel-2Dcomposite-2Did-2Drouting-2Dsolrcloud_&d=DwIFaQ&c=gtIjdLs6LnStUpy9cTOW9w&r=0SwsmPELGv6GC1_5JSQ9T7ZPMLljrIkbF_2jBCrKXI0&m=3W1fPV3il56N1yZXMpkr8tctxVeKkZ9Bi5S74c2AmSo&s=h67H58KbeLZIoOUaly3kVCFHllH-0Mi2FiqRDckIlBo&e=
>
>     But let's be clear what we're talking about here. I'm talking about
>     specifically the size of the index on disk for any particular
>     _replica_, meaning the size in places similar to:
>     pdv201806_shard1_replica1/data/index. I've never seen as much
>     disparity as you're talking about so we should get to the bottom of
>     that.
>
>     Do you have massive numbers of deleted docs in any of those shards?
>     The admin screen for any particular replica will show this number.
>
>
>     On another note: Your cache sizes are probably not part of the page
>     fault question, but on the surface they're badly misconfigured, at
>     least the filterCache and queryResultCache. Each entry in the
>     filterCache is a map entry, the key is roughly the query and the value
>     is bounded by maxDoc/8. So if you have, say, 8M documents, your
>     filterCache could theoretically be 1M each (give or take) and you
>     could have up to 20,000 of them. You're probably just being lucky and
>     either not having very many distinct fq clauses or are indexing often
>     enough that it isn't growing for very long before being flushed.
>
>     Your queryResultCache takes up a lot less space, but still it's quite
>     large. It has two primary purposes:
>     > paging. It generally stores a few integers (40 is common, maybe several hundred but who cares?) so hitting the next page won't have to search again. This isn't terribly important in modern installations.
>
>     > being used in autowarming to pre-load parts of the index into memory.
>
>     I'd consider knocking each of these back to the defaults (512), except
>     I'd put the autowarm count at, say, 16 or so.
>
>     The document cache is less clear, the recommendation is (number of
>     simultaneous queries you expect) X (your average row parameter)
>
>     Best,
>     Erick
>
>     On Mon, Jan 7, 2019 at 12:43 PM Branham, Jeremy (Experis)
>     <[hidden email]> wrote:
>     >
>     > Thanks Erick/Chris for the information.
>     > The page faults are occurring on each node of the cluster.
>     > These are VMs running SOLR v7.2.1 on RHEL 7. CPUx8, 64GB mem.
>     >
>     > We’re collecting GC information and using a DynaTrace agent, so I’m not sure if / how much that contributes to the overhead.
>     >
>     > This cluster is used strictly for type-ahead/auto-complete functionality.
>     >
>     > I’ve also just noticed that the shards are imbalanced – 2 having about 90GB and 2 having about 18GB of data.
>     > Having just joined this team, I’m not too familiar yet with the documents or queries/updates [and maybe not relevant to the page faults].
>     > Although, I did check the schema, and most of the fields are stored=true, docValues=true
>     >
>     > Solr v7.2.1
>     > OS: RHEL 7
>     >
>     > Collection Configuration -
>     > Shard count: 4
>     > configName: pdv201806
>     > replicationFactor: 2
>     > maxShardsPerNode: 1
>     > router: compositeId
>     > autoAddReplicas: false
>     >
>     > Cache configuration –
>     > filterCache class="solr.FastLRUCache"
>     >                  size="20000"
>     >                  initialSize="5000"
>     >                  autowarmCount="10"
>     > queryResultCache class="solr.LRUCache"
>     >                       size="5000"
>     >                       initialSize="1000"
>     >                       autowarmCount="0"
>     > documentCache class="solr.LRUCache"
>     >                    size="15000"
>     >                    initialSize="512"
>     >
>     > enableLazyFieldLoading=true
>     >
>     >
>     > JVM Information/Configuration –
>     > java.runtime.version: 1.8.0_162-b12
>     >
>     > -XX:+CMSParallelRemarkEnabled
>     > -XX:+CMSScavengeBeforeRemark
>     > -XX:+ParallelRefProcEnabled
>     > -XX:+PrintGCApplicationStoppedTime
>     > -XX:+PrintGCDateStamps
>     > -XX:+PrintGCDetails
>     > -XX:+PrintGCTimeStamps
>     > -XX:+PrintHeapAtGC
>     > -XX:+PrintTenuringDistribution
>     > -XX:+ScavengeBeforeFullGC
>     > -XX:+UseCMSInitiatingOccupancyOnly
>     > -XX:+UseConcMarkSweepGC
>     > -XX:+UseGCLogFileRotation
>     > -XX:+UseParNewGC
>     > -XX:-OmitStackTraceInFastThrow
>     > -XX:CMSInitiatingOccupancyFraction=70
>     > -XX:CMSMaxAbortablePrecleanTime=6000
>     > -XX:ConcGCThreads=4
>     > -XX:GCLogFileSize=20M
>     > -XX:MaxTenuringThreshold=8
>     > -XX:NewRatio=3
>     > -XX:ParallelGCThreads=8
>     > -XX:PretenureSizeThreshold=64m
>     > -XX:SurvivorRatio=4
>     > -XX:TargetSurvivorRatio=90
>     > -Xms16g
>     > -Xmx32g
>     > -Xss256k
>     > -verbose:gc
>     >
>     >
>     >
>     > Jeremy Branham
>     > [hidden email]
>     >
>     > On 1/7/19, 1:16 PM, "Christopher Schultz" <[hidden email]> wrote:
>     >
>     >     -----BEGIN PGP SIGNED MESSAGE-----
>     >     Hash: SHA256
>     >
>     >     Erick,
>     >
>     >     On 1/7/19 11:52, Erick Erickson wrote:
>     >     > Images do not come through, so we don't see what you're seeing.
>     >     >
>     >     > That said, I'd expect page faults to happen:
>     >     >
>     >     > 1> when indexing. Besides what you'd expect (new segments written
>     >     > to disk), there's segment merging going on in the background which
>     >     > has to read segments from disk in order to merge.
>     >     >
>     >     > 2> when querying, any fields returned as part of a doc that has
>     >     > stored=true docValues=false will require a disk access to get the
>     >     > stored data.
>     >
>     >     A page fault is not necessarily a disk access. It almost always *is*,
>     >     but it's not because the application is calling fopen(). It's because
>     >     the OS is performing a memory operation which often results in a dip
>     >     into virtual memory.
>     >
>     >     Jeremy, are these page-faults occurring on all the machines in your
>     >     cluster, or only some? What is the hardware configuration of each
>     >     machine (specifically, memory)? What are your JVM settings for your
>     >     Solr instances? Is anything else running on these nodes?
>     >
>     >     It would help to understand what's happening on your servers. "I'm
>     >     seeing page faults" doesn't really help us help you.
>     >
>     >     Thanks,
>     >     - -chris
>     >
>     >     > On Mon, Jan 7, 2019 at 8:35 AM Branham, Jeremy (Experis)
>     >     > <[hidden email]> wrote:
>     >     >>
>     >     >> Does anyone know if it is typical behavior for a SOLR cluster to
>     >     >> have lots of page faults (50-100 per second) under heavy load?
>     >     >>
>     >     >> We are performing load testing on a cluster with 8 nodes, and my
>     >     >> performance engineer has brought this information to attention.
>     >     >>
>     >     >> I don’t know enough about memory management to say it is normal
>     >     >> or not.
>     >     >>
>     >     >>
>     >     >>
>     >     >> The performance doesn’t appear to be suffering, but I don’t want
>     >     >> to overlook a potential hazard.
>     >     >>
>     >     >>
>     >     >>
>     >     >> Thanks!
>     >     >>
>     >     >>
>     >     >>
>     >     >>
>     >     >>
>     >     >>
>     >     >>
>     >     >>
>     >     >>
>     >     >> Jeremy Branham
>     >     >>
>     >     >> [hidden email]
>     >     >>
>     >     >> Allstate Insurance Company | UCV Technology Services |
>     >     >> Information Services Group
>     >     >>
>     >     >>
>     >     >
>     >     -----BEGIN PGP SIGNATURE-----
>     >     Comment: Using GnuPG with Thunderbird - https://urldefense.proofpoint.com/v2/url?u=https-3A__www.enigmail.net_&d=DwIFaQ&c=gtIjdLs6LnStUpy9cTOW9w&r=0SwsmPELGv6GC1_5JSQ9T7ZPMLljrIkbF_2jBCrKXI0&m=3W1fPV3il56N1yZXMpkr8tctxVeKkZ9Bi5S74c2AmSo&s=vNR-dUrni9lawd8NWq67r2bdsuz_UvpC-vtPr9dbYF4&e=
>     >
>     >     iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlwzpYsACgkQHPApP6U8
>     >     pFgSHxAAgaXV5wkwV7Ru2QyhnvxUnIWY4Iom0IdZYrDuZBDxmFx9wzE7P33zmR3E
>     >     nrgZCqBtAMdxRSwG9BfyKircChZBssqtQpskw6mgJyzRyGvKVJjJ68r0vEio3Kjo
>     >     HjaJczBFWvdOKm42W1Li4SeymGyYXu/jmdkWLcIbEM4BgDQLf1HhSEphDeZzP4ST
>     >     GNDBrIA6XkUJwE1r58FUuj9l0XSKUAPLOPNAx1qGiAn4fKdbysVHvLcvJvJzC0pC
>     >     1kx000r+Mqdd61EzhM20ZDIvg2F3vgFgGCUtB31hIi18bfD8whoAafL2FSMkIccD
>     >     H7X09PpUK8qPM/oQgqCKTtfmVR3M2pi3CSxLFSQ1/QucnF2wxWknOOWUH1TMU/L2
>     >     KUQHS6GwuTk+R/8PxdBRsZI8ON3MVb690ECV4QplYlkrtygXrLRg2YOgifgAXsKL
>     >     5Kg2mrpKoxfNnDWaRksy4GUDTsSxbkd1rpnHJEZ8le26HXvz9wrug/FtNPzqP8S9
>     >     dan2gkgiSqOM9GKlKkA72ROyQDhZa5YiXfGNdRrmfkiQzlDBEcGpD8pg1GwskRJl
>     >     yidTBfvRSyCHsI5NBGf65nTG+2WfUnr8wClHVK5QQGVilHBn6KzeHeDTL9ZpHvcn
>     >     GhkDMvc+9f8DR7Hr/mTiGjYIAvJZYiIJeYUoe0Bl2BHmGDv0tEk=
>     >     =OpZo
>     >     -----END PGP SIGNATURE-----
>     >
>     >
>
>