PriorityQueue:initialize consistently showing up as hot spot while profiling

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

PriorityQueue:initialize consistently showing up as hot spot while profiling

Aaron Daubman
Greetings,

I've been seeing this call chain come up fairly frequently when
debugging longer-QTime queries under Solr 3.6.1 but have not been able
to understand from the code what is really going on - the call graph
and code follow below.

Would somebody please explain to me:
1) Why this would show up frequently as a hotspot
2) If it is expected to do so
3) If there is anything I should look in to that may help performance
where this frequently shows up as the long pole in the QTime tent
4) What the code is doing and why heap is being allocated as an
apparently giant object (which also is apparently not unheard of due
to MAX_VALUE wrapping check)

---call-graph---
Filter - SolrDispatchFilter:doFilter (method time = 12 ms, total time = 487 ms)
 Filter - SolrDispatchFilter:execute:365 (method time = 0 ms, total
time = 109 ms)
  org.apache.solr.core.SolrCore:execute:1376 (method time = 0 ms,
total time = 109 ms)
   org.apache.solr.handler.RequestHandlerBase:handleRequest:129
(method time = 0 ms, total time = 109 ms)
    org.apache.solr.handler.component.SearchHandler:handleRequestBody:186
(method time = 0 ms, total time = 109 ms)
     com.echonest.solr.component.EchoArtistGroupingComponent:process:188
(method time = 0 ms, total time = 109 ms)
      org.apache.solr.search.SolrIndexSearcher:search:375 (method time
= 0 ms, total time = 96 ms)
       org.apache.solr.search.SolrIndexSearcher:getDocListC:1176
(method time = 0 ms, total time = 96 ms)
        org.apache.solr.search.SolrIndexSearcher:getDocListNC:1209
(method time = 0 ms, total time = 96 ms)
         org.apache.solr.search.SolrIndexSearcher:getProcessedFilter:796
(method time = 0 ms, total time = 26 ms)
          org.apache.solr.search.BitDocSet:andNot:185 (method time = 0
ms, total time = 13 ms)
           org.apache.lucene.util.OpenBitSet:clone:732 (method time =
13 ms, total time = 13 ms)
          org.apache.solr.search.BitDocSet:intersection:31 (method
time = 0 ms, total time = 13 ms)
           org.apache.solr.search.DocSetBase:intersection:90 (method
time = 0 ms, total time = 13 ms)
            org.apache.lucene.util.OpenBitSet:and:808 (method time =
13 ms, total time = 13 ms)
         org.apache.lucene.search.TopFieldCollector:create:916 (method
time = 0 ms, total time = 46 ms)
          org.apache.lucene.search.FieldValueHitQueue:create:175
(method time = 0 ms, total time = 46 ms)
           org.apache.lucene.search.FieldValueHitQueue$MultiComparatorsFieldValueHitQueue:<init>:111
(method time = 0 ms, total time = 46 ms)
            org.apache.lucene.search.SortField:getComparator:409
(method time = 0 ms, total time = 13 ms)
             org.apache.lucene.search.FieldComparator$FloatComparator:<init>:400
(method time = 13 ms, total time = 13 ms)
            org.apache.lucene.util.PriorityQueue:initialize:108
(method time = 33 ms, total time = 33 ms)
---snip---


org.apache.lucene.util.PriorityQueue:initialize - hotspot is line 108:
    heap = (T[]) new Object[heapSize]; // T is unbounded type, so this
unchecked cast works always

---PriorityQueue.java---
  /** Subclass constructors must call this. */
  @SuppressWarnings("unchecked")
  protected final void initialize(int maxSize) {
    size = 0;
    int heapSize;
    if (0 == maxSize)
      // We allocate 1 extra to avoid if statement in top()
      heapSize = 2;
    else {
      if (maxSize == Integer.MAX_VALUE) {
        // Don't wrap heapSize to -1, in this case, which
        // causes a confusing NegativeArraySizeException.
        // Note that very likely this will simply then hit
        // an OOME, but at least that's more indicative to
        // caller that this values is too big.  We don't +1
        // in this case, but it's very unlikely in practice
        // one will actually insert this many objects into
        // the PQ:
        heapSize = Integer.MAX_VALUE;
      } else {
        // NOTE: we add +1 because all access to heap is
        // 1-based not 0-based.  heap[0] is unused.
        heapSize = maxSize + 1;
      }
    }
    heap = (T[]) new Object[heapSize]; // T is unbounded type, so this
unchecked cast works always
    this.maxSize = maxSize;

    // If sentinel objects are supported, populate the queue with them
    T sentinel = getSentinelObject();
    if (sentinel != null) {
      heap[1] = sentinel;
      for (int i = 2; i < heap.length; i++) {
        heap[i] = getSentinelObject();
      }
      size = maxSize;
    }
  }
---snip---


Thanks, as always!
     Aaron
Reply | Threaded
Open this post in threaded view
|

Re: PriorityQueue:initialize consistently showing up as hot spot while profiling

Mikhail Khludnev
what's the value of rows param
http://wiki.apache.org/solr/CommonQueryParameters#rows ?

On Fri, Oct 5, 2012 at 6:56 AM, Aaron Daubman <[hidden email]> wrote:

> Greetings,
>
> I've been seeing this call chain come up fairly frequently when
> debugging longer-QTime queries under Solr 3.6.1 but have not been able
> to understand from the code what is really going on - the call graph
> and code follow below.
>
> Would somebody please explain to me:
> 1) Why this would show up frequently as a hotspot
> 2) If it is expected to do so
> 3) If there is anything I should look in to that may help performance
> where this frequently shows up as the long pole in the QTime tent
> 4) What the code is doing and why heap is being allocated as an
> apparently giant object (which also is apparently not unheard of due
> to MAX_VALUE wrapping check)
>
> ---call-graph---
> Filter - SolrDispatchFilter:doFilter (method time = 12 ms, total time =
> 487 ms)
>  Filter - SolrDispatchFilter:execute:365 (method time = 0 ms, total
> time = 109 ms)
>   org.apache.solr.core.SolrCore:execute:1376 (method time = 0 ms,
> total time = 109 ms)
>    org.apache.solr.handler.RequestHandlerBase:handleRequest:129
> (method time = 0 ms, total time = 109 ms)
>     org.apache.solr.handler.component.SearchHandler:handleRequestBody:186
> (method time = 0 ms, total time = 109 ms)
>      com.echonest.solr.component.EchoArtistGroupingComponent:process:188
> (method time = 0 ms, total time = 109 ms)
>       org.apache.solr.search.SolrIndexSearcher:search:375 (method time
> = 0 ms, total time = 96 ms)
>        org.apache.solr.search.SolrIndexSearcher:getDocListC:1176
> (method time = 0 ms, total time = 96 ms)
>         org.apache.solr.search.SolrIndexSearcher:getDocListNC:1209
> (method time = 0 ms, total time = 96 ms)
>          org.apache.solr.search.SolrIndexSearcher:getProcessedFilter:796
> (method time = 0 ms, total time = 26 ms)
>           org.apache.solr.search.BitDocSet:andNot:185 (method time = 0
> ms, total time = 13 ms)
>            org.apache.lucene.util.OpenBitSet:clone:732 (method time =
> 13 ms, total time = 13 ms)
>           org.apache.solr.search.BitDocSet:intersection:31 (method
> time = 0 ms, total time = 13 ms)
>            org.apache.solr.search.DocSetBase:intersection:90 (method
> time = 0 ms, total time = 13 ms)
>             org.apache.lucene.util.OpenBitSet:and:808 (method time =
> 13 ms, total time = 13 ms)
>          org.apache.lucene.search.TopFieldCollector:create:916 (method
> time = 0 ms, total time = 46 ms)
>           org.apache.lucene.search.FieldValueHitQueue:create:175
> (method time = 0 ms, total time = 46 ms)
>
>  org.apache.lucene.search.FieldValueHitQueue$MultiComparatorsFieldValueHitQueue:<init>:111
> (method time = 0 ms, total time = 46 ms)
>             org.apache.lucene.search.SortField:getComparator:409
> (method time = 0 ms, total time = 13 ms)
>
>  org.apache.lucene.search.FieldComparator$FloatComparator:<init>:400
> (method time = 13 ms, total time = 13 ms)
>             org.apache.lucene.util.PriorityQueue:initialize:108
> (method time = 33 ms, total time = 33 ms)
> ---snip---
>
>
> org.apache.lucene.util.PriorityQueue:initialize - hotspot is line 108:
>     heap = (T[]) new Object[heapSize]; // T is unbounded type, so this
> unchecked cast works always
>
> ---PriorityQueue.java---
>   /** Subclass constructors must call this. */
>   @SuppressWarnings("unchecked")
>   protected final void initialize(int maxSize) {
>     size = 0;
>     int heapSize;
>     if (0 == maxSize)
>       // We allocate 1 extra to avoid if statement in top()
>       heapSize = 2;
>     else {
>       if (maxSize == Integer.MAX_VALUE) {
>         // Don't wrap heapSize to -1, in this case, which
>         // causes a confusing NegativeArraySizeException.
>         // Note that very likely this will simply then hit
>         // an OOME, but at least that's more indicative to
>         // caller that this values is too big.  We don't +1
>         // in this case, but it's very unlikely in practice
>         // one will actually insert this many objects into
>         // the PQ:
>         heapSize = Integer.MAX_VALUE;
>       } else {
>         // NOTE: we add +1 because all access to heap is
>         // 1-based not 0-based.  heap[0] is unused.
>         heapSize = maxSize + 1;
>       }
>     }
>     heap = (T[]) new Object[heapSize]; // T is unbounded type, so this
> unchecked cast works always
>     this.maxSize = maxSize;
>
>     // If sentinel objects are supported, populate the queue with them
>     T sentinel = getSentinelObject();
>     if (sentinel != null) {
>       heap[1] = sentinel;
>       for (int i = 2; i < heap.length; i++) {
>         heap[i] = getSentinelObject();
>       }
>       size = maxSize;
>     }
>   }
> ---snip---
>
>
> Thanks, as always!
>      Aaron
>



--
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics

<http://www.griddynamics.com>
 <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: PriorityQueue:initialize consistently showing up as hot spot while profiling

Aaron Daubman
On Fri, Oct 5, 2012 at 4:33 AM, Mikhail Khludnev
<[hidden email]> wrote:
> what's the value of rows param
> http://wiki.apache.org/solr/CommonQueryParameters#rows ?

Very interesting question - so, for historic reasons lost to me, we
pass in a huge (10000000?) number for rows and this hits our custom
component, which has its own internal maximum for real rows returned.
(This is a custom grouping component, so I am guessing the large
number of rows had to do with trying not to limit what got grouped?).

Is the value of rows what is used for that heap allocation?

Thanks,
     Aaron

>
> On Fri, Oct 5, 2012 at 6:56 AM, Aaron Daubman <[hidden email]> wrote:
>
>> Greetings,
>>
>> I've been seeing this call chain come up fairly frequently when
>> debugging longer-QTime queries under Solr 3.6.1 but have not been able
>> to understand from the code what is really going on - the call graph
>> and code follow below.
>>
>> Would somebody please explain to me:
>> 1) Why this would show up frequently as a hotspot
>> 2) If it is expected to do so
>> 3) If there is anything I should look in to that may help performance
>> where this frequently shows up as the long pole in the QTime tent
>> 4) What the code is doing and why heap is being allocated as an
>> apparently giant object (which also is apparently not unheard of due
>> to MAX_VALUE wrapping check)
>>
>> ---call-graph---
>> Filter - SolrDispatchFilter:doFilter (method time = 12 ms, total time =
>> 487 ms)
>>  Filter - SolrDispatchFilter:execute:365 (method time = 0 ms, total
>> time = 109 ms)
>>   org.apache.solr.core.SolrCore:execute:1376 (method time = 0 ms,
>> total time = 109 ms)
>>    org.apache.solr.handler.RequestHandlerBase:handleRequest:129
>> (method time = 0 ms, total time = 109 ms)
>>     org.apache.solr.handler.component.SearchHandler:handleRequestBody:186
>> (method time = 0 ms, total time = 109 ms)
>>      com.echonest.solr.component.EchoArtistGroupingComponent:process:188
>> (method time = 0 ms, total time = 109 ms)
>>       org.apache.solr.search.SolrIndexSearcher:search:375 (method time
>> = 0 ms, total time = 96 ms)
>>        org.apache.solr.search.SolrIndexSearcher:getDocListC:1176
>> (method time = 0 ms, total time = 96 ms)
>>         org.apache.solr.search.SolrIndexSearcher:getDocListNC:1209
>> (method time = 0 ms, total time = 96 ms)
>>          org.apache.solr.search.SolrIndexSearcher:getProcessedFilter:796
>> (method time = 0 ms, total time = 26 ms)
>>           org.apache.solr.search.BitDocSet:andNot:185 (method time = 0
>> ms, total time = 13 ms)
>>            org.apache.lucene.util.OpenBitSet:clone:732 (method time =
>> 13 ms, total time = 13 ms)
>>           org.apache.solr.search.BitDocSet:intersection:31 (method
>> time = 0 ms, total time = 13 ms)
>>            org.apache.solr.search.DocSetBase:intersection:90 (method
>> time = 0 ms, total time = 13 ms)
>>             org.apache.lucene.util.OpenBitSet:and:808 (method time =
>> 13 ms, total time = 13 ms)
>>          org.apache.lucene.search.TopFieldCollector:create:916 (method
>> time = 0 ms, total time = 46 ms)
>>           org.apache.lucene.search.FieldValueHitQueue:create:175
>> (method time = 0 ms, total time = 46 ms)
>>
>>  org.apache.lucene.search.FieldValueHitQueue$MultiComparatorsFieldValueHitQueue:<init>:111
>> (method time = 0 ms, total time = 46 ms)
>>             org.apache.lucene.search.SortField:getComparator:409
>> (method time = 0 ms, total time = 13 ms)
>>
>>  org.apache.lucene.search.FieldComparator$FloatComparator:<init>:400
>> (method time = 13 ms, total time = 13 ms)
>>             org.apache.lucene.util.PriorityQueue:initialize:108
>> (method time = 33 ms, total time = 33 ms)
>> ---snip---
>>
>>
>> org.apache.lucene.util.PriorityQueue:initialize - hotspot is line 108:
>>     heap = (T[]) new Object[heapSize]; // T is unbounded type, so this
>> unchecked cast works always
>>
>> ---PriorityQueue.java---
>>   /** Subclass constructors must call this. */
>>   @SuppressWarnings("unchecked")
>>   protected final void initialize(int maxSize) {
>>     size = 0;
>>     int heapSize;
>>     if (0 == maxSize)
>>       // We allocate 1 extra to avoid if statement in top()
>>       heapSize = 2;
>>     else {
>>       if (maxSize == Integer.MAX_VALUE) {
>>         // Don't wrap heapSize to -1, in this case, which
>>         // causes a confusing NegativeArraySizeException.
>>         // Note that very likely this will simply then hit
>>         // an OOME, but at least that's more indicative to
>>         // caller that this values is too big.  We don't +1
>>         // in this case, but it's very unlikely in practice
>>         // one will actually insert this many objects into
>>         // the PQ:
>>         heapSize = Integer.MAX_VALUE;
>>       } else {
>>         // NOTE: we add +1 because all access to heap is
>>         // 1-based not 0-based.  heap[0] is unused.
>>         heapSize = maxSize + 1;
>>       }
>>     }
>>     heap = (T[]) new Object[heapSize]; // T is unbounded type, so this
>> unchecked cast works always
>>     this.maxSize = maxSize;
>>
>>     // If sentinel objects are supported, populate the queue with them
>>     T sentinel = getSentinelObject();
>>     if (sentinel != null) {
>>       heap[1] = sentinel;
>>       for (int i = 2; i < heap.length; i++) {
>>         heap[i] = getSentinelObject();
>>       }
>>       size = maxSize;
>>     }
>>   }
>> ---snip---
>>
>>
>> Thanks, as always!
>>      Aaron
>>
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Tech Lead
> Grid Dynamics
>
> <http://www.griddynamics.com>
>  <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: PriorityQueue:initialize consistently showing up as hot spot while profiling

Mikhail Khludnev
okay. huge rows value is no.1 way to kill Lucene. It's not possible,
absolutely. You need to rethink logic of your component. Check Solr's
FieldCollapsing code, IIRC it makes second search to achieve similar goal.
Also check PostFilter and DelegatingCollector classes, their approach can
also be handy for your task.

On Fri, Oct 5, 2012 at 2:38 PM, Aaron Daubman <[hidden email]> wrote:

> On Fri, Oct 5, 2012 at 4:33 AM, Mikhail Khludnev
> <[hidden email]> wrote:
> > what's the value of rows param
> > http://wiki.apache.org/solr/CommonQueryParameters#rows ?
>
> Very interesting question - so, for historic reasons lost to me, we
> pass in a huge (10000000?) number for rows and this hits our custom
> component, which has its own internal maximum for real rows returned.
> (This is a custom grouping component, so I am guessing the large
> number of rows had to do with trying not to limit what got grouped?).
>
> Is the value of rows what is used for that heap allocation?
>
absolutely. it's classic priority queue algorithm upon binary heap.


>
> Thanks,
>      Aaron
>
> >
> > On Fri, Oct 5, 2012 at 6:56 AM, Aaron Daubman <[hidden email]> wrote:
> >
> >> Greetings,
> >>
> >> I've been seeing this call chain come up fairly frequently when
> >> debugging longer-QTime queries under Solr 3.6.1 but have not been able
> >> to understand from the code what is really going on - the call graph
> >> and code follow below.
> >>
> >> Would somebody please explain to me:
> >> 1) Why this would show up frequently as a hotspot
> >> 2) If it is expected to do so
> >> 3) If there is anything I should look in to that may help performance
> >> where this frequently shows up as the long pole in the QTime tent
> >> 4) What the code is doing and why heap is being allocated as an
> >> apparently giant object (which also is apparently not unheard of due
> >> to MAX_VALUE wrapping check)
> >>
> >> ---call-graph---
> >> Filter - SolrDispatchFilter:doFilter (method time = 12 ms, total time =
> >> 487 ms)
> >>  Filter - SolrDispatchFilter:execute:365 (method time = 0 ms, total
> >> time = 109 ms)
> >>   org.apache.solr.core.SolrCore:execute:1376 (method time = 0 ms,
> >> total time = 109 ms)
> >>    org.apache.solr.handler.RequestHandlerBase:handleRequest:129
> >> (method time = 0 ms, total time = 109 ms)
> >>
> org.apache.solr.handler.component.SearchHandler:handleRequestBody:186
> >> (method time = 0 ms, total time = 109 ms)
> >>      com.echonest.solr.component.EchoArtistGroupingComponent:process:188
> >> (method time = 0 ms, total time = 109 ms)
> >>       org.apache.solr.search.SolrIndexSearcher:search:375 (method time
> >> = 0 ms, total time = 96 ms)
> >>        org.apache.solr.search.SolrIndexSearcher:getDocListC:1176
> >> (method time = 0 ms, total time = 96 ms)
> >>         org.apache.solr.search.SolrIndexSearcher:getDocListNC:1209
> >> (method time = 0 ms, total time = 96 ms)
> >>          org.apache.solr.search.SolrIndexSearcher:getProcessedFilter:796
> >> (method time = 0 ms, total time = 26 ms)
> >>           org.apache.solr.search.BitDocSet:andNot:185 (method time = 0
> >> ms, total time = 13 ms)
> >>            org.apache.lucene.util.OpenBitSet:clone:732 (method time =
> >> 13 ms, total time = 13 ms)
> >>           org.apache.solr.search.BitDocSet:intersection:31 (method
> >> time = 0 ms, total time = 13 ms)
> >>            org.apache.solr.search.DocSetBase:intersection:90 (method
> >> time = 0 ms, total time = 13 ms)
> >>             org.apache.lucene.util.OpenBitSet:and:808 (method time =
> >> 13 ms, total time = 13 ms)
> >>          org.apache.lucene.search.TopFieldCollector:create:916 (method
> >> time = 0 ms, total time = 46 ms)
> >>           org.apache.lucene.search.FieldValueHitQueue:create:175
> >> (method time = 0 ms, total time = 46 ms)
> >>
> >>
>  org.apache.lucene.search.FieldValueHitQueue$MultiComparatorsFieldValueHitQueue:<init>:111
> >> (method time = 0 ms, total time = 46 ms)
> >>             org.apache.lucene.search.SortField:getComparator:409
> >> (method time = 0 ms, total time = 13 ms)
> >>
> >>  org.apache.lucene.search.FieldComparator$FloatComparator:<init>:400
> >> (method time = 13 ms, total time = 13 ms)
> >>             org.apache.lucene.util.PriorityQueue:initialize:108
> >> (method time = 33 ms, total time = 33 ms)
> >> ---snip---
> >>
> >>
> >> org.apache.lucene.util.PriorityQueue:initialize - hotspot is line 108:
> >>     heap = (T[]) new Object[heapSize]; // T is unbounded type, so this
> >> unchecked cast works always
> >>
> >> ---PriorityQueue.java---
> >>   /** Subclass constructors must call this. */
> >>   @SuppressWarnings("unchecked")
> >>   protected final void initialize(int maxSize) {
> >>     size = 0;
> >>     int heapSize;
> >>     if (0 == maxSize)
> >>       // We allocate 1 extra to avoid if statement in top()
> >>       heapSize = 2;
> >>     else {
> >>       if (maxSize == Integer.MAX_VALUE) {
> >>         // Don't wrap heapSize to -1, in this case, which
> >>         // causes a confusing NegativeArraySizeException.
> >>         // Note that very likely this will simply then hit
> >>         // an OOME, but at least that's more indicative to
> >>         // caller that this values is too big.  We don't +1
> >>         // in this case, but it's very unlikely in practice
> >>         // one will actually insert this many objects into
> >>         // the PQ:
> >>         heapSize = Integer.MAX_VALUE;
> >>       } else {
> >>         // NOTE: we add +1 because all access to heap is
> >>         // 1-based not 0-based.  heap[0] is unused.
> >>         heapSize = maxSize + 1;
> >>       }
> >>     }
> >>     heap = (T[]) new Object[heapSize]; // T is unbounded type, so this
> >> unchecked cast works always
> >>     this.maxSize = maxSize;
> >>
> >>     // If sentinel objects are supported, populate the queue with them
> >>     T sentinel = getSentinelObject();
> >>     if (sentinel != null) {
> >>       heap[1] = sentinel;
> >>       for (int i = 2; i < heap.length; i++) {
> >>         heap[i] = getSentinelObject();
> >>       }
> >>       size = maxSize;
> >>     }
> >>   }
> >> ---snip---
> >>
> >>
> >> Thanks, as always!
> >>      Aaron
> >>
> >
> >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
> > Tech Lead
> > Grid Dynamics
> >
> > <http://www.griddynamics.com>
> >  <[hidden email]>
>



--
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics

<http://www.griddynamics.com>
 <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: PriorityQueue:initialize consistently showing up as hot spot while profiling

Aaron Daubman
Hi Mikhail,

On Fri, Oct 5, 2012 at 7:15 AM, Mikhail Khludnev
<[hidden email]> wrote:
> okay. huge rows value is no.1 way to kill Lucene. It's not possible,
> absolutely. You need to rethink logic of your component. Check Solr's
> FieldCollapsing code, IIRC it makes second search to achieve similar goal.
> Also check PostFilter and DelegatingCollector classes, their approach can
> also be handy for your task.

This sounds like it could be a much saner way to handle the task,
however, I'm not sure what I should be looking at for the
'FieldCollapsing code' you mention - can you point me to a class?

Also, is there anything more you can say about PostFilter and
DelegatingCollector classes - I reviewed them but it was not obvious
to me what they were doing that would allow me to reduce the large
rows param we use to ensure all relevant docs are included in the
grouping and limiting occurs at the group level, rather than
pre-grouping...

Thanks again,
      Aaron