[jira] [Comment Edited] (LUCENE-5015) Unexpected performance difference between SamplingAccumulator and StandardFacetAccumulator

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Comment Edited] (LUCENE-5015) Unexpected performance difference between SamplingAccumulator and StandardFacetAccumulator

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664980#comment-13664980 ]

Rob Audenaerde edited comment on LUCENE-5015 at 5/23/13 8:23 AM:
-----------------------------------------------------------------

I use a MatchAddDocsQuery(), so I retrieve all the 5 million documents as hits.
               
      was (Author: robau):
    Yes, I use a MatchAddDocsQuery()
                 

> Unexpected performance difference between SamplingAccumulator and StandardFacetAccumulator
> ------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-5015
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5015
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/facet
>    Affects Versions: 4.3
>            Reporter: Rob Audenaerde
>            Priority: Minor
>
> I have an unexpected performance difference between the SamplingAccumulator and the StandardFacetAccumulator.
> The case is an index with about 5M documents and each document containing about 10 fields. I created a facet on each of those fields. When searching to retrieve facet-counts (using 1 CountFacetRequest), the SamplingAccumulator is about twice as fast as the StandardFacetAccumulator. This is expected and a nice speed-up.
> However, when I use more CountFacetRequests to retrieve facet-counts for more than one field, the speeds of the SampingAccumulator decreases, to the point where the StandardFacetAccumulator is faster.
> {noformat}
> FacetRequests  Sampling    Standard
>  1               391 ms     1100 ms
>  2               531 ms     1095 ms
>  3               948 ms     1108 ms
>  4              1400 ms     1110 ms
>  5              1901 ms     1102 ms
> {noformat}
> Is this behaviour normal? I did not expect it, as the SamplingAccumulator needs to do less work?
> Some code to show what I do:
> {code}
> searcher.search( facetsQuery, facetsCollector );
> final List<FacetResult> collectedFacets = facetsCollector.getFacetResults();
> {code}
> {code}
> final FacetSearchParams facetSearchParams = new FacetSearchParams( facetRequests );
> FacetsCollector facetsCollector;
> if ( isSampled )
> {
> facetsCollector =
> FacetsCollector.create( new SamplingAccumulator( new RandomSampler(), facetSearchParams, searcher.getIndexReader(), taxo ) );
> }
> else
> {
> facetsCollector = FacetsCollector.create( FacetsAccumulator.create( facetSearchParams, searcher.getIndexReader(), taxo ) );
> {code}
>

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]