Am I'm missing something obvious about subfacets?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Am I'm missing something obvious about subfacets?

Scott Blum
Hi folks,

I think I'm missing something fundamental about how subfacets work and relate to each other.  I don't really understand when it's legal to reference a subfacet, and when it's not.

For example, here's a simple subfacet example from http://yonik.com/solr-subfacets/

      top_authors:{ 
        type: terms,
        field: author,
        limit: 7,
        sort: "revenue desc",
        facet:{
          revenue: "sum(sales)"
        }
      }

Notice how "revenue" is referenced from top_authors.sort?  The thing is, I haven't seen any examples where the subfacet is referenced *except* for sort.

What I'm actually trying to accomplish is doing some post-processing with the aggregation functions.  But I don't want to aggregate at the "leaf buckets" -- I'm trying to aggregate *across* buckets at a higher level.  You could think of this as similar to the "reduce" phase of a map-reduce.

In other words, I know how to fan *out* with subfacets to create more buckets; what I don't know is how to then reduce those buckets into aggregate statistics about the buckets.

What am I missing?

Thanks!
Scott

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Am I'm missing something obvious about subfacets?

Yonik Seeley
On Fri, May 19, 2017 at 6:36 PM, Scott Blum <[hidden email]> wrote:

> Hi folks,
>
> I think I'm missing something fundamental about how subfacets work and
> relate to each other.  I don't really understand when it's legal to
> reference a subfacet, and when it's not.
>
> For example, here's a simple subfacet example from
> http://yonik.com/solr-subfacets/
>
>       top_authors:{
>         type: terms,
>         field: author,
>         limit: 7,
>         sort: "revenue desc",
>         facet:{
>           revenue: "sum(sales)"
>         }
>       }
>
>
> Notice how "revenue" is referenced from top_authors.sort?  The thing is, I
> haven't seen any examples where the subfacet is referenced *except* for
> sort.

That's currently all we can do.
There's an issue open for filtering buckets by calculated metrics, but
I don't think that addresses what you're talking about.

> What I'm actually trying to accomplish is doing some post-processing with
> the aggregation functions.  But I don't want to aggregate at the "leaf
> buckets" -- I'm trying to aggregate *across* buckets at a higher level.

Some small subset may be covered by
https://issues.apache.org/jira/browse/SOLR-8998 child rollups
or https://issues.apache.org/jira/browse/SOLR-10545 compound field faceting

But the streaming expressions is more likely the right solution for
more general purpose computation.  If there is a common enough
faceting usecase that needs to be faster, we can look at how to fit it
into the JSON Facet API.  Some of the streaming expressions stuff
already starts off with faceting as the source for its input stream.

-Yonik



>  You
> could think of this as similar to the "reduce" phase of a map-reduce.
>
> In other words, I know how to fan *out* with subfacets to create more
> buckets; what I don't know is how to then reduce those buckets into
> aggregate statistics about the buckets.
>
> What am I missing?
>
> Thanks!
> Scott
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Am I'm missing something obvious about subfacets?

Scott Blum
Thanks, Yonik!  Glad to hear I'm not completely crazy.  We should probably update to 6.x and try out streaming.

On Sat, May 20, 2017 at 7:13 AM, Yonik Seeley <[hidden email]> wrote:
On Fri, May 19, 2017 at 6:36 PM, Scott Blum <[hidden email]> wrote:
> Hi folks,
>
> I think I'm missing something fundamental about how subfacets work and
> relate to each other.  I don't really understand when it's legal to
> reference a subfacet, and when it's not.
>
> For example, here's a simple subfacet example from
> http://yonik.com/solr-subfacets/
>
>       top_authors:{
>         type: terms,
>         field: author,
>         limit: 7,
>         sort: "revenue desc",
>         facet:{
>           revenue: "sum(sales)"
>         }
>       }
>
>
> Notice how "revenue" is referenced from top_authors.sort?  The thing is, I
> haven't seen any examples where the subfacet is referenced *except* for
> sort.

That's currently all we can do.
There's an issue open for filtering buckets by calculated metrics, but
I don't think that addresses what you're talking about.

> What I'm actually trying to accomplish is doing some post-processing with
> the aggregation functions.  But I don't want to aggregate at the "leaf
> buckets" -- I'm trying to aggregate *across* buckets at a higher level.

Some small subset may be covered by
https://issues.apache.org/jira/browse/SOLR-8998 child rollups
or https://issues.apache.org/jira/browse/SOLR-10545 compound field faceting

But the streaming expressions is more likely the right solution for
more general purpose computation.  If there is a common enough
faceting usecase that needs to be faster, we can look at how to fit it
into the JSON Facet API.  Some of the streaming expressions stuff
already starts off with faceting as the source for its input stream.

-Yonik



>  You
> could think of this as similar to the "reduce" phase of a map-reduce.
>
> In other words, I know how to fan *out* with subfacets to create more
> buckets; what I don't know is how to then reduce those buckets into
> aggregate statistics about the buckets.
>
> What am I missing?
>
> Thanks!
> Scott
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Am I'm missing something obvious about subfacets?

Joel Bernstein
Hi Scott,

You may find this ticket interesting:


On Sun, May 21, 2017 at 5:08 PM, Scott Blum <[hidden email]> wrote:
Thanks, Yonik!  Glad to hear I'm not completely crazy.  We should probably update to 6.x and try out streaming.

On Sat, May 20, 2017 at 7:13 AM, Yonik Seeley <[hidden email]> wrote:
On Fri, May 19, 2017 at 6:36 PM, Scott Blum <[hidden email]> wrote:
> Hi folks,
>
> I think I'm missing something fundamental about how subfacets work and
> relate to each other.  I don't really understand when it's legal to
> reference a subfacet, and when it's not.
>
> For example, here's a simple subfacet example from
> http://yonik.com/solr-subfacets/
>
>       top_authors:{
>         type: terms,
>         field: author,
>         limit: 7,
>         sort: "revenue desc",
>         facet:{
>           revenue: "sum(sales)"
>         }
>       }
>
>
> Notice how "revenue" is referenced from top_authors.sort?  The thing is, I
> haven't seen any examples where the subfacet is referenced *except* for
> sort.

That's currently all we can do.
There's an issue open for filtering buckets by calculated metrics, but
I don't think that addresses what you're talking about.

> What I'm actually trying to accomplish is doing some post-processing with
> the aggregation functions.  But I don't want to aggregate at the "leaf
> buckets" -- I'm trying to aggregate *across* buckets at a higher level.

Some small subset may be covered by
https://issues.apache.org/jira/browse/SOLR-8998 child rollups
or https://issues.apache.org/jira/browse/SOLR-10545 compound field faceting

But the streaming expressions is more likely the right solution for
more general purpose computation.  If there is a common enough
faceting usecase that needs to be faster, we can look at how to fit it
into the JSON Facet API.  Some of the streaming expressions stuff
already starts off with faceting as the source for its input stream.

-Yonik



>  You
> could think of this as similar to the "reduce" phase of a map-reduce.
>
> In other words, I know how to fan *out* with subfacets to create more
> buckets; what I don't know is how to then reduce those buckets into
> aggregate statistics about the buckets.
>
> What am I missing?
>
> Thanks!
> Scott
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



Loading...