Median in Solr json facet api

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Median in Solr json facet api

Anil-2
HI,

Good Morning.
I don;t see median aggregation in JSON facet api documentation. Could you
please point me to the documentation to create custom json facet apis ?
Thanks.

Regards,
Anil
Reply | Threaded
Open this post in threaded view
|

Re: Median in Solr json facet api

Toke Eskildsen-2
On Wed, 2018-11-14 at 17:53 +0530, Anil wrote:
> I don;t see median aggregation in JSON facet api documentation.

It's the 50 percentile:


https://lucene.apache.org/solr/guide/7_5/json-facet-api.html#metrics-example

- Toke Eskildsen, Royal Danish Library


Reply | Threaded
Open this post in threaded view
|

Re: Median in Solr json facet api

Joel Bernstein
The JSON facet API uses the t-digest approach to estimate the percentiles.

You can also use Solr Math Expressions to take a random sample from a field
and estimate the median from the sample. Here is the Streaming Expression:

let(a=random(collection1, q="*:*", fl="filesize_d", rows="25000"),
     b=col(a, filesize_d),
     median=percentile(b, 50))

The example above takes a random sample and sets it to variable "a".
Then the filesize_d field from the sample (in variable "a") are copied to a
vector and set to variable "b".
Then the percentile function is called on the vector and the results are
set to variable "median".

The results look like this:

{ "result-set": { "docs": [ { "median": 39980.53459335005 }, { "EOF": true,
"RESPONSE_TIME": 365 } ] } }

You can adjust the sample size to see how it effects the estimate.

Here is the link to Solr Math Expressions in the User Guide:

https://lucene.apache.org/solr/guide/7_5/math-expressions.html






Joel Bernstein
http://joelsolr.blogspot.com/


On Wed, Nov 14, 2018 at 8:21 AM Toke Eskildsen <[hidden email]> wrote:

> On Wed, 2018-11-14 at 17:53 +0530, Anil wrote:
> > I don;t see median aggregation in JSON facet api documentation.
>
> It's the 50 percentile:
>
>
>
> https://lucene.apache.org/solr/guide/7_5/json-facet-api.html#metrics-example
>
> - Toke Eskildsen, Royal Danish Library
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Median in Solr json facet api

Anil-2
Thanks Toke and Joel.

On Wed, 14 Nov 2018 at 19:47, Joel Bernstein <[hidden email]> wrote:

> The JSON facet API uses the t-digest approach to estimate the percentiles.
>
> You can also use Solr Math Expressions to take a random sample from a field
> and estimate the median from the sample. Here is the Streaming Expression:
>
> let(a=random(collection1, q="*:*", fl="filesize_d", rows="25000"),
>      b=col(a, filesize_d),
>      median=percentile(b, 50))
>
> The example above takes a random sample and sets it to variable "a".
> Then the filesize_d field from the sample (in variable "a") are copied to a
> vector and set to variable "b".
> Then the percentile function is called on the vector and the results are
> set to variable "median".
>
> The results look like this:
>
> { "result-set": { "docs": [ { "median": 39980.53459335005 }, { "EOF": true,
> "RESPONSE_TIME": 365 } ] } }
>
> You can adjust the sample size to see how it effects the estimate.
>
> Here is the link to Solr Math Expressions in the User Guide:
>
> https://lucene.apache.org/solr/guide/7_5/math-expressions.html
>
>
>
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Wed, Nov 14, 2018 at 8:21 AM Toke Eskildsen <[hidden email]> wrote:
>
> > On Wed, 2018-11-14 at 17:53 +0530, Anil wrote:
> > > I don;t see median aggregation in JSON facet api documentation.
> >
> > It's the 50 percentile:
> >
> >
> >
> >
> https://lucene.apache.org/solr/guide/7_5/json-facet-api.html#metrics-example
> >
> > - Toke Eskildsen, Royal Danish Library
> >
> >
> >
>