The JSON facet API uses the t-digest approach to estimate the percentiles.

You can also use Solr Math Expressions to take a random sample from a field

and estimate the median from the sample. Here is the Streaming Expression:

let(a=random(collection1, q="*:*", fl="filesize_d", rows="25000"),

b=col(a, filesize_d),

median=percentile(b, 50))

The example above takes a random sample and sets it to variable "a".

Then the filesize_d field from the sample (in variable "a") are copied to a

vector and set to variable "b".

Then the percentile function is called on the vector and the results are

set to variable "median".

The results look like this:

{ "result-set": { "docs": [ { "median": 39980.53459335005 }, { "EOF": true,

"RESPONSE_TIME": 365 } ] } }

You can adjust the sample size to see how it effects the estimate.

Here is the link to Solr Math Expressions in the User Guide:

https://lucene.apache.org/solr/guide/7_5/math-expressions.htmlJoel Bernstein

http://joelsolr.blogspot.com/On Wed, Nov 14, 2018 at 8:21 AM Toke Eskildsen <

[hidden email]> wrote: