StatsComponent and sint?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

StatsComponent and sint?

Jonathan Rochkind
Man, what types of fields is StatsComponent actually known to work with?

With an sint, it seems to have trouble if there are any documents with null values for the field. It appears to decide that a null/empty/blank value is -1325166535, and is thus the minimum value.

At least if I'm interpreting what's going on right. Anyone run into this?
Reply | Threaded
Open this post in threaded view
|

Re: StatsComponent and sint?

Chris Hostetter-3

: With an sint, it seems to have trouble if there are any documents with
: null values for the field. It appears to decide that a null/empty/blank
: value is -1325166535, and is thus the minimum value.

1) there is relaly no such thing as a "null" value for a field ... there
are documents that have no value for that field -- but that's differnet
then actually indexing a null value (Solr is not a RDBMS)

I attempted to reproduce the problem you are describing by chaning the
solr 1.4.1 schema.xml so that the "popularity" field used type "sint" and
then indexed all of the sample documents.  exactly one of those documents
has no value for hte "popularity" field (id:UTF8TEST) and this is the
results that i got from the following reuqest...

http://localhost:8983/solr/select/?wt=json&q=*%3A*%0D%0A&version=2.2&start=0&rows=00&indent=on&stats=true&stats.field=popularity
{
 "responseHeader":{
  "status":0,
  "QTime":1,
  "params":{
        "indent":"on",
        "start":"0",
        "q":"*:*\r\n",
        "stats":"true",
        "stats.field":"popularity",
        "wt":"json",
        "version":"2.2",
        "rows":"00"}},
 "response":{"numFound":19,"start":0,"docs":[]
 },
 "stats":{
  "stats_fields":{
        "popularity":{
         "min":0.0,
         "max":10.0,
         "sum":102.0,
         "count":18,
         "missing":1,
         "sumOfSquares":702.0,
         "mean":5.666666666666667,
         "stddev":2.700762419587999}}}}

As you can see, it correclty recognized that the "min" value was 0.0, and
thta 1 of the 19 total docs had no value for that field.


If you can't reproduce these types of results with your own data, then we
need to see a lot more details about your specific sitaution (schema.xml,
raw data, query urls, results, etc...) to try and understand what you are
seeing.


-Hoss

Reply | Threaded
Open this post in threaded view
|

Re: StatsComponent and sint?

Jonathan Rochkind
Thanks Hoss, the problem was transient, I believe that my index had
become corrupted (changed the schema but hadn't fully deleted all
documents that had been using the previous version of the schema), my
fault.