Understanding fieldNorm differences between 3.6.1 and 4.9 solrs

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Understanding fieldNorm differences between 3.6.1 and 4.9 solrs

Aaron Daubman
In trying to determine some subtle scoring differences (causing
occasionally significant ordering differences) among search results, I
wrote a parser to normalize debug.explain.structured JSON output.

It appears that every score that is different comes down to a difference in
fieldNorm, where the 3.6.1 solr is using  0.109375 as the fieldNorm, and
the 4.9 solr is using 0.125 as the fieldNorm. [1]

What would be causing the different versions to use different field norms
(and rather infrequently, as the majority of scores are identical as
desired)?

Thanks,
      Aaron

[1] Here's a snippet of the diff (of the output from my
debug.explain.structured normalizer) for one such difference (apologies for
the width):

    "06808040cd523a296abaf26025148c85": {
"06808040cd523a296abaf26025148c85": {
*      "_value": 0.83961660000000005,                       |
 "_value": 0.85474813000000005, *
      "description": "product of:",
"description": "product of:",
      "details": [
 "details": [
        {                                                               {
*          "_value": 2.623802,                              |
 "_value": 2.6710880000000001, *
          "description": "sum of:",
"description": "sum of:",
          "details": [
 "details": [
            {
{
*              "_value": 0.064461969999999993,              |
   "_value": 0.073670830000000007, *
              "description": "weight(t_style:alternative
   "description": "weight(t_style:alternative
              "details": [
   "details": [
                {
    {
                  "_value": 0.062980229999999998,
      "_value": 0.062980229999999998,
                  "description": "queryWeight",
      "description": "queryWeight",
                  "details": [
       "details": [
                    {
        {
                      "_value": 4.1850079999999998,
          "_value": 4.1850079999999998,
                      "description": "idf(137871)"
           "description": "idf(137871)"
                    }
        }
                  ]
      ]
                },
     },
                {
    {
*                  "_value": 1.0235270999999999,            |
       "_value": 1.1697453, *
                  "description": "fieldWeight",
      "description": "fieldWeight",
                  "details": [
       "details": [
                    {
        {
                      "_value": 2.2360679999999999,
          "_value": 2.2360679999999999,
                      "description": "tf(freq=5)"
          "description": "tf(freq=5)"
                    },
         },
                    {
        {
                      "_value": 4.1850079999999998,
          "_value": 4.1850079999999998,
                      "description": "idf(137871)"
           "description": "idf(137871)"
                    },
         },
                    {
        {
*                      "_value": 0.109375,                  |
           "_value": 0.125, *
*                      "description": "fieldNorm"
           "description": "fieldNorm"*
                    }
        }
                  ]
      ]
                }
    }
              ]
  ]
            },
 },
Reply | Threaded
Open this post in threaded view
|

Re: Understanding fieldNorm differences between 3.6.1 and 4.9 solrs

Aaron Daubman
Wow - so apparently I have terrible recall and should re-read this thread I
started on the same topic when upgrading from 1.4 to 3.6 and hit a very
similar fieldNorm issue almost two years ago! =)
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201207.mbox/%3CCALyTvnpwZMj4zxPbK0abVpnyRJny=QAuiJdqmj7E3ZgNv7Utpg@...%3E

In the mean time, I'm still happy to hear any new thoughts / suggestions on
making similarity contiguous across upgrades.

Thanks again,
       Aaron


On Tue, Jul 1, 2014 at 11:14 PM, Aaron Daubman <[hidden email]> wrote:

> In trying to determine some subtle scoring differences (causing
> occasionally significant ordering differences) among search results, I
> wrote a parser to normalize debug.explain.structured JSON output.
>
> It appears that every score that is different comes down to a difference
> in fieldNorm, where the 3.6.1 solr is using  0.109375 as the fieldNorm, and
> the 4.9 solr is using 0.125 as the fieldNorm. [1]
>
> What would be causing the different versions to use different field norms
> (and rather infrequently, as the majority of scores are identical as
> desired)?
>
> Thanks,
>       Aaron
>
> [1] Here's a snippet of the diff (of the output from my
> debug.explain.structured normalizer) for one such difference (apologies for
> the width):
>
>     "06808040cd523a296abaf26025148c85": {
> "06808040cd523a296abaf26025148c85": {
> *      "_value": 0.83961660000000005,                       |
>  "_value": 0.85474813000000005, *
>       "description": "product of:",
> "description": "product of:",
>       "details": [
>  "details": [
>         {                                                               {
> *          "_value": 2.623802,                              |
>  "_value": 2.6710880000000001, *
>           "description": "sum of:",
> "description": "sum of:",
>           "details": [
>  "details": [
>             {
>   {
> *              "_value": 0.064461969999999993,              |
>      "_value": 0.073670830000000007, *
>               "description": "weight(t_style:alternative
>    "description": "weight(t_style:alternative
>               "details": [
>    "details": [
>                 {
>       {
>                   "_value": 0.062980229999999998,
>         "_value": 0.062980229999999998,
>                   "description": "queryWeight",
>         "description": "queryWeight",
>                   "details": [
>        "details": [
>                     {
>           {
>                       "_value": 4.1850079999999998,
>             "_value": 4.1850079999999998,
>                       "description": "idf(137871)"
>            "description": "idf(137871)"
>                     }
>           }
>                   ]
>         ]
>                 },
>      },
>                 {
>       {
> *                  "_value": 1.0235270999999999,            |
>          "_value": 1.1697453, *
>                   "description": "fieldWeight",
>         "description": "fieldWeight",
>                   "details": [
>        "details": [
>                     {
>           {
>                       "_value": 2.2360679999999999,
>             "_value": 2.2360679999999999,
>                       "description": "tf(freq=5)"
>             "description": "tf(freq=5)"
>                     },
>          },
>                     {
>           {
>                       "_value": 4.1850079999999998,
>             "_value": 4.1850079999999998,
>                       "description": "idf(137871)"
>            "description": "idf(137871)"
>                     },
>          },
>                     {
>           {
> *                      "_value": 0.109375,                  |
>              "_value": 0.125, *
> *                      "description": "fieldNorm"
>              "description": "fieldNorm"*
>                     }
>           }
>                   ]
>         ]
>                 }
>       }
>               ]
>     ]
>             },
>  },
>