exact matches on a join

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

exact matches on a join

rhys J
I am trying to do a join, which I have working properly on 2 cores.

One core has report_as, and the other core has debt_id.

If I enter 'report_as: "Freeman", I expect to get 272 results. But I get
557.

When I do a database search on the matched fields, it shows me that
report_as: "Freeman" is matching also on 'A-1 Freeman'.

I have tried boosting the score as report_as: "Freeman"^2, but I get the
same results from the API, and from the browser itself.

Here is my query:

{
  "responseHeader":{
    "status":0,
    "QTime":5,
    "params":{
      "q":"( * )",
      "indent":"on",
      "fl":"debt_id, score",
      "cursorMark":"*",
      "sort":"score desc, id desc",
      "fq":"{!join from=debtor_id to=debt_id fromIndex=dbtr}(
report_as:\"Freeman\"^2)",
      "rows":"1000"}},
  "response":{"numFound":557,"start":0,"maxScore":1.0,"docs":[
      {
        "debt_id":"485435",
        "score":1.0},
      {
        "debt_id":"485435",
        "score":1.0},
      {
        "debt_id":"482795",
        "score":1.0},
      {
        "debt_id":"482795",
        "score":1.0},
      {
        "debt_id":"482794",
        "score":1.0},
      {
        "debt_id":"482794",
        "score":1.0},
      {
        "debt_id":"482794",
        "score":1.0},

SKIP



{
        "debt_id":"396925",
        "score":1.0},
      {
        "debt_id":"396925",
        "score":1.0},
      {
        "debt_id":"396925",
        "score":1.0},
      {
        "debt_id":"396925",
        "score":1.0},
      {
        "debt_id":"396925",
        "score":1.0},
      {
        "debt_id":"396925",
        "score":1.0},
      {
        "debt_id":"396925",
        "score":1.0},
      {
        "debt_id":"396925",
        "score":1.0},
      {
        "debt_id":"396925",
        "score":1.0},
      {
        "debt_id":"396925",
        "score":1.0},
      {
        "debt_id":"396925",


These ones are the correct matches that I can verify with the
database, but their scores are the same as the ones matching on
'A1-Freeman'

Is my scoring set up wrong?

Thanks,

Rhys
Reply | Threaded
Open this post in threaded view
|

Re: exact matches on a join

Jason Gerlowski
Are these fields "string" or "text" fields?

Text fields receive analysis that splits them into a series of terms.
That's why the query "Freeman" matches the document "A-1 Freeman".
"A-1 Freeman" gets split up into multiple terms, and the "Freeman"
query matches one of those terms.  Text fields are what you use when
you want matches to have some wiggle room based on your analyzers.

String fields are much more geared towards exact matches.  No analysis
is done, so a query for "Freeman" would only match docs who have that
value identically.

Jason

On Tue, Nov 19, 2019 at 2:44 PM rhys J <[hidden email]> wrote:

>
> I am trying to do a join, which I have working properly on 2 cores.
>
> One core has report_as, and the other core has debt_id.
>
> If I enter 'report_as: "Freeman", I expect to get 272 results. But I get
> 557.
>
> When I do a database search on the matched fields, it shows me that
> report_as: "Freeman" is matching also on 'A-1 Freeman'.
>
> I have tried boosting the score as report_as: "Freeman"^2, but I get the
> same results from the API, and from the browser itself.
>
> Here is my query:
>
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":5,
>     "params":{
>       "q":"( * )",
>       "indent":"on",
>       "fl":"debt_id, score",
>       "cursorMark":"*",
>       "sort":"score desc, id desc",
>       "fq":"{!join from=debtor_id to=debt_id fromIndex=dbtr}(
> report_as:\"Freeman\"^2)",
>       "rows":"1000"}},
>   "response":{"numFound":557,"start":0,"maxScore":1.0,"docs":[
>       {
>         "debt_id":"485435",
>         "score":1.0},
>       {
>         "debt_id":"485435",
>         "score":1.0},
>       {
>         "debt_id":"482795",
>         "score":1.0},
>       {
>         "debt_id":"482795",
>         "score":1.0},
>       {
>         "debt_id":"482794",
>         "score":1.0},
>       {
>         "debt_id":"482794",
>         "score":1.0},
>       {
>         "debt_id":"482794",
>         "score":1.0},
>
> SKIP
>
>
>
> {
>         "debt_id":"396925",
>         "score":1.0},
>       {
>         "debt_id":"396925",
>         "score":1.0},
>       {
>         "debt_id":"396925",
>         "score":1.0},
>       {
>         "debt_id":"396925",
>         "score":1.0},
>       {
>         "debt_id":"396925",
>         "score":1.0},
>       {
>         "debt_id":"396925",
>         "score":1.0},
>       {
>         "debt_id":"396925",
>         "score":1.0},
>       {
>         "debt_id":"396925",
>         "score":1.0},
>       {
>         "debt_id":"396925",
>         "score":1.0},
>       {
>         "debt_id":"396925",
>         "score":1.0},
>       {
>         "debt_id":"396925",
>
>
> These ones are the correct matches that I can verify with the
> database, but their scores are the same as the ones matching on
> 'A1-Freeman'
>
> Is my scoring set up wrong?
>
> Thanks,
>
> Rhys
Reply | Threaded
Open this post in threaded view
|

Re: exact matches on a join

rhys J
On Thu, Nov 21, 2019 at 8:04 AM Jason Gerlowski <[hidden email]>
wrote:

> Are these fields "string" or "text" fields?
>
> Text fields receive analysis that splits them into a series of terms.
> That's why the query "Freeman" matches the document "A-1 Freeman".
> "A-1 Freeman" gets split up into multiple terms, and the "Freeman"
> query matches one of those terms.  Text fields are what you use when
> you want matches to have some wiggle room based on your analyzers.
>
> String fields are much more geared towards exact matches.  No analysis
> is done, so a query for "Freeman" would only match docs who have that
> value identically.
>
>
Thanks, this was the conclusion I came to too. When I asked, they decided
that those matches were acceptable, and to keep the field a textField.

Rhys