mlt.interestingTerms in MoreLikeThisComponent: standalone vs cloud

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

mlt.interestingTerms in MoreLikeThisComponent: standalone vs cloud

Thomas Corthals
Hi all,

When I run the following command:

curl "
http://127.0.0.1:8983/solr/techproducts/select?omitHeader=true&wt=json&q=apache&fl=id&mlt=true&mlt.fl=manu%2Ccat&mlt.mintf=1&mlt.mindf=1&mlt.interestingTerms=list
"

… against Solr 8.5 in standalone mode with techproducts, I get
interestingTerms in the output:

{
  "response":{"numFound":2,"start":0,"docs":[
      {
        "id":"UTF8TEST"},
      {
        "id":"SOLR1000"}]
  },
  "interestingTerms":{
    "UTF8TEST":["cat:search",
      "manu:foundation",
      "manu:software",
      "manu:apache",
      "cat:software"],
    "SOLR1000":["cat:search",
      "manu:foundation",
      "manu:software",
      "manu:apache",
      "cat:software",
      "cat:search",
      "manu:foundation",
      "manu:software",
      "manu:apache",
      "cat:software"]},
  "moreLikeThis":{
    "UTF8TEST":{"numFound":1,"start":0,"docs":[
        {
          "id":"SOLR1000"}]
    },
    "SOLR1000":{"numFound":1,"start":0,"docs":[
        {
          "id":"UTF8TEST"}]
    }}}

… against the same version of Solr in SolrCloud mode with techproducts, it
is missing the interestingTerms data:

{
  "response":{"numFound":2,"start":0,"maxScore":0.5722849,"docs":[
      {
        "id":"UTF8TEST"},
      {
        "id":"SOLR1000"}]
  },
  "moreLikeThis":[
    "UTF8TEST",{"numFound":1,"start":0,"maxScore":4.5925217,"docs":[
        {
          "id":"SOLR1000"}]
    },
    "SOLR1000",{"numFound":1,"start":0,"maxScore":4.5925217,"docs":[
        {
          "id":"UTF8TEST"}]
    }]}

Same thing happens with mlt.interestingTerms=details.

Is this a bug or an undocumented limitation?

Kind regards,

Thomas
Reply | Threaded
Open this post in threaded view
|

Re: mlt.interestingTerms in MoreLikeThisComponent: standalone vs cloud

Cassandra Targett
It’s an undocumented limitation that interesting terms are not returned in a distributed query (like SolrCloud would make) when using MLT with the search component.

The interesting terms support for the search component was added in 8.2 in https://issues.apache.org/jira/browse/SOLR-12304. That issue mentions that adding distributed query support was out of scope, but the limitation never made it into the documentation.

I recently extensively overhauled the MLT documentation (https://issues.apache.org/jira/browse/SOLR-15243) and added this caveat to the docs. The 8.8 Ref Guide will be republished in conjunction with the release of 8.8.2 (likely by early next week) and the updated Guide will include the changes made as part of SOLR-15243.
On Apr 7, 2021, 1:33 PM -0500, Thomas Corthals <[hidden email]>, wrote:

> Hi all,
>
> When I run the following command:
>
> curl "
> http://127.0.0.1:8983/solr/techproducts/select?omitHeader=true&wt=json&q=apache&fl=id&mlt=true&mlt.fl=manu%2Ccat&mlt.mintf=1&mlt.mindf=1&mlt.interestingTerms=list
> "
>
> … against Solr 8.5 in standalone mode with techproducts, I get
> interestingTerms in the output:
>
> {
> "response":{"numFound":2,"start":0,"docs":[
> {
> "id":"UTF8TEST"},
> {
> "id":"SOLR1000"}]
> },
> "interestingTerms":{
> "UTF8TEST":["cat:search",
> "manu:foundation",
> "manu:software",
> "manu:apache",
> "cat:software"],
> "SOLR1000":["cat:search",
> "manu:foundation",
> "manu:software",
> "manu:apache",
> "cat:software",
> "cat:search",
> "manu:foundation",
> "manu:software",
> "manu:apache",
> "cat:software"]},
> "moreLikeThis":{
> "UTF8TEST":{"numFound":1,"start":0,"docs":[
> {
> "id":"SOLR1000"}]
> },
> "SOLR1000":{"numFound":1,"start":0,"docs":[
> {
> "id":"UTF8TEST"}]
> }}}
>
> … against the same version of Solr in SolrCloud mode with techproducts, it
> is missing the interestingTerms data:
>
> {
> "response":{"numFound":2,"start":0,"maxScore":0.5722849,"docs":[
> {
> "id":"UTF8TEST"},
> {
> "id":"SOLR1000"}]
> },
> "moreLikeThis":[
> "UTF8TEST",{"numFound":1,"start":0,"maxScore":4.5925217,"docs":[
> {
> "id":"SOLR1000"}]
> },
> "SOLR1000",{"numFound":1,"start":0,"maxScore":4.5925217,"docs":[
> {
> "id":"UTF8TEST"}]
> }]}
>
> Same thing happens with mlt.interestingTerms=details.
>
> Is this a bug or an undocumented limitation?
>
> Kind regards,
>
> Thomas