Solr - phrase suggestion returning duplicate

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr - phrase suggestion returning duplicate

ruby
I'm trying to enable phrase suggestion in my application by using
*AnalyzingInfixLookupFactory *and *DocumentDictionaryFactory*. Following is
what my configuration looks like:

<searchComponent name="suggest" class="solr.SuggestComponent">
  <lst name="suggester">
    <str name="name">mySuggester</str>
      <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
      <str name=”indexPath”>suggester_infix_dir</str>
    <str name="dictionaryImpl">DocumentDictionaryFactory</str>
    <str name="field">title</str>
    <str name="suggestAnalyzerFieldType">suggestType</str>
    <str name="buildOnStartup">false</str>
    <str name="buildOnCommit">false</str>
  </lst>
</searchComponent>
<requestHandler name="/suggesthandler" class="solr.SearchHandler"
startup="lazy" >
  <lst name="defaults">
    <str name="suggest">true</str>
    <str name="suggest.count">10</str>
    <str name="suggest.dictionary">mySuggester</str>
  </lst>
  <arr name="components">
    <str>suggest</str>
  </arr>
</requestHandler>

I have following documents indexed:

<doc>
<field name="id">44</field>
<field name="title"></field>
</doc>

<doc>
<field name="id">11</field>
<field name="title">Video gaming: the history</field>
</doc>
<doc>

<field name="id">55</field>
<field name="title">Video games: multiplayer gaming</field>
</doc>

<doc>
<field name="id">33</field>
<field name="title">Video gaming: the history</field>
</doc>

After indexing documents and building the suggester, when I query I get
duplicate suggestions

q.suggest=video
returns
[
      {
        "id":"44",
        "*title":"Video gaming: the history"},*
      {
        "id":"33",
        "title":"Video games: multiplayer gaming"},
      {
        "id":"44",
        *"title":"Video gaming: the history"}]*

Is this a known bug with Solr suggester? shouldn't suggester by default
return unique suggestions?


Thanks



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Solr - phrase suggestion returning duplicate

Erick Erickson
Is "id" the actual <uniqueKey> in your schema? If you indexed the same
document twice, the second one should overwrite the first one so
getting two docs back with the same ID is strange.

Best,
Erick

On Tue, Nov 7, 2017 at 10:43 AM, ruby <[hidden email]> wrote:

> I'm trying to enable phrase suggestion in my application by using
> *AnalyzingInfixLookupFactory *and *DocumentDictionaryFactory*. Following is
> what my configuration looks like:
>
> <searchComponent name="suggest" class="solr.SuggestComponent">
>   <lst name="suggester">
>     <str name="name">mySuggester</str>
>       <str name="lookupImpl">AnalyzingInfixLookupFactory</str>
>       <str name=”indexPath”>suggester_infix_dir</str>
>     <str name="dictionaryImpl">DocumentDictionaryFactory</str>
>     <str name="field">title</str>
>     <str name="suggestAnalyzerFieldType">suggestType</str>
>     <str name="buildOnStartup">false</str>
>     <str name="buildOnCommit">false</str>
>   </lst>
> </searchComponent>
> <requestHandler name="/suggesthandler" class="solr.SearchHandler"
> startup="lazy" >
>   <lst name="defaults">
>     <str name="suggest">true</str>
>     <str name="suggest.count">10</str>
>     <str name="suggest.dictionary">mySuggester</str>
>   </lst>
>   <arr name="components">
>     <str>suggest</str>
>   </arr>
> </requestHandler>
>
> I have following documents indexed:
>
> <doc>
> <field name="id">44</field>
> <field name="title"></field>
> </doc>
>
> <doc>
> <field name="id">11</field>
> <field name="title">Video gaming: the history</field>
> </doc>
> <doc>
>
> <field name="id">55</field>
> <field name="title">Video games: multiplayer gaming</field>
> </doc>
>
> <doc>
> <field name="id">33</field>
> <field name="title">Video gaming: the history</field>
> </doc>
>
> After indexing documents and building the suggester, when I query I get
> duplicate suggestions
>
> q.suggest=video
> returns
> [
>       {
>         "id":"44",
>         "*title":"Video gaming: the history"},*
>       {
>         "id":"33",
>         "title":"Video games: multiplayer gaming"},
>       {
>         "id":"44",
>         *"title":"Video gaming: the history"}]*
>
> Is this a known bug with Solr suggester? shouldn't suggester by default
> return unique suggestions?
>
>
> Thanks
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Solr - phrase suggestion returning duplicate

ruby
yes, id is an unique field.

I found following issue in Jira:
https://issues.apache.org/jira/browse/LUCENE-6336

It says affected versions are 4.10.3, 5.0. I'm using Solr 6.1 and seeing
this issue.

You can recreate it by indexing those documents I shared and querying.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Solr - phrase suggestion returning duplicate

ruby
In reply to this post by Erick Erickson
Yes, Id is an unique field in my schema.

I found following Jira issue:
https://issues.apache.org/jira/browse/LUCENE-6336

It looks related to me. It does not mention that it was fixed. Is it fixed
in Solr 6.1? I'm using Solr 6.1



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Solr - phrase suggestion returning duplicate

alessandro.benedetti
Hi Ruby,
I partecipated at the discussion at the time,
It's definitely still open.

It's on my long TO DO list, I hope I will be able to contribute a solution
sooner or later.
In case you decide to use an entire new index for the autosuggestion, you
can potentially manage that on your own.
But out of the box, you are going to get that problem.

There is a related issue to solve the problem SolrJ client side[1] but it is
not merged in Solr code either.

[1] https://issues.apache.org/jira/browse/SOLR-8672




-----
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
Reply | Threaded
Open this post in threaded view
|

Re: Solr - phrase suggestion returning duplicate

ruby
Alessandro, thanks for your reply.

What do you mean by "In case you decide to use an entire new index for the
autosuggestion, you
can potentially manage that on your own".

Is this duplicate issue a problem with the DocumentDictionaryFactory?



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Reply | Threaded
Open this post in threaded view
|

Re: Solr - phrase suggestion returning duplicate

alessandro.benedetti
"In case you decide to use an entire new index for the
autosuggestion, you
can potentially manage that on your own"

This refers to the fact that is possible to define an index just for
autocompletion.
You can model the document as you prefer in this additional index, defining
the field types that best fits you and then managing the documents in the
index ( so you can avoid duplicates according to your rules).

Then you can configure a request handler and manage the query side as your
preference.

Regards



-----
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io