Created: (SOLR-167) synonym filter mixes up terms from different synonyms

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Created: (SOLR-167) synonym filter mixes up terms from different synonyms

Sebastian Nagel (Jira)
synonym filter mixes up terms from different synonyms
-----------------------------------------------------

                 Key: SOLR-167
                 URL: https://issues.apache.org/jira/browse/SOLR-167
             Project: Solr
          Issue Type: Bug
          Components: update
    Affects Versions: 1.1.0, 1.2
            Reporter: Mike Klaas
         Assigned To: Yonik Seeley
             Fix For: 1.1.0, 1.2


SynonymFilter can mix up options from different synonyms, sometimes inserting the wrong word, sometimes using the wrong offset.  Issue appears to be use of the matched arraylist in SynonymFilter

To reproduce: add "best buy,bestbuy,bb" to the example's synonym list.  Then view verbose analysis of the query analyzer output for "Best buy - Acer Aspire AS5610-2273 - $599. Windows vista, 1 GB RAM"

"gigabytes" becomes a synonym of "Best buy", and the offsets of the remainder of the "GB" synonyms are incorrect.

Assigning to Yonik as this code is too hairy for me to fix without further study.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

Work started: (SOLR-167) synonym filter mixes up terms from different synonyms

Sebastian Nagel (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on SOLR-167 started by Yonik Seeley.

> synonym filter mixes up terms from different synonyms
> -----------------------------------------------------
>
>                 Key: SOLR-167
>                 URL: https://issues.apache.org/jira/browse/SOLR-167
>             Project: Solr
>          Issue Type: Bug
>          Components: update
>    Affects Versions: 1.1.0, 1.2
>            Reporter: Mike Klaas
>         Assigned To: Yonik Seeley
>             Fix For: 1.1.0, 1.2
>
>
> SynonymFilter can mix up options from different synonyms, sometimes inserting the wrong word, sometimes using the wrong offset.  Issue appears to be use of the matched arraylist in SynonymFilter
> To reproduce: add "best buy,bestbuy,bb" to the example's synonym list.  Then view verbose analysis of the query analyzer output for "Best buy - Acer Aspire AS5610-2273 - $599. Windows vista, 1 GB RAM"
> "gigabytes" becomes a synonym of "Best buy", and the offsets of the remainder of the "GB" synonyms are incorrect.
> Assigning to Yonik as this code is too hairy for me to fix without further study.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

Commented: (SOLR-167) synonym filter mixes up terms from different synonyms

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12474545 ]

Yonik Seeley commented on SOLR-167:
-----------------------------------

Still need to figure out incorrect offsets, but the "gigabytes" a synonym of "Best buy" part was a display error in analysis.jsp : SOLR-168


> synonym filter mixes up terms from different synonyms
> -----------------------------------------------------
>
>                 Key: SOLR-167
>                 URL: https://issues.apache.org/jira/browse/SOLR-167
>             Project: Solr
>          Issue Type: Bug
>          Components: update
>    Affects Versions: 1.1.0, 1.2
>            Reporter: Mike Klaas
>         Assigned To: Yonik Seeley
>             Fix For: 1.1.0, 1.2
>
>
> SynonymFilter can mix up options from different synonyms, sometimes inserting the wrong word, sometimes using the wrong offset.  Issue appears to be use of the matched arraylist in SynonymFilter
> To reproduce: add "best buy,bestbuy,bb" to the example's synonym list.  Then view verbose analysis of the query analyzer output for "Best buy - Acer Aspire AS5610-2273 - $599. Windows vista, 1 GB RAM"
> "gigabytes" becomes a synonym of "Best buy", and the offsets of the remainder of the "GB" synonyms are incorrect.
> Assigning to Yonik as this code is too hairy for me to fix without further study.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

Resolved: (SOLR-167) synonym filter mixes up terms from different synonyms

Sebastian Nagel (Jira)
In reply to this post by Sebastian Nagel (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley resolved SOLR-167.
-------------------------------

       Resolution: Fixed
    Fix Version/s:     (was: 1.1.0)
                       (was: 1.2)

I just committed a fix for this.

> synonym filter mixes up terms from different synonyms
> -----------------------------------------------------
>
>                 Key: SOLR-167
>                 URL: https://issues.apache.org/jira/browse/SOLR-167
>             Project: Solr
>          Issue Type: Bug
>          Components: update
>    Affects Versions: 1.1.0, 1.2
>            Reporter: Mike Klaas
>         Assigned To: Yonik Seeley
>
> SynonymFilter can mix up options from different synonyms, sometimes inserting the wrong word, sometimes using the wrong offset.  Issue appears to be use of the matched arraylist in SynonymFilter
> To reproduce: add "best buy,bestbuy,bb" to the example's synonym list.  Then view verbose analysis of the query analyzer output for "Best buy - Acer Aspire AS5610-2273 - $599. Windows vista, 1 GB RAM"
> "gigabytes" becomes a synonym of "Best buy", and the offsets of the remainder of the "GB" synonyms are incorrect.
> Assigning to Yonik as this code is too hairy for me to fix without further study.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.