[jira] [Created] (LUCENE-3980) Word order seems to affect proximity searching

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (LUCENE-3980) Word order seems to affect proximity searching

Tim Allison (Jira)
Word order seems to affect proximity searching
----------------------------------------------

                 Key: LUCENE-3980
                 URL: https://issues.apache.org/jira/browse/LUCENE-3980
             Project: Lucene - Java
          Issue Type: Bug
          Components: core/search
            Reporter: Ian Pooley
            Priority: Minor


It would appear that the order of words within a search query affects a proximity search.

For instance, for the text "The proximity operator seems to match differently based on word order", a match is found for "proximity order"~8 but is not found for "order proximity"~8. In order for the latter to find a match, it needs to be changed to "order proximity"~10.

Both the text and the query are processed using org.apache.lucene.analysis.standard.StandardAnalyzer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-3980) Word order seems to affect proximity searching

Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253469#comment-13253469 ]

Robert Muir commented on LUCENE-3980:
-------------------------------------

Have you seen the documentation of slop in phrasequery?

{noformat}
The slop is in fact an edit-distance, where the units correspond to
moves of terms in the query phrase out of position.  For example, to switch
the order of two words requires two moves (the first move places the words
atop one another), so to permit re-orderings of phrases, the slop must be
at least two.
{noformat}
               

> Word order seems to affect proximity searching
> ----------------------------------------------
>
>                 Key: LUCENE-3980
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3980
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>            Reporter: Ian Pooley
>            Priority: Minor
>
> It would appear that the order of words within a search query affects a proximity search.
> For instance, for the text "The proximity operator seems to match differently based on word order", a match is found for "proximity order"~8 but is not found for "order proximity"~8. In order for the latter to find a match, it needs to be changed to "order proximity"~10.
> Both the text and the query are processed using org.apache.lucene.analysis.standard.StandardAnalyzer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Closed] (LUCENE-3980) Word order seems to affect proximity searching

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/LUCENE-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ian Pooley closed LUCENE-3980.
------------------------------

    Resolution: Not A Problem

Thanks, that makes sense now. The challenge is going to be trying to explain this to our users as they want to be able to type in a query and find the same results irrespective of word order.
               

> Word order seems to affect proximity searching
> ----------------------------------------------
>
>                 Key: LUCENE-3980
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3980
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>            Reporter: Ian Pooley
>            Priority: Minor
>
> It would appear that the order of words within a search query affects a proximity search.
> For instance, for the text "The proximity operator seems to match differently based on word order", a match is found for "proximity order"~8 but is not found for "order proximity"~8. In order for the latter to find a match, it needs to be changed to "order proximity"~10.
> Both the text and the query are processed using org.apache.lucene.analysis.standard.StandardAnalyzer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-3980) Word order seems to affect proximity searching

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253511#comment-13253511 ]

Uwe Schindler commented on LUCENE-3980:
---------------------------------------

If the word order does not matter, why use a phrase query? A simple BooleanQuery on all terms would be fine.
               

> Word order seems to affect proximity searching
> ----------------------------------------------
>
>                 Key: LUCENE-3980
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3980
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>            Reporter: Ian Pooley
>            Priority: Minor
>
> It would appear that the order of words within a search query affects a proximity search.
> For instance, for the text "The proximity operator seems to match differently based on word order", a match is found for "proximity order"~8 but is not found for "order proximity"~8. In order for the latter to find a match, it needs to be changed to "order proximity"~10.
> Both the text and the query are processed using org.apache.lucene.analysis.standard.StandardAnalyzer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-3980) Word order seems to affect proximity searching

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253510#comment-13253510 ]

Uwe Schindler commented on LUCENE-3980:
---------------------------------------

If the word order does not matter, why use a phrase query? A simple BooleanQuery on all terms would be fine.
               

> Word order seems to affect proximity searching
> ----------------------------------------------
>
>                 Key: LUCENE-3980
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3980
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>            Reporter: Ian Pooley
>            Priority: Minor
>
> It would appear that the order of words within a search query affects a proximity search.
> For instance, for the text "The proximity operator seems to match differently based on word order", a match is found for "proximity order"~8 but is not found for "order proximity"~8. In order for the latter to find a match, it needs to be changed to "order proximity"~10.
> Both the text and the query are processed using org.apache.lucene.analysis.standard.StandardAnalyzer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (LUCENE-3980) Word order seems to affect proximity searching

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/LUCENE-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253558#comment-13253558 ]

Ian Pooley commented on LUCENE-3980:
------------------------------------

The problem is that the queries are generated by a QueryParser from large, complex query strings created by our internal users. Most of these queries return exactly want they expect but, a couple of days ago, one of the users noticed that "A B"~5 within one of these queries returned slightly different results from a query that was identical other than the clause that as "B A"~5.

Now that I have a better idea as to what is going on under the covers, my challenge is to translate that into non-technical rules that will allow all permutations of "A B C D E..."~n to give the desired answer.
               

> Word order seems to affect proximity searching
> ----------------------------------------------
>
>                 Key: LUCENE-3980
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3980
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>            Reporter: Ian Pooley
>            Priority: Minor
>
> It would appear that the order of words within a search query affects a proximity search.
> For instance, for the text "The proximity operator seems to match differently based on word order", a match is found for "proximity order"~8 but is not found for "order proximity"~8. In order for the latter to find a match, it needs to be changed to "order proximity"~10.
> Both the text and the query are processed using org.apache.lucene.analysis.standard.StandardAnalyzer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]