[jira] Created: (LUCENE-1163) CharArraySet.contains(char[] text, int off, int len) does not work

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (LUCENE-1163) CharArraySet.contains(char[] text, int off, int len) does not work

JIRA jira@apache.org
CharArraySet.contains(char[] text, int off, int len) does not work
------------------------------------------------------------------

                 Key: LUCENE-1163
                 URL: https://issues.apache.org/jira/browse/LUCENE-1163
             Project: Lucene - Java
          Issue Type: Bug
    Affects Versions: 2.3
            Reporter: Thomas Peuss


I try to use the CharArraySet for a filter I am writing. I heavily use char-arrays in my code to speed up things. I stumbled upon a bug in CharArraySet while doing that.

The method _public boolean contains(char[] text, int off, int len)_ seems not to work.

When I do

{code}
if (set.contains(buffer,offset,length) {
  ...
}
{code}

my code fails.

But when I do

{code}
if (set.contains(new String(buffer,offset,length)) {
   ...
}
{code}

everything works as expected.

Both variants should behave the same. I attach a small piece of code to show the problem.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-1163) CharArraySet.contains(char[] text, int off, int len) does not work

JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thomas Peuss updated LUCENE-1163:
---------------------------------

    Attachment: CharArraySetShowBug.java

A simple piece of code that shows the problem.

> CharArraySet.contains(char[] text, int off, int len) does not work
> ------------------------------------------------------------------
>
>                 Key: LUCENE-1163
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1163
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.3
>            Reporter: Thomas Peuss
>         Attachments: CharArraySetShowBug.java
>
>
> I try to use the CharArraySet for a filter I am writing. I heavily use char-arrays in my code to speed up things. I stumbled upon a bug in CharArraySet while doing that.
> The method _public boolean contains(char[] text, int off, int len)_ seems not to work.
> When I do
> {code}
> if (set.contains(buffer,offset,length) {
>   ...
> }
> {code}
> my code fails.
> But when I do
> {code}
> if (set.contains(new String(buffer,offset,length)) {
>    ...
> }
> {code}
> everything works as expected.
> Both variants should behave the same. I attach a small piece of code to show the problem.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Assigned: (LUCENE-1163) CharArraySet.contains(char[] text, int off, int len) does not work

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless reassigned LUCENE-1163:
------------------------------------------

    Assignee: Michael McCandless

> CharArraySet.contains(char[] text, int off, int len) does not work
> ------------------------------------------------------------------
>
>                 Key: LUCENE-1163
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1163
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.3
>            Reporter: Thomas Peuss
>            Assignee: Michael McCandless
>         Attachments: CharArraySetShowBug.java
>
>
> I try to use the CharArraySet for a filter I am writing. I heavily use char-arrays in my code to speed up things. I stumbled upon a bug in CharArraySet while doing that.
> The method _public boolean contains(char[] text, int off, int len)_ seems not to work.
> When I do
> {code}
> if (set.contains(buffer,offset,length) {
>   ...
> }
> {code}
> my code fails.
> But when I do
> {code}
> if (set.contains(new String(buffer,offset,length)) {
>    ...
> }
> {code}
> everything works as expected.
> Both variants should behave the same. I attach a small piece of code to show the problem.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-1163) CharArraySet.contains(char[] text, int off, int len) does not work

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-1163:
---------------------------------------

    Attachment: LUCENE-1163.patch

Indeed it's really a bug -- thank you for finding this & reporting it Thomas!

We were ignoring the offset when computing the hash code internally.

Lucene always passes '0' for this offset (only used in StopFilter currently) so it wasn't hitting any existing Lucene test cases.

I turned your example into a test case in the attached patch.  I will commit shortly.

> CharArraySet.contains(char[] text, int off, int len) does not work
> ------------------------------------------------------------------
>
>                 Key: LUCENE-1163
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1163
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.3
>            Reporter: Thomas Peuss
>            Assignee: Michael McCandless
>         Attachments: CharArraySetShowBug.java, LUCENE-1163.patch
>
>
> I try to use the CharArraySet for a filter I am writing. I heavily use char-arrays in my code to speed up things. I stumbled upon a bug in CharArraySet while doing that.
> The method _public boolean contains(char[] text, int off, int len)_ seems not to work.
> When I do
> {code}
> if (set.contains(buffer,offset,length) {
>   ...
> }
> {code}
> my code fails.
> But when I do
> {code}
> if (set.contains(new String(buffer,offset,length)) {
>    ...
> }
> {code}
> everything works as expected.
> Both variants should behave the same. I attach a small piece of code to show the problem.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-1163) CharArraySet.contains(char[] text, int off, int len) does not work

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565748#action_12565748 ]

Thomas Peuss commented on LUCENE-1163:
--------------------------------------

Thanks for the quick response. I can confirm that the patch fixes the problem.

> CharArraySet.contains(char[] text, int off, int len) does not work
> ------------------------------------------------------------------
>
>                 Key: LUCENE-1163
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1163
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.3
>            Reporter: Thomas Peuss
>            Assignee: Michael McCandless
>         Attachments: CharArraySetShowBug.java, LUCENE-1163.patch
>
>
> I try to use the CharArraySet for a filter I am writing. I heavily use char-arrays in my code to speed up things. I stumbled upon a bug in CharArraySet while doing that.
> The method _public boolean contains(char[] text, int off, int len)_ seems not to work.
> When I do
> {code}
> if (set.contains(buffer,offset,length) {
>   ...
> }
> {code}
> my code fails.
> But when I do
> {code}
> if (set.contains(new String(buffer,offset,length)) {
>    ...
> }
> {code}
> everything works as expected.
> Both variants should behave the same. I attach a small piece of code to show the problem.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-1163) CharArraySet.contains(char[] text, int off, int len) does not work

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565751#action_12565751 ]

Michael McCandless commented on LUCENE-1163:
--------------------------------------------

Super, thanks Thomas!  I just committed this.

> CharArraySet.contains(char[] text, int off, int len) does not work
> ------------------------------------------------------------------
>
>                 Key: LUCENE-1163
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1163
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.3
>            Reporter: Thomas Peuss
>            Assignee: Michael McCandless
>             Fix For: 2.4
>
>         Attachments: CharArraySetShowBug.java, LUCENE-1163.patch
>
>
> I try to use the CharArraySet for a filter I am writing. I heavily use char-arrays in my code to speed up things. I stumbled upon a bug in CharArraySet while doing that.
> The method _public boolean contains(char[] text, int off, int len)_ seems not to work.
> When I do
> {code}
> if (set.contains(buffer,offset,length) {
>   ...
> }
> {code}
> my code fails.
> But when I do
> {code}
> if (set.contains(new String(buffer,offset,length)) {
>    ...
> }
> {code}
> everything works as expected.
> Both variants should behave the same. I attach a small piece of code to show the problem.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Resolved: (LUCENE-1163) CharArraySet.contains(char[] text, int off, int len) does not work

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless resolved LUCENE-1163.
----------------------------------------

       Resolution: Fixed
    Fix Version/s: 2.4

> CharArraySet.contains(char[] text, int off, int len) does not work
> ------------------------------------------------------------------
>
>                 Key: LUCENE-1163
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1163
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.3
>            Reporter: Thomas Peuss
>            Assignee: Michael McCandless
>             Fix For: 2.4
>
>         Attachments: CharArraySetShowBug.java, LUCENE-1163.patch
>
>
> I try to use the CharArraySet for a filter I am writing. I heavily use char-arrays in my code to speed up things. I stumbled upon a bug in CharArraySet while doing that.
> The method _public boolean contains(char[] text, int off, int len)_ seems not to work.
> When I do
> {code}
> if (set.contains(buffer,offset,length) {
>   ...
> }
> {code}
> my code fails.
> But when I do
> {code}
> if (set.contains(new String(buffer,offset,length)) {
>    ...
> }
> {code}
> everything works as expected.
> Both variants should behave the same. I attach a small piece of code to show the problem.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-1163) CharArraySet.contains(char[] text, int off, int len) does not work

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568050#action_12568050 ]

Michael McCandless commented on LUCENE-1163:
--------------------------------------------

I'll port this one to 2.3.1 as well.

> CharArraySet.contains(char[] text, int off, int len) does not work
> ------------------------------------------------------------------
>
>                 Key: LUCENE-1163
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1163
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.3
>            Reporter: Thomas Peuss
>            Assignee: Michael McCandless
>             Fix For: 2.4
>
>         Attachments: CharArraySetShowBug.java, LUCENE-1163.patch
>
>
> I try to use the CharArraySet for a filter I am writing. I heavily use char-arrays in my code to speed up things. I stumbled upon a bug in CharArraySet while doing that.
> The method _public boolean contains(char[] text, int off, int len)_ seems not to work.
> When I do
> {code}
> if (set.contains(buffer,offset,length) {
>   ...
> }
> {code}
> my code fails.
> But when I do
> {code}
> if (set.contains(new String(buffer,offset,length)) {
>    ...
> }
> {code}
> everything works as expected.
> Both variants should behave the same. I attach a small piece of code to show the problem.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-1163) CharArraySet.contains(char[] text, int off, int len) does not work

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568056#action_12568056 ]

Michael McCandless commented on LUCENE-1163:
--------------------------------------------

Backported to 2.3

> CharArraySet.contains(char[] text, int off, int len) does not work
> ------------------------------------------------------------------
>
>                 Key: LUCENE-1163
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1163
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.3
>            Reporter: Thomas Peuss
>            Assignee: Michael McCandless
>             Fix For: 2.4
>
>         Attachments: CharArraySetShowBug.java, LUCENE-1163.patch
>
>
> I try to use the CharArraySet for a filter I am writing. I heavily use char-arrays in my code to speed up things. I stumbled upon a bug in CharArraySet while doing that.
> The method _public boolean contains(char[] text, int off, int len)_ seems not to work.
> When I do
> {code}
> if (set.contains(buffer,offset,length) {
>   ...
> }
> {code}
> my code fails.
> But when I do
> {code}
> if (set.contains(new String(buffer,offset,length)) {
>    ...
> }
> {code}
> everything works as expected.
> Both variants should behave the same. I attach a small piece of code to show the problem.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]