[jira] Created: (LUCENE-763) LuceneDictionary skips first word in enumeration

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (LUCENE-763) LuceneDictionary skips first word in enumeration

JIRA jira@apache.org
LuceneDictionary skips first word in enumeration
------------------------------------------------

                 Key: LUCENE-763
                 URL: http://issues.apache.org/jira/browse/LUCENE-763
             Project: Lucene - Java
          Issue Type: Bug
          Components: Other
    Affects Versions: 2.0.0
         Environment: Windows Sun JRE 1.4.2_10_b03
            Reporter: Dan Ertman


The current code for LuceneDictionary will always skip the first word of the TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - its first call is to TermEnum.next, which moves it past the first term (line 76).
To see this problem cause a failure, add this test to TestSpellChecker:
similar = spellChecker.suggestSimilar("eihgt",2);
      assertEquals(1, similar.length);
      assertEquals(similar[0], "eight");

Because "eight" is the first word in the index, it will fail.


--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-763) LuceneDictionary skips first word in enumeration

JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12462042 ]

Steven Parkes commented on LUCENE-763:
--------------------------------------

I was wondering about something very similar just recently: to call TermEnum.next() or not to call TermEnum.next() to get the first term. However, in my case I use terms() rather than terms( Term ) and there's the rub.

After looking through things, there looks to be an inconsistency between the two cases. terms( Term ) seeks such that the new TermEnum object is ready. On the other hand, terms() leaves the enum state "before" the first term: you need to call next() first and calling term() earlier will return null.

I've only tried this against SegmentReader#terms(...).

This difference of behaviour isn't mentioned in the documentation.

It would seem like it would be nice to have the same behaviour between the two calls but I'm a little worried that half the existing code would break. Should we just document the existing behaviour?

In that case, the spell checker does just need to get rid of the extra next() call.

While investigating, I noticed there are several other issues around the spell checker now, both the functional code and test code. It plays a bit fast and loose with when index readers and writers are opened. Perhaps it used to work, depending on when things got flushed to disk, but it doesn't work for me now under the trunk.

> LuceneDictionary skips first word in enumeration
> ------------------------------------------------
>
>                 Key: LUCENE-763
>                 URL: https://issues.apache.org/jira/browse/LUCENE-763
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Other
>    Affects Versions: 2.0.0
>         Environment: Windows Sun JRE 1.4.2_10_b03
>            Reporter: Dan Ertman
>
> The current code for LuceneDictionary will always skip the first word of the TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - its first call is to TermEnum.next, which moves it past the first term (line 76).
> To see this problem cause a failure, add this test to TestSpellChecker:
> similar = spellChecker.suggestSimilar("eihgt",2);
>       assertEquals(1, similar.length);
>       assertEquals(similar[0], "eight");
> Because "eight" is the first word in the index, it will fail.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-763) LuceneDictionary skips first word in enumeration

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12462054 ]

Dan Ertman commented on LUCENE-763:
-----------------------------------

Ah, that makes sense. So the one basically behaves like ResultSet - the marker is before the first entry when initialized.  Unfortunately SpellChecker uses the other.  

> LuceneDictionary skips first word in enumeration
> ------------------------------------------------
>
>                 Key: LUCENE-763
>                 URL: https://issues.apache.org/jira/browse/LUCENE-763
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Other
>    Affects Versions: 2.0.0
>         Environment: Windows Sun JRE 1.4.2_10_b03
>            Reporter: Dan Ertman
>
> The current code for LuceneDictionary will always skip the first word of the TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - its first call is to TermEnum.next, which moves it past the first term (line 76).
> To see this problem cause a failure, add this test to TestSpellChecker:
> similar = spellChecker.suggestSimilar("eihgt",2);
>       assertEquals(1, similar.length);
>       assertEquals(similar[0], "eight");
> Because "eight" is the first word in the index, it will fail.

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-763) LuceneDictionary skips first word in enumeration

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christian Mallwitz updated LUCENE-763:
--------------------------------------

    Attachment: LuceneDictionary.java

This is a fixed LuceneDictionary.

> LuceneDictionary skips first word in enumeration
> ------------------------------------------------
>
>                 Key: LUCENE-763
>                 URL: https://issues.apache.org/jira/browse/LUCENE-763
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Other
>    Affects Versions: 2.0.0
>         Environment: Windows Sun JRE 1.4.2_10_b03
>            Reporter: Dan Ertman
>         Attachments: LuceneDictionary.java
>
>
> The current code for LuceneDictionary will always skip the first word of the TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - its first call is to TermEnum.next, which moves it past the first term (line 76).
> To see this problem cause a failure, add this test to TestSpellChecker:
> similar = spellChecker.suggestSimilar("eihgt",2);
>       assertEquals(1, similar.length);
>       assertEquals(similar[0], "eight");
> Because "eight" is the first word in the index, it will fail.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-763) LuceneDictionary skips first word in enumeration

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christian Mallwitz updated LUCENE-763:
--------------------------------------

    Attachment: TestLuceneDictionary.java

This a unit test case for LuceneDictionary making sure it doesn't skip any of the words in the original index.

> LuceneDictionary skips first word in enumeration
> ------------------------------------------------
>
>                 Key: LUCENE-763
>                 URL: https://issues.apache.org/jira/browse/LUCENE-763
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Other
>    Affects Versions: 2.0.0
>         Environment: Windows Sun JRE 1.4.2_10_b03
>            Reporter: Dan Ertman
>         Attachments: LuceneDictionary.java, TestLuceneDictionary.java
>
>
> The current code for LuceneDictionary will always skip the first word of the TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - its first call is to TermEnum.next, which moves it past the first term (line 76).
> To see this problem cause a failure, add this test to TestSpellChecker:
> similar = spellChecker.suggestSimilar("eihgt",2);
>       assertEquals(1, similar.length);
>       assertEquals(similar[0], "eight");
> Because "eight" is the first word in the index, it will fail.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-763) LuceneDictionary skips first word in enumeration

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500051 ]

Christian Mallwitz commented on LUCENE-763:
-------------------------------------------

I have added a fixed LuceneDictionary.java and a unit test case for it which should go to
   contrib/spellchecker/src/java/org/apache/lucene/search/spell/LuceneDictionary.java
and
   contrib/spellchecker/src/test/org/apache/lucene/search/spell/TestLuceneDictionary.java
respectively.

This is on top of the current lucene-trunk.

Cheers
Christian


> LuceneDictionary skips first word in enumeration
> ------------------------------------------------
>
>                 Key: LUCENE-763
>                 URL: https://issues.apache.org/jira/browse/LUCENE-763
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Other
>    Affects Versions: 2.0.0
>         Environment: Windows Sun JRE 1.4.2_10_b03
>            Reporter: Dan Ertman
>         Attachments: LuceneDictionary.java, TestLuceneDictionary.java
>
>
> The current code for LuceneDictionary will always skip the first word of the TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - its first call is to TermEnum.next, which moves it past the first term (line 76).
> To see this problem cause a failure, add this test to TestSpellChecker:
> similar = spellChecker.suggestSimilar("eihgt",2);
>       assertEquals(1, similar.length);
>       assertEquals(similar[0], "eight");
> Because "eight" is the first word in the index, it will fail.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-763) LuceneDictionary skips first word in enumeration

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500230 ]

Daniel Naber commented on LUCENE-763:
-------------------------------------

Thanks for your patch. I think there's a problem with the iterator which might not occur often, but it should be fixed nonetheless: calling next() only has an effect if hasNext() has been called before. You can see that by commenting out "assertTrue("Second element doesn't exist.", it.hasNext());" in the test case: the test will then fail, although, to my understanding, hasNext() should have no side effects. Could you change you patch accordingly?


> LuceneDictionary skips first word in enumeration
> ------------------------------------------------
>
>                 Key: LUCENE-763
>                 URL: https://issues.apache.org/jira/browse/LUCENE-763
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Other
>    Affects Versions: 2.0.0
>         Environment: Windows Sun JRE 1.4.2_10_b03
>            Reporter: Dan Ertman
>         Attachments: LuceneDictionary.java, TestLuceneDictionary.java
>
>
> The current code for LuceneDictionary will always skip the first word of the TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - its first call is to TermEnum.next, which moves it past the first term (line 76).
> To see this problem cause a failure, add this test to TestSpellChecker:
> similar = spellChecker.suggestSimilar("eihgt",2);
>       assertEquals(1, similar.length);
>       assertEquals(similar[0], "eight");
> Because "eight" is the first word in the index, it will fail.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-763) LuceneDictionary skips first word in enumeration

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christian Mallwitz updated LUCENE-763:
--------------------------------------

    Attachment:     (was: LuceneDictionary.java)

> LuceneDictionary skips first word in enumeration
> ------------------------------------------------
>
>                 Key: LUCENE-763
>                 URL: https://issues.apache.org/jira/browse/LUCENE-763
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Other
>    Affects Versions: 2.0.0
>         Environment: Windows Sun JRE 1.4.2_10_b03
>            Reporter: Dan Ertman
>         Attachments: TestLuceneDictionary.java
>
>
> The current code for LuceneDictionary will always skip the first word of the TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - its first call is to TermEnum.next, which moves it past the first term (line 76).
> To see this problem cause a failure, add this test to TestSpellChecker:
> similar = spellChecker.suggestSimilar("eihgt",2);
>       assertEquals(1, similar.length);
>       assertEquals(similar[0], "eight");
> Because "eight" is the first word in the index, it will fail.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-763) LuceneDictionary skips first word in enumeration

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christian Mallwitz updated LUCENE-763:
--------------------------------------

    Attachment:     (was: TestLuceneDictionary.java)

> LuceneDictionary skips first word in enumeration
> ------------------------------------------------
>
>                 Key: LUCENE-763
>                 URL: https://issues.apache.org/jira/browse/LUCENE-763
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Other
>    Affects Versions: 2.0.0
>         Environment: Windows Sun JRE 1.4.2_10_b03
>            Reporter: Dan Ertman
>
> The current code for LuceneDictionary will always skip the first word of the TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - its first call is to TermEnum.next, which moves it past the first term (line 76).
> To see this problem cause a failure, add this test to TestSpellChecker:
> similar = spellChecker.suggestSimilar("eihgt",2);
>       assertEquals(1, similar.length);
>       assertEquals(similar[0], "eight");
> Because "eight" is the first word in the index, it will fail.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-763) LuceneDictionary skips first word in enumeration

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christian Mallwitz updated LUCENE-763:
--------------------------------------

    Attachment: TestLuceneDictionary.java

New extended unit test case for class LuceneDictionary

> LuceneDictionary skips first word in enumeration
> ------------------------------------------------
>
>                 Key: LUCENE-763
>                 URL: https://issues.apache.org/jira/browse/LUCENE-763
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Other
>    Affects Versions: 2.0.0
>         Environment: Windows Sun JRE 1.4.2_10_b03
>            Reporter: Dan Ertman
>         Attachments: TestLuceneDictionary.java
>
>
> The current code for LuceneDictionary will always skip the first word of the TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - its first call is to TermEnum.next, which moves it past the first term (line 76).
> To see this problem cause a failure, add this test to TestSpellChecker:
> similar = spellChecker.suggestSimilar("eihgt",2);
>       assertEquals(1, similar.length);
>       assertEquals(similar[0], "eight");
> Because "eight" is the first word in the index, it will fail.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (LUCENE-763) LuceneDictionary skips first word in enumeration

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christian Mallwitz updated LUCENE-763:
--------------------------------------

    Attachment: LuceneDictionary.java

Fixed class LuceneDictionary

> LuceneDictionary skips first word in enumeration
> ------------------------------------------------
>
>                 Key: LUCENE-763
>                 URL: https://issues.apache.org/jira/browse/LUCENE-763
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Other
>    Affects Versions: 2.0.0
>         Environment: Windows Sun JRE 1.4.2_10_b03
>            Reporter: Dan Ertman
>         Attachments: LuceneDictionary.java, TestLuceneDictionary.java
>
>
> The current code for LuceneDictionary will always skip the first word of the TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - its first call is to TermEnum.next, which moves it past the first term (line 76).
> To see this problem cause a failure, add this test to TestSpellChecker:
> similar = spellChecker.suggestSimilar("eihgt",2);
>       assertEquals(1, similar.length);
>       assertEquals(similar[0], "eight");
> Because "eight" is the first word in the index, it will fail.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: [jira] Commented: (LUCENE-763) LuceneDictionary skips first word in enumeration

Christian Mallwitz
In reply to this post by JIRA jira@apache.org
I knew the boolean flag which was in the class in the first place was
used for something ... :-)

Anyway, I have uploaded updated class and unit test files.

Thanks
Christian


________________________________________________________________________
This e-mail has been scanned for all viruses by MessageLabs.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Closed: (LUCENE-763) LuceneDictionary skips first word in enumeration

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

     [ https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Naber closed LUCENE-763.
-------------------------------

       Resolution: Fixed
    Fix Version/s: 2.2

Thanks, patch applied.


> LuceneDictionary skips first word in enumeration
> ------------------------------------------------
>
>                 Key: LUCENE-763
>                 URL: https://issues.apache.org/jira/browse/LUCENE-763
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Other
>    Affects Versions: 2.0.0
>         Environment: Windows Sun JRE 1.4.2_10_b03
>            Reporter: Dan Ertman
>             Fix For: 2.2
>
>         Attachments: LuceneDictionary.java, TestLuceneDictionary.java
>
>
> The current code for LuceneDictionary will always skip the first word of the TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - its first call is to TermEnum.next, which moves it past the first term (line 76).
> To see this problem cause a failure, add this test to TestSpellChecker:
> similar = spellChecker.suggestSimilar("eihgt",2);
>       assertEquals(1, similar.length);
>       assertEquals(similar[0], "eight");
> Because "eight" is the first word in the index, it will fail.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-763) LuceneDictionary skips first word in enumeration

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500521 ]

Steven Parkes commented on LUCENE-763:
--------------------------------------

Can we also update the javadocs to reflect the different semantics between terms() and terms(term)? Here's some possible verbage. (Also tweaks the "after the given term" which I think isn't correct?)

{noformat}
Index: src/java/org/apache/lucene/index/IndexReader.java
===================================================================
--- src/java/org/apache/lucene/index/IndexReader.java   (revision 543284)
+++ src/java/org/apache/lucene/index/IndexReader.java   (working copy)
@@ -539,16 +539,21 @@
     setNorm(doc, field, Similarity.encodeNorm(value));
   }
 
-  /** Returns an enumeration of all the terms in the index.
-   * The enumeration is ordered by Term.compareTo().  Each term
-   * is greater than all that precede it in the enumeration.
+  /** Returns an enumeration of all the terms in the index.  The
+   * enumeration is ordered by Term.compareTo().  Each term is greater
+   * than all that precede it in the enumeration.  Note that after
+   * calling {@link #terms()}, {@link TermEnum#next()} must be called
+   * on the resulting enumeration before calling other methods such as
+   * {@link TermEnum#term()}.
    * @throws IOException if there is a low-level IO error
    */
   public abstract TermEnum terms() throws IOException;
 
-  /** Returns an enumeration of all terms after a given term.
-   * The enumeration is ordered by Term.compareTo().  Each term
-   * is greater than all that precede it in the enumeration.
+  /** Returns an enumeration of all terms starting at a given term. If
+   * the given term does not exist, the enumeration is positioned a the
+   * first term greater than the supplied therm.  The enumeration is
+   * ordered by Term.compareTo().  Each term is greater than all that
+   * precede it in the enumeration.
    * @throws IOException if there is a low-level IO error
    */
   public abstract TermEnum terms(Term t) throws IOException;
{noformat}


> LuceneDictionary skips first word in enumeration
> ------------------------------------------------
>
>                 Key: LUCENE-763
>                 URL: https://issues.apache.org/jira/browse/LUCENE-763
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Other
>    Affects Versions: 2.0.0
>         Environment: Windows Sun JRE 1.4.2_10_b03
>            Reporter: Dan Ertman
>             Fix For: 2.2
>
>         Attachments: LuceneDictionary.java, TestLuceneDictionary.java
>
>
> The current code for LuceneDictionary will always skip the first word of the TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - its first call is to TermEnum.next, which moves it past the first term (line 76).
> To see this problem cause a failure, add this test to TestSpellChecker:
> similar = spellChecker.suggestSimilar("eihgt",2);
>       assertEquals(1, similar.length);
>       assertEquals(similar[0], "eight");
> Because "eight" is the first word in the index, it will fail.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (LUCENE-763) LuceneDictionary skips first word in enumeration

JIRA jira@apache.org
In reply to this post by JIRA jira@apache.org

    [ https://issues.apache.org/jira/browse/LUCENE-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500863 ]

Daniel Naber commented on LUCENE-763:
-------------------------------------

Thanks, Steven. Your javadoc changes have also been committed now.


> LuceneDictionary skips first word in enumeration
> ------------------------------------------------
>
>                 Key: LUCENE-763
>                 URL: https://issues.apache.org/jira/browse/LUCENE-763
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Other
>    Affects Versions: 2.0.0
>         Environment: Windows Sun JRE 1.4.2_10_b03
>            Reporter: Dan Ertman
>             Fix For: 2.2
>
>         Attachments: LuceneDictionary.java, TestLuceneDictionary.java
>
>
> The current code for LuceneDictionary will always skip the first word of the TermEnum. The reason is that it doesn't initially retrieve TermEnum.term - its first call is to TermEnum.next, which moves it past the first term (line 76).
> To see this problem cause a failure, add this test to TestSpellChecker:
> similar = spellChecker.suggestSimilar("eihgt",2);
>       assertEquals(1, similar.length);
>       assertEquals(similar[0], "eight");
> Because "eight" is the first word in the index, it will fail.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]